Skip to content

glefundes/Multimethod-Binarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multimethod Binarization

Efficient implementation of local thresholding binarization methods on Python

This was originally developped while studying low-level character segmentation methods for LPR(License Plate Recognition). The more popular global thresholding methods for binarization (Otsu, etc) are not very well suited for LPR systems, as explored by the authors on the cited paper.

In the article the authors posit that using a single binarization method with static parameters while efficient for certain conditions, is not the best approach. Different methods or parameters will perform best for different visual features across images (or even different regions in a single image), so it follows that applying multiple methods and merging the results should yield better overrall accuracy.

Motivated by the lack of support for local thresholding binarization by popular computer vision libraries, I wrote this code to provide a simple interface for the production and use of multiple binary images in character segmentation (especially in LPR systems, but should be useful to OCR applications in general). It currently supports Niblack's, Sauvola's and Wolf's binarization methods.


Requirements

  • Python3
  • numpy
  • OpenCV
  • scipy
  • bottleneck

Usage

Multimethod Binarization

The main binarization function is located in multibin.py. it can be used as follows:

import multibin as mb

img = cv2.imread(img_path)
bin_imgs = mb.binarize(img, bin_methods)

Optional arguments are:

  • resize: Resize input image to desired output dimensions;
  • morph_kernel: Define a morphological kernel to be used in opening image to reduce noise (see: this line). This improves the results slightly;
  • return_original: Return a copy of the original image resized to output dimensions as the first position in the resulting array. Useful for prototyping.

You can find an example using all of them on the demo notebook included in this repo.

This function returns a list containing one binary image for each method described. The methods are defined as a list dictionary objects with the following format:

    [{
    'type' : Binarization method (string),
    'window_size': Moving square window dimension (int),
    'k_factor': Constant (int)
    },
    (...)]

You can read more about the window size an k constant selection on the paper that inspired this code. The threshold is calculated using bottleneck internally to speed up obtaining the moving average and standar deviation parameters.

CCA Analysis for ROI selection

Some auxiliary functions are defined in utils/cs_utils.py that serve to demonstrate how to select potential ROIs from a binarized image. This is very rundimentary since I've given up on using this method on my original project and moved on to Deep Learning instead (who hasn't?). Anyway, should anyone ever need or want to explore binarization-based OCR it should be a helpful start. The demonstration notebook should be useful in visualizing what the system is doing and possible next steps.

Features:

  • Wolf's, Sauvola's and Niblack's local thresholding methods
  • CCA analysis for blob extraction
  • Discard uninsteresting blobs following guidelines from this work
  • Other local thresholding algorithms as described in the paper
  • Perform non-maximum supression on redundant regions
  • Implement character recognition for final ROIs

I'll probably not be coming back to work on this anymore, but should anyone feel the urge to continue the work, I'll happily be of assistance.

About

Efficient implementation of local thresholding image binarization in python for use in multimethod binarization OCR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published