Skip to content

Tips for Improving OCR Results

Kevin Conley edited this page Jan 7, 2015 · 6 revisions

Important to Know

Tesseract is a library for performing optical character recognition, but it's important to know that Tesseract performs OCR best when it is given a preprocessed image that is ideally crystal clear black text on a pure white background.

The following sections provide some tips about how to preprocess images before running them through Tesseract to improve the result and speed of OCR.

Upstream tips

The upstream Tesseract library has a Wiki page on how to improve the quality of OCR results here: https://code.google.com/p/tesseract-ocr/wiki/ImproveQuality

It's worth reading because it explains the kinds of processing Tesseract does and does not do, which is useful in determining what preprocessing to perform on an image.

Using GPUImage's Adaptive Threshold Filter

GPUImage is a fantastic image processing library for iOS that filters images on the GPU, so it's really fast. It even comes with a photo camera and a live video camera that you can use to create a pipeline of one or more filters.

You can use GPUImage's GPUImageAdaptiveThresholdFilter to preprocess an image for performing OCR, which "determines the local luminance around a pixel, then turns the pixel black if it is below that local luminance and white if above. This can be useful for picking out text under varying lighting conditions."

Here's some sample code to get you started:

// Grab the image you want to preprocess
UIImage *inputImage = [UIImage imageNamed:@"my_test_image.jpg"];

// Initialize our adaptive threshold filter
GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
stillImageFilter.blurRadiusInPixels = 4.0 // adjust this to tweak the blur radius of the filter, defaults to 4.0

// Retrieve the filtered image from the filter
UIImage *filteredImage = [stillImageFilter imageByFilteringImage:inputImage];

// Give Tesseract the filtered image
tesseract.image = filteredImage;

Bypassing Tesseract's Internal Thresholder

By default, Tesseract applies Otsu's thresholding method to every image as a pre-processing step of the recognition process.

But if you've already performed your own pre-processing/thresholding (as with using the GPUImage code above), you will probably want to bypass the internal Tesseract thresholding step. That's possible with the preprocessedImageForTesseract:sourceImage: Tesseract delegate method. If implemented, that method is called before the internal thresholder and prevents the running of the internal thresholder, as long as the method returns an image.

If you wanted to skip the internal thresholding step, the GPUImage code above should be changed as follows:

// somewhere in the function of your class
// set the delegate
tesseract.delegate = self;
// give the original, non-processed image to Tesseract
tesseract.image = [UIImage imageNamed:@"my_test_image.jpg"];

// Tesseract delegate method inside of your class
- (UIImage *)preprocessedImageForTesseract:(G8Tesseract *)tesseract sourceImage:(UIImage *)sourceImage {

    // sourceImage is the same image you sent to Tesseract above
    UIImage *inputImage = sourceImage;

    // Initialize our adaptive threshold filter
    GPUImageAdaptiveThresholdFilter *stillImageFilter = [[GPUImageAdaptiveThresholdFilter alloc] init];
    stillImageFilter.blurRadiusInPixels = 4.0 // adjust this to tweak the blur radius of the filter, defaults to 4.0

    // Retrieve the filtered image from the filter
    UIImage *filteredImage = [stillImageFilter imageByFilteringImage:inputImage];

    // Give the filteredImage to Tesseract instead of the original one, 
    // allowing us to bypass the internal thresholding step.
    // filteredImage will be sent immediately to the recognition step
    return filteredImage;
}