Tesseract for iOS
Tesseract-OCR-iOS instead.tesseract-ios is not actively maintained anymore. I encourage you to use gali8's
Tesseract-ios is an Objective-C wrapper for Tesseract OCR.
This project couldn't exist without the Ângelo Suzuki's blog post. A lot of code came from his article.
- iOS SDK 6.0, iOS 5.0+ (there is no support for armv6)
- Tesseract and Leptonica libraries from the tesseract-ios-lib repo.
- Clone this repo from your project folder.
- Download an appropriate tesseract language trained data from the following website: https://code.google.com/p/tesseract-ocr/downloads/list and put it in your project folder
- You should have the following folder structure:
tesseract-iosas a group, and
tessdataby reference to your project:
- Go to your project settings, and ensure that
C++ Standard Library => libstdc++:
Here is the default workflow to extract text from an image:
- Instantiate Tesseract with data path and language
- Set variables (character set, …)
- Set the image to analyze
- Start recognition
- Get recognized text
#import "Tesseract.h" Tesseract* tesseract = [[Tesseract alloc] initWithDataPath:@"tessdata" language:@"eng"]; [tesseract setVariableValue:@"0123456789" forKey:@"tessedit_char_whitelist"]; [tesseract setImage:[UIImage imageNamed:@"image_sample.jpg"]]; [tesseract recognize]; NSLog(@"%@", [tesseract recognizedText]); [tesseract clear];
- (id)initWithDataPath:(NSString *)dataPath language:(NSString *)language
Initialize a new
dataPath: a relative path from the application bundle to the
.traineddatafiles. You can find these files from the tesseract downloads section.
language: language used for recognition. Ex:
eng. Tesseract will search for a
eng.traineddatafile in the
nil if instanciation failed.
- (void)setVariableValue:(NSString *)value forKey:(NSString *)key
Set Tesseract variable
value. See http://www.sk-spell.sk.cx/tesseract-ocr-en-variables for a complete (but not up-to-date) list.
For instance, use
tessedit_char_whitelist to restrict characters to a specific set.
- (void)setImage:(UIImage *)image
Set the image to recognize.
- (BOOL)setLanguage:(NSString *)language
Override the language defined with
Start text recognition. You might want to launch this process in background with
- (NSString *)recognizedText
Get the text extracted from the image.
- (void) clear
Clears Tesseract object after text has been recognized from image. Preventing memory leaks.