Python SIP wrapper for libtesseract (Apache license)
Switch branches/tags
Nothing to show
Clone or download
Latest commit 1d02c7b Feb 20, 2017
Type Name Latest commit message Commit time
Failed to load latest commit information.
example fix build on linux (openSUSE) Dec 31, 2013
sip Cleaner code Feb 19, 2017
src Fix bugs in PIX conversion Aug 2, 2013
test fix build on linux (openSUSE) Dec 31, 2013
LIBS.txt Update supported build structure to match tesseract/leptonica Jul 30, 2013
LICENSE.txt Update license file Jul 30, 2013 Add manifest so source distribution can be built Dec 5, 2013
README.txt Bump version Jan 2, 2014 Bump version Jan 2, 2014



Yet another python library wrapper for Tesseract, this one uses SIP to do the
wrapping. Distributed using the same license as Tesseract. 

Why does this exist?

python-tesseract is GPL licensed, and I needed something with an apache 

Build Requirements

You must have a compiler installed that is supported by python distutils.
Check out LIBS.txt for other dependency details. 

Additionally, SIP must be installed. You can get SIP at:

There are working examples in the 'examples' folder. They require you to
unpack the tesseract OCR data to a directory called 'tessdata'. 

The python-tesseract wiki has some useful python code samples. Anything
that uses the TessBaseAPI will most likely be compatible with this library.
However, they have a bunch of other functions (which may or may not be 
useful for you) that are not implemented in this wrapper. See the wiki 
on their google code site for similar usage examples.
Platforms Tested

    Windows 7 x64
        - Python 2.7 x86, MSVC 2008 (Tesseract 3.02, Leptonica 1.68)
        - Python 2.7 x64, MSVC 2008 (custom x64 build of Tess/Lept)


If you do find bugs, please send fixes my way, and report them at the github
site for python-tesseract-sip. However, no technical support will be 
provided. Good luck and have fun! 

Dustin Spicuzza

Get the latest version of this code at