Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dramatically improve deskew performance with leptonica #61

Closed
wants to merge 4 commits into from

Commits on Jan 19, 2014

  1. Implement ctypes wrapper around Leptonica to access its deskew function

    A few design notes:
    Leptonica's deskew is far superior to ImageMagick's convert -deskew command --
    around 30-40x faster.  Subjectively the output appears to this contributor to
    be of higher quality as well.  The difference is the algorithm: ImageMagick
    uses the complex Hough transform to find the skew angle, while Leptonica uses
    the simpler method, Postl's variance of differential line sums -- conceptually, shear the image and check for straight horizontal.  In this case
    simplicity wins.  Finding the skew angle is the bulk of the work.
    
    Leptonica's author explains the advantages of his approach here:
    http://www.leptonica.com/skew-measurement.html
    
    Leptonica is the low-level library that Tesseract depends on.  Hence, this
    project already depends on Leptonica.  Leptonica can read and write most
    common image file types on its own.
    
    Unfortunately its error handling is poor: it seldom returns any meaningful
    error codes.  The best it manages is writing messages to stderr, which in
    the context of a verbose script is just confusing since the error's source
    is not indicated.  The problem is compounded by Tesseract's use of Leptonica,
    which will produce exactly the same errors in some cases.  So we trap stderr
    between calls to Leptonica and parse it for a few different types of error
    message.
    
    leptonica.py is Python 2/3 compatible and set up to provide access to other
    Leptonica functions as needed.  Of particular interest are its orientation
    detection (including flip and rotation errors) which it does by comparing
    text ascenders to descenders.
    
    There is a PyPI "pylepthonica" package, however it is out of date by a few
    years, and it implements all of Leptonica with Python wrappers -- so it is
    massive, with one .py file at 2.5 MB.  This module is loosely inspired by
    pyleptonica but more modern, up to date, and contains only limited
    functionality.
    Jim Barlow committed Jan 19, 2014
    Configuration menu
    Copy the full SHA
    62edc15 View commit details
    Browse the repository at this point in the history
  2. Replace ImageMagick-convert with Leptonica

    Jim Barlow committed Jan 19, 2014
    Configuration menu
    Copy the full SHA
    6703434 View commit details
    Browse the repository at this point in the history

Commits on Jan 20, 2014

  1. Fix a silly typo, and other minor cleanup

    Jim Barlow committed Jan 20, 2014
    Configuration menu
    Copy the full SHA
    8cfbdaf View commit details
    Browse the repository at this point in the history

Commits on Jan 22, 2014

  1. Bug fix: leptonica generates .png when asked to produce .pbm/pgm/ppm

    Leptonica does not interpret those extensions correctly.  However, when
    asked to produce a .pnm file, it will produce the expected .pbm/pgm/ppm
    file depending on the input.  So ask it to produce a .pnm and then
    adjust the extension.
    
    And add a test case.
    Jim Barlow committed Jan 22, 2014
    Configuration menu
    Copy the full SHA
    5ace690 View commit details
    Browse the repository at this point in the history