Skip to content
Heroku Custom Buildpack for Tesseract OCR
Branch: master
Clone or download
Pull request Compare This branch is 7 commits ahead of caueguedes:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin
.gitignore
LICENSE
README.md
ell.traineddata
eng.traineddata
eng_fast.traineddata
grc.traineddata
tesseract-ocr-4.00.eng.tar.gz
tesseract-ocr-4.00.tar.gz

README.md

Heroku Buildpack Tesseract

This package provide a custom Heroku buildpack providing the Tesseract OCR binary and all the required libraries to Heroku apps. Training data for English.

Configuration

  1. setup your app as

    heroku buildpacks:set heroku/LANG
    heroku buildpacks:add https://github.com/caueguedes/heroku-buildpack-tesseract
    

    where LANG is the language used by your app (e.g., ruby, python, or nodejs). A complete list of Heroku buildpacks can be found here.

  2. you can use the tesseract binary in your Heroku app!

  3. deploy :)

Example

A minimal functioning Heroku app using this buildpack can be found here. The app is coded in Python and provides a REST method that accept an image and return the Tesseract OCR output as a JSON object. The REST functionality is implemented through the Flask web microframework.

Note

This fork solves the issue of the missing libraries libtiff4 and libjpeg62.

License

MIT License.

Original work Copyright (c) 2013 Marco Azimonti.
Modified work Copyright (c) 2015 Matteo Maggioni.
Modified work Copyright (c) 2017 Brian Castor.
Modified work Copyright (c) 2018 Caue Guedes.

You can’t perform that action at this time.