Skip to content
A layer for AWS Lambda containing the tesseract C libraries and tesseract executable.
Dockerfile Python Shell
Branch: master
Clone or download
dependabot and bweigel Bump mixin-deep from 1.3.1 to 1.3.2 (#3)
Bumps [mixin-deep](https://github.com/jonschlinkert/mixin-deep) from 1.3.1 to 1.3.2.
- [Release notes](https://github.com/jonschlinkert/mixin-deep/releases)
- [Commits](jonschlinkert/mixin-deep@1.3.1...1.3.2)

Signed-off-by: dependabot[bot] <support@github.com>
Latest commit e297996 Oct 7, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
example updates README.md Dec 2, 2018
.gitignore initial commit Dec 2, 2018
Dockerfile fix Dockerfile Jun 22, 2019
LICENSE Initial commit Dec 2, 2018
README.md Update README.md Jun 22, 2019
build.sh initial commit Dec 2, 2018
package-lock.json Bump mixin-deep from 1.3.1 to 1.3.2 (#3) Oct 7, 2019
package.json update serverless.yml Mar 13, 2019
serverless.yml update serverless.yml Mar 13, 2019

README.md

Tesseract OCR Lambda Layer

see also my new repo for layer deployment via the AWS cloud development kit (CDK)

This projects creates an AWS lambda layer that contains the tesseract 4.0.0 OCR libraries. The fast german, english and osd (orientation and script detection) data files are included by default, but can be changed by editing the Dockerfile:

...
ARG DIST=/opt/build-dist
# change OCR_LANG to enable the layer for different languages
ARG OCR_LANG=deu
# change TESSERACT_SUFFIX to use different datafiles (options: "_best", "_fast" and "")
ARG TESSERACT_SUFFIX=_fast
...

The library files that are content of the layer are stripped, before deployment to make them more suitable for the lambda environment.

Build & Deploy layer

# Build Layer components
./build.sh
# Deploy via Serverless
sls deploy

How to use

There is an example included for how to use this with the Serverless Framework.

Misc: Layer contents

The layer contents get deployed to /opt, when used by a function. See here for details:

$ ls -laR layer
layer:
total 24
drwxr-xr-x 5 bweigel bweigel 4096 Dez  2 22:42 .
drwxrwxr-x 8 bweigel bweigel 4096 Dez  2 23:24 ..
drwxr-xr-x 2 bweigel bweigel 4096 Dez  2 22:42 bin
drwxr-xr-x 2 bweigel bweigel 4096 Dez  2 22:42 lib
-rw-rw-r-- 1 bweigel bweigel   42 Dez  2 22:42 .slsignore
drwxr-xr-x 3 bweigel bweigel 4096 Dez  2 22:42 tesseract

layer/bin:
total 320
drwxr-xr-x 2 bweigel bweigel   4096 Dez  2 22:42 .
drwxr-xr-x 5 bweigel bweigel   4096 Dez  2 22:42 ..
-rwxr-xr-x 1 bweigel bweigel 316127 Dez  2 22:42 tesseract

layer/lib:
total 6072
drwxr-xr-x 2 bweigel bweigel    4096 Dez  2 22:42 .
drwxr-xr-x 5 bweigel bweigel    4096 Dez  2 22:42 ..
-rwxr-xr-x 1 bweigel bweigel 2534424 Dez  2 22:42 liblept.so.5
-rwxr-xr-x 1 bweigel bweigel 3354640 Dez  2 22:42 libtesseract.so.4
-rwxr-xr-x 1 bweigel bweigel  311352 Dez  2 22:42 libwebp.so.4

layer/tesseract:
total 12
drwxr-xr-x 3 bweigel bweigel 4096 Dez  2 22:42 .
drwxr-xr-x 5 bweigel bweigel 4096 Dez  2 22:42 ..
drwxr-xr-x 3 bweigel bweigel 4096 Dez  2 22:42 share

layer/tesseract/share:
total 12
drwxr-xr-x 3 bweigel bweigel 4096 Dez  2 22:42 .
drwxr-xr-x 3 bweigel bweigel 4096 Dez  2 22:42 ..
drwxr-xr-x 2 bweigel bweigel 4096 Dez  2 22:42 tessdata

layer/tesseract/share/tessdata:
total 15836
drwxr-xr-x 2 bweigel bweigel     4096 Dez  2 22:42 .
drwxr-xr-x 3 bweigel bweigel     4096 Dez  2 22:42 ..
-rw-r--r-- 1 bweigel bweigel  1525436 Dez  2 22:42 deu.traineddata
-rw-r--r-- 1 bweigel bweigel  4113088 Dez  2 22:42 eng.traineddata
-rw-r--r-- 1 bweigel bweigel 10562727 Dez  2 22:42 osd.traineddata
You can’t perform that action at this time.