Skip to content

OCR-D/ocrd_anybaseocr

Repository files navigation

Document Croppnig

CircleCI PyPI

Tools to crop scanned images for OCR-D

Installing

Requires Python >= 3.6.

  1. Create a new venv unless you already have one

     python3 -m venv venv
    
  2. Activate the venv

     source venv/bin/activate
    
  3. To install from source, get GNU make and do:

     make install
    

    There are also prebuilds available on PyPI:

     pip install ocrd_anybaseocr
    

Tools

All tools, also called processors, abide by the CLI specifications for OCR-D, which roughly looks like:

ocrd-<processor-name> [-m <path to METs input file>] -I <input group> -O <output group> [-p <path to parameter file>]* [-P <param name> <param value>]*

Cropper

Method Behaviour

For each page, this processor takes a document image as input and computes the border around the page content area (i.e. removes textual noise as well as any other noise around the page frame). It also annotates a cropped image.

The input image does not need to be binarized, but should be deskewed for the module to work optimally.

Implemented via rule-based methods (gradient-based line segment detection and morphology based textline detection).

Example:

ocrd-anybaseocr-crop -I OCR-D-DESKEW -O OCR-D-CROP -P rulerAreaMax 0 -P marginLeft 0.1

Document Analyser

Method Behaviour

For the whole document, this processor takes all the cropped page images and their corresponding text regions as input and computes the logical structure (page types and sections).

The input image should be binarized and segmented for this module to work.

Implemented via data-driven methods (neural Inception-V3 image classification model trained with Tensorflow/Keras).

Example

ocrd-anybaseocr-layout-analysis -I OCR-D-LINE -O OCR-D-STRUCT

Testing

To test the tools under realistic conditions (on OCR-D workspaces), download OCR-D/assets. In particular, the code is tested with the dfki-testdata dataset.

To download the data:

make assets

To run module tests:

make test

To run processor/workflow tests:

make cli-test

License

 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.

About

DFKI Layout Detection for OCR-D

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 8

Languages