This project is a python wrapper to grab text from images and save as text files using Google Tesseract Engine. Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License, Version 2.0, and development has been sponsored by Google since 2006. In 2006 Tesseract was considered one of the most accurate open-source OCR engines then available.

Usage

python main.py -i <input_path> -o <output_path>

usage: main.py [-h] -i INPUT [-o OUTPUT] [-d]

required arguments: -i INPUT, --input INPUT Single image file path or images directory path

optional arguments: -o OUTPUT, --output OUTPUT (Optional) Output directory for converted text -d, --debug Enable verbose DEBUG logging

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Files

README.md

Latest commit

History

README.md

File metadata and controls