Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 786 Bytes

README.md

File metadata and controls

15 lines (9 loc) · 786 Bytes

This project is a python wrapper to grab text from images and save as text files using Google Tesseract Engine. Tesseract is an optical character recognition engine for various operating systems. It is free software, released under the Apache License, Version 2.0, and development has been sponsored by Google since 2006. In 2006 Tesseract was considered one of the most accurate open-source OCR engines then available.

Usage

python main.py -i <input_path> -o <output_path>

usage: main.py [-h] -i INPUT [-o OUTPUT] [-d]

required arguments: -i INPUT, --input INPUT Single image file path or images directory path

optional arguments: -o OUTPUT, --output OUTPUT (Optional) Output directory for converted text -d, --debug Enable verbose DEBUG logging