Text Recognition and Redaction using OpenCV and Pytesseract

OS: Ubuntu 22.04 with python 3.10

USED Libraries:

Python, OpenCV, Tesseract

Install requirements for the project:

sudo apt-get install unzip tesseract-ocr

Unzip the zip file:

unzip Soroco-task.zip

Move inside the directory:

cd Soroco-task

Install requirement libraries for the project:

pip install -r requirements.txt

Run Project:

python3 text_recognize.py <image_path>

For example, I have used an image ocr_input.png which is already present in the the directory.

Input Image:

python3 text_recognize.py ocr_input.png

Output:

This will generate input_image_text.json and input_image_redacted.png in the current directory.

input_image_text.json : JSON file containing the words and their bounding boxes.
input_image_redacted.png : Redacted image with the text removed.

Redacted Image:

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Recognition and Redaction using OpenCV and Pytesseract

OS: Ubuntu 22.04 with python 3.10

USED Libraries:

Install requirements for the project:

Unzip the zip file:

Move inside the directory:

Install requirement libraries for the project:

Run Project:

Output:

About

Releases

Packages

fti-vsaxena/Readme-write

Folders and files

Latest commit

History

Repository files navigation

Text Recognition and Redaction using OpenCV and Pytesseract

OS: Ubuntu 22.04 with python 3.10

USED Libraries:

Install requirements for the project:

Unzip the zip file:

Move inside the directory:

Install requirement libraries for the project:

Run Project:

Output:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages