Skip to content

fti-vsaxena/Readme-write

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Text Recognition and Redaction using OpenCV and Pytesseract

OS: Ubuntu 22.04 with python 3.10

USED Libraries:

Python, OpenCV, Tesseract

Install requirements for the project:

sudo apt-get install unzip tesseract-ocr

Unzip the zip file:

unzip Soroco-task.zip 

Move inside the directory:

cd Soroco-task

Install requirement libraries for the project:

pip install -r requirements.txt

Run Project:

python3 text_recognize.py <image_path>

For example, I have used an image ocr_input.png which is already present in the the directory.

Input Image:

ocr_input

python3 text_recognize.py ocr_input.png

Output:

This will generate input_image_text.json and input_image_redacted.png in the current directory.

  • input_image_text.json : JSON file containing the words and their bounding boxes.
  • input_image_redacted.png : Redacted image with the text removed.

Redacted Image:

ocr_input_redacted

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published