DeepOCR

Description

This is an Optical Character Recognition Library with the ability to train and deploy Deep Neural Network models to a Streamlit Web application. The base library in written using PyTorch and PyTorch-Lightning, while the dashboard was developed using the Streamlit library.

How to run

First, install dependencies

# clone deepOCR   
git clone https://github.com/das-projects/deepOCR

# install deepOCR   
cd deepOCR 
pip install -e .  
sudo apt-get install fonts-freefont-ttf -y

Streamlit Webapp

Next, try out the streamlit dashboard for a demonstration

# demo folder
cd deepOCR
# run demo
streamlit run demo/app.py

Python code API

This project is setup as a package which means you can now easily import any file into any other file:

# Download an example image
wget https://eforms.com/download/2019/01/Cash-Payment-Receipt-Template.pdf

import matplotlib.pyplot as plt

from deepocr.io import DocumentFile
from deepocr.models import ocr_predictor

# Load the pdf file
doc = DocumentFile.from_pdf("Cash-Payment-Receipt-Template.pdf").as_images()
print(f"Number of pages: {len(doc)}")

# Use the predictor object to detect and recognize text
predictor = ocr_predictor(pretrained=True)

# show the predictor output!
result = predictor(doc)
result.show(doc)

# Use synthesize method to regenerate the image in a desired format 
synthetic_pages = result.synthesize()
plt.imshow(synthetic_pages[0]); plt.axis('off'); plt.show()

Models architecture reference

Text Detection

Text Recognition

Citation

@article{Arijit Das, Raphael Kronberg
  title={deep OCR: Optical Character Recognition with Deep Learning},
  author={Arijit Das, Raphael Kronberg},
  journal={https://github.com/das-projects/deepOCR},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
.streamlit		.streamlit
deepocr		deepocr
demo		demo
scripts		scripts
tests		tests
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepOCR

Description

How to run

Streamlit Webapp

Python code API

Models architecture reference

Text Detection

Text Recognition

Citation

About

Releases

Packages

Contributors 2

Languages

License

das-projects/deepOCR

Folders and files

Latest commit

History

Repository files navigation

DeepOCR

Description

How to run

Streamlit Webapp

Python code API

Models architecture reference

Text Detection

Text Recognition

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages