Invoice extractor extract key value pairs from a invoice
Here we have used the East text detection model to detect the texts. after detecting the texts it will go through the tesseract and extract the key values
-
Pytesseract
-
Flask
-
OpenCV
-
PILLOW
-
Pdf2img
-
EAST text detection model
pip install requirements.txt
python app.py
The project is showing only 20% accuracy on test set. but it'can be improved by making a custom model for text detection,language support and tweaking the image preprocessing
- Deployed on Google compute engine - upload your pdf to test
This project is licensed under the MIT License - see the LICENSE.md file for details