Skip to content

System that uses OCR (Optical Character Recognition) to extract data from invoice photos (e.g. products, supermarket, prices), and displays it in a dashboard.

Notifications You must be signed in to change notification settings

dinispeixoto/invoice-scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Invoice Scanner

This project is the result of a hackathon whose objective was to create a system that would extract data from invoice photos (products bought, supermarket, prices, etc) and display the extracted data in a dashboard.

Scanning the invoices

To detect the text on the invoice photos we relied on tesseract and imagemagick to improve the quality of the photos by cropping the image, reducing the brightness, increasing the contrast, reducing to a gray colorspace and sharpening the edges. The cropping is done using Fred's multicrop script.

The main issue in this step was the fact that the invoices we had on us at the time were already old and battered, as well as the fact that tesseract was trained on books. Furthermore, the cropping step is taking too much long.

Invoice example

Kibana dashboard

The data from the various scanned receipts is collected and presented in a Kibana dashboard. This way you can keep track of your purchases and where you spend your money.

Kibana screenshot

About

System that uses OCR (Optical Character Recognition) to extract data from invoice photos (e.g. products, supermarket, prices), and displays it in a dashboard.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages