Skip to content

Richard-Rumpel/receipt-parser

 
 

Repository files navigation

A fuzzy receipt parser written in Python

This is a fuzzy receipt parser written in Python. It extracts information like the shop, the date, and the total form receipts. It can work as a standalone script or as part of the IOS and Android application.

History

This project started as a hackathon idea. Read more about it on the trivago techblog. Also read the comments on HackerNews There's also a talk about the project. The library is now available at PyPi.

Dependencies

The receipt-parser-core library depend on imagemagick. Please install imagemagick with your favorite package manager.

Usage

To convert all images from the data/img/ folder to text using tesseract and parse the resulting text files, run

make run

Docker

A Dockerfile is available with all dependencies needed to run the program.
To build the image, run

make docker-build

To run it on the sample files, try

make docker-run

By default, running the image will execute the make run command. To use with your own images, run the following:

docker run -v <path_to_input_images>:/usr/src/app/data/img mre0/receipt_parser

About

A supermarket receipt parser written in Python using tesseract OCR

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 97.2%
  • Makefile 2.2%
  • Dockerfile 0.6%