Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract data from receipts/bills and enrich OFF with prices notion #65

Open
devingfx opened this issue Jan 20, 2021 · 3 comments
Open

Comments

@devingfx
Copy link

devingfx commented Jan 20, 2021

I read somewhere that table recognition is on roadmap...
When this is ready, scanning "bills" or invoices to extract products price by brand/store/date.

With a shared price information, comparators and other apps would be possibles...

  • See price over time of a product
  • Compare stores margins
  • Detect price fluctuation
  • Compare categories average prices
  • Compare categories average country differences
  • An app to calculate the best couple of store to get this gorcery list you have to buy for next dinner, created from my last bills scans recurring products
    ... and so on

Maybe privacy is to be discussed though !
Maybe a mix of anonymous price data, and a way to keep the pictures/OCR/data local in the user device (aka let apps owners use OCR localy)

My 2 cents

@devingfx
Copy link
Author

I developped a draft of bills data extractor from PDF > detect header footer infos (like store, address, date, SIRET, ect) > parse table rows to CSV.

PDF file are generated right now by an external app TextFairy that uses Tesseract to extract text and positioning.

I found OpenFoodFacts searching a way to get products infos from "partial general name" + store

@teolemon
Copy link
Member

teolemon commented Sep 9, 2021

@devingfx We had made this prototype during a hackathon: https://github.com/openreceipts/openreceipts-server

@teolemon teolemon changed the title Extract data from bills and enrich OFF with prices notion Extract data from receipts/bills and enrich OFF with prices notion Sep 9, 2021
@raphael0202
Copy link
Contributor

@devingfx I'm not sure whether you're still interested in the subject, but you've launche Open Prices (https://prices.openfoodfacts.org), a crowdsourced database of prices of food products in the world.
Having ML to extract automatically data from receipts/price tags would help tremendously.

@teolemon teolemon added the ✨ enhancement New feature or request label May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Backlog
Status: To discuss and validate
Development

No branches or pull requests

3 participants