Using this repository you can extract invoices from your Gmail account as PDF files and sort them based on their dates. It is useful when you want to submit your expenses to your accountant/tax authorities for VAT declaration.
There are two types of invoices that I usually get:
- A PDF file that is attached to the email
- An HTML based invoice which is the body of the email itself
Second part was a bit tricky to convert to PDF, since there are no pure Python implementations to convert text to pdf in a
pretty format I used libreoffice
which is install in the Docker image we are using.
- You need to enable your Gmail API.
- Then you need to create a credentials
file as a JSON and put it under
config
directory. - Now you need to make sure you create filters for emails that you want to extract invoices for to apply appropriate labels.
While creating these filters you can use
from
keyword orhas the words
and in the end make sure to apply a new label for this filter. You can test your filter by clicking on the label that you created under Labels in your Gmail UI. - Now modify
config/data.json
to include the labels you want. Indicate the type of the invoice for each label. You need to also indicate the time period you want to fetch these invoices. - Run
docker-compose up
- This will generate a URL for you to do OAUTH step for your Gmail account. Authorize the application and let the script do its job.
- You will have all the invoices stored under
/data
directory.
We welcome contributors to improve and add new features to this project.
- Better text to PDF formatting
DISCLAIMER: This repository was initially intended for my personal use only, and not implemented in an efficient way nor tested thoroughly. if you are using this repository use it with your own caution and know what you are doing.