Ever wanted to use Python and basic data analysis skills to help do your taxes?
I'll show you how it's possible using a few features of the popular personal finance service Mint.com, in conjunction with your favorite data analysis package. Mint is a service that syncs with your financial statements from various credit cards and banks, and allows you to view your transactions from across multiple accounts in one unified dashboard.
For the purposes of this tutorial, I will use the Python programming language and Pandas data manipulation library, but great alternatives could be R or Excel if you are more familiar with those. This projects assumes you will obtain a Mint transactions csv export, with example code snippets in .py
, and .ipython/juptyer
form.
THIS DEMO IS DISTRIBUTED FOR EDUCATIONAL PURPOSES ONLY. IT IS NOT TAX-RELATED ADVICE OR FILING SOFTWARE.
You will need to set up a Mint.com account, and connect any bank/credit-card accounts that you'd like to analyze. If you do not have an account yet, you will need to set one up if you wish to play along with your own data.
This tutorial's syntax assumes you're using Python 2.7 version.
You will need the following Python Packages installed: pandas
. This can be done in one fell swoop using the requirements.txt
file provided, by running the command pip install -r requirements.txt
.
Go to your Mint Transactions
, it should look something like this:
Next, to export your transaction history, scroll to the bottom of the tab, and click the export link, which will prompt your browser to download a transactions.csv
file. Move it to your workspace / working directory.
Jupyter: run jupyter notebook
in your working directory (the project directory with these scripts) which should allow you to initiate the review_transactions.ipynb
in your browser window to begin exploring your transactions log.
Python: run python review_transactions.py
Pandas is a data manipulation library which allows for the creation of in-memory DataFrame objects.
DataFrames allow for quick access to views of your original Data Frame with various query predicates (aka attribute filtering) applied. In this example, the transactions are the rows, with Dates, Descriptions, Amounts, etc as attributes.
This allows for statistical manipulations on different data attribute types (integers, floats, strings). In this tutorial, you will specifically see some string attribute searches, and numeric summations in action.