### Vendor payments made by the City of Phoenix

#### The City of Phoenix publishes vendor payment information from 2014 onward. I'm interested in understanding how categories of spending change from year to year.  Each year has about half a million vendor payments, this is a fun opportunity for me to work with data at a larger scale and work through issues that might be associated with this size of data sets.

In [8]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
import seaborn as sns
import datetime


files = [
    'data/2017/january-june-2017.csv',
    'data/2017/july-september-2017.csv',
    'data/2017/november-2017.csv',
    'data/2017/october-2017.csv',
    'data/2017/december-2017.csv',
]

phx_ven_pay_17 = pd.concat([pd.read_csv(f) for f in files], sort=True)

In [9]:
phx_ven_pay_17.shape

(487380, 9)

In [10]:
phx_ven_pay_17.head()

Unnamed: 0,Invoice Net Amt,Check/Payment Date,Commitmt Item Name,Dept. Descrptn,Document Nbr,Fund Center,Invoice Net Amt.1,Vendor ID Number,Vendor Name
0,400.0,1/9/2017,Percent Arts-Prf Svc,Office of Arts and Culture,1905025000.0,AR66000039,,3521555.0,"JOHNSON, GARTH W"
1,5153.6,1/27/2017,Spec Contractual Svc,Human Services,5200546000.0,8980150005,,3072374.0,1 N 10 INC
2,2400.0,1/27/2017,Spec Contractual Svc,Human Services,5200546000.0,8980150005,,3072374.0,1 N 10 INC
3,5153.6,1/27/2017,Spec Contractual Svc,Human Services,5200546000.0,8980150005,,3072374.0,1 N 10 INC
4,2610.61,1/27/2017,Spec Contractual Svc,Human Services,5200546000.0,8980150005,,3072374.0,1 N 10 INC


### 2016 Data

In [17]:
files = [
    'data/2016/citycheckbookjantojune2016.csv',
    'data/2016/citycheckbookjulytodec2016.csv',
]

phx_ven_pay_16 = pd.concat([pd.read_csv(f) for f in files], sort=True)

In [18]:
phx_ven_pay_16.head()

Unnamed: 0,Amount,Date,Department,Description,G/L Description,Vendor Display
0,$855,01-04-2016,Community & Economic Development,Other Commodities,,ABM PARKING SERVICES
1,$471,01-04-2016,Community & Economic Development,Other Commodities,,ABM PARKING SERVICES
2,$813,01-04-2016,Community & Economic Development,Other Commodities,,ABM PARKING SERVICES
3,$578.92,01-04-2016,Fire,Plumbing Services,,ABOVE ALL PLUMBING SERVICES INC
4,"$2,712.04",01-04-2016,Fire,Plumbing Services,,ABOVE ALL PLUMBING SERVICES INC


In [20]:
phx_ven_pay_16.dtypes

Amount             object
Date               object
Department         object
Description        object
G/L Description    object
Vendor Display     object
dtype: object

In [26]:
phx_ven_pay_16.shape

(533015, 6)

### To clean up:
+ 'Amount' column has dollar sign in front of it, must remove this in order to work with the values as numbers
+ convert 'Date' to datetime from object
+ 'Commitmt Item Name' does not show up in this dataset, however it is present in the following year(2017) and beyond

### 2015 Data

In [21]:
files = [
    'data/2015/citycheckbookjantojune2015.csv',
    'data/2015/citycheckbookjulytodec2015.csv',
]

phx_ven_pay_15 = pd.concat([pd.read_csv(f) for f in files], sort=True)

In [22]:
phx_ven_pay_15.head()

Unnamed: 0,Amount,Date,Department,Description,Vendor Display
0,$80,01-02-2015,Municipal Court,Interpreters/Transl,A FOREIGN LANGUAGE SERVICE CORP
1,"$1,888.85",01-02-2015,Aviation,Small Tools/ Equip,A TO Z EQUIPMENT RENTALS
2,$22.46,01-02-2015,Aviation,Motor Vehicle Parts,A TO Z EQUIPMENT RENTALS
3,"$1,973.6",01-02-2015,Aviation,Small Tools/ Equip,A TO Z EQUIPMENT RENTALS
4,$17.33,01-02-2015,Public Works,Inventories,A-Z LOCK PRODUCTS CO INC


In [23]:
phx_ven_pay_15.dtypes

Amount            object
Date              object
Department        object
Description       object
Vendor Display    object
dtype: object

In [24]:
phx_ven_pay_15.shape

(333440, 5)