# Getting & Munging Wash. State Political Contribution Data 

### About the Data
  
This project analyzes contributions made during the election cycles starting in 2007 and continuing through 2017.  

The data comes from the mandatory reports filed by Washington State political campaigns and political action committees (PAC) with the __Washington Public Disclosure Commission (PDC)__. The PDC aggregated the last 10 years of reports and makes it available under a public domain license at this website: [data.wa.gov](https://data.wa.gov/Politics/Contributions-to-Candidates-and-Political-Committe/kv7h-kjye). Changes to the dataset are made nearly every weekday.  
  
Campaign contribution laws are a bit complicated, but in general, this data includes political contributions for greater than \$25 and made by any person other than the candidate, not including loans. However, if the candidate raises and spends no more than \$5,000 and receives no more than \$500 from person (or organization) other than the candidate they can file a "mini report," which means their records are not included in this dataset. For more infomation about "full reporting" versus "mini reporting," see the [PDC website](https://www.pdc.wa.gov/learn/publications/candidate-instructions/basic-information/reporting-options-choosing-mini-or-full).  
  
Additionally, this data includes all state-wide, county level, and city level elections along with other local races like the port-authority or school board and initiatives. Essentially, these records cover all races except United States (U.S.) President, U.S. Senate, and U.S. House of Representatives.  
  
The full dataset is over 1.34 GB with 37 columns and more than 3.65 million rows. This size makes data analysis difficult on a typical desktop computer.    
  
Alternatively, you can drop columns and filter rows on `data.wa.gov` before downloading so you get a smaller dataset. Those filtering instructions are included below. But, even when we reduce the size, it's still a big file, so the code for data cleaning and generating smaller CSV files is included in this notebook file.  
  
You can also access the contribution data using the __Data.WA API__, powered by Socrata. To learn how to use the API, this [API Documentation](https://dev.socrata.com/foundry/data.wa.gov/74eq-kst5) provides excellent instructions. However, this notebook only covers the method for downloading a CSV file.

## Part 1. Get the Data

### 1. How to Download the Contribution Data
Visit https://data.wa.gov/Politics/Contributions-to-Candidates-and-Political-Committe/kv7h-kjye.   
You should see a page like the screenshot below.  
Click the dropdown menu __Explore Data__ and select __View Data__. 
![Image](images/pdc-data-window.jpeg)  

### 2. How to Remove Columns 
As shown in the screenshot below, to remove unnecessary columns, click on the menu icon in the column header and select __Hide Column__ for each of the following columns:  
 - report_number  
 - origin  
 - first_name   
 - middle_initial 
 - last_name
 - position 
 - jurisdiction_county
 - description
 - memo 
 - primary_general  
 - url  

![Image](images/hide-col.jpeg)
  
  
### 3. How to Filter Rows
Since we are only interested in contributions for election years 2007-2017, we'll create that filter by clicking the __Filter__ tab and then select `election_year` is between 2007 and 2017, just like in the screenshot below.  
  
This filter removes all the rows with records for other election years. Note: you can contribute money in 2017 to a campaign for an election that won't occur until 2018 or even 2020. So, don't confuse the values in the `receipt_date` column with the values in the `election_year` column.  
![Image](images/filter.jpeg)  

### 4. Download the CSV File
Finally, to download the data in a CSV file format by clicking the __Export__ tab and then under Download As, click CSV.  
  
The default name of the downloaded dataset is: `Contributions_to_Candidates_and_Political_Committees.csv`.  
  
If you followed steps 1-3, your CSV file should be approximately 811 MB. This is still too large to upload to a GitHub repository, since GitHub prevents users from pushing files larger than 100 MB. ([GitHub Help Documentation](https://help.github.com/articles/what-is-my-disk-quota/)) So, the code below shows how I created the smaller CSV files currently in the data folder of this repository.

## Part 2. Clean the Data

In [1]:
# Import statements for Python 3 libraries
import datetime
import numpy as np
import pandas as pd
import time

from IPython.display import display, Image, HTML
from pandas.io.formats.style import Styler

# Increase the maximum number of columns displayed to 50.
pd.set_option('display.max_column', 50)

# Format display of values
pd.option_context('display.precision', 2)

# This code displays all results created within a jupyter notebook cell.
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

### 1. Reading CSV File Into a Pandas DataFrame (df)
  
We could just read the CSV file without any additional parameters, but we would have to do additional data munging later. After applying all the parameters, our 811 MB CSV file is only taking up 461 MB in memory if make `election_year` a categorical variable, otherwise 485 MB.  

When reading a large file into a df, if you don't set the parameter `low_memory=False` you will get a warning message and depending on how large the file is, you might watch the kernel die.  
  
Unless you specify the data type for each column, the default will be the pandas type, `object`, more commonly known as a string in python. This is very inefficient. So, if you have a column with only a few different values (typically text values, but could be numeric), set the `dtype` parameter for that column as `category`. [See Categorical Data](http://pandas.pydata.org/pandas-docs/stable/categorical.html). You can have null values in a `category`, the null values are labeled as numpy nan (np.nan).  

However, per the documentation, categorical data are NOT automatically ordered. You must explicitly pass `ordered=True` to indicate an ordered Categorical. For example, you could make the `election_year` column a categorical variable, but would have to set it to ordered. This has the added benefit of telling pandas not to add those values when you call sum() on a df object since numerical operations don't work for `category` variables. But, I am going to declare `election_year` as an integer.

In [2]:
start = time.time()
# The initial, filtered dataset has 3,635,991 rows and 26 columns.
# I ran into errors reading this CSV file with encoding="utf8," so using "latin-1."
data = pd.read_csv("/Users/erinorbits/Downloads/Contributions_to_Candidates_and_Political_Committees-2.csv",
                   header=0, sep=",", encoding="latin-1", keep_default_na=False,
                   na_values=[0], parse_dates=["receipt_date"], low_memory=False,
                   dtype = {"type": "category",
                            "office": "category",
                            "legislative_district": "category",
                            "party": "category",
                            "for_or_against": "category",
                            "jurisdiction": "category", 
                            "jurisdiction_type": "category",
                            "election_year": "int64",
                            "cash_or_in_kind": "category",
                            "primary_general": "category",
                            "code": "category",
                            "contributor_employer_state": "category"})
end = time.time()
print("\nTotal time to read data into df:", str((end-start)/60), "min.\n")
data.info()


Total time to read data into df: 6.343477348486583 min.

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3635991 entries, 0 to 3635990
Data columns (total 26 columns):
id                            object
filer_id                      object
type                          category
filer_name                    object
office                        category
legislative_district          category
party                         category
ballot_number                 object
for_or_against                category
jurisdiction                  category
jurisdiction_type             category
election_year                 int64
amount                        object
cash_or_in_kind               category
receipt_date                  datetime64[ns]
code                          category
contributor_name              object
contributor_address           object
contributor_city              object
contributor_state             object
contributor_zip               object
contributor_occupation     

In [3]:
# To get a list of categories within jurisdiction column
data["jurisdiction"].cat.categories

# Verifying that the categories for jurisdiction are unordered.
data["jurisdiction"].cat.ordered

Index(['', 'ABERDEEN SD 005', 'ANACORTES SD 103', 'APPEALS, COURT OF DIV I',
       'APPEALS, COURT OF DIV II', 'APPEALS, COURT OF DIV III',
       'ARLINGTON SD 016', 'ASOTIN CO', 'ATTORNEY GENERAL, OFFICE OF',
       'AUBURN SD 408 *',
       ...
       'WHATCOM PUD 01', 'WHIDBEY ISL HOSPITAL DIST', 'WHITE RIVER SD 416',
       'WHITMAN CO', 'WOODINVILLE FIRE & RESCUE', 'WOODINVILLE WATER DIST',
       'YAKIMA CO', 'YAKIMA CO DIST CT', 'YAKIMA CO SUPERIOR COURT',
       'YAKIMA FIRE PROT DIST 04'],
      dtype='object', length=500)

False

In [4]:
print("\nvalue counts in the 'election_year' column:")
data["election_year"].value_counts()

print("\nvalue counts in the 'contributor_state' column:")
data["contributor_state"].value_counts()


value counts in the 'election_year' column:


2012    561683
2008    490703
2016    439703
2014    335711
2010    298597
2013    284077
2009    276754
2015    274607
2017    266564
2011    209605
2007    197987
Name: election_year, dtype: int64


value counts in the 'contributor_state' column:


WA    3282023
       148957
CA      35714
OR      26123
ID      11722
NY      11191
TX      10665
DC       7511
FL       7319
IL       6702
VA       6637
GA       5876
MA       5178
PA       4541
AZ       4502
OH       4455
MD       4439
NJ       4304
CO       4079
MI       2898
MN       2875
MO       2842
NC       2607
CT       2381
WV       2233
WI       2011
IN       1928
TN       1843
NM       1510
MT       1358
       ...   
IS          2
3           2
SE          2
CH          2
W           2
ES          2
NB          1
IT          1
AT          1
B.          1
SW          1
OS          1
FM          1
YT          1
5           1
OE          1
DI          1
JP          1
PH          1
GR          1
GE          1
QU          1
NT          1
JA          1
SK          1
WO          1
AU          1
RA          1
LT          1
CE          1
Name: contributor_state, Length: 122, dtype: int64

## 2. Reformatting Column Values
  
We want to clean the data before breaking out subsets of the data into smaller CSV files.  
  
As you can see above, the `amount` column is considered an object (aka string) value because the values have dollar signs. We need to remove the dollar signs and convert the dtype to float so we can sum the values.  

In [5]:
# Strip the dollar sign and then change dtype to float.
data["amount"] = data["amount"].map(lambda x: x.lstrip("$"))
data["amount"] = pd.to_numeric(data["amount"],
                               downcast="float", errors="raise")
data["amount"][0:10]

0      21.0
1     700.0
2    5000.0
3     250.0
4      50.0
5     100.0
6      50.0
7      50.0
8     100.0
9     100.0
Name: amount, dtype: float32

We could also convert `contributor_zip` from an object to an integer, but since we are not going to be adding zipcodes, I didn't bother reformatting that column.

### 3. Create Subsets of Data

Initially, since we are concerned about the size of our CSV files, we will drop any additional columns not relevant to these subsets of data using the candidates df.  
  
__governors__ : The set of all state governor races 2007-2017  
  
__atty_generals__ : The set of all state attorney general races 2007-2017  
  
__mayors__ : The set of all city mayoral races 2007-2017  
  
__state_senators__ : The set of all state senate races 2007-2017  

And because we also want to look at contribution amounts over time, we'll create CSV files with the data from each year.

In [6]:
candidates = data[data["type"]=="Candidate"]
candidates.drop(["type",
                 "legislative_district",
                 "ballot_number",
                 "for_or_against"], axis=1, inplace=True)

governors = candidates[candidates["office"] == "GOVERNOR"]
print("governors df:", str(governors.shape))
governors.to_csv("data/governors.csv",
                 encoding='utf-8', index=False)

atty_generals = candidates[candidates["office"] == "ATTORNEY GENERAL"]
print("atty_generals df:", str(atty_generals.shape))
atty_generals.to_csv("data/atty_generals.csv",
                     encoding='utf-8', index=False)

mayors = candidates[candidates["office"] == "MAYOR"]
print("mayors df:", str(mayors.shape))

# If you only want Seattle Mayors
seattle_mayors = mayors[mayors["jurisdiction"] == "CITY OF SEATTLE"]
print("seattle_mayors df:", str(seattle_mayors.shape))

# Since the total number of rows in mayors is 69,649 we can keep them all
# instead of subsetting just the 32,785 of Seattle mayors
mayors.to_csv("data/mayors.csv",
              encoding='utf-8', index=False)

state_senators = candidates[candidates["office"] == "STATE SENATOR"]
print("state_senators df:", str(state_senators.shape))
state_senators.to_csv("data/state_senators.csv",
                      encoding='utf-8', index=False)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """


governors df: (330086, 22)
atty_generals df: (33109, 22)
mayors df: (69649, 22)
seattle_mayors df: (32785, 22)
state_senators df: (148623, 22)


In [8]:
display(governors.head(10), atty_generals.head(10),
        state_senators.head(10), mayors.head(10))

Unnamed: 0,id,filer_id,filer_name,office,party,jurisdiction,jurisdiction_type,election_year,amount,cash_or_in_kind,receipt_date,code,contributor_name,contributor_address,contributor_city,contributor_state,contributor_zip,contributor_occupation,contributor_employer_name,contributor_employer_city,contributor_employer_state,contributor_location
57,3763399.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,25.0,Cash,2011-11-26,Individual,SUGGETT SANDRA H.,12519 238TH ST SE,SNOHOMISH,WA,98296,,,,,"(47.7822, -122.06259)"
58,3763400.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,100.0,Cash,2011-11-26,Individual,DORSEY NANCY L.,1517 1ST ST,WENATCHEE,WA,98801,,,,,"(47.42512, -120.33991)"
59,3763401.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,100.0,Cash,2011-11-26,Individual,GRIM G. KEITH,2000 43RD AVE E APT 101,SEATTLE,WA,98112,RETIRED,,,,"(47.63716, -122.27651)"
60,3763402.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,25.0,Cash,2011-11-26,Individual,NELSON THELMA B.,12746 EVANSTON AVE N,SEATTLE,WA,98133,,,,,"(47.72272, -122.35132)"
61,3763403.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,50.0,Cash,2011-11-26,Individual,BODINE JANICE,585 SW MOUNT CEDAR DR,ISSAQUAH,WA,98027,RETIRED,,,,"(47.52707, -122.04664)"
62,3763404.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,100.0,Cash,2011-11-26,Individual,GASS PATRICIA L.,1305 N HIGHLANDS PKWY APT E4,TACOMA,WA,98406,RETIRED,,,,"(47.2628, -122.52276)"
63,3763405.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,50.0,Cash,2011-11-26,Individual,MCGEE DONNA J.,2322 SW 116TH ST,SEATTLE,WA,98146,,,,,"(47.49936, -122.36411)"
64,3763406.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,25.0,Cash,2011-11-26,Individual,FIFIELD CAROL M.,6672 HIGHWAY 291,NINE MILE FALLS,WA,99026,,,,,"(47.88924, -117.65953)"
65,3763407.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,25.0,Cash,2011-11-26,Individual,HANSON ELAINE B.,3703 MCCORMICK ST SE,OLYMPIA,WA,98501,,,,,"(47.01456, -122.87698)"
66,3763408.rcpt,MCKER2 006,MCKENNA ROBERT M,GOVERNOR,REPUBLICAN,"GOVERNOR, OFFICE OF",Statewide,2012,100.0,Cash,2011-11-26,Individual,DILLON JOHN G.,PO BOX 2208,GIG HARBOR,WA,98335,RETIRED,,,,"(47.30572, -122.59775)"


Unnamed: 0,id,filer_id,filer_name,office,party,jurisdiction,jurisdiction_type,election_year,amount,cash_or_in_kind,receipt_date,code,contributor_name,contributor_address,contributor_city,contributor_state,contributor_zip,contributor_occupation,contributor_employer_name,contributor_employer_city,contributor_employer_state,contributor_location
1077,2459995.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,50.0,Cash,2008-06-26,Other,ABBEY C. R.,17745 - 2ND AVE NW,SHORELINE,WA,98177,RETIRED,,,,"(47.75856, -122.36031)"
1078,2459996.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,20.0,Cash,2008-06-23,Other,AHO SUSAN,11652 SE 46TH ST,BELLEVUE,WA,98006,,,,,"(47.56389, -122.18289)"
1079,2459997.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,100.0,Cash,2008-06-29,Other,ALBERG KAY LYNN,2307 NW BLUE RIDGE DR,SEATTLE,WA,98177,AGRICULTURE,AK RESOURCES,SHORELINE,WA,"(47.70337, -122.38502)"
1080,2459998.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,25.0,Cash,2008-06-23,Other,ALLEN ROY.,2725 N HUNT RD,OAK HARBOR,WA,98277,,,,,"(48.3087, -122.61015)"
1081,2459999.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,15.0,Cash,2008-06-23,Other,ANDERSON C. ROBERT,721 S 304TH ST,FEDERAL WAY,WA,98003,,,,,"(47.32938, -122.32423)"
1082,2460000.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,1000.0,Cash,2008-06-27,Other,ANDERSON DAN & PORTIA,22250 SE 50TH STREET,ISSAQUAH,WA,98029,RETIRED,,,,"(47.55892, -122.04313)"
1083,2460001.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,250.0,Cash,2008-06-27,Other,ANDERSON DONALD,11520 MADERA DR SW,LAKEWOOD,WA,98499,ATTORNEY,EISENHOWER & CARLSON LLC,TACOMA,WA,"(47.15274, -122.5259)"
1084,2460002.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,25.0,Cash,2008-06-23,Other,ANDERSON EDWIN,8133 NE 115TH CT,KIRKLAND,WA,98034,,,,,"(47.70385, -122.23219)"
1085,2460003.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,500.0,Cash,2008-06-23,Other,ANDERSON HERMAN,35 COUNTRY CLUB DR SW,TACOMA,WA,98498,RETIRED,,,WA,"(47.13405, -122.54052)"
1086,2460004.rcpt,MCKER 006,MCKENNA ROBERT M,ATTORNEY GENERAL,REPUBLICAN,"ATTORNEY GENERAL, OFFICE OF",Statewide,2008,1.73,Cash,2008-06-29,Other,ANONYMOUS A.,UNKNOWN,UNKNOWN,WA,98004,,,,WA,"(47.62002, -122.20695)"


Unnamed: 0,id,filer_id,filer_name,office,party,jurisdiction,jurisdiction_type,election_year,amount,cash_or_in_kind,receipt_date,code,contributor_name,contributor_address,contributor_city,contributor_state,contributor_zip,contributor_occupation,contributor_employer_name,contributor_employer_city,contributor_employer_state,contributor_location
71,3763413.rcpt,DAMMB 374,DAMMEIER BRUCE F,STATE SENATOR,REPUBLICAN,LEG DISTRICT 25 - SENATE,Legislative,2012,150.0,Cash,2012-09-15,Individual,NASH WALLY,13408 110TH ST CT E,PUYALLUP,WA,98374,PRESIDENT,BO-NASH (NORTH AMERICA) INC.,PUYALLUP,WA,"(47.15572, -122.24984)"
72,3763414.rcpt,DAMMB 374,DAMMEIER BRUCE F,STATE SENATOR,REPUBLICAN,LEG DISTRICT 25 - SENATE,Legislative,2012,100.0,Cash,2012-09-14,Individual,TANABE LANE,19423 1ST AVENUE SOUTH,NORMANDY PARK,WA,98148,,,,,"(47.42812, -122.33578)"
112,3661650.rcpt,NASSC2 110,ROLFES CHRISTINE N,STATE SENATOR,DEMOCRAT,LEG DISTRICT 23 - SENATE,Legislative,2012,250.0,Cash,2012-07-30,Individual,SULLIVAN ELIZABETH,192 GARDEN PARKWAY,WILLIAMSVILLE,NY,14221,RETIRED,,,,"(42.95581, -78.748401)"
156,3763498.rcpt,HATFB 577,HATFIELD BRIAN A,STATE SENATOR,DEMOCRAT,LEG DISTRICT 19 - SENATE,Legislative,2012,300.0,Cash,2012-09-17,Political Action Committee,ABBOTT LABORATORIES EMPLOYEE PAC,100 ABBOTT PARK ROAD,ABBOTT PARK,IL,60064,,,,,"(42.30236, -87.890831)"
157,3763499.rcpt,HATFB 577,HATFIELD BRIAN A,STATE SENATOR,DEMOCRAT,LEG DISTRICT 19 - SENATE,Legislative,2012,250.0,Cash,2012-09-17,Political Action Committee,WA RESTAURANT ASSN PAC,"510 PLUM ST. SE, SUITE 200",OLYMPIA,WA,98501,,,,,"(47.04444, -122.89244)"
158,3763500.rcpt,HATFB 577,HATFIELD BRIAN A,STATE SENATOR,DEMOCRAT,LEG DISTRICT 19 - SENATE,Legislative,2012,500.0,Cash,2012-09-18,Business,PROCTER & GAMBLE COMPANY GOOD GOVERNMENT COMMI...,ONE PROCTER & GAMBLE PLAZA,CINCINNATI,OH,45202,,,,,"(39.103, -84.50658)"
159,3763501.rcpt,HATFB 577,HATFIELD BRIAN A,STATE SENATOR,DEMOCRAT,LEG DISTRICT 19 - SENATE,Legislative,2012,300.0,Cash,2012-09-19,Political Action Committee,WA DAIRY PAC,"575 E. MAIN STREET, SUITE #2",ELMA,WA,98541,,,,,"(47.00522, -123.39266)"
160,3763502.rcpt,HATFB 577,HATFIELD BRIAN A,STATE SENATOR,DEMOCRAT,LEG DISTRICT 19 - SENATE,Legislative,2012,900.0,Cash,2012-09-20,Political Action Committee,WA ST COUNCIL OF FIREFIGHTERS,1069 ADAMS STREET SE,OLYMPIA,WA,98501,,,,,"(47.03855, -122.89749)"
161,3763503.rcpt,HATFB 577,HATFIELD BRIAN A,STATE SENATOR,DEMOCRAT,LEG DISTRICT 19 - SENATE,Legislative,2012,500.0,Cash,2012-09-20,Union,UNITED ASSOCIATION OF JOURNEYMEN & APPRENTICES,8501 ZENITH CT. N.E.,LACEY,WA,98516,,,,,"(47.06918, -122.75989)"
162,3763504.rcpt,HATFB 577,HATFIELD BRIAN A,STATE SENATOR,DEMOCRAT,LEG DISTRICT 19 - SENATE,Legislative,2012,900.0,Cash,2012-09-21,Political Action Committee,CENTURYLINK WA PAC,"1600 - 7TH AVE., RM. 1508",SEATTLE,WA,98191,,,,,"(47.60621, -122.33206)"


Unnamed: 0,id,filer_id,filer_name,office,party,jurisdiction,jurisdiction_type,election_year,amount,cash_or_in_kind,receipt_date,code,contributor_name,contributor_address,contributor_city,contributor_state,contributor_zip,contributor_occupation,contributor_employer_name,contributor_employer_city,contributor_employer_state,contributor_location
1734,1849785.rcpt,KEEND 225,KEENAN DON H,MAYOR,NON PARTISAN,CITY OF BELLINGHAM,Local,2007,320.0,In kind,2007-08-23,Business,RODERICK C BURTON ART & DESIGN,238 N. FOREST STREET,BELLINGHAM,WA,98225,,,,,"(48.73696, -122.49187)"
1769,1849871.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,50.0,Cash,2007-09-09,Individual,CUYKENDALL CLYDIA,4209 AMBER CT SE,OLYMPIA,WA,98501,,,,,"(47.00833, -122.87025)"
1770,1849872.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,50.0,Cash,2007-09-09,Individual,MCPHEE MARGARET,3512 COUNTRY CLUB DR NW,OLYMPIA,WA,98502,,,,,"(47.08272, -122.9357)"
1771,1849873.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,100.0,Cash,2007-09-09,Individual,RAY BEN,8049 68TH LOOP SE,OLYMPIA,WA,98513,,,,,"(46.9863, -122.76787)"
1772,1849874.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,100.0,Cash,2007-09-09,Individual,PARSONS MIKE & STEFANI,4001 BERWICK LN SE,OLYMPIA,WA,98501,,,,,"(46.99347, -122.83494)"
1773,1849875.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,75.0,Cash,2007-09-09,Individual,WILLIAMS BRENDAN,3901 MORTON CT SE,OLYMPIA,WA,98501,,,,,"(47.01254, -122.88119)"
1774,1849876.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,50.0,Cash,2007-09-09,Individual,CARLSON DENNIS & LUCILLE,1517 BRIARWOOD CT NW,OLYMPIA,WA,98502,,,,,"(47.05984, -122.94928)"
1775,1849877.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,100.0,Cash,2007-09-09,Individual,VAN WAGENEN DICK,1503 5TH AVE SE,OLYMPIA,WA,98501,LAWYER,STATE OF WASHINGTON,OLYMPIA,WA,"(47.04495, -122.88162)"
1776,1849878.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,100.0,Cash,2007-09-09,Individual,SHAPIRO SCOTT,2621 2ND AVE #1005,SEATTLE,WA,98121,,,,,"(47.61605, -122.34963)"
1777,1849879.rcpt,MAH D 504,MAH DOUGLAS A,MAYOR,NON PARTISAN,CITY OF OLYMPIA,Local,2007,50.0,Cash,2007-09-09,Individual,AHLF ANGIE,7501 MAGNOLIA CT SE,LACEY,WA,98503,,,,,"(47.01156, -122.78025)"


In [None]:
df = data[["type", # Candidate or Political Committee
           "filer_name",
           "party", # DEMOCRAT, REPUBLICAN, NON PARTISAN
           "jurisdiction", # e.g. 'APPEALS, COURT OF DIV I', 'APPEALS, COURT OF DIV II'
           "election_year", # 2007 - 2017
           "amount",
           "code", # Other, Individual, Political Action Committee, Business, Party
           "contributor_name",
           "contributor_city",
           "contributor_state",
           "contributor_occupation"]]

df_2007 = df[df["election_year"] == 2007]
df_2008 = df[df["election_year"] == 2008]
df_2009 = df[df["election_year"] == 2009]
df_2010 = df[df["election_year"] == 2010]
df_2011 = df[df["election_year"] == 2011]
df_2012 = df[df["election_year"] == 2012]
df_2013 = df[df["election_year"] == 2013]
df_2014 = df[df["election_year"] == 2014]
df_2015 = df[df["election_year"] == 2015]
df_2016 = df[df["election_year"] == 2016]
df_2017 = df[df["election_year"] == 2017]
dfs = [df_2017, df_2016, df_2015, df_2014, df_2013,
       df_2012, df_2011, df_2010, df_2009, df_2008, df_2007]
df_index = 0
for i in range(2017, 2006, -1):
    file_name = "data/electyr_"+str(i)+".csv"
    save_df = dfs[df_index]
    save_df.to_csv(file_name, encoding='utf-8', index=False)
    print(file_name, "saved")
    df_index += 1