# Out-of-State-Contributions: Contributors Analysis

How much out-of-state money are contributors donating in the 2018 election cycle thus far and how does that compare with this point in the 2014 and 2010 cycles?

In [1]:
import numpy as np
import pandas as pd

pd.set_option("display.max_columns", 100)
pd.set_option("display.max_rows", 500)
pd.options.display.float_format = "{:,.2f}".format # Format floats

Import contributions data.

In [2]:
contributions = pd.read_csv("data/contributions.csv")
contributions.info()

  interactivity=interactivity, compiler=compiler, result=result)


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6947770 entries, 0 to 6947769
Data columns (total 23 columns):
candidate                 object
candidate_id              int64
year                      int64
state                     object
party                     object
election_status           object
contributor               object
amount                    float64
date                      object
contributor_street        object
contributor_city          object
contributor_state         object
contributor_zip           float64
in_out_state              object
no_veto                   object
office                    object
last_day                  object
redistricting_role        object
independent_commission    object
single_house_district     object
standardized_office       object
standardized_status       object
two_year_term             object
dtypes: float64(2), int64(2), object(19)
memory usage: 1.2+ GB


Convert the contribution date and last day columns to datetime data type.

In [3]:
contributions["date"] = pd.to_datetime(contributions["date"], errors="coerce")
contributions["last_day"] = pd.to_datetime(contributions["last_day"], errors="coerce")
contributions.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6947770 entries, 0 to 6947769
Data columns (total 23 columns):
candidate                 object
candidate_id              int64
year                      int64
state                     object
party                     object
election_status           object
contributor               object
amount                    float64
date                      datetime64[ns]
contributor_street        object
contributor_city          object
contributor_state         object
contributor_zip           float64
in_out_state              object
no_veto                   object
office                    object
last_day                  datetime64[ns]
redistricting_role        object
independent_commission    object
single_house_district     object
standardized_office       object
standardized_status       object
two_year_term             object
dtypes: datetime64[ns](2), float64(2), int64(2), object(17)
memory usage: 1.2+ GB


## Is there a difference in the average out-of-state contribution to Democratic vs. Republican candidates?

Group by year, party, office and in-vs.-out-of-state contribution status and calculate the total contributions, average contributions, median contributions and number of contributions per group.

In [4]:
contributions_by_party_office = contributions.groupby(["year", "party", "standardized_office", "in_out_state"])["amount"].agg([sum, np.average, "median", len]).reset_index()
contributions_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,249405839.52,816.77,100.0,305356.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,35779114.12,1071.46,250.0,33393.0
2,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,unknown,369604.83,378.69,100.0,976.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,272893234.09,414.35,100.0,658609.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,28680059.56,494.21,250.0,58032.0
5,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,unknown,1342191.95,340.49,100.0,3942.0
6,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,in-state,1001559.24,463.47,300.0,2161.0
7,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,143053.7,529.83,500.0,270.0
8,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,unknown,-19438.72,-1295.91,-75.0,15.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,482613987.05,1106.53,100.0,436151.0


Drop non-major parties and unknown contribution statuses.

In [5]:
contributions_by_party_office = contributions_by_party_office[((contributions_by_party_office["party"] == "Democratic") | (contributions_by_party_office["party"] == "Republican")) & (contributions_by_party_office["in_out_state"] != "unknown")]
contributions_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,249405839.52,816.77,100.0,305356.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,35779114.12,1071.46,250.0,33393.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,272893234.09,414.35,100.0,658609.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,28680059.56,494.21,250.0,58032.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,482613987.05,1106.53,100.0,436151.0
10,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,39151620.03,1168.91,200.0,33494.0
12,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,in-state,231341884.21,414.63,125.0,557950.0
13,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,19206369.09,489.93,350.0,39202.0
21,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,208592453.21,606.98,50.0,343657.0
22,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,45439296.05,490.08,35.0,92719.0


Pivot dataframe to aggregate each year and office's data in a single row.

In [6]:
contributions_by_party_office = pd.pivot_table(contributions_by_party_office, index=["year", "standardized_office"], columns=["party", "in_out_state"]).reset_index()
contributions_by_party_office

Unnamed: 0_level_0,year,standardized_office,average,average,average,average,len,len,len,len,median,median,median,median,sum,sum,sum,sum
party,Unnamed: 1_level_1,Unnamed: 2_level_1,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican
in_out_state,Unnamed: 1_level_2,Unnamed: 2_level_2,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,816.77,1071.46,1106.53,1168.91,305356.0,33393.0,436151.0,33494.0,100.0,250.0,100.0,200.0,249405839.52,35779114.12,482613987.05,39151620.03
1,2010,STATE HOUSE/ASSEMBLY/SENATE,414.35,494.21,414.63,489.93,658609.0,58032.0,557950.0,39202.0,100.0,250.0,125.0,350.0,272893234.09,28680059.56,231341884.21,19206369.09
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,606.98,490.08,676.67,466.36,343657.0,92719.0,335088.0,110509.0,50.0,35.0,100.0,50.0,208592453.21,45439296.05,226742799.24,51536542.44
3,2014,STATE HOUSE/ASSEMBLY/SENATE,423.85,411.31,539.82,603.68,564875.0,74246.0,532252.0,46714.0,100.0,100.0,200.0,500.0,239422578.44,30538086.59,287319524.73,28200167.11
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,901.61,278.51,1056.14,286.4,495128.0,214778.0,465667.0,148855.0,50.0,25.0,80.0,50.0,446414596.86,59816942.73,491807244.52,42632032.87
5,2018,STATE HOUSE/ASSEMBLY/SENATE,378.54,356.38,656.45,732.62,662957.0,116008.0,436078.0,39711.0,100.0,100.0,250.0,500.0,250955888.64,41343033.37,286263804.75,29093022.07


Flatten the resulting dataframe's multi-index columns.

In [7]:
contributions_by_party_office.columns = ["year", "standardized_office",
                                  "avg_dem_in_state", "avg_dem_out_of_state",
                                  "avg_rep_in_state", "avg_rep_out_of_state",
                                  "num_dem_in_state", "num_dem_out_of_state",
                                  "num_rep_in_state", "num_rep_out_of_state",
                                  "med_dem_in_state", "med_dem_out_of_state",
                                  "med_rep_in_state", "med_rep_out_of_state",
                                  "sum_dem_in_state", "sum_dem_out_of_state",
                                  "sum_rep_in_state", "sum_rep_out_of_state"
                                  ]                          
contributions_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,816.77,1071.46,1106.53,1168.91,305356.0,33393.0,436151.0,33494.0,100.0,250.0,100.0,200.0,249405839.52,35779114.12,482613987.05,39151620.03
1,2010,STATE HOUSE/ASSEMBLY/SENATE,414.35,494.21,414.63,489.93,658609.0,58032.0,557950.0,39202.0,100.0,250.0,125.0,350.0,272893234.09,28680059.56,231341884.21,19206369.09
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,606.98,490.08,676.67,466.36,343657.0,92719.0,335088.0,110509.0,50.0,35.0,100.0,50.0,208592453.21,45439296.05,226742799.24,51536542.44
3,2014,STATE HOUSE/ASSEMBLY/SENATE,423.85,411.31,539.82,603.68,564875.0,74246.0,532252.0,46714.0,100.0,100.0,200.0,500.0,239422578.44,30538086.59,287319524.73,28200167.11
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,901.61,278.51,1056.14,286.4,495128.0,214778.0,465667.0,148855.0,50.0,25.0,80.0,50.0,446414596.86,59816942.73,491807244.52,42632032.87
5,2018,STATE HOUSE/ASSEMBLY/SENATE,378.54,356.38,656.45,732.62,662957.0,116008.0,436078.0,39711.0,100.0,100.0,250.0,500.0,250955888.64,41343033.37,286263804.75,29093022.07


Calculate the difference between the parties.

In [8]:
contributions_by_party_office["diff_avg_in_state"] = (contributions_by_party_office["avg_rep_in_state"] - contributions_by_party_office["avg_dem_in_state"]) / contributions_by_party_office["avg_dem_in_state"]
contributions_by_party_office["diff_avg_out_of_state"] = (contributions_by_party_office["avg_rep_out_of_state"] - contributions_by_party_office["avg_dem_out_of_state"]) / contributions_by_party_office["avg_dem_out_of_state"]
contributions_by_party_office["diff_num_in_state"] = (contributions_by_party_office["num_rep_in_state"] - contributions_by_party_office["num_dem_in_state"]) / contributions_by_party_office["num_dem_in_state"]
contributions_by_party_office["diff_num_out_of_state"] = (contributions_by_party_office["num_rep_out_of_state"] - contributions_by_party_office["num_dem_out_of_state"]) / contributions_by_party_office["num_dem_out_of_state"]
contributions_by_party_office["diff_med_in_state"] = (contributions_by_party_office["med_rep_in_state"] - contributions_by_party_office["med_dem_in_state"]) / contributions_by_party_office["med_dem_in_state"]
contributions_by_party_office["diff_med_out_of_state"] = (contributions_by_party_office["med_rep_out_of_state"] - contributions_by_party_office["med_dem_out_of_state"]) / contributions_by_party_office["med_dem_out_of_state"]
contributions_by_party_office["diff_sum_in_state"] = (contributions_by_party_office["sum_rep_in_state"] - contributions_by_party_office["sum_dem_in_state"]) / contributions_by_party_office["sum_dem_in_state"]
contributions_by_party_office["diff_sum_out_of_state"] = (contributions_by_party_office["sum_rep_out_of_state"] - contributions_by_party_office["sum_dem_out_of_state"]) / contributions_by_party_office["sum_dem_out_of_state"]
contributions_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state,diff_avg_in_state,diff_avg_out_of_state,diff_num_in_state,diff_num_out_of_state,diff_med_in_state,diff_med_out_of_state,diff_sum_in_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,816.77,1071.46,1106.53,1168.91,305356.0,33393.0,436151.0,33494.0,100.0,250.0,100.0,200.0,249405839.52,35779114.12,482613987.05,39151620.03,0.35,0.09,0.43,0.0,0.0,-0.2,0.94,0.09
1,2010,STATE HOUSE/ASSEMBLY/SENATE,414.35,494.21,414.63,489.93,658609.0,58032.0,557950.0,39202.0,100.0,250.0,125.0,350.0,272893234.09,28680059.56,231341884.21,19206369.09,0.0,-0.01,-0.15,-0.32,0.25,0.4,-0.15,-0.33
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,606.98,490.08,676.67,466.36,343657.0,92719.0,335088.0,110509.0,50.0,35.0,100.0,50.0,208592453.21,45439296.05,226742799.24,51536542.44,0.11,-0.05,-0.02,0.19,1.0,0.43,0.09,0.13
3,2014,STATE HOUSE/ASSEMBLY/SENATE,423.85,411.31,539.82,603.68,564875.0,74246.0,532252.0,46714.0,100.0,100.0,200.0,500.0,239422578.44,30538086.59,287319524.73,28200167.11,0.27,0.47,-0.06,-0.37,1.0,4.0,0.2,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,901.61,278.51,1056.14,286.4,495128.0,214778.0,465667.0,148855.0,50.0,25.0,80.0,50.0,446414596.86,59816942.73,491807244.52,42632032.87,0.17,0.03,-0.06,-0.31,0.6,1.0,0.1,-0.29
5,2018,STATE HOUSE/ASSEMBLY/SENATE,378.54,356.38,656.45,732.62,662957.0,116008.0,436078.0,39711.0,100.0,100.0,250.0,500.0,250955888.64,41343033.37,286263804.75,29093022.07,0.73,1.06,-0.34,-0.66,1.5,4.0,0.14,-0.3


Rearrange the columns.

In [9]:
contributions_by_party_office = contributions_by_party_office[["year", "standardized_office",
                                  "num_dem_in_state", "num_rep_in_state", "diff_num_in_state",
                                  "num_dem_out_of_state", "num_rep_out_of_state", "diff_num_out_of_state",
                                  "avg_dem_in_state", "avg_rep_in_state", "diff_avg_in_state",
                                  "avg_dem_out_of_state", "avg_rep_out_of_state", "diff_avg_out_of_state",
                                  "med_dem_in_state", "med_rep_in_state", "diff_med_in_state",
                                  "med_dem_out_of_state", "med_rep_out_of_state", "diff_med_out_of_state",
                                  "sum_dem_in_state", "sum_rep_in_state", "diff_sum_in_state",
                                  "sum_dem_out_of_state", "sum_rep_out_of_state", "diff_sum_out_of_state"]]
contributions_by_party_office

Unnamed: 0,year,standardized_office,num_dem_in_state,num_rep_in_state,diff_num_in_state,num_dem_out_of_state,num_rep_out_of_state,diff_num_out_of_state,avg_dem_in_state,avg_rep_in_state,diff_avg_in_state,avg_dem_out_of_state,avg_rep_out_of_state,diff_avg_out_of_state,med_dem_in_state,med_rep_in_state,diff_med_in_state,med_dem_out_of_state,med_rep_out_of_state,diff_med_out_of_state,sum_dem_in_state,sum_rep_in_state,diff_sum_in_state,sum_dem_out_of_state,sum_rep_out_of_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,305356.0,436151.0,0.43,33393.0,33494.0,0.0,816.77,1106.53,0.35,1071.46,1168.91,0.09,100.0,100.0,0.0,250.0,200.0,-0.2,249405839.52,482613987.05,0.94,35779114.12,39151620.03,0.09
1,2010,STATE HOUSE/ASSEMBLY/SENATE,658609.0,557950.0,-0.15,58032.0,39202.0,-0.32,414.35,414.63,0.0,494.21,489.93,-0.01,100.0,125.0,0.25,250.0,350.0,0.4,272893234.09,231341884.21,-0.15,28680059.56,19206369.09,-0.33
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,343657.0,335088.0,-0.02,92719.0,110509.0,0.19,606.98,676.67,0.11,490.08,466.36,-0.05,50.0,100.0,1.0,35.0,50.0,0.43,208592453.21,226742799.24,0.09,45439296.05,51536542.44,0.13
3,2014,STATE HOUSE/ASSEMBLY/SENATE,564875.0,532252.0,-0.06,74246.0,46714.0,-0.37,423.85,539.82,0.27,411.31,603.68,0.47,100.0,200.0,1.0,100.0,500.0,4.0,239422578.44,287319524.73,0.2,30538086.59,28200167.11,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,495128.0,465667.0,-0.06,214778.0,148855.0,-0.31,901.61,1056.14,0.17,278.51,286.4,0.03,50.0,80.0,0.6,25.0,50.0,1.0,446414596.86,491807244.52,0.1,59816942.73,42632032.87,-0.29
5,2018,STATE HOUSE/ASSEMBLY/SENATE,662957.0,436078.0,-0.34,116008.0,39711.0,-0.66,378.54,656.45,0.73,356.38,732.62,1.06,100.0,250.0,1.5,100.0,500.0,4.0,250955888.64,286263804.75,0.14,41343033.37,29093022.07,-0.3


## Is there a difference in the average out-of-state contributor to Democratic vs. Republican candidates?

First, group the data by contributor.

In [10]:
contributors = contributions.groupby(["year", "party", "standardized_office", "in_out_state", "contributor"])["amount"].sum().reset_index()
contributors.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3824804 entries, 0 to 3824803
Data columns (total 6 columns):
year                   int64
party                  object
standardized_office    object
in_out_state           object
contributor            object
amount                 float64
dtypes: float64(1), int64(1), object(4)
memory usage: 175.1+ MB


Then group by year, party, office and in-vs.-out-of-state contribution status and calculate the total contributions, average contributions, median contributions and number of contributors per group.

In [11]:
contributors_by_party_office = contributors.groupby(["year", "party", "standardized_office", "in_out_state"])["amount"].agg([sum, np.average, "median", len]).reset_index()
contributors_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,249405839.52,1236.21,175.0,201750.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,35779114.12,1485.84,250.0,24080.0
2,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,unknown,369604.83,605.91,250.0,610.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,272893234.09,811.46,100.0,336297.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,28679809.56,1026.85,120.0,27930.0
5,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,unknown,1342191.95,446.5,100.0,3006.0
6,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,in-state,1001559.24,2491.44,500.0,402.0
7,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,143053.7,2135.13,1000.0,67.0
8,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,unknown,-19438.72,-1495.29,-500.0,13.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,482613812.05,1731.69,135.0,278695.0


Drop non-major parties and unknown contribution statuses.

In [12]:
contributors_by_party_office = contributors_by_party_office[((contributors_by_party_office["party"] == "Democratic") | (contributors_by_party_office["party"] == "Republican")) & (contributors_by_party_office["in_out_state"] != "unknown")]
contributors_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,249405839.52,1236.21,175.0,201750.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,35779114.12,1485.84,250.0,24080.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,272893234.09,811.46,100.0,336297.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,28679809.56,1026.85,120.0,27930.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,482613812.05,1731.69,135.0,278695.0
10,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,39151620.03,1531.39,250.0,25566.0
12,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,in-state,231341884.21,741.93,125.0,311809.0
13,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,19206319.09,1209.85,200.0,15875.0
21,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,208592453.21,1091.98,100.0,191022.0
22,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,45439296.05,825.45,95.0,55048.0


Pivot dataframe to aggregate each year and office's data in a single row.

In [13]:
contributors_by_party_office = pd.pivot_table(contributors_by_party_office, index=["year", "standardized_office"], columns=["party", "in_out_state"]).reset_index()
contributors_by_party_office

Unnamed: 0_level_0,year,standardized_office,average,average,average,average,len,len,len,len,median,median,median,median,sum,sum,sum,sum
party,Unnamed: 1_level_1,Unnamed: 2_level_1,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican
in_out_state,Unnamed: 1_level_2,Unnamed: 2_level_2,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,1236.21,1485.84,1731.69,1531.39,201750.0,24080.0,278695.0,25566.0,175.0,250.0,135.0,250.0,249405839.52,35779114.12,482613812.05,39151620.03
1,2010,STATE HOUSE/ASSEMBLY/SENATE,811.46,1026.85,741.93,1209.85,336297.0,27930.0,311809.0,15875.0,100.0,120.0,125.0,200.0,272893234.09,28679809.56,231341884.21,19206319.09
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,1091.98,825.45,1106.21,816.12,191022.0,55048.0,204972.0,63148.0,100.0,95.0,100.0,75.0,208592453.21,45439296.05,226742799.24,51536542.44
3,2014,STATE HOUSE/ASSEMBLY/SENATE,796.9,732.66,1031.8,1851.86,300444.0,41681.0,278465.0,15228.0,100.0,100.0,150.0,200.0,239422578.44,30538086.59,287319475.73,28200167.11
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,1808.5,495.66,2012.84,499.31,246842.0,120681.0,244335.0,85382.0,100.0,30.0,141.2,75.0,446414076.86,59816870.05,491806944.52,42632032.87
5,2018,STATE HOUSE/ASSEMBLY/SENATE,732.52,638.7,1236.44,2201.35,342590.0,64728.0,231522.0,13216.0,100.0,100.0,200.0,250.0,250955375.64,41341934.37,286263329.75,29093022.07


Flatten the resulting dataframe's multi-index columns.

In [14]:
contributors_by_party_office.columns = ["year", "standardized_office",
                                  "avg_dem_in_state", "avg_dem_out_of_state",
                                  "avg_rep_in_state", "avg_rep_out_of_state",
                                  "num_dem_in_state", "num_dem_out_of_state",
                                  "num_rep_in_state", "num_rep_out_of_state",
                                  "med_dem_in_state", "med_dem_out_of_state",
                                  "med_rep_in_state", "med_rep_out_of_state",
                                  "sum_dem_in_state", "sum_dem_out_of_state",
                                  "sum_rep_in_state", "sum_rep_out_of_state"
                                  ]
contributors_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,1236.21,1485.84,1731.69,1531.39,201750.0,24080.0,278695.0,25566.0,175.0,250.0,135.0,250.0,249405839.52,35779114.12,482613812.05,39151620.03
1,2010,STATE HOUSE/ASSEMBLY/SENATE,811.46,1026.85,741.93,1209.85,336297.0,27930.0,311809.0,15875.0,100.0,120.0,125.0,200.0,272893234.09,28679809.56,231341884.21,19206319.09
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,1091.98,825.45,1106.21,816.12,191022.0,55048.0,204972.0,63148.0,100.0,95.0,100.0,75.0,208592453.21,45439296.05,226742799.24,51536542.44
3,2014,STATE HOUSE/ASSEMBLY/SENATE,796.9,732.66,1031.8,1851.86,300444.0,41681.0,278465.0,15228.0,100.0,100.0,150.0,200.0,239422578.44,30538086.59,287319475.73,28200167.11
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,1808.5,495.66,2012.84,499.31,246842.0,120681.0,244335.0,85382.0,100.0,30.0,141.2,75.0,446414076.86,59816870.05,491806944.52,42632032.87
5,2018,STATE HOUSE/ASSEMBLY/SENATE,732.52,638.7,1236.44,2201.35,342590.0,64728.0,231522.0,13216.0,100.0,100.0,200.0,250.0,250955375.64,41341934.37,286263329.75,29093022.07


In [15]:
contributors_by_party_office["diff_avg_in_state"] = (contributors_by_party_office["avg_rep_in_state"] - contributors_by_party_office["avg_dem_in_state"]) / contributors_by_party_office["avg_dem_in_state"]
contributors_by_party_office["diff_avg_out_of_state"] = (contributors_by_party_office["avg_rep_out_of_state"] - contributors_by_party_office["avg_dem_out_of_state"]) / contributors_by_party_office["avg_dem_out_of_state"]
contributors_by_party_office["diff_num_in_state"] = (contributors_by_party_office["num_rep_in_state"] - contributors_by_party_office["num_dem_in_state"]) / contributors_by_party_office["num_dem_in_state"]
contributors_by_party_office["diff_num_out_of_state"] = (contributors_by_party_office["num_rep_out_of_state"] - contributors_by_party_office["num_dem_out_of_state"]) / contributors_by_party_office["num_dem_out_of_state"]
contributors_by_party_office["diff_med_in_state"] = (contributors_by_party_office["med_rep_in_state"] - contributors_by_party_office["med_dem_in_state"]) / contributors_by_party_office["med_dem_in_state"]
contributors_by_party_office["diff_med_out_of_state"] = (contributors_by_party_office["med_rep_out_of_state"] - contributors_by_party_office["med_dem_out_of_state"]) / contributors_by_party_office["med_dem_out_of_state"]
contributors_by_party_office["diff_sum_in_state"] = (contributors_by_party_office["sum_rep_in_state"] - contributors_by_party_office["sum_dem_in_state"]) / contributors_by_party_office["sum_dem_in_state"]
contributors_by_party_office["diff_sum_out_of_state"] = (contributors_by_party_office["sum_rep_out_of_state"] - contributors_by_party_office["sum_dem_out_of_state"]) / contributors_by_party_office["sum_dem_out_of_state"]
contributors_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state,diff_avg_in_state,diff_avg_out_of_state,diff_num_in_state,diff_num_out_of_state,diff_med_in_state,diff_med_out_of_state,diff_sum_in_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,1236.21,1485.84,1731.69,1531.39,201750.0,24080.0,278695.0,25566.0,175.0,250.0,135.0,250.0,249405839.52,35779114.12,482613812.05,39151620.03,0.4,0.03,0.38,0.06,-0.23,0.0,0.94,0.09
1,2010,STATE HOUSE/ASSEMBLY/SENATE,811.46,1026.85,741.93,1209.85,336297.0,27930.0,311809.0,15875.0,100.0,120.0,125.0,200.0,272893234.09,28679809.56,231341884.21,19206319.09,-0.09,0.18,-0.07,-0.43,0.25,0.67,-0.15,-0.33
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,1091.98,825.45,1106.21,816.12,191022.0,55048.0,204972.0,63148.0,100.0,95.0,100.0,75.0,208592453.21,45439296.05,226742799.24,51536542.44,0.01,-0.01,0.07,0.15,0.0,-0.21,0.09,0.13
3,2014,STATE HOUSE/ASSEMBLY/SENATE,796.9,732.66,1031.8,1851.86,300444.0,41681.0,278465.0,15228.0,100.0,100.0,150.0,200.0,239422578.44,30538086.59,287319475.73,28200167.11,0.29,1.53,-0.07,-0.63,0.5,1.0,0.2,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,1808.5,495.66,2012.84,499.31,246842.0,120681.0,244335.0,85382.0,100.0,30.0,141.2,75.0,446414076.86,59816870.05,491806944.52,42632032.87,0.11,0.01,-0.01,-0.29,0.41,1.5,0.1,-0.29
5,2018,STATE HOUSE/ASSEMBLY/SENATE,732.52,638.7,1236.44,2201.35,342590.0,64728.0,231522.0,13216.0,100.0,100.0,200.0,250.0,250955375.64,41341934.37,286263329.75,29093022.07,0.69,2.45,-0.32,-0.8,1.0,1.5,0.14,-0.3


Rearrange the columns.

In [16]:
contributors_by_party_office = contributors_by_party_office[["year", "standardized_office",
                                  "num_dem_in_state", "num_rep_in_state", "diff_num_in_state",
                                  "num_dem_out_of_state", "num_rep_out_of_state", "diff_num_out_of_state",
                                  "avg_dem_in_state", "avg_rep_in_state", "diff_avg_in_state",
                                  "avg_dem_out_of_state", "avg_rep_out_of_state", "diff_avg_out_of_state",
                                  "med_dem_in_state", "med_rep_in_state", "diff_med_in_state",
                                  "med_dem_out_of_state", "med_rep_out_of_state", "diff_med_out_of_state",
                                  "sum_dem_in_state", "sum_rep_in_state", "diff_sum_in_state",
                                  "sum_dem_out_of_state", "sum_rep_out_of_state", "diff_sum_out_of_state"]]
contributors_by_party_office

Unnamed: 0,year,standardized_office,num_dem_in_state,num_rep_in_state,diff_num_in_state,num_dem_out_of_state,num_rep_out_of_state,diff_num_out_of_state,avg_dem_in_state,avg_rep_in_state,diff_avg_in_state,avg_dem_out_of_state,avg_rep_out_of_state,diff_avg_out_of_state,med_dem_in_state,med_rep_in_state,diff_med_in_state,med_dem_out_of_state,med_rep_out_of_state,diff_med_out_of_state,sum_dem_in_state,sum_rep_in_state,diff_sum_in_state,sum_dem_out_of_state,sum_rep_out_of_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,201750.0,278695.0,0.38,24080.0,25566.0,0.06,1236.21,1731.69,0.4,1485.84,1531.39,0.03,175.0,135.0,-0.23,250.0,250.0,0.0,249405839.52,482613812.05,0.94,35779114.12,39151620.03,0.09
1,2010,STATE HOUSE/ASSEMBLY/SENATE,336297.0,311809.0,-0.07,27930.0,15875.0,-0.43,811.46,741.93,-0.09,1026.85,1209.85,0.18,100.0,125.0,0.25,120.0,200.0,0.67,272893234.09,231341884.21,-0.15,28679809.56,19206319.09,-0.33
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,191022.0,204972.0,0.07,55048.0,63148.0,0.15,1091.98,1106.21,0.01,825.45,816.12,-0.01,100.0,100.0,0.0,95.0,75.0,-0.21,208592453.21,226742799.24,0.09,45439296.05,51536542.44,0.13
3,2014,STATE HOUSE/ASSEMBLY/SENATE,300444.0,278465.0,-0.07,41681.0,15228.0,-0.63,796.9,1031.8,0.29,732.66,1851.86,1.53,100.0,150.0,0.5,100.0,200.0,1.0,239422578.44,287319475.73,0.2,30538086.59,28200167.11,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,246842.0,244335.0,-0.01,120681.0,85382.0,-0.29,1808.5,2012.84,0.11,495.66,499.31,0.01,100.0,141.2,0.41,30.0,75.0,1.5,446414076.86,491806944.52,0.1,59816870.05,42632032.87,-0.29
5,2018,STATE HOUSE/ASSEMBLY/SENATE,342590.0,231522.0,-0.32,64728.0,13216.0,-0.8,732.52,1236.44,0.69,638.7,2201.35,2.45,100.0,200.0,1.0,100.0,250.0,1.5,250955375.64,286263329.75,0.14,41341934.37,29093022.07,-0.3


## Export the data

In [17]:
writer = pd.ExcelWriter("data/contributors_analysis.xlsx")
contributions_by_party_office.to_excel(writer, "contributions_by_party_office", index=False)
contributors_by_party_office.to_excel(writer, "contributors_by_party_office", index=False)
writer.save()