# Out-of-State-Contributions: Contributors Analysis

How much out-of-state money are contributors donating in the 2018 election cycle thus far and how does that compare with this point in the 2014 and 2010 cycles?

In [1]:
import numpy as np
import pandas as pd

%load_ext jupyternotify

pd.set_option("display.max_columns", 100)
pd.set_option("display.max_rows", 500)
pd.options.display.float_format = "{:,.2f}".format # Format floats

<IPython.core.display.Javascript object>

Import contributions data.

In [2]:
%%notify
contributions = pd.read_csv("data/contributions.csv")
contributions.info()

  interactivity=interactivity, compiler=compiler, result=result)


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6514274 entries, 0 to 6514273
Data columns (total 23 columns):
candidate                 object
candidate_id              int64
year                      int64
state                     object
party                     object
election_status           object
contributor               object
amount                    float64
date                      object
contributor_street        object
contributor_city          object
contributor_state         object
contributor_zip           float64
in_out_state              object
no_veto                   object
office                    object
latest_month              object
redistricting_role        object
independent_commission    object
single_house_district     object
standardized_office       object
standardized_status       object
two_year_term             object
dtypes: float64(2), int64(2), object(19)
memory usage: 1.1+ GB


<IPython.core.display.Javascript object>

Convert the contribution date and latest month columns to datetime data type.

In [3]:
contributions["date"] = pd.to_datetime(contributions["date"], errors="coerce")
contributions["latest_month"] = pd.to_datetime(contributions["latest_month"], errors="coerce")
contributions.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6514274 entries, 0 to 6514273
Data columns (total 23 columns):
candidate                 object
candidate_id              int64
year                      int64
state                     object
party                     object
election_status           object
contributor               object
amount                    float64
date                      datetime64[ns]
contributor_street        object
contributor_city          object
contributor_state         object
contributor_zip           float64
in_out_state              object
no_veto                   object
office                    object
latest_month              datetime64[ns]
redistricting_role        object
independent_commission    object
single_house_district     object
standardized_office       object
standardized_status       object
two_year_term             object
dtypes: datetime64[ns](2), float64(2), int64(2), object(17)
memory usage: 1.1+ GB


## Is there a difference in the average out-of-state contribution to Democratic vs. Republican candidates?

Group by year, party, office and in-vs.-out-of-state contribution status and calculate the total contributions, average contributions, median contributions and number of contributions per group.

In [4]:
contributions_by_party_office = contributions.groupby(["year", "party", "standardized_office", "in_out_state"])["amount"].agg([sum, np.average, "median", len]).reset_index()
contributions_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,233462186.85,871.1,100.0,268008.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,33830741.69,1129.91,250.0,29941.0
2,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,unknown,330453.83,375.52,100.0,880.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,246518283.49,399.5,100.0,617074.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,26870531.36,496.22,250.0,54150.0
5,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,unknown,1244914.43,333.58,100.0,3732.0
6,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,in-state,941726.0,459.15,300.0,2051.0
7,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,136753.7,530.05,500.0,258.0
8,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,unknown,-19688.72,-1406.34,-287.5,14.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,458345956.83,1110.01,100.0,412922.0


Drop non-major parties and unknown contribution statuses.

In [5]:
contributions_by_party_office = contributions_by_party_office[((contributions_by_party_office["party"] == "Democratic") | (contributions_by_party_office["party"] == "Republican")) & (contributions_by_party_office["in_out_state"] != "unknown")]
contributions_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,233462186.85,871.1,100.0,268008.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,33830741.69,1129.91,250.0,29941.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,246518283.49,399.5,100.0,617074.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,26870531.36,496.22,250.0,54150.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,458345956.83,1110.01,100.0,412922.0
10,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,33221930.04,1103.32,250.0,30111.0
12,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,in-state,204847421.32,395.42,125.0,518049.0
13,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,17813376.93,490.98,350.0,36281.0
21,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,194377098.85,631.94,50.0,307589.0
22,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,43113946.23,535.48,50.0,80514.0


Pivot dataframe to aggregate each year and office's data in a single row.

In [6]:
contributions_by_party_office = pd.pivot_table(contributions_by_party_office, index=["year", "standardized_office"], columns=["party", "in_out_state"]).reset_index()
contributions_by_party_office

Unnamed: 0_level_0,year,standardized_office,average,average,average,average,len,len,len,len,median,median,median,median,sum,sum,sum,sum
party,Unnamed: 1_level_1,Unnamed: 2_level_1,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican
in_out_state,Unnamed: 1_level_2,Unnamed: 2_level_2,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,871.1,1129.91,1110.01,1103.32,268008.0,29941.0,412922.0,30111.0,100.0,250.0,100.0,250.0,233462186.85,33830741.69,458345956.83,33221930.04
1,2010,STATE HOUSE/ASSEMBLY/SENATE,399.5,496.22,395.42,490.98,617074.0,54150.0,518049.0,36281.0,100.0,250.0,125.0,350.0,246518283.49,26870531.36,204847421.32,17813376.93
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,631.94,535.48,657.21,489.92,307589.0,80514.0,316912.0,100580.0,50.0,50.0,100.0,50.0,194377098.85,43113946.23,208278324.93,49276308.34
3,2014,STATE HOUSE/ASSEMBLY/SENATE,417.4,405.42,515.64,607.6,530690.0,71081.0,503061.0,43734.0,100.0,100.0,200.0,500.0,221510080.53,28817810.64,259399579.33,26572572.55
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,958.13,333.01,1076.55,292.78,453559.0,173209.0,453983.0,142747.0,50.0,25.0,100.0,50.0,434568674.95,57680264.43,488736935.24,41793755.71
5,2018,STATE HOUSE/ASSEMBLY/SENATE,380.29,360.48,655.32,733.68,651324.0,113349.0,429432.0,39146.0,100.0,100.0,250.0,500.0,247690649.42,40859889.7,281413250.95,28720743.88


Flatten the resulting dataframe's multi-index columns.

In [7]:
contributions_by_party_office.columns = ["year", "standardized_office",
                                  "avg_dem_in_state", "avg_dem_out_of_state",
                                  "avg_rep_in_state", "avg_rep_out_of_state",
                                  "num_dem_in_state", "num_dem_out_of_state",
                                  "num_rep_in_state", "num_rep_out_of_state",
                                  "med_dem_in_state", "med_dem_out_of_state",
                                  "med_rep_in_state", "med_rep_out_of_state",
                                  "sum_dem_in_state", "sum_dem_out_of_state",
                                  "sum_rep_in_state", "sum_rep_out_of_state"
                                  ]                          
contributions_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,871.1,1129.91,1110.01,1103.32,268008.0,29941.0,412922.0,30111.0,100.0,250.0,100.0,250.0,233462186.85,33830741.69,458345956.83,33221930.04
1,2010,STATE HOUSE/ASSEMBLY/SENATE,399.5,496.22,395.42,490.98,617074.0,54150.0,518049.0,36281.0,100.0,250.0,125.0,350.0,246518283.49,26870531.36,204847421.32,17813376.93
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,631.94,535.48,657.21,489.92,307589.0,80514.0,316912.0,100580.0,50.0,50.0,100.0,50.0,194377098.85,43113946.23,208278324.93,49276308.34
3,2014,STATE HOUSE/ASSEMBLY/SENATE,417.4,405.42,515.64,607.6,530690.0,71081.0,503061.0,43734.0,100.0,100.0,200.0,500.0,221510080.53,28817810.64,259399579.33,26572572.55
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,958.13,333.01,1076.55,292.78,453559.0,173209.0,453983.0,142747.0,50.0,25.0,100.0,50.0,434568674.95,57680264.43,488736935.24,41793755.71
5,2018,STATE HOUSE/ASSEMBLY/SENATE,380.29,360.48,655.32,733.68,651324.0,113349.0,429432.0,39146.0,100.0,100.0,250.0,500.0,247690649.42,40859889.7,281413250.95,28720743.88


Calculate the difference between the parties.

In [8]:
contributions_by_party_office["diff_avg_in_state"] = (contributions_by_party_office["avg_rep_in_state"] - contributions_by_party_office["avg_dem_in_state"]) / contributions_by_party_office["avg_dem_in_state"]
contributions_by_party_office["diff_avg_out_of_state"] = (contributions_by_party_office["avg_rep_out_of_state"] - contributions_by_party_office["avg_dem_out_of_state"]) / contributions_by_party_office["avg_dem_out_of_state"]
contributions_by_party_office["diff_num_in_state"] = (contributions_by_party_office["num_rep_in_state"] - contributions_by_party_office["num_dem_in_state"]) / contributions_by_party_office["num_dem_in_state"]
contributions_by_party_office["diff_num_out_of_state"] = (contributions_by_party_office["num_rep_out_of_state"] - contributions_by_party_office["num_dem_out_of_state"]) / contributions_by_party_office["num_dem_out_of_state"]
contributions_by_party_office["diff_med_in_state"] = (contributions_by_party_office["med_rep_in_state"] - contributions_by_party_office["med_dem_in_state"]) / contributions_by_party_office["med_dem_in_state"]
contributions_by_party_office["diff_med_out_of_state"] = (contributions_by_party_office["med_rep_out_of_state"] - contributions_by_party_office["med_dem_out_of_state"]) / contributions_by_party_office["med_dem_out_of_state"]
contributions_by_party_office["diff_sum_in_state"] = (contributions_by_party_office["sum_rep_in_state"] - contributions_by_party_office["sum_dem_in_state"]) / contributions_by_party_office["sum_dem_in_state"]
contributions_by_party_office["diff_sum_out_of_state"] = (contributions_by_party_office["sum_rep_out_of_state"] - contributions_by_party_office["sum_dem_out_of_state"]) / contributions_by_party_office["sum_dem_out_of_state"]
contributions_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state,diff_avg_in_state,diff_avg_out_of_state,diff_num_in_state,diff_num_out_of_state,diff_med_in_state,diff_med_out_of_state,diff_sum_in_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,871.1,1129.91,1110.01,1103.32,268008.0,29941.0,412922.0,30111.0,100.0,250.0,100.0,250.0,233462186.85,33830741.69,458345956.83,33221930.04,0.27,-0.02,0.54,0.01,0.0,0.0,0.96,-0.02
1,2010,STATE HOUSE/ASSEMBLY/SENATE,399.5,496.22,395.42,490.98,617074.0,54150.0,518049.0,36281.0,100.0,250.0,125.0,350.0,246518283.49,26870531.36,204847421.32,17813376.93,-0.01,-0.01,-0.16,-0.33,0.25,0.4,-0.17,-0.34
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,631.94,535.48,657.21,489.92,307589.0,80514.0,316912.0,100580.0,50.0,50.0,100.0,50.0,194377098.85,43113946.23,208278324.93,49276308.34,0.04,-0.09,0.03,0.25,1.0,0.0,0.07,0.14
3,2014,STATE HOUSE/ASSEMBLY/SENATE,417.4,405.42,515.64,607.6,530690.0,71081.0,503061.0,43734.0,100.0,100.0,200.0,500.0,221510080.53,28817810.64,259399579.33,26572572.55,0.24,0.5,-0.05,-0.38,1.0,4.0,0.17,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,958.13,333.01,1076.55,292.78,453559.0,173209.0,453983.0,142747.0,50.0,25.0,100.0,50.0,434568674.95,57680264.43,488736935.24,41793755.71,0.12,-0.12,0.0,-0.18,1.0,1.0,0.12,-0.28
5,2018,STATE HOUSE/ASSEMBLY/SENATE,380.29,360.48,655.32,733.68,651324.0,113349.0,429432.0,39146.0,100.0,100.0,250.0,500.0,247690649.42,40859889.7,281413250.95,28720743.88,0.72,1.04,-0.34,-0.65,1.5,4.0,0.14,-0.3


Rearrange the columns.

In [9]:
contributions_by_party_office = contributions_by_party_office[["year", "standardized_office",
                                  "num_dem_in_state", "num_rep_in_state", "diff_num_in_state",
                                  "num_dem_out_of_state", "num_rep_out_of_state", "diff_num_out_of_state",
                                  "avg_dem_in_state", "avg_rep_in_state", "diff_avg_in_state",
                                  "avg_dem_out_of_state", "avg_rep_out_of_state", "diff_avg_out_of_state",
                                  "med_dem_in_state", "med_rep_in_state", "diff_med_in_state",
                                  "med_dem_out_of_state", "med_rep_out_of_state", "diff_med_out_of_state",
                                  "sum_dem_in_state", "sum_rep_in_state", "diff_sum_in_state",
                                  "sum_dem_out_of_state", "sum_rep_out_of_state", "diff_sum_out_of_state"]]
contributions_by_party_office

Unnamed: 0,year,standardized_office,num_dem_in_state,num_rep_in_state,diff_num_in_state,num_dem_out_of_state,num_rep_out_of_state,diff_num_out_of_state,avg_dem_in_state,avg_rep_in_state,diff_avg_in_state,avg_dem_out_of_state,avg_rep_out_of_state,diff_avg_out_of_state,med_dem_in_state,med_rep_in_state,diff_med_in_state,med_dem_out_of_state,med_rep_out_of_state,diff_med_out_of_state,sum_dem_in_state,sum_rep_in_state,diff_sum_in_state,sum_dem_out_of_state,sum_rep_out_of_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,268008.0,412922.0,0.54,29941.0,30111.0,0.01,871.1,1110.01,0.27,1129.91,1103.32,-0.02,100.0,100.0,0.0,250.0,250.0,0.0,233462186.85,458345956.83,0.96,33830741.69,33221930.04,-0.02
1,2010,STATE HOUSE/ASSEMBLY/SENATE,617074.0,518049.0,-0.16,54150.0,36281.0,-0.33,399.5,395.42,-0.01,496.22,490.98,-0.01,100.0,125.0,0.25,250.0,350.0,0.4,246518283.49,204847421.32,-0.17,26870531.36,17813376.93,-0.34
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,307589.0,316912.0,0.03,80514.0,100580.0,0.25,631.94,657.21,0.04,535.48,489.92,-0.09,50.0,100.0,1.0,50.0,50.0,0.0,194377098.85,208278324.93,0.07,43113946.23,49276308.34,0.14
3,2014,STATE HOUSE/ASSEMBLY/SENATE,530690.0,503061.0,-0.05,71081.0,43734.0,-0.38,417.4,515.64,0.24,405.42,607.6,0.5,100.0,200.0,1.0,100.0,500.0,4.0,221510080.53,259399579.33,0.17,28817810.64,26572572.55,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,453559.0,453983.0,0.0,173209.0,142747.0,-0.18,958.13,1076.55,0.12,333.01,292.78,-0.12,50.0,100.0,1.0,25.0,50.0,1.0,434568674.95,488736935.24,0.12,57680264.43,41793755.71,-0.28
5,2018,STATE HOUSE/ASSEMBLY/SENATE,651324.0,429432.0,-0.34,113349.0,39146.0,-0.65,380.29,655.32,0.72,360.48,733.68,1.04,100.0,250.0,1.5,100.0,500.0,4.0,247690649.42,281413250.95,0.14,40859889.7,28720743.88,-0.3


## Is there a difference in the average out-of-state contributor to Democratic vs. Republican candidates?

First, group the data by contributor.

In [10]:
contributors = contributions.groupby(["year", "party", "standardized_office", "in_out_state", "contributor"])["amount"].sum().reset_index()
contributors.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3618450 entries, 0 to 3618449
Data columns (total 6 columns):
year                   int64
party                  object
standardized_office    object
in_out_state           object
contributor            object
amount                 float64
dtypes: float64(1), int64(1), object(4)
memory usage: 165.6+ MB


Then group by year, party, office and in-vs.-out-of-state contribution status and calculate the total contributions, average contributions, median contributions and number of contributors per group.

In [11]:
contributors_by_party_office = contributors.groupby(["year", "party", "standardized_office", "in_out_state"])["amount"].agg([sum, np.average, "median", len]).reset_index()
contributors_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,233462186.85,1279.58,200.0,182452.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,33830741.69,1522.19,250.0,22225.0
2,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,unknown,330453.83,594.34,231.0,556.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,246518283.49,771.41,100.0,319567.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,26870281.36,1022.03,125.0,26291.0
5,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,unknown,1244914.43,429.87,100.0,2896.0
6,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,in-state,941726.0,2446.04,500.0,385.0
7,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,136753.7,2072.03,1000.0,66.0
8,2010,Nonpartisan,STATE HOUSE/ASSEMBLY/SENATE,unknown,-19688.72,-1514.52,-500.0,13.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,458345781.83,1705.9,125.0,268682.0


Drop non-major parties and unknown contribution statuses.

In [12]:
contributors_by_party_office = contributors_by_party_office[((contributors_by_party_office["party"] == "Democratic") | (contributors_by_party_office["party"] == "Republican")) & (contributors_by_party_office["in_out_state"] != "unknown")]
contributors_by_party_office

Unnamed: 0,year,party,standardized_office,in_out_state,sum,average,median,len
0,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,233462186.85,1279.58,200.0,182452.0
1,2010,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,33830741.69,1522.19,250.0,22225.0
3,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,in-state,246518283.49,771.41,100.0,319567.0
4,2010,Democratic,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,26870281.36,1022.03,125.0,26291.0
9,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,in-state,458345781.83,1705.9,125.0,268682.0
10,2010,Republican,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,33221930.04,1441.42,250.0,23048.0
12,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,in-state,204847421.32,692.59,125.0,295768.0
13,2010,Republican,STATE HOUSE/ASSEMBLY/SENATE,out-of-state,17813326.93,1191.69,200.0,14948.0
21,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,in-state,194377098.85,1101.59,100.0,176452.0
22,2014,Democratic,GOVERNOR/LIEUTENANT GOVERNOR,out-of-state,43113946.23,863.92,100.0,49905.0


Pivot dataframe to aggregate each year and office's data in a single row.

In [13]:
contributors_by_party_office = pd.pivot_table(contributors_by_party_office, index=["year", "standardized_office"], columns=["party", "in_out_state"]).reset_index()
contributors_by_party_office

Unnamed: 0_level_0,year,standardized_office,average,average,average,average,len,len,len,len,median,median,median,median,sum,sum,sum,sum
party,Unnamed: 1_level_1,Unnamed: 2_level_1,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican,Democratic,Democratic,Republican,Republican
in_out_state,Unnamed: 1_level_2,Unnamed: 2_level_2,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state,in-state,out-of-state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,1279.58,1522.19,1705.9,1441.42,182452.0,22225.0,268682.0,23048.0,200.0,250.0,125.0,250.0,233462186.85,33830741.69,458345781.83,33221930.04
1,2010,STATE HOUSE/ASSEMBLY/SENATE,771.41,1022.03,692.59,1191.69,319567.0,26291.0,295768.0,14948.0,100.0,125.0,125.0,200.0,246518283.49,26870281.36,204847421.32,17813326.93
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,1101.59,863.92,1063.39,847.94,176452.0,49905.0,195862.0,58113.0,100.0,100.0,100.0,75.0,194377098.85,43113946.23,208278324.93,49276308.34
3,2014,STATE HOUSE/ASSEMBLY/SENATE,770.6,709.99,969.38,1824.16,287450.0,40589.0,267593.0,14567.0,100.0,100.0,150.0,200.0,221510080.53,28817810.64,259399530.33,26572572.55
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,1962.14,621.62,2047.36,503.1,221477.0,92790.0,238715.0,83073.0,100.0,47.0,150.0,75.0,434568154.95,57680191.75,488736635.24,41793755.71
5,2018,STATE HOUSE/ASSEMBLY/SENATE,733.36,642.38,1230.39,2197.29,337749.0,63605.0,228718.0,13071.0,100.0,100.0,200.0,250.0,247690136.42,40858790.7,281412775.95,28720743.88


Flatten the resulting dataframe's multi-index columns.

In [14]:
contributors_by_party_office.columns = ["year", "standardized_office",
                                  "avg_dem_in_state", "avg_dem_out_of_state",
                                  "avg_rep_in_state", "avg_rep_out_of_state",
                                  "num_dem_in_state", "num_dem_out_of_state",
                                  "num_rep_in_state", "num_rep_out_of_state",
                                  "med_dem_in_state", "med_dem_out_of_state",
                                  "med_rep_in_state", "med_rep_out_of_state",
                                  "sum_dem_in_state", "sum_dem_out_of_state",
                                  "sum_rep_in_state", "sum_rep_out_of_state"
                                  ]
contributors_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,1279.58,1522.19,1705.9,1441.42,182452.0,22225.0,268682.0,23048.0,200.0,250.0,125.0,250.0,233462186.85,33830741.69,458345781.83,33221930.04
1,2010,STATE HOUSE/ASSEMBLY/SENATE,771.41,1022.03,692.59,1191.69,319567.0,26291.0,295768.0,14948.0,100.0,125.0,125.0,200.0,246518283.49,26870281.36,204847421.32,17813326.93
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,1101.59,863.92,1063.39,847.94,176452.0,49905.0,195862.0,58113.0,100.0,100.0,100.0,75.0,194377098.85,43113946.23,208278324.93,49276308.34
3,2014,STATE HOUSE/ASSEMBLY/SENATE,770.6,709.99,969.38,1824.16,287450.0,40589.0,267593.0,14567.0,100.0,100.0,150.0,200.0,221510080.53,28817810.64,259399530.33,26572572.55
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,1962.14,621.62,2047.36,503.1,221477.0,92790.0,238715.0,83073.0,100.0,47.0,150.0,75.0,434568154.95,57680191.75,488736635.24,41793755.71
5,2018,STATE HOUSE/ASSEMBLY/SENATE,733.36,642.38,1230.39,2197.29,337749.0,63605.0,228718.0,13071.0,100.0,100.0,200.0,250.0,247690136.42,40858790.7,281412775.95,28720743.88


In [15]:
contributors_by_party_office["diff_avg_in_state"] = (contributors_by_party_office["avg_rep_in_state"] - contributors_by_party_office["avg_dem_in_state"]) / contributors_by_party_office["avg_dem_in_state"]
contributors_by_party_office["diff_avg_out_of_state"] = (contributors_by_party_office["avg_rep_out_of_state"] - contributors_by_party_office["avg_dem_out_of_state"]) / contributors_by_party_office["avg_dem_out_of_state"]
contributors_by_party_office["diff_num_in_state"] = (contributors_by_party_office["num_rep_in_state"] - contributors_by_party_office["num_dem_in_state"]) / contributors_by_party_office["num_dem_in_state"]
contributors_by_party_office["diff_num_out_of_state"] = (contributors_by_party_office["num_rep_out_of_state"] - contributors_by_party_office["num_dem_out_of_state"]) / contributors_by_party_office["num_dem_out_of_state"]
contributors_by_party_office["diff_med_in_state"] = (contributors_by_party_office["med_rep_in_state"] - contributors_by_party_office["med_dem_in_state"]) / contributors_by_party_office["med_dem_in_state"]
contributors_by_party_office["diff_med_out_of_state"] = (contributors_by_party_office["med_rep_out_of_state"] - contributors_by_party_office["med_dem_out_of_state"]) / contributors_by_party_office["med_dem_out_of_state"]
contributors_by_party_office["diff_sum_in_state"] = (contributors_by_party_office["sum_rep_in_state"] - contributors_by_party_office["sum_dem_in_state"]) / contributors_by_party_office["sum_dem_in_state"]
contributors_by_party_office["diff_sum_out_of_state"] = (contributors_by_party_office["sum_rep_out_of_state"] - contributors_by_party_office["sum_dem_out_of_state"]) / contributors_by_party_office["sum_dem_out_of_state"]
contributors_by_party_office

Unnamed: 0,year,standardized_office,avg_dem_in_state,avg_dem_out_of_state,avg_rep_in_state,avg_rep_out_of_state,num_dem_in_state,num_dem_out_of_state,num_rep_in_state,num_rep_out_of_state,med_dem_in_state,med_dem_out_of_state,med_rep_in_state,med_rep_out_of_state,sum_dem_in_state,sum_dem_out_of_state,sum_rep_in_state,sum_rep_out_of_state,diff_avg_in_state,diff_avg_out_of_state,diff_num_in_state,diff_num_out_of_state,diff_med_in_state,diff_med_out_of_state,diff_sum_in_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,1279.58,1522.19,1705.9,1441.42,182452.0,22225.0,268682.0,23048.0,200.0,250.0,125.0,250.0,233462186.85,33830741.69,458345781.83,33221930.04,0.33,-0.05,0.47,0.04,-0.38,0.0,0.96,-0.02
1,2010,STATE HOUSE/ASSEMBLY/SENATE,771.41,1022.03,692.59,1191.69,319567.0,26291.0,295768.0,14948.0,100.0,125.0,125.0,200.0,246518283.49,26870281.36,204847421.32,17813326.93,-0.1,0.17,-0.07,-0.43,0.25,0.6,-0.17,-0.34
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,1101.59,863.92,1063.39,847.94,176452.0,49905.0,195862.0,58113.0,100.0,100.0,100.0,75.0,194377098.85,43113946.23,208278324.93,49276308.34,-0.03,-0.02,0.11,0.16,0.0,-0.25,0.07,0.14
3,2014,STATE HOUSE/ASSEMBLY/SENATE,770.6,709.99,969.38,1824.16,287450.0,40589.0,267593.0,14567.0,100.0,100.0,150.0,200.0,221510080.53,28817810.64,259399530.33,26572572.55,0.26,1.57,-0.07,-0.64,0.5,1.0,0.17,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,1962.14,621.62,2047.36,503.1,221477.0,92790.0,238715.0,83073.0,100.0,47.0,150.0,75.0,434568154.95,57680191.75,488736635.24,41793755.71,0.04,-0.19,0.08,-0.1,0.5,0.6,0.12,-0.28
5,2018,STATE HOUSE/ASSEMBLY/SENATE,733.36,642.38,1230.39,2197.29,337749.0,63605.0,228718.0,13071.0,100.0,100.0,200.0,250.0,247690136.42,40858790.7,281412775.95,28720743.88,0.68,2.42,-0.32,-0.79,1.0,1.5,0.14,-0.3


Rearrange the columns.

In [16]:
contributors_by_party_office = contributors_by_party_office[["year", "standardized_office",
                                  "num_dem_in_state", "num_rep_in_state", "diff_num_in_state",
                                  "num_dem_out_of_state", "num_rep_out_of_state", "diff_num_out_of_state",
                                  "avg_dem_in_state", "avg_rep_in_state", "diff_avg_in_state",
                                  "avg_dem_out_of_state", "avg_rep_out_of_state", "diff_avg_out_of_state",
                                  "med_dem_in_state", "med_rep_in_state", "diff_med_in_state",
                                  "med_dem_out_of_state", "med_rep_out_of_state", "diff_med_out_of_state",
                                  "sum_dem_in_state", "sum_rep_in_state", "diff_sum_in_state",
                                  "sum_dem_out_of_state", "sum_rep_out_of_state", "diff_sum_out_of_state"]]
contributors_by_party_office

Unnamed: 0,year,standardized_office,num_dem_in_state,num_rep_in_state,diff_num_in_state,num_dem_out_of_state,num_rep_out_of_state,diff_num_out_of_state,avg_dem_in_state,avg_rep_in_state,diff_avg_in_state,avg_dem_out_of_state,avg_rep_out_of_state,diff_avg_out_of_state,med_dem_in_state,med_rep_in_state,diff_med_in_state,med_dem_out_of_state,med_rep_out_of_state,diff_med_out_of_state,sum_dem_in_state,sum_rep_in_state,diff_sum_in_state,sum_dem_out_of_state,sum_rep_out_of_state,diff_sum_out_of_state
0,2010,GOVERNOR/LIEUTENANT GOVERNOR,182452.0,268682.0,0.47,22225.0,23048.0,0.04,1279.58,1705.9,0.33,1522.19,1441.42,-0.05,200.0,125.0,-0.38,250.0,250.0,0.0,233462186.85,458345781.83,0.96,33830741.69,33221930.04,-0.02
1,2010,STATE HOUSE/ASSEMBLY/SENATE,319567.0,295768.0,-0.07,26291.0,14948.0,-0.43,771.41,692.59,-0.1,1022.03,1191.69,0.17,100.0,125.0,0.25,125.0,200.0,0.6,246518283.49,204847421.32,-0.17,26870281.36,17813326.93,-0.34
2,2014,GOVERNOR/LIEUTENANT GOVERNOR,176452.0,195862.0,0.11,49905.0,58113.0,0.16,1101.59,1063.39,-0.03,863.92,847.94,-0.02,100.0,100.0,0.0,100.0,75.0,-0.25,194377098.85,208278324.93,0.07,43113946.23,49276308.34,0.14
3,2014,STATE HOUSE/ASSEMBLY/SENATE,287450.0,267593.0,-0.07,40589.0,14567.0,-0.64,770.6,969.38,0.26,709.99,1824.16,1.57,100.0,150.0,0.5,100.0,200.0,1.0,221510080.53,259399530.33,0.17,28817810.64,26572572.55,-0.08
4,2018,GOVERNOR/LIEUTENANT GOVERNOR,221477.0,238715.0,0.08,92790.0,83073.0,-0.1,1962.14,2047.36,0.04,621.62,503.1,-0.19,100.0,150.0,0.5,47.0,75.0,0.6,434568154.95,488736635.24,0.12,57680191.75,41793755.71,-0.28
5,2018,STATE HOUSE/ASSEMBLY/SENATE,337749.0,228718.0,-0.32,63605.0,13071.0,-0.79,733.36,1230.39,0.68,642.38,2197.29,2.42,100.0,200.0,1.0,100.0,250.0,1.5,247690136.42,281412775.95,0.14,40858790.7,28720743.88,-0.3


## Export the data

In [17]:
%%notify
writer = pd.ExcelWriter("data/contributors_analysis.xlsx")
contributions_by_party_office.to_excel(writer, "contributions_by_party_office", index=False)
contributors_by_party_office.to_excel(writer, "contributors_by_party_office", index=False)
writer.save()

<IPython.core.display.Javascript object>