#Do Campaign Contributions Lead to Gun Violence?
I'm going to be critiquing an article posted by Jonathan Morgan http://goodattheinternet.com/2016/06/12/campaign-contributions-to-gun-violence/. He claims that campaign contributions are a leading indicator of gun violence in when you use the campaign cycle. So I decided to check that since his article was devoid of actual statistics. It did have some nice graphs though. I ran some statistics, looked at t-scores, all in the name of seeing whether his story checks out. When I have some time, I'd like to improve upon this basic model. 

Cliff notes version: I find no evidence that campaign contributions by the gun industry lead mass shooting deaths. In fact, I find the exact opposite: The number of victims leads gun industry donations. This suggests that the gun industry is simply playing defense.

In [1]:
import pandas
import statsmodels.api as sm
import numpy as np
import matplotlib.pyplot as plt

First of all let's load and clean the data. I'm using the same data sources as Mr. Morgan. Where we are going with this is a granger causality study. Later on, I'd like to test for cointegration of the variables and if it exists, we can build an error correction model to understand the long-run and short-run dynamics between these two variables. But for now because I'm doing this on a Monday night after my kid went to sleep, I'm limiting this to a very simplistic granger causality study. 

In [3]:
df=pandas.read_csv('c:\\users\\ryan\\desktop\\shootings.csv')
df.head()

Unnamed: 0,in,Location,Date,Year,Summary,Fatalities,Wounded,Total victims,Venue,Prior signs of possible mental illness,...,Where obtained,Type of weapons,Weapon details,Race,Gender,Sources,Mental Health Sources,latitude,longitude,Type
0,Orlando nightclub massacre,"Orlando, Florida",6/12/2016,2016,"Omar Mateen, 29, attacked the Pulse nighclub i...",(pending),(pending),(pending),Other,,...,"Shooting center in Port St. Lucie, Florida","Semiautomatic rifle, semiautomatic handgun",Sig Sauer MCX rifle; Glock 17 9mm,,,http://www.motherjones.com/politics/2016/06/as...,,,,
1,Excel Industries mass shooting,"Hesston, Kansas",2/25/2016,2016,"Cedric L. Ford, who worked as a painter at a m...",3,14,17,Workplace,Unclear,...,,"Semiautomatic rifle, semiautomatic handgun",AK-47,Black,M,http://www.nytimes.com/2016/02/26/us/shooting-...,,,,Spree
2,Kalamazoo shooting spree,"Kalamazoo County, Michigan",2/20/2016,2016,"Jason B. Dalton, a driver for Uber, apparently...",6,2,8,Other,Unclear,...,,Semiautomatic handgun,9 mm handgun,White,M,http://www.nytimes.com/2016/02/22/us/kalamazoo...,,,,Spree
3,San Bernardino mass shooting,"San Bernardino, California",12/2/2015,2015,Syed Rizwan Farook left a Christmas party held...,14,21,35,\nWorkplace,Unclear,...,The suspects purchased their handguns in the U...,Two assault rifles and two semi-automatic pist...,Two semiautomatic AR-15-style rifles—one a DPM...,Other,Male & Female,http://www.motherjones.com/mojo/2015/12/san-be...,,,,Mass
4,Planned Parenthood clinic,"Colorado Springs, Colorado",11/27/2015,2015,"Robert Lewis Dear, 57, shot and killed a polic...",3,9,12,Workplace,Unclear,...,Unclear,Long gun,Reportedly an AK-47 style semiautomatic rifle ...,White,M,http://www.nytimes.com/2015/11/28/us/colorado-...,,,,Mass


In [14]:
df=df.loc[1:,['Year','Fatalities']]
df['Deaths']=[int(obj) for obj in df['Fatalities']]
deaths=df.groupby('Year').sum()

In [23]:
deaths['mod']=[obj%2 for obj in deaths.index]
deaths['cycle']=deaths['mod']+deaths.index
death_cycle=deaths.groupby('cycle').sum()

In [22]:
df2=pandas.read_csv('c:\\users\\ryan\\desktop\\contributions.csv')
df2.head()

Unnamed: 0,year,contribution
0,1998,4498393
1,1999,5891966
2,2000,6710758
3,2001,6236161
4,2002,5684546


In [24]:
df2['mod']=[obj%2 for obj in df2['year']]
df2['cycle']=df2['year']+df2['mod']
contribution_cycle=df2.groupby('cycle').sum()

In [29]:
full_df=pandas.concat([contribution_cycle,death_cycle],axis=1)
dfa=full_df[['contribution','Deaths']].dropna()

Now we have a nice clean dataset to work with. Let's perform a simple granger-causality study on this data set. That's where we regress one variable against the lag of the other, but then we reverse the process and regress the other variable against the lag of the first variable. When you do this you can establish whether or not the two variables jointly determine each other, or one variable causes the other, or if they have no relationship. In essence, we are asking if we have a variable that always comes first. 

In [39]:
dfa['const']=1
dfa['lag']=dfa['contribution'].shift(1)
dfa=dfa.dropna()
y=dfa['Deaths']
X=dfa[['lag','const']]
results=sm.OLS(y,X).fit()
print results.summary()

                            OLS Regression Results                            
Dep. Variable:                 Deaths   R-squared:                       0.002
Model:                            OLS   Adj. R-squared:                 -0.140
Method:                 Least Squares   F-statistic:                   0.01495
Date:                Mon, 13 Jun 2016   Prob (F-statistic):              0.906
Time:                        21:45:26   Log-Likelihood:                -41.828
No. Observations:                   9   AIC:                             87.66
Df Residuals:                       7   BIC:                             88.05
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
lag        -1.903e-07   1.56e-06     -0.122      0.9

  int(n))


In [40]:
full_df=pandas.concat([contribution_cycle,death_cycle],axis=1)
dfa=full_df[['contribution','Deaths']].dropna()
dfa['const']=1
dfa['lag']=dfa['Deaths'].shift(1)
dfa=dfa.dropna()
y=dfa['contribution']
X=dfa[['lag','const']]
results=sm.OLS(y,X).fit()
print results.summary()

                            OLS Regression Results                            
Dep. Variable:           contribution   R-squared:                       0.574
Model:                            OLS   Adj. R-squared:                  0.513
Method:                 Least Squares   F-statistic:                     9.413
Date:                Mon, 13 Jun 2016   Prob (F-statistic):             0.0181
Time:                        21:54:19   Log-Likelihood:                -148.86
No. Observations:                   9   AIC:                             301.7
Df Residuals:                       7   BIC:                             302.1
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
lag         1.636e+05   5.33e+04      3.068      0.0

So let's interpret this really quickly. Campaign contributions do **not** granger cause deaths in mass shootings, however what we do find is that deaths in mass shootings **do** granger cause campaign contributions. To me it looks like the campaign contributions are defensive, which makes sense to me. In essence, my version of the story goes like this a mass shooting occurs, politicians start talking about gun control so the gun industry steps up its game amd starts playing nice with the politicians donating money to campaigns to stop the gun control legislation. In fact, the coefficient on the lag of deaths indicates that the gun industry will spend about an additional $163,600 per mass murder death. 