<a href="https://colab.research.google.com/github/AdrianduPlessis/UBI_Effects/blob/master/ML_Project_Proposal.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Hidden Effects of a UBI

Universal Basic Income (UBI) is not a new idea, I myself was introduced to the idea when Elon Musk first mentioned it in an interview back in 2017. Recently one of the democratic candidates, Andrew Yang, has been moving the idea closer to reality than it has ever been (on a massive scale, at least). With more attention than ever on UBI, some interesting opinions are being aired as to the possible social implications this massive economic disruption might have.  

Thinking through these issues myself, a lucky synapse fired reminding me of the Gini Index. The Gini Index is essentially a metric representing the inequality of income distribution within a certain geography. A UBI would obviously have the effect of lowering the Gini Index, most drastically on the bottom end. 
 
'Alright, so what?' I can hear you thinking. So this: The only reason I remember anything at all from the obscure YouTube video that introduced me to the Gini Index is because of the powerful correlation that exists between the Gini Index and crime rates. Anywhere on the spectrum... if you take two of the weathiest neighborhoods California, for example, and calculate the Gini Index for just those two neighborhoods, the neighborhood with the higher Gini Coefficient will (with some confidence) have the higher crime rate.  

'Neato, but so what?' Well, let me put it to you that using per county Gini Index, Income, and Crime Rate data-- a decent ML model should be able to predict the reduction in crime rates of a given UBI ammount.  

**What's more:** that predicted reduction in crime-rate could be translated into reduced cost of incarseration (including burden of legal fees) to the government, information that would be very useful to a hopeful candidate trying to explain the value of the idea that their political campaign is based on.

#Hyp: income inequality -> crime


Hypothesis A: There exists a (statistically significant) correlation between Gini Coefficient (Income Inequality) and crimerates within a specific county/state.

Determine p-values for individual counties' gini-to-crime over time.  
Determine p-values for counties' gini-to-crime for a given year.

##Tools of the trade

In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

##Import Datasets

In [0]:
gini_5yr_est_2017 = pd.read_csv('https://raw.githubusercontent.com/AdrianduPlessis/UBI_Effects/master/data/gini_est_5yr_17.csv', skiprows=1)

In [80]:
crime_1 = pd.read_csv('https://raw.githubusercontent.com/AdrianduPlessis/UBI_Effects/master/data/ICPSR_36399/DS0001/36399-0001-Data.tsv', sep='\t')
crime_1.head()

Unnamed: 0,STUDYNO,EDITION,PART,IDNO,FIPS_ST,FIPS_CTY,CPOPARST,AG_ARRST,JURFLAG,COVIND,GRNDTOT,P1TOT,P1VLNT,P1PRPTY,MURDER,RAPE,ROBBERY,AGASSLT,BURGLRY,LARCENY,MVTHEFT,ARSON,OTHASLT,FRGYCNT,FRAUD,EMBEZL,STLNPRP,VANDLSM,WEAPONS,COMVICE,SEXOFF,DRUGTOT,DRGSALE,COCSALE,MJSALE,SYNSALE,OTHSALE,DRGPOSS,COCPOSS,MJPOSS,SYNPOSS,OTHPOSS,GAMBLE,BOOKMKG,NUMBERS,OTGAMBL,OFAGFAM,DUI,LIQUOR,DRUNK,DISORDR,VAGRANT,ALLOTHR,SUSPICN,CURFEW,RUNAWAY
0,9999,1,1,1,1,1,57217,3,1,0.0,974,289,23,266,1,1,10,12,10,252,3,0,125,5,17,4,10,4,8,0,1,63,0,0,0,0,0,63,8,31,1,23,0,0,0,0,0,84,21,68,12,3,260,0,0,0
1,9999,1,1,2,1,3,198843,14,0,10.4765,4480,807,93,714,1,2,27,62,45,650,16,3,464,11,60,6,45,17,25,0,15,105,0,0,0,0,0,104,13,52,1,38,0,0,0,0,5,427,87,239,36,6,2126,0,0,0
2,9999,1,1,3,1,5,27026,5,0,32.4354,435,133,15,119,0,0,6,8,12,105,1,0,67,2,7,2,4,2,3,0,0,26,0,0,0,0,0,26,3,13,0,9,0,0,0,0,0,35,9,28,6,1,110,0,0,0
3,9999,1,1,4,1,7,22491,4,1,27.9504,246,68,13,55,0,0,4,8,4,47,5,0,43,2,2,3,4,3,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,19,0,1,8,0,90,0,0,0
4,9999,1,1,5,1,9,58037,6,0,7.3561,804,79,3,76,0,0,0,3,1,72,2,0,56,0,23,0,0,8,4,0,0,2,0,0,0,0,0,2,0,1,0,1,0,0,0,0,2,52,51,23,4,0,499,0,0,0


In [0]:
#Import other crime datasets

In [82]:
gini_5yr_est_2017.head()

Unnamed: 0,Id,Id2,Geography,Estimate; Gini Index,Margin of Error; Gini Index
0,0500000US01001,1001,"Autauga County, Alabama",0.4501,0.0391
1,0500000US01003,1003,"Baldwin County, Alabama",0.4618,0.01
2,0500000US01005,1005,"Barbour County, Alabama",0.4622,0.0148
3,0500000US01007,1007,"Bibb County, Alabama",0.4518,0.0565
4,0500000US01009,1009,"Blount County, Alabama",0.4302,0.0175


In [0]:
def change_id_to_fips(id):
  fips = int(id[-5:])
  return fips

In [86]:
gini_5yr_est_2017['Id'].apply(change_id_to_fips)
gini_5yr_est_2017.head()

Unnamed: 0,Id,Id2,Geography,Estimate; Gini Index,Margin of Error; Gini Index
0,0500000US01001,1001,"Autauga County, Alabama",0.4501,0.0391
1,0500000US01003,1003,"Baldwin County, Alabama",0.4618,0.01
2,0500000US01005,1005,"Barbour County, Alabama",0.4622,0.0148
3,0500000US01007,1007,"Bibb County, Alabama",0.4518,0.0565
4,0500000US01009,1009,"Blount County, Alabama",0.4302,0.0175


In [0]:
FIPS = pd.read_csv('https://raw.githubusercontent.com/AdrianduPlessis/UBI_Effects/master/data/all-geocodes-v2016.xlsx%20-%20Sheet1.csv', skiprows=4)
#State FIPS == 00 needs to be investigated
FIPS.head()


In [0]:
#Create dictionary to lookup State based on state FIPS
State_FIPS = FIPS[FIPS['County Code (FIPS)']==0]
State_FIPS = State_FIPS[State_FIPS['Summary Level']==40]

lookup_state_FIPS = dict(zip(State_FIPS['State Code (FIPS)'], State_FIPS['Area Name (including legal/statistical area description)']))

In [0]:
crime_1['state'] = crime_1['FIPS_ST'].map(lookup_state_FIPS)

In [0]:
def combine_cty_st_fips(cty, st):
  full_fips = st*1000 + cty
  return full_fips

In [0]:
FIPS['FIPS'] = combine_cty_st_fips(FIPS['County Code (FIPS)'], FIPS['State Code (FIPS)'])
crime_1['FIPS'] = combine_cty_st_fips(crime_1['FIPS_CTY'], crime_1['FIPS_ST'])
FIPS.tail()

In [0]:
crime_1.tail()

##Create crimerate summary dataframe

##Merge Datasets

##Predict crime on gini coefficient (Baseline)

#Predict future gini values (optional)

#Apply UBI transform to Gini values

#Predict crime rate without UBI

#Predict crime rate with UBI

#Correlation between income metric and trend in crime rate?  
positive feedback loop?