### Import FBI Hate Crime Database
Download zip file and read to dataframe

Documentation:
    https://www.fbi.gov/services/cjis/ucr/hate-crime

Download Page:
    https://crime-data-explorer.fr.cloud.gov/downloads-and-docs

Download Link:
    http://s3-us-gov-west-1.amazonaws.com/cg-d4b776d0-d898-4153-90c8-8336f86bdfec/hate_crime_2017.zip

In [98]:
import pandas as pd
import zipfile

download_link = 'http://s3-us-gov-west-1.amazonaws.com/cg-d4b776d0-d898-4153-90c8-8336f86bdfec/hate_crime_2017.zip'
file_name = 'hate_crime_2017.csv'

import requests, zipfile, io
r = requests.get(download_link)
z = zipfile.ZipFile(io.BytesIO(r.content))
df = pd.read_csv(z.open('hate_crime_2017.csv'))

In [99]:
df.shape

(194203, 28)

In [100]:
df.head()

Unnamed: 0,INCIDENT_ID,DATA_YEAR,ORI,PUB_AGENCY_NAME,PUB_AGENCY_UNIT,AGENCY_TYPE_NAME,STATE_ABBR,STATE_NAME,DIVISION_NAME,REGION_NAME,...,OFFENDER_RACE,OFFENDER_ETHNICITY,VICTIM_COUNT,OFFENSE_NAME,TOTAL_INDIVIDUAL_VICTIMS,LOCATION_NAME,BIAS_DESC,VICTIM_TYPES,MULTIPLE_OFFENSE,MULTIPLE_BIAS
0,3015,1991,AR0040200,Rogers,,City,AR,Arkansas,West South Central,South,...,White,,1,Intimidation,1.0,Highway/Road/Alley/Street/Sidewalk,Anti-Black or African American,Individual,S,S
1,3016,1991,AR0290100,Hope,,City,AR,Arkansas,West South Central,South,...,Black or African American,,1,Simple Assault,1.0,Highway/Road/Alley/Street/Sidewalk,Anti-White,Individual,S,S
2,43,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Black or African American,,1,Aggravated Assault,1.0,Residence/Home,Anti-Black or African American,Individual,S,S
3,44,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Black or African American,,2,Aggravated Assault;Destruction/Damage/Vandalis...,1.0,Highway/Road/Alley/Street/Sidewalk,Anti-White,Individual,M,S
4,3017,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Black or African American,,1,Aggravated Assault,1.0,Service/Gas Station,Anti-White,Individual,S,S


### Hate Crime Types
The FBI UCR Program’s Hate Crime Data Collection gathers data on the following biases:

In [101]:
Race_Ethnicity_Ancestry = [
    'American Indian or Alaska Native',
    'Arab',
    'Asian',
    'Black or African American',
    'Hispanic or Latino',
    'Multiple Races, Group',
    'Native Hawaiian or Other Pacific Islander',
    'Other Race/Ethnicity/Ancestry',
    'White'
]
Religion = [
    'Buddhist'
    ,'Catholic'
    ,'Eastern Orthodox (Russian, Greek, Other)'
    ,'Hindu'
    ,'Islamic'
    ,'Jehovah’s Witness'
    ,'Jewish'
    ,'Mormon'
    ,'Multiple Religions, Group'
    ,'Other Christian'
    ,'Other Religion'
    ,'Protestant'
    ,'Atheism/Agnosticism'
]

Sexual_Orientation = [
'Bisexual'
,'Gay (Male)'
,'Heterosexual'
,'Lesbian'
,'Lesbian, Gay, Bisexual, or Transgender (Mixed Group)'
]

Disability = [
    'Mental Disability'
    ,'Physical Disability'
]

Gender =[
    'Male'
    ,'Female'
]

Gender_Identity = [
    'Transgender'
    ,'Gender Non-Conforming'
]

### Hate Crime Type Categorize
Turn Hate Crime Bias Description into higher types via Booleans format

In [102]:
list_Is_type = ['Is_Race_Ethnicity_Ancestry','Is_Religion','Is_Sexual_Orientation','Is_Disability','Is_Gender','Is_Gender_Identity']
df_new = df

for bias_type in list_Is_type:
    df_new[bias_type] = 0
df_new.head()

Unnamed: 0,INCIDENT_ID,DATA_YEAR,ORI,PUB_AGENCY_NAME,PUB_AGENCY_UNIT,AGENCY_TYPE_NAME,STATE_ABBR,STATE_NAME,DIVISION_NAME,REGION_NAME,...,BIAS_DESC,VICTIM_TYPES,MULTIPLE_OFFENSE,MULTIPLE_BIAS,Is_Race_Ethnicity_Ancestry,Is_Religion,Is_Sexual_Orientation,Is_Disability,Is_Gender,Is_Gender_Identity
0,3015,1991,AR0040200,Rogers,,City,AR,Arkansas,West South Central,South,...,Anti-Black or African American,Individual,S,S,0,0,0,0,0,0
1,3016,1991,AR0290100,Hope,,City,AR,Arkansas,West South Central,South,...,Anti-White,Individual,S,S,0,0,0,0,0,0
2,43,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Anti-Black or African American,Individual,S,S,0,0,0,0,0,0
3,44,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Anti-White,Individual,M,S,0,0,0,0,0,0
4,3017,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Anti-White,Individual,S,S,0,0,0,0,0,0


In [103]:
df_new['BIAS_DESC']

0                            Anti-Black or African American
1                                                Anti-White
2                            Anti-Black or African American
3                                                Anti-White
4                                                Anti-White
                                ...                        
194198                       Anti-Black or African American
194199                                      Anti-Protestant
194200    Anti-Lesbian, Gay, Bisexual, or Transgender (M...
194201                Anti-American Indian or Alaska Native
194202                                      Anti-Gay (Male)
Name: BIAS_DESC, Length: 194203, dtype: object

In [113]:
list_Is_type

['Is_Race_Ethnicity_Ancestry',
 'Is_Religion',
 'Is_Sexual_Orientation',
 'Is_Disability',
 'Is_Gender',
 'Is_Gender_Identity']

In [124]:
for row in range(0,df_new.shape[0]):
    for i in range(0,len(list_types)):
        for bias_type in list_types[i]:
            if bias_type in df_new.BIAS_DESC[row]:
                df_new[list_Is_type[i]][row] = 1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """


In [125]:
df_new

Unnamed: 0,INCIDENT_ID,DATA_YEAR,ORI,PUB_AGENCY_NAME,PUB_AGENCY_UNIT,AGENCY_TYPE_NAME,STATE_ABBR,STATE_NAME,DIVISION_NAME,REGION_NAME,...,BIAS_DESC,VICTIM_TYPES,MULTIPLE_OFFENSE,MULTIPLE_BIAS,Is_Race_Ethnicity_Ancestry,Is_Religion,Is_Sexual_Orientation,Is_Disability,Is_Gender,Is_Gender_Identity
0,3015,1991,AR0040200,Rogers,,City,AR,Arkansas,West South Central,South,...,Anti-Black or African American,Individual,S,S,1,0,0,0,0,0
1,3016,1991,AR0290100,Hope,,City,AR,Arkansas,West South Central,South,...,Anti-White,Individual,S,S,1,0,0,0,0,0
2,43,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Anti-Black or African American,Individual,S,S,1,0,0,0,0,0
3,44,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Anti-White,Individual,M,S,1,0,0,0,0,0
4,3017,1991,AR0350100,Pine Bluff,,City,AR,Arkansas,West South Central,South,...,Anti-White,Individual,S,S,1,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
194198,481909,2017,WV0540100,Parkersburg,,City,WV,West Virginia,South Atlantic,South,...,Anti-Black or African American,Individual;Society/Public,M,S,1,0,0,0,0,0
194199,190213,2017,WY0010100,Laramie,,City,WY,Wyoming,Mountain,West,...,Anti-Protestant,Religious Organization,S,S,0,1,0,0,0,0
194200,193399,2017,WY0010100,Laramie,,City,WY,Wyoming,Mountain,West,...,"Anti-Lesbian, Gay, Bisexual, or Transgender (M...",Business,S,S,0,0,1,0,0,1
194201,194469,2017,WY0010200,University of Wyoming,,University or College,WY,Wyoming,Mountain,West,...,Anti-American Indian or Alaska Native,Individual,S,S,1,0,0,0,0,0


In [127]:
df_new.to_csv('hate_crime_auto_bool.csv')