# US SEX CRIMES
##### An investigation on household composition types and reported sex crimes in nine US cities

## 2.0 Crime
### Isolating crime data for 2015-2019 and differentiating between sex crimes and other crimes in all cities observed

### Data Sources
##### City Crime
##### Baltimore: https://data.baltimorecity.gov/Public-Safety/BPD-Part-1-Victim-Based-Crime-Data/wsfq-mvij  
##### Chicago (2001-Present): https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present-Dashboard/5cd6-ry5g
##### DC: 
#####    - DC 2015: https://opendata.dc.gov/datasets/crime-incidents-in-2015
#####    - DC 2016: https://opendata.dc.gov/datasets/crime-incidents-in-2016
#####    - DC 2017: https://opendata.dc.gov/datasets/crime-incidents-in-2017
#####    - DC 2018: https://opendata.dc.gov/datasets/crime-incidents-in-2018
#####    - DC 2019: https://opendata.dc.gov/datasets/crime-incidents-in-2019/data?page=3242
##### Detroit: https://data.detroitmi.gov/datasets/rms-crime-incidents?geometry=-84.117%2C42.175%2C-82.081%2C42.530&selectedAttribute=year
##### LA (2010-Present): https://data.lacity.org/A-Safe-City/Crime-Data-from-2010-to-Present/63jg-8b9z/data
##### Minneapolis: 
#####            - Minneapolis 2015: http://opendata.minneapolismn.gov/datasets/police-incidents-2015?geometry=-109.946%2C-5.468%2C16.617%2C48.789
#####            - Minneapolis 2016: http://opendata.minneapolismn.gov/datasets/police-incidents-2016/data?geometry=-109.946%2C-5.468%2C16.617%2C48.789
#####            - Minneapolis 2017: http://opendata.minneapolismn.gov/datasets/police-incidents-2017/data?geometry=-109.946%2C-5.468%2C16.617%2C48.789
#####            - Minneapolis 2018 & 2019: http://opendata.minneapolismn.gov/datasets/police-incidents-last-2years
##### Nashville: 
#####          - Nashville 2015: https://data.nashville.gov/Police/Metro-Nashville-Police-Department-Incidents-2015-/ce74-dvvv
#####          - Nashville 2016: https://data.nashville.gov/Police/Metro-Nashville-Police-Department-Incidents-2016-/tpvn-3k6v
#####          - Nashville 2017: https://data.nashville.gov/Police/Metro-Nashville-Police-Department-Incidents-2017-/ei8z-vngg
#####          - Nashville 2018: https://data.nashville.gov/Police/Metro-Nashville-Police-Department-Incidents-2018-/we5n-wkcf
#####          - Nashville 2019: https://data.nashville.gov/Police/Metro-Nashville-Police-Department-Incidents-2019-/a88c-cc2y
##### Philadelphia (2006-Present): https://www.opendataphilly.org/dataset/crime-incidents
##### SF:
#####    - SF (2003-May 2018): https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-Historical-2003/tmnf-yvry
#####    - SF (2018-Present): https://data.sfgov.org/Public-Safety/Police-Department-Incident-Reports-2018-to-Present/wg3w-h783

##### Process:
###### 1. Read in crime data for each city
###### 2. Isolate each city for crime data from 2015-2019 (i.e. 2019 being "present")
###### 3. Drop unnecessary columns (select only description of crime, latitude, longitude, and year)
###### 4. Rename columns
###### 5. Drop NA values for location (i.e. latitude and longitude) and location values of zero (o)
###### 6. Create a list of all cities datframes, 'cities_ls'
###### 7. Create a string, 'sex_str', of all sex related crime descriptions in all cities
###### 8. Create columns of "total_crime" and "sex_crime" for cities_ls
###### 9. Convert crime dfs to geospatial objects
###### 10. Write crime dfs to shp and export to data/interim/crime/

In [1]:
# Call libraries
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import os

In [2]:
# Change working directory to a specified directory
os.chdir('../')
print("Directory Changes")

# Get current working directory
cwd = os.getcwd()
print("Current working directory is:", cwd)

Directory Changes
Current working directory is: C:\Users\Carol\Documents\CUSP\US_SexCrime


### 2.1 Read in Crime Data

In [3]:
# Read in crime data for each city
# Baltimore
baltimore = pd.read_csv("data/raw/crime/Baltimore/BPD_Part_1_Victim_Based_Crime_Data.csv")

# Chicago
chicago = pd.read_csv("data/raw/crime/Chicago/Crimes_-_2001_to_present.csv")

# DC
dc_2015 = pd.read_csv("data/raw/crime/DC/Crime_Incidents_in_2015.csv")
dc_2016 = pd.read_csv("data/raw/crime/DC/Crime_Incidents_in_2016.csv")
dc_2017 = pd.read_csv("data/raw/crime/DC/Crime_Incidents_in_2017.csv")
dc_2018 = pd.read_csv("data/raw/crime/DC/Crime_Incidents_in_2018.csv")
dc_2019 = pd.read_csv("data/raw/crime/DC/Crime_Incidents_in_2019.csv")
                      
# Detroit
detroit = pd.read_csv("data/raw/crime/Detroit/RMS_Crime_Incidents.csv")

# LA
la = pd.read_csv("data/raw/crime/LA/crime_Data_from_2010_to_Present.csv")

# Minneapolis
minneapolis_2015 = pd.read_csv("data/raw/crime/Minneapolis/Police_Incidents_2015.csv")
minneapolis_2016 = pd.read_csv("data/raw/crime/Minneapolis/Police_Incidents_2015.csv")
minneapolis_2017 = pd.read_csv("data/raw/crime/Minneapolis/Police_Incidents_2015.csv")
minneapolis_2018_2019 = pd.read_csv("data/raw/crime/Minneapolis/Police_Incidents_Last_2Years.csv")

# Nashville
nashville_2015 = pd.read_csv("data/raw/crime/Nashville/Metro_Nashville_Police_Department_Incidents__2015_.csv")
nashville_2016 = pd.read_csv("data/raw/crime/Nashville/Metro_Nashville_Police_Department_Incidents__2016_.csv")          
nashville_2017 = pd.read_csv("data/raw/crime/Nashville/Metro_Nashville_Police_Department_Incidents__2017_.csv")          
nashville_2018 = pd.read_csv("data/raw/crime/Nashville/Metro_Nashville_Police_Department_Incidents__2018_.csv")          
nashville_2019 = pd.read_csv("data/raw/crime/Nashville/Metro_Nashville_Police_Department_Incidents__2019_.csv")          

# Philadelphia
philadelphia = pd.read_csv("data/raw/crime/Philadelphia/incidents_part1_part2.csv")

# SF
sf_2003_2018 = pd.read_csv("data/raw/crime/SF/Police_Department_Incident_Reports__Historical_2003_to_May_2018.csv")
sf_2018_present = pd.read_csv("data/raw/crime/SF/Police_Department_Incident_Reports__2018_to_Present.csv")

  interactivity=interactivity, compiler=compiler, result=result)
  interactivity=interactivity, compiler=compiler, result=result)
  interactivity=interactivity, compiler=compiler, result=result)
  interactivity=interactivity, compiler=compiler, result=result)
  interactivity=interactivity, compiler=compiler, result=result)


### 2.2 Prepare Crime Data per City

In [4]:
# Prepping crime data per city
# Steps:
# Step 1) Isolate each city for crime data from 2015-2019 (i.e. 2019 being "present")
# Step 2) Drop unnecessary columns (select only description of crime, latitude, longitude, and year)
# Step 3) Rename columns
# Step 4) Drop NA values for location (i.e. latitude and longitude) and location values of zero (o)

# Create list 'col_names' with new column names for year, crime description, latitude, and longitude
col_names = ['year','description', 'lat','lon']
col_names_chicago = ['year','description', 'description2', 'lat','lon']

# Baltimore
# Step 1:
# Extract 'year' from 'CrimeDate'
baltimore['year'] = baltimore['CrimeDate'].str[6:10]
# Convert 'year' to int64 and subset 'year' for >2015
baltimore = baltimore[baltimore['year'].astype('int64') >= 2015]
# Step 2: Select only necessary columns
baltimore = baltimore[['year', 'Description', 'Latitude', 'Longitude']]
# Step 3: Rename columns
baltimore.columns = col_names
# Step 4: Drop NA and zero values for location
baltimore = baltimore[baltimore['lat'].notnull()]
baltimore = baltimore[baltimore['lon'].notnull()]
baltimore = baltimore[baltimore['lat']!=0]
baltimore = baltimore[baltimore['lon']!=0]

# Chicago
# Step 1:
# Convert 'year' to int64 and subset 'year' for >2015
chicago = chicago[chicago['Year'].astype('int64') >= 2015]
# Step 2: Select only necessary columns
chicago = chicago[['Year', 'Primary Type', 'Description', 'Latitude', 'Longitude']]
# Step 3: Rename columns
chicago.columns = col_names_chicago
# Step 4: Drop NA and zero values for location
chicago = chicago[chicago['lat'].notnull()]
chicago = chicago[chicago['lon'].notnull()]
chicago = chicago[chicago['lat']!=0]
chicago = chicago[chicago['lon']!=0]


# DC
# Step 1:
# Merge dc_2015, dc_2016, dc_2017, dc_2018, and 2c_2019 data
dc = pd.concat([dc_2015, dc_2016, dc_2017, dc_2018, dc_2019])
# Extract 'year' from 'REPORT_DAT'
dc['year']=dc['REPORT_DAT'].str[0:4]
# Step 2: Select only necessary columns
dc = dc[['year', 'OFFENSE', 'LATITUDE', 'LONGITUDE']]
# Step 3: Rename columns
dc.columns = col_names
# Step 4: Drop NA and zero values for location
dc = dc[dc['lat'].notnull()]
dc = dc[dc['lon'].notnull()]
dc = dc[dc['lat']!=0]
dc = dc[dc['lon']!=0]


# Detroit
# Step 1:
# Convert 'year' to int64 and subset 'year' for >2015
detroit = detroit[detroit['year'].astype('int64') >= 2015]
# Step 2: Select only necessary columns
detroit = detroit[['year', 'offense_description', 'latitude', 'longitude']]
# Step 3: Rename columns
detroit.columns = col_names
# Step 4: Drop NA and zero values for location
detroit = detroit[detroit['lat'].notnull()]
detroit = detroit[detroit['lon'].notnull()]
detroit = detroit[detroit['lat']!=0]
detroit = detroit[detroit['lon']!=0]


# LA
# Step 1:
# Extract 'year' from 'DATE OCC'
la['year']=la['DATE OCC'].str[6:10]
# Convert 'year' to int64 and subset 'year' for >2015
la = la[la['year'].astype('int64') >= 2015]
# Step 2: Select only necessary columns
la = la[['year', 'Crm Cd Desc', 'LAT', 'LON']]
# Step 3: Rename columns
la.columns = col_names
# Step 4: Drop NA and zero values for location
la = la[la['lat'].notnull()]
la = la[la['lon'].notnull()]
la = la[la['lat']!=0]
la = la[la['lon']!=0]


# Minneapolis
# Minneapolis_2015
# Step 1:
# Extract 'year' from 'DATE OCC'
minneapolis_2015['year']=minneapolis_2015['ReportedDate'].str[0:4]
# Step 2: Select only necessary columns
minneapolis_2015 = minneapolis_2015[['year', 'Description', 'Lat', 'Long']]
# Step 3: Rename columns
minneapolis_2015.columns = col_names
# Step 4: Drop NA and zero values for location
minneapolis_2015 = minneapolis_2015[minneapolis_2015['lat'].notnull()]
minneapolis_2015 = minneapolis_2015[minneapolis_2015['lon'].notnull()]
minneapolis_2015 = minneapolis_2015[minneapolis_2015['lat']!=0]
minneapolis_2015 = minneapolis_2015[minneapolis_2015['lon']!=0]

# Minneapolis_2016
# Step 1:
# Extract 'year' from 'DATE OCC'
minneapolis_2016['year']=minneapolis_2016['ReportedDate'].str[0:4]
# Step 2: Select only necessary columns
minneapolis_2016 = minneapolis_2016[['year', 'Description', 'Lat', 'Long']]
# Step 3: Rename columns
minneapolis_2016.columns = col_names
# Step 4: Drop NA and zero values for location
minneapolis_2016 = minneapolis_2016[minneapolis_2016['lat'].notnull()]
minneapolis_2016 = minneapolis_2016[minneapolis_2016['lon'].notnull()]
minneapolis_2016 = minneapolis_2016[minneapolis_2016['lat']!=0]
minneapolis_2016 = minneapolis_2016[minneapolis_2016['lon']!=0]

# Minneapolis_2017
# Step 1:
# Extract 'year' from 'DATE OCC'
minneapolis_2017['year']=minneapolis_2017['ReportedDate'].str[0:4]
# Step 2: Select only necessary columns
minneapolis_2017 = minneapolis_2017[['year', 'Description', 'Lat', 'Long']]
# Step 3: Rename columns
minneapolis_2017.columns = col_names
# Step 4: Drop NA and zero values for location
minneapolis_2017 = minneapolis_2017[minneapolis_2017['lat'].notnull()]
minneapolis_2017 = minneapolis_2017[minneapolis_2017['lon'].notnull()]
minneapolis_2017 = minneapolis_2017[minneapolis_2017['lat']!=0]
minneapolis_2017 = minneapolis_2017[minneapolis_2017['lon']!=0]

# Minneapolis_2018_2019
# Step 1:
# Extract 'year' from 'DATE OCC'
minneapolis_2018_2019['year']=minneapolis_2018_2019['reportedDate'].str[0:4]
# Step 2: Select only necessary columns
minneapolis_2018_2019 = minneapolis_2018_2019[['year', 'description', 'centerLat', 'centerLong']]
# Step 3: Rename columns
minneapolis_2018_2019.columns = col_names
# Step 4: Drop NA and zero values for location
minneapolis_2018_2019 = minneapolis_2018_2019[minneapolis_2018_2019['lat'].notnull()]
minneapolis_2018_2019 = minneapolis_2018_2019[minneapolis_2018_2019['lon'].notnull()]
minneapolis_2018_2019 = minneapolis_2018_2019[minneapolis_2018_2019['lat']!=0]
minneapolis_2018_2019 = minneapolis_2018_2019[minneapolis_2018_2019['lon']!=0]

# Merge minneapolis_2015, minneapolis_2016, minneapolis_2017, and minneapolis_2018_2019
minneapolis = pd.concat([minneapolis_2015, minneapolis_2016, minneapolis_2017,minneapolis_2018_2019])


# Nashville
# Step 1:
# Merge nashville_2015, nashville_2016, nashville_2017, nashville_2018, and nashville_2019
nashville = pd.concat([nashville_2015, nashville_2016, nashville_2017, nashville_2018, nashville_2019])
# Extract 'year' from 'DATE OCC'
nashville['year']=nashville['Incident Reported'].str[6:10]
# Step 2: Select only necessary columns
nashville = nashville[['year', 'Offense Description', 'Latitude', 'Longitude']]
# Step 3: Rename columns
nashville.columns = col_names
# Step 4: Drop NA and zero values for location
nashville = nashville[nashville['lat'].notnull()]
nashville = nashville[nashville['lon'].notnull()]
nashville = nashville[nashville['lat']!=0]
nashville = nashville[nashville['lon']!=0]

# Philadelphia
# Step 1:
# Extract 'year' from 'DATE OCC'
philadelphia['year']=philadelphia['dispatch_date'].str[0:4]
# Convert 'year' to int64 and subset 'year' for >2015
philadelphia = philadelphia[philadelphia['year'].astype('int64') >= 2015]
# Step 2: Select only necessary columns
philadelphia = philadelphia[['year', 'text_general_code', 'point_y', 'point_x']]
# Step 3: Rename columns
philadelphia.columns = col_names
# Step 4: Drop NA and zero values for location
philadelphia = philadelphia[philadelphia['lat'].notnull()]
philadelphia = philadelphia[philadelphia['lon'].notnull()]
philadelphia = philadelphia[philadelphia['lat']!=0]
philadelphia = philadelphia[philadelphia['lon']!=0]


# SF
# sf_2003_2018
# Step 1:
# Extract 'year' from 'Date'
sf_2003_2018['year']=sf_2003_2018['Date'].str[6:10]
# Drop NA values for year
sf_2003_2018 = sf_2003_2018[sf_2003_2018['year'].notnull()]
# Convert 'year' to int64 and subset 'year' for >2015
sf_2003_2018 = sf_2003_2018[sf_2003_2018['year'].astype('int64') >= 2015]
# Step 2: Select only necessary columns
sf_2003_2018 = sf_2003_2018[['year', 'Category', 'Y', 'X']]
# Step 3: Rename columns
sf_2003_2018.columns = col_names
# Step 4: Drop NA and zero values for location
sf_2003_2018 = sf_2003_2018[sf_2003_2018['lat'].notnull()]
sf_2003_2018 = sf_2003_2018[sf_2003_2018['lon'].notnull()]
sf_2003_2018 = sf_2003_2018[sf_2003_2018['lat']!=0]
sf_2003_2018 = sf_2003_2018[sf_2003_2018['lon']!=0]

# sf_2018_present
# Step 2: Select only necessary columns
sf_2018_present = sf_2018_present[['Incident Year', 'Incident Category', 'Latitude', 'Longitude']]
# Step 3: Rename columns
sf_2018_present.columns = col_names
# Step 4: Drop NA values and zero for location
sf_2018_present = sf_2018_present[sf_2018_present['lat'].notnull()]
sf_2018_present = sf_2018_present[sf_2018_present['lon'].notnull()]
sf_2018_present = sf_2018_present[sf_2018_present['lat']!=0]
sf_2018_present = sf_2018_present[sf_2018_present['lon']!=0]

# Merge sf_2003_2018 and sf_2018_present
sf = pd.concat([sf_2003_2018, sf_2018_present])
# Convert lat to type float to numeric
sf['lat']= pd.to_numeric(sf['lat']).astype(float)

In [5]:
# Create list of all cities
cities_ls = [baltimore, chicago, dc, detroit, la, minneapolis, nashville, philadelphia, sf]

In [29]:
# Create a list of all sex crimes from all sex crime related descriptions in all cities

# Baltimore
# 'RAPE'

# Chicago
# 'SEX OFFENSE', 'CRIM SEXUAL ASSAULT', 'AGG CRIM SEX ABUSE FAM MEMBER', 'AGG CRIMINAL SEXUAL ABUSE','AGG SEX ASSLT OF CHILD FAM MBR', 'ATT AGG CRIM SEXUAL ABUSE',
# 'ATT AGG CRIMINAL SEXUAL ABUSE', 'ATT CRIM SEXUAL ABUSE',
# 'COMMERCIAL SEX ACTS', 'CRIM SEX ABUSE BY FAM MEMBER',
# 'CRIMINAL SEXUAL ABUSE',
# 'NON-CONSENSUAL DISSEMINATION PRIVATE SEXUAL IMAGES',
# 'SEX ASSLT OF CHILD BY FAM MBR', 'SEX OFFENDER: FAIL REG NEW ADD',
# 'SEX OFFENDER: FAIL TO REGISTER', 'SEX OFFENDER: PROHIBITED ZONE',
# 'SEX RELATION IN FAMILY', 'SEXUAL EXPLOITATION OF A CHILD'


# DC
# 'SEX ABUSE'

# Detroit
# 'SEXUAL PENETRATION NONFORCIBLE - OTHER', 'SEX OFFENSE - OTHER', 'SEXUAL PENETRATION NONFORCIBLE - BLOOD / AFFINITY'

# LA
# 'SEX,UNLAWFUL(INC MUTUAL CONSENT, PENETRATION W/ FRGN OBJ','RAPE, ATTEMPTED', 'RAPE, FORCIBLE', 
# 'BATTERY WITH SEXUAL CONTACT', 'SEXUAL PENETRATION W/FOREIGN OBJECT', 'SODOMY/SEXUAL CONTACT B/W PENIS OF ONE PERS TO ANUS OTH'

# Minneapolis
# 'Crim Sex Cond-rape', 'CSC - RAPE', 'CSC - SODOMY'

# Nashville
# 'RAPE, WOMAN', 'RAPE WITH WEAPON', 'RAPE OF A CHILD', 'RAPE ATTEMPT',
# 'RAPE - STRONGARM', 'RAPE - GUN', 'RAPE-STATUTORY- AUTHORITY FIGURE,
# 'RAPE- WITHOUT CONSENT', 'RAPE- STATUTORY', 'RAPE- SPOUSAL - WEAPON OR OBJECT',
# 'RAPE- FORCE OR COERCION', 'RAPE- AGG.- WEAPON OR OBJECT', 'SEX ASSLT - SODOMY-GIRL-GUN', 
# 'SEX ASSLT - SODOMY-GIRL-STGARM', 'SEX ASSAULT-ATTEMPT', 'SEX ASSLT - SODOMY-MAN-GUN, 
# 'SEX ASSLT - SODOMY-MAN-STGARM', 'SEX ASSLT - SODOMY-WOMAN-GUN', 'SEX ASSLT - SODOMY-WOMAN-WEAPON', 
# 'SEXUAL ASSAULT', 'SEXUAL BATTERY- SPOUSAL, WEAPON OR OBJECT', 'SEXUAL BATTERY- WITHOUT CONSENT, 
# 'SEXUAL BATTERY, AGGRAVATED, GIRL', 'SEXUAL BATTERY, AGGRAVATED, WOMAN', 'SEXUAL BATTERY, CRIMINAL ATTEMPT', 
# 'SEXUAL BATTERY, GIRL, 'SEXUAL BATTERY, MAN', 'SEXUAL BATTERY, WOMAN' 'SEXUAL CONTACT BY AN AUTHORITY FIGURE'

# Philadelphia
# 'Rape', 'Other Sex Offenses (Not Commercialized)'

# SF
# 'Rape','Sex Offense'


sex_str = ['RAPE', 'SEX OFFENSE', 'CRIM SEXUAL ASSAULT', 'SEX ABUSE', 'SEXUAL PENETRATION NONFORCIBLE - OTHER', 'SEX OFFENSE - OTHER', 
           'SEXUAL PENETRATION NONFORCIBLE - BLOOD / AFFINITY', 'SEX,UNLAWFUL(INC MUTUAL CONSENT, PENETRATION W/ FRGN OBJ',
           'RAPE, ATTEMPTED', 'RAPE, FORCIBLE', 'BATTERY WITH SEXUAL CONTACT', 'SEXUAL PENETRATION W/FOREIGN OBJECT', 
           'SODOMY/SEXUAL CONTACT B/W PENIS OF ONE PERS TO ANUS OTH', 'Crim Sex Cond-rape', 'CSC - RAPE', 'CSC - SODOMY', 
           'RAPE, WOMAN', 'RAPE WITH WEAPON', 'RAPE OF A CHILD', 'RAPE ATTEMPT','RAPE - STRONGARM', 'RAPE - GUN', 
           'RAPE-STATUTORY- AUTHORITY FIGURE', 'RAPE- WITHOUT CONSENT', 'RAPE- STATUTORY', 'RAPE- SPOUSAL - WEAPON OR OBJECT', 
           'RAPE- FORCE OR COERCION', 'RAPE- AGG.- WEAPON OR OBJECT', 'SEX ASSLT - SODOMY-GIRL-GUN', 'SEX ASSLT - SODOMY-GIRL-STGARM', 
           'SEX ASSAULT-ATTEMPT', 'SEX ASSLT - SODOMY-MAN-GUN', "SEX ASSLT - SODOMY-MAN-STGARM", "SEX ASSLT - SODOMY-WOMAN-GUN", 
           'SEX ASSLT - SODOMY-WOMAN-WEAPON', 'SEXUAL ASSAULT', 'SEXUAL BATTERY- SPOUSAL, WEAPON OR OBJECT', 
           'SEXUAL BATTERY- WITHOUT CONSENT', "SEXUAL BATTERY, AGGRAVATED, GIRL", "SEXUAL BATTERY, AGGRAVATED, WOMAN", 
           'SEXUAL BATTERY, CRIMINAL ATTEMPT', 'SEXUAL BATTERY, GIRL', "SEXUAL BATTERY, MAN", "SEXUAL BATTERY, WOMAN", 
           'SEXUAL CONTACT BY AN AUTHORITY FIGURE', 'Rape', 'Other Sex Offenses (Not Commercialized)', 'Rape', 'Sex Offense', 'AGG CRIM SEX ABUSE FAM MEMBER', 'AGG CRIMINAL SEXUAL ABUSE',
       'AGG SEX ASSLT OF CHILD FAM MBR', 'ATT AGG CRIM SEXUAL ABUSE',
       'ATT AGG CRIMINAL SEXUAL ABUSE', 'ATT CRIM SEXUAL ABUSE',
       'COMMERCIAL SEX ACTS', 'CRIM SEX ABUSE BY FAM MEMBER',
       'CRIMINAL SEXUAL ABUSE',
       'NON-CONSENSUAL DISSEMINATION PRIVATE SEXUAL IMAGES',
       'SEX ASSLT OF CHILD BY FAM MBR', 'SEX OFFENDER: FAIL REG NEW ADD',
       'SEX OFFENDER: FAIL TO REGISTER', 'SEX OFFENDER: PROHIBITED ZONE',
       'SEX RELATION IN FAMILY', 'SEXUAL EXPLOITATION OF A CHILD']

In [28]:
# DELETE THIS LATER
#test = chicago.groupby(['Description', 'Primary Type']).sum().reset_index()
test['Description'][test['Description'].str.contains('SEX')].values

array(['AGG CRIM SEX ABUSE FAM MEMBER', 'AGG CRIMINAL SEXUAL ABUSE',
       'AGG SEX ASSLT OF CHILD FAM MBR', 'ATT AGG CRIM SEXUAL ABUSE',
       'ATT AGG CRIMINAL SEXUAL ABUSE', 'ATT CRIM SEXUAL ABUSE',
       'COMMERCIAL SEX ACTS', 'CRIM SEX ABUSE BY FAM MEMBER',
       'CRIMINAL SEXUAL ABUSE',
       'NON-CONSENSUAL DISSEMINATION PRIVATE SEXUAL IMAGES',
       'SEX ASSLT OF CHILD BY FAM MBR', 'SEX OFFENDER: FAIL REG NEW ADD',
       'SEX OFFENDER: FAIL TO REGISTER', 'SEX OFFENDER: PROHIBITED ZONE',
       'SEX RELATION IN FAMILY', 'SEXUAL EXPLOITATION OF A CHILD'],
      dtype=object)

In [32]:
# Create columns of "total_crime" and "sex_crime" for cities_ls
for i in range(0, len(cities_ls)):
    cities_ls[i]['total_crime'] = 1
    cities_ls[i]['sex_crime'] = 0
    
    if cities_ls[i] is chicago:
        cities_ls[i]['sex_crime'][cities_ls[i]['description'].isin(sex_str)] = 1 
        cities_ls[i]['sex_crime'][cities_ls[i]['description2'].isin(sex_str)] = 1    
    else:
        cities_ls[i]['sex_crime'][cities_ls[i]['description'].isin(sex_str)] = 1    

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  import sys
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


In [34]:
len(chicago[chicago['sex_crime']==1])

16660

In [9]:
# Convert crime dfs to geospatial objects
for i in range(0, len(cities_ls)):
    cities_ls[i] = gpd.GeoDataFrame(cities_ls[i], crs = {'init' : 'epsg:4326'}, geometry=gpd.points_from_xy(cities_ls[i].lon, 
                                                                            cities_ls[i].lat))
baltimore = cities_ls[0]
chicago = cities_ls[1]
dc = cities_ls[2]
detroit = cities_ls[3]
la = cities_ls[4]
minneapolis = cities_ls[5]
nashville = cities_ls[6]
philadelphia = cities_ls[7]
sf = cities_ls[8]

### 2.3 Write & Export Crime Data

In [23]:
# Write crime dfs to shp and export to data/interim/crime/
baltimore.to_file("data/interim/crime/Baltimore/baltimore_crime_pois.shp")
chicago.to_file("data/interim/crime/Chicago/chicago_crime_pois.shp")
dc.to_file("data/interim/crime/DC/dc_crime_pois.shp")
detroit.to_file("data/interim/crime/Detroit/detroit_crime_pois.shp")
la.to_file("data/interim/crime/LA/la_crime_pois.shp")
minneapolis.to_file("data/interim/crime/Minneapolis/minneapolis_crime_pois.shp")
nashville.to_file("data/interim/crime/Nashville/nashville_crime_pois.shp")
philadelphia.to_file("data/interim/crime/Philadelphia/philadelphia_crime_pois.shp")
sf.to_file("data/interim/crime/SF/sf_crime_pois.shp")