# Bail or No Bail


Author: Erick Orozco

Date: 10/28/2023

## Inspiration:
Los Angeles County plans to bring back a Zero-Bail policy for those arrested on misdemeanors and non-violent offenders. Many in the county had concerns that when this policy was originally enacted it would have caused a spike in crime with those individuals let go becoming re-offenders. This no bail policy was first implemented during the COVID-19 pandemic starting on April 13, 2020 due to concerns of spreading COVID-19 in overcrowded jails. This was always planned to be a temporary measure due to the nature of the pandemic and the zero-bail policy ended on July 1, 2020. On October 1, 2023 Los Angeles County Police Departments will go back to the Zero-Bail Policy with no designated end-date.



### Project (and possible additions)
I want to observe and analyze the rates of crimes during the enactment of this policy and prior to the policy. Obviously the original time window of this no cacs bail policy is very short and during an odd time with several complex factors for fuctuations in crime. I leave this for possible updates in the future. The main focus here is to create a dash board to visualize these crimes. 

### This Notebook
Cleaning, Feature Selection, and general preparation for dashboard.

### Data Sources:

#### Used:
https://data.lacity.org/Public-Safety/Crime-Data-from-2020-to-Present/2nrs-mtv8

https://data.lacity.org/Public-Safety/Crime-Data-from-2010-to-2019/63jg-8b9z

##### Potential for Future:

https://fred.stlouisfed.org/series/PPAACA06037A156NCEN

https://fred.stlouisfed.org/series/PE5T17CA06037A647NCEN

https://fred.stlouisfed.org/series/CALOSA7URN

https://fred.stlouisfed.org/series/CALOSA7LFN

https://fred.stlouisfed.org/series/LAUCN060370000000005


### References:
    
https://engineering.stanford.edu/magazine/article/can-ai-help-judges-make-bail-system-fairer-and-safer

https://www.wired.com/story/algorithms-supposed-fix-bail-system-they-havent/

https://www.latimes.com/archives/la-xpm-2005-aug-30-me-crime30-story.html

In [125]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import OneHotEncoder

In [126]:
# Function to create a dataframe & dictionary to give 
# us in depth of info of the dataset's 
# datatypes consistency per column
def check_type(df):
    col_type_dict = {}
    # iterate through columns
    for col in list(df.columns.values):
        # create list of types of each cell for column
        type_list = list(map(str,list(df[col].map(type))))
        # find most common type in column
        common = max(set(type_list), key=type_list.count)
        # find total number of that type
        com_num = sum(t == common for t in type_list)
        # list all types found in this column
        all_types = list(set(type_list))
        #compile info into dict
        col_type_dict[col] = (common,com_num,(com_num/len(type_list)),all_types)
    
    # create a pandas dataframe, it's easier to read than a dictionary in my opinion
    temp_df = pd.DataFrame.from_dict(col_type_dict, orient='index',\
                                  columns=['most_common_type','most_common_freq',\
                                           'most_common_%','all_types_found'])
    return temp_df

In [127]:
#get all crimes reported available on lapd website and create one larget dataset starting from 2010
current_df = pd.read_csv('Crime_Data_from_2020_to_present.csv')
prev_df = pd.read_csv('Crime_Data_from_2010_to_2019.csv')
crime_df = pd.concat([prev_df, current_df])
crime_df.reset_index(drop=True,inplace=True)
crime_df.head(5)

Unnamed: 0,DR_NO,Date Rptd,DATE OCC,TIME OCC,AREA,AREA NAME,Rpt Dist No,Part 1-2,Crm Cd,Crm Cd Desc,...,Status Desc,Crm Cd 1,Crm Cd 2,Crm Cd 3,Crm Cd 4,LOCATION,Cross Street,LAT,LON,AREA.1
0,1307355,02/20/2010 12:00:00 AM,02/20/2010 12:00:00 AM,1350,13.0,Newton,1385,2,900,VIOLATION OF COURT ORDER,...,Adult Arrest,900.0,,,,300 E GAGE AV,,33.9825,-118.2695,
1,11401303,09/13/2010 12:00:00 AM,09/12/2010 12:00:00 AM,45,14.0,Pacific,1485,2,740,"VANDALISM - FELONY ($400 & OVER, ALL CHURCH VA...",...,Invest Cont,740.0,,,,SEPULVEDA BL,MANCHESTER AV,33.9599,-118.3962,
2,70309629,08/09/2010 12:00:00 AM,08/09/2010 12:00:00 AM,1515,13.0,Newton,1324,2,946,OTHER MISCELLANEOUS CRIME,...,Invest Cont,946.0,,,,1300 E 21ST ST,,34.0224,-118.2524,
3,90631215,01/05/2010 12:00:00 AM,01/05/2010 12:00:00 AM,150,6.0,Hollywood,646,2,900,VIOLATION OF COURT ORDER,...,Invest Cont,900.0,998.0,,,CAHUENGA BL,HOLLYWOOD BL,34.1016,-118.3295,
4,100100501,01/03/2010 12:00:00 AM,01/02/2010 12:00:00 AM,2100,1.0,Central,176,1,122,"RAPE, ATTEMPTED",...,Invest Cont,122.0,,,,8TH ST,SAN PEDRO ST,34.0387,-118.2488,


In [128]:
# show all columns in datasets, check Datasources section at top of page to find link to find more info on each column
crime_df.columns

Index(['DR_NO', 'Date Rptd', 'DATE OCC', 'TIME OCC', 'AREA ', 'AREA NAME',
       'Rpt Dist No', 'Part 1-2', 'Crm Cd', 'Crm Cd Desc', 'Mocodes',
       'Vict Age', 'Vict Sex', 'Vict Descent', 'Premis Cd', 'Premis Desc',
       'Weapon Used Cd', 'Weapon Desc', 'Status', 'Status Desc', 'Crm Cd 1',
       'Crm Cd 2', 'Crm Cd 3', 'Crm Cd 4', 'LOCATION', 'Cross Street', 'LAT',
       'LON', 'AREA'],
      dtype='object')

In [129]:
# save copy of original dataset
og_crime_df = crime_df.copy()

# select columns 
crime_df = crime_df[['Date Rptd','DATE OCC','TIME OCC','Crm Cd','Vict Age','Vict Sex','Vict Descent','LAT','LON']]

In [130]:
# split between violent crime and property crime
    #based on UCR-COMPSTAT062618.pdf on LAPD crime stats page
violent_crime = [110,113,121,122,815,820,821,210,220,230,231,235,236,250,251,761,926]
simple_assault = [435,436,437,622,623,624,625,626,627,647,763,928,930]
property_crime = [310,320,510,520,433,330,331,410,420,421,350,351,352,353,450,451,452,453,341,\
                  343,345,440,441,442,443,444,445,470,471,472,473,474,475,480,485,487,491]

# Based upon my own exploration, here are some more crimes that are not considered in the LAPD pdf

#vandalism, stolen vehicle(other), telephone damage, car theft (auto repair)
property_crime.extend([740,745,522,924,349,446])

#arson, kidnapping, battery with sexual contact, child neglect, child annoying, human trafficking
violent_crime.extend([648,910,920,860,237,813,921,812,810,922,822])

#miscellaneous
misc_crime = [946]

#identity theft, violation of court order, Bunco (fraud/swindling), Bomb scare, violation of restraining order,
    #failure to yield, lewd phone calls, disturbing the peace, false police report, document forgery, trespassing,
    #lewd contact, cruelty to animals, embezzlement, discharge firearm, unauthorized computer access, peeping tom,
    #indecent exposure, derauding innkeeper/theft of services,extortion,contempt of court, prowler, counterfeit
misc_crime.extend([900,354,662,664,755,901,890,956,886,439,649,888,762,943,668,753,661,932,850,951,940,903,902,670,\
                  756,933,651,660,434,652,814,653,806,805,438,950,949,654,944,347,666,954,845,760,880,865,882,840,\
                  942,870,830,884,432,948,931,906,952,905,904])

#now create dictionary to facilitate replacement value process
    # V-Violent, S-Simple Assault, P-Property
crime_type = {}
for i in violent_crime:
    crime_type[i] = 'V'
for i in simple_assault:
    crime_type[i] = 'S'
for i in property_crime:
    crime_type[i] = 'P'    
for i in misc_crime:
    crime_type[i] = 'M'   

In [131]:
# simplify Crime Code column to 4 types of crime: Violent, Simple Assault, Property, & Miscellaneous
crime_df.replace({"Crm Cd": crime_type},inplace=True)

In [132]:
# Format dates
crime_df['Date Rptd'] = crime_df['Date Rptd'].astype('datetime64')
crime_df['DATE OCC'] = crime_df['DATE OCC'].astype('datetime64')

In [133]:
# label reported crime dates where no bail system is enacted = 1
#NOTE: not all crimes receive no bail, this indicator is only to demonstrate the date the policy is enacted,
#      not that this crime received no bail

crime_df['no_bail'] = 0
crime_df['no_bail'].loc[(crime_df['Date Rptd']>='2020-04-13')&(crime_df['Date Rptd']<'2020-07-01')] = 1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)


In [134]:
# some other housekeeping, ensure data is consistent and simplified

# ensure all NaN are 'X', meaning 'Unkown'
crime_df['Vict Sex'].loc[~crime_df['Vict Sex'].isin(['M','F'])]= 'X'

# simplify Victim Descent column, lapd has it as follows:
# Descent Code: A - Other Asian B - Black C - Chinese D - Cambodian F - Filipino G - Guamanian 
#               H - Hispanic/Latin/Mexican I - American Indian/Alaskan Native J - Japanese K - Korean 
#               L - Laotian O - Other P - Pacific Islander S - Samoan U - Hawaiian V - Vietnamese W - White 
#               X - Unknown Z - Asian Indian

# simplify east asian & southeast asian descents into one group
crime_df['Vict Descent'].loc[crime_df['Vict Descent'].isin(['A','C','D','F','J','K','L','V'])] = 'A'

# simplify pacific islanders into one group
crime_df['Vict Descent'].loc[crime_df['Vict Descent'].isin(['G','P','S','U'])] = 'P'

# make sure to list all unkowns into 'X'
# NOTE: 'O' for other is considered unkown due to no details of descent
crime_df['Vict Descent'].loc[~crime_df['Vict Descent'].isin(['H','W','B','A','I','P','Z'])] = 'X'



In [157]:
# create new dataset to visualzie per day accounts of violence
enc = OneHotEncoder(sparse=False)

# pd = per day
crime_pd_df = crime_df.copy()


#One Hot Encode all categorical variables to add up per day later 

crime_pd_df[['M Crm','P Crm','S Crm','V Crm']] = enc.fit_transform(crime_pd_df[['Crm Cd']])

crime_pd_df[['Female','Male','Unkown Sex']] = enc.fit_transform(crime_pd_df[['Vict Sex']])

crime_pd_df[['Asian', 'Black', 'Latino', 'Indigenous', 'Pacific Islander', 'White', 'Unknown Desc', 'South Asian']] \
            = enc.fit_transform(crime_pd_df[['Vict Descent']])

#bin victi, ages
crime_pd_df['Vict Age'] = pd.cut(crime_pd_df['Vict Age'], [-1000,-1,11,15,19,24,34,49,64,200],\
                                 labels=['Uknown Age','0-11','12-15','16-19','20-24','24-34','35-49','50-64','65+'])

crime_pd_df[['Unkown Age','0-11', '12-15', '16-19', '20-24', '24-34', '35-49', '50-64','65+']] =\
            enc.fit_transform(crime_pd_df[['Vict Age']])

#only categorical data
crime_pd_df = crime_pd_df[['Date Rptd', 'no_bail', 'M Crm', 'P Crm','S Crm','V Crm', 'Female','Male',\
                           'Unkown Sex', 'Asian', 'Black', 'Latino','Indigenous','Pacific Islander', \
                           'White', 'Unknown Desc','South Asian', 'Unkown Age', '0-11', '12-15','16-19',\
                           '20-24', '24-34','35-49', '50-64', '65+']]

# sum per day
crime_pd_df = crime_pd_df.groupby(['Date Rptd']).sum().reset_index(drop=True)

# sum of crimes 
crime_pd_df['Total Crime'] = crime_pd_df[['M Crm','P Crm','S Crm','V Crm']].sum(axis=1)

# adjust no bail column back to binary 
crime_pd_df['no_bail'].loc[crime_pd_df['no_bail']!=0] = 1

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)


In [158]:
# write dataset to file

crime_df.to_csv('LA_County_Crime.csv')
crime_df.to_csv('LA_County_PerDay_Crime.csv')

In [161]:
list(crime_df['LAT'].unique())

[33.9825,
 33.9599,
 34.0224,
 34.1016,
 34.0387,
 34.048,
 34.0389,
 34.0435,
 34.045,
 34.0538,
 34.064,
 34.035,
 34.0409,
 34.0502,
 34.0515,
 34.0401,
 34.0428,
 34.0545,
 34.0563,
 34.0454,
 34.0472,
 34.0382,
 34.0423,
 34.046,
 34.0461,
 34.0437,
 34.0357,
 34.0394,
 34.0446,
 34.0439,
 34.0415,
 34.0559,
 34.0426,
 34.0544,
 34.0449,
 34.0481,
 34.0453,
 34.0431,
 34.0617,
 34.0445,
 34.0524,
 34.0459,
 34.0384,
 34.0485,
 34.0503,
 34.0535,
 34.0695,
 34.0416,
 34.053000000000004,
 34.0499,
 34.0452,
 34.0407,
 34.0442,
 34.0451,
 34.0424,
 34.0482,
 34.0371,
 34.0627,
 34.0635,
 34.0917,
 34.0317,
 34.0516,
 34.032,
 34.0717,
 34.0475,
 34.0474,
 34.0496,
 34.0373,
 34.0644,
 34.0537,
 34.0672,
 34.0494,
 34.0673,
 34.0467,
 34.0649,
 34.0619,
 34.0364,
 34.0522,
 34.0582,
 34.0488,
 34.0346,
 34.0656,
 34.042,
 34.0486,
 34.0411,
 34.051,
 34.054,
 34.0377,
 34.0721,
 34.0398,
 34.0308,
 34.0495,
 34.0489,
 34.0419,
 34.0333,
 34.0388,
 34.0462,
 34.0615,
 34.1779,
 34.0571