<table><tr>
<td> <img src=https://www.baltimorepolice.org/themes/custom/bpd/images/bpd_logo.png alt="Drawing" style="height: 250px;"/> </td>
<td> <img src=https://beam-images.warnermediacdn.com/BEAM_LWM_DELIVERABLES/1bc3aff5-0d6a-4c0b-8ed0-5716ca30ab3b/fbbc7a604f327cfa8a7bbe614a89be13a246d266.jpg?host=wbd-images.prod-vod.h264.io&partner=beamcom style="height: 250px;"/> </td>
</tr></table>

ARRESTOS DE LA POLICIA DE BALTIMORE (2013-2016)
=

En este proyecto analizaremos las detenciones en Baltimore para identificar patrones y predecir el cargo de una detención basándose en características como la edad, sexo, raza, ubicación y fecha del arresto.

OBJETIVOS
=
    - Identificar las características más influyentes en el cargo de una detención.
    - Construir un modelo predictivo que pueda predecir el cargo de una detención.
    - Analizar estos datos puede ayudar a entender los patrones de detenciones y mejorar las estrategias de seguridad pública.

In [5]:
import pandas as pd
import numpy as np

In [6]:
df = pd.read_csv('BPD_Arrests.csv')

In [7]:
df.head()

Unnamed: 0,Arrest,Age,Sex,Race,ArrestDate,ArrestTime,ArrestLocation,IncidentOffense,IncidentLocation,Charge,ChargeDescription,District,Post,Neighborhood,Location 1
0,16160529.0,54.0,M,B,11/12/2016,22:35,3500 PELHAM AVE,4ECOMMON ASSAULT,3500 PELHAM AVE,1 1415,COMMON ASSAULT,Northeastern,432.0,Belair-Edison,"(39.3208685519, -76.5652449141)"
1,16160490.0,22.0,M,B,11/12/2016,21:49,300 S LOUDON AVE,Unknown Offense,300 S LOUDON AVE,4 3550,POSSESSION,Southwestern,833.0,Irvington,"(39.2811486601, -76.6821278085)"
2,16160487.0,31.0,M,B,11/12/2016,21:40,,Unknown Offense,,1 0077,FAILURE TO APPEAR,,,,
3,16160485.0,31.0,M,B,11/12/2016,20:30,,Unknown Offense,,1 0077,FAILURE TO APPEAR,,,,
4,16160481.0,33.0,M,B,11/12/2016,19:45,,Unknown Offense,,2 0480,MOTOR VEH/UNLAWFUL TAKING,,,,


In [8]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 130713 entries, 0 to 130712
Data columns (total 15 columns):
 #   Column             Non-Null Count   Dtype  
---  ------             --------------   -----  
 0   Arrest             123699 non-null  float64
 1   Age                130685 non-null  float64
 2   Sex                130713 non-null  object 
 3   Race               130713 non-null  object 
 4   ArrestDate         130713 non-null  object 
 5   ArrestTime         130713 non-null  object 
 6   ArrestLocation     78595 non-null   object 
 7   IncidentOffense    130713 non-null  object 
 8   IncidentLocation   76987 non-null   object 
 9   Charge             114255 non-null  object 
 10  ChargeDescription  130211 non-null  object 
 11  District           78601 non-null   object 
 12  Post               78583 non-null   float64
 13  Neighborhood       78595 non-null   object 
 14  Location 1         78666 non-null   object 
dtypes: float64(3), object(12)
memory usage: 15.0+ MB


COMENTARIOS DEL DATASET
=======================
- La fuente del dataset es el propio Departamento de Policia de Baltimore a través del portal Open Baltimore (https://data.baltimorecity.gov/)
- Dataset está compuesto de 15 columnas y más de 130.000 filas.
- Recopila datos sobre arrestos en Baltimore de 2013 a 2016 (hasta 12 de noviembre de 2016).
- 4 grupos de datos:
    (1) Edad, sexo, raza de los detenidos. 
    (2) Fecha y hora.
    (3) Lugar de las detenciones.
    (4) Delito cometido.

MANIPULACIÓN DEL DATASET
=

La columna 'ArrestDate' está en formato (22/05/2016) (M/D/A) --> Pasaremos a tener tres columnas: 'Year', 'Month' y 'DayOfWeek'

In [12]:
# Convertir la columna 'ArrestDate' a tipo datetime
df['ArrestDate'] = pd.to_datetime(df['ArrestDate'])

# Crear columnas adicionales para el análisis
df['Year'] = df['ArrestDate'].dt.year
df['Month'] = df['ArrestDate'].dt.month
df['DayOfWeek'] = df['ArrestDate'].dt.dayofweek

La columna 'ArrestTime' está en formato (22:53) (HH:MM) --> Pasaremos a tener la columna 'Hour' con un float solo de la hora (22.0)

In [14]:
# Crear la Columna Hour
# 'ArrestTime' es de tipo string
df['ArrestTime'] = df['ArrestTime'].astype(str)

# Función para manejar los valores incorrectos
def extract_hour(time_str):
    try:
        return pd.to_datetime(time_str, format='%H:%M').hour
    except ValueError:
        return None

# Aplicar la función a la columna 'ArrestTime'
df['Hour'] = df['ArrestTime'].apply(extract_hour)

In [15]:
df.to_csv('BPD_Date.csv', index=False)

print("Archivo CSV guardado como 'BPD_Date.csv'")

Archivo CSV guardado como 'BPD_Date.csv'


La columna 'ChargeCategory' tiene más de 500 tipologías de delitos diferentes, además, tiene repeticiones con errores ortográficos, mayúsculas/minúsculas y faltas de guiones o puntos.

Tras crear un diccionario y mapear todas las opciones se crea la columna 'ChargeCategory' y se pasa de 500 a 9 tipologías.

La columna 'Location 1' viene en formato (39.3208685519, -76.5652449141)

Pasará a ser dos columnas 'Latitude' (39.3208685519) y 'Longitude' (-76.5652449141)

In [18]:
# Extraer latitud y longitud de la columna "Location 1"
df['Location 1'] = df['Location 1'].dropna()
df[['Latitude', 'Longitude']] = df['Location 1'].str.extract(r'\(([^,]+),\s*([^)]+)\)')

# Convertir a tipo float
df['Latitude'] = df['Latitude'].astype(float)
df['Longitude'] = df['Longitude'].astype(float)

# Asegúrate de que no queden valores nulos en las nuevas columnas
df = df.dropna(subset=['Latitude', 'Longitude'])

In [19]:
category_mapping = {
    'Asslt-Sec Degree || Common Assault': 'Violent Crimes',
    'Asslt-Sec Degree || Assault': 'Violent Crimes',
    'Asslt-Sec Degree || 2Nd Degree Assault': 'Violent Crimes',
    'Asslt-Sec Degree || Aggravated Assault': 'Violent Crimes',
    'Asslt-Sec Degree || Assault On Police': 'Violent Crimes',
    'Asslt-Sec Degree || Domestic Assault': 'Violent Crimes',
    'Asslt-Sec Degree || Agg Assault': 'Violent Crimes',
    'Asslt-Sec Degree || Assault 2Nd Degree': 'Violent Crimes',
    'Asslt-Sec Degree || Assault': 'Violent Crimes',
    'Asslt-Sec Degree || Assault 1St Degree': 'Violent Crimes',
    'Asslt-First Degree || Aggravated Assault': 'Violent Crimes',
    'Asslt-First Degree || Assault': 'Violent Crimes',
    'Asslt-First Degree || Common Assault': 'Violent Crimes',
    'Asslt-First Degree || Armed Robbery': 'Violent Crimes',
    'Asslt-Sec Degree || Assault By Threat': 'Violent Crimes',
    'Deadly Weapon-Conceal || Deadly Weapon': 'Violent Crimes',
    'Deadly Weapon-Int/Injure || Aggravated Assault': 'Violent Crimes',
    'Deadly Weapon-Int/Injure || Assault': 'Violent Crimes',
    'Deadly Weapon || Deadly Weapon': 'Violent Crimes',
    'Asslt-Sec Degree || Domestic Common Assault': 'Violent Crimes',
    'Asslt-Sec Degree || Common Assualt': 'Violent Crimes',
    'ARMED ROBBERY': 'Violent Crimes',
    'ROBBERY': 'Violent Crimes',
    'Robbery || Armed Robbery': 'Violent Crimes',
    'Robbery || Robbery': 'Violent Crimes',
    'UNARMED ROBBERY': 'Violent Crimes',
    'Asslt-Sec Degree || Unarmed Robbery': 'Violent Crimes',
    'Assault': 'Violent Crimes',
    '1St Degree Assault': 'Violent Crimes',
    '2Nd Degree Assault': 'Violent Crimes',
    'Aggravated Assault': 'Violent Crimes',
    'Assault And Robbery': 'Violent Crimes',
    'Rape Second Degree': 'Violent Crimes',
    'Attempted Murder': 'Violent Crimes',
    'Att 1St Deg. Murder': 'Violent Crimes',
    'Hindering': 'Violent Crimes',
    'Physical Child Abuse': 'Violent Crimes',
    'Child Abuse:Parent': 'Violent Crimes',
    'Sex Offense Fourth Degree': 'Violent Crimes',
    'Perverted Practice': 'Violent Crimes',
    'Burglary-4Th Degree-Store || Burglary': 'Property Crimes',
    'Burglary-4Th Degree-Store || B&E': 'Property Crimes',
    'Burglary-4Th Degree-Store || 4Th Degree Burglary': 'Property Crimes',
    'Burglary-4Th Degree-Store || Breaking And Entering': 'Property Crimes',
    'Burglary-4Th Degree-Store || Burglary 4Th Degree': 'Property Crimes',
    'Burglary-Fourth Degree || Burglary': 'Property Crimes',
    'Burglary-Fourth Degree || B&E': 'Property Crimes',
    'Burglary-Fourth Degree || 4Th Degree B&E': 'Property Crimes',
    'Burglary-Fourth Degree || Breaking And Entering': 'Property Crimes',
    'Burglary-Fourth Degree || B & E': 'Property Crimes',
    'Burglary-4Th Degree': 'Property Crimes',
    'Burglary-Fourth Degree': 'Property Crimes',
    'Burglary-4Th Degree-Store': 'Property Crimes',
    'Burglary-Third Degree': 'Property Crimes',
    'Destruction Of Property': 'Property Crimes',
    'Destruction Of Pr': 'Property Crimes',
    'Mal Dest Prop/Valu - $500': 'Property Crimes',
    'Malicious Destruction': 'Property Crimes',
    'Theft Less Than $100.00': 'Property Crimes',
    'Theft Under $500': 'Property Crimes',
    'Theft Under 100.00': 'Property Crimes',
    'Theft': 'Property Crimes',
    'Larceny From Auto': 'Property Crimes',
    'Auto Theft': 'Property Crimes',
    'Unauthorized Use': 'Property Crimes',
    'Stolen Vehicle': 'Property Crimes',
    'Theft Less Than $100.00 || Larceny': 'Property Crimes',
    'Theft: Less $1,000 Value || Larceny': 'Property Crimes',
    'Theft: $1,000 To Under $10,000 || Larceny': 'Property Crimes',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Theft: $1,000 To Under $10,000 || Theft': 'Property Crimes',
    'Theft Less Than $100.00 || Theft Under $100': 'Property Crimes',
    'Theft Less Than $100.00 || Larceny From Auto': 'Property Crimes',
    'Theft Less Than $100.00 || Shoplifting': 'Property Crimes',
    'Cds:Possess-Not Marihuana || Cds Violation': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds': 'Drug Offenses',
    'Cds: Poss Marihuana L/T 10 G || Cds Violation': 'Drug Offenses',
    'Cds: Possession-Marihuana || Cds Violation': 'Drug Offenses',
    'Cds: Poss Marihuana L/T 10 G || Cds': 'Drug Offenses',
    'Cds: Possession-Marihuana || Cds': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Poss Cocaine': 'Drug Offenses',
    'Cds:P W/I Dist:Narc || Cds Violation': 'Drug Offenses',
    'Att-Cds Manuf/Dist-Narc || Cds Violation': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Poss Heroin': 'Drug Offenses',
    'Cds: Poss Marihuana L/T 10 G || Poss Marijuana': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds Possession': 'Drug Offenses',
    'Att-Cds Manuf/Dist-Narc || Cds': 'Drug Offenses',
    'Cds:Poss Para || Cds Violation': 'Drug Offenses',
    'CDS': 'Drug Offenses',
    'Cds: Poss Marihuana L/T 10 G || Poss Of Marijuana': 'Drug Offenses',
    'Cds Manuf/Dist-Narc || Cds': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds Poss Cocaine': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Poss Cds': 'Drug Offenses',
    'Cds: Poss Marihuana L/T 10 G || Poss Of Marijuana': 'Drug Offenses',
    'Cds: Poss Marihuana L/T 10 G || Cds Poss Marijuana': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds Poss Heroin': 'Drug Offenses',
    'Cds Manuf/Dist-Narc || Cds Violation': 'Drug Offenses',
    'Cds:P W/I Dist:Narc || Cds Pwid': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Poss Cds': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Poss Of Heroin': 'Drug Offenses',
    'Cds: Possession-Marihuana || Cds Poss': 'Drug Offenses',
    'Cds:Possess-Not Marihuana': 'Drug Offenses',
    'Cds Manuf/Dist-Narc': 'Drug Offenses',
    'Distribution Of Heroin': 'Drug Offenses',
    'Possession Of Heroin': 'Drug Offenses',
    'Distribution Of Cocaine': 'Drug Offenses',
    'Possession Of Cocaine': 'Drug Offenses',
    'Poss W/Int': 'Drug Offenses',
    'Poss W/Int To Dist': 'Drug Offenses',
    'Pwi Heroin': 'Drug Offenses',
    'Pwi Cocaine': 'Drug Offenses',
    'Pwid Cocaine': 'Drug Offenses',
    'Pwid Marijuana': 'Drug Offenses',
    'Poss Mdma': 'Drug Offenses',
    'Poss W/Int Marijuana': 'Drug Offenses',
    'Armed Robbery || Armed Robbery': 'Robbery',
    'Robbery || Robbery': 'Robbery',
    'Robbery || Unarmed Robbery': 'Robbery',
    'Armed Robbery': 'Robbery',
    'Robbery': 'Robbery',
    'Attempt Armed Robbery': 'Robbery',
    'Commercial Armed Robbery': 'Robbery',
    'Unarmed Robbery': 'Robbery',
    'Handgun On Person || Hgv': 'Weapons Violations',
    'Handgun On Person || Handgun Violation': 'Weapons Violations',
    'Firearm Poss W/Fel Convict || Hgv': 'Weapons Violations',
    'Reg Firearm:Illegal Possession || Hgv': 'Weapons Violations',
    'Firearm/Drug Traf Crime || Hgv': 'Weapons Violations',
    'Poss Of Firearm/Ammo/Minor || Hgv': 'Weapons Violations',
    'Deadly Weapon-Conceal || Concealed Deadly Weapon': 'Weapons Violations',
    'Deadly Weapon-Int/Injure || Aggravated Assault': 'Weapons Violations',
    'Handgun Violation': 'Weapons Violations',
    'Cds-Poss Of Firearms || Hgv': 'Weapons Violations',
    'Cds Manuf/Dist-Narc || Hgv': 'Weapons Violations',
    'Cds:Possess-Not Marihuana || Hgv': 'Weapons Violations',
    'Deadly Weapon-Conceal': 'Weapons Violations',
    'Deadly Weapon-Int/Injure': 'Weapons Violations',
    'Concealed Deadly Weapon': 'Weapons Violations',
    'Handgun On Person': 'Weapons Violations',
    'Handgun': 'Weapons Violations',
    'Firearm Poss W/Fel Convict': 'Weapons Violations',
    'Firearm/Drug Traf Crime': 'Weapons Violations',
    'Rifle/Shot-Poss W/Fel Conv': 'Weapons Violations',
    'Reg Firearm:Illegal Possession': 'Weapons Violations',
    'Reg Firearm:Stolen/Sell Etc': 'Weapons Violations',
    'Dangerous Weapon': 'Weapons Violations',
    'Dis.Erly Conduct || Disorderly Conduct': 'Public Order Crimes',
    'Dis.Erly Conduct || Disorderly': 'Public Order Crimes',
    'Dis.Erly Conduct || Fto': 'Public Order Crimes',
    'Fail Obey Renble/Lawfl || Disorderly Conduct': 'Public Order Crimes',
    'Fail Obey Renble/Lawfl || Failure To Obey': 'Public Order Crimes',
    'Fail Obey Renble/Lawfl || Fto': 'Public Order Crimes',
    'Public Urination': 'Public Order Crimes',
    'Loitering': 'Public Order Crimes',
    'Violate Exparte/Prot Order || Violation Of Protective': 'Public Order Crimes',
    'Peace Order: Fail To Comply || Violation Of Peace Orde': 'Public Order Crimes',
    'Resist/Interfere With Arrest || Assault On Police': 'Public Order Crimes',
    'Resist/Interfere With Arrest || Cds': 'Public Order Crimes',
    'Resist/Interfere With Arrest || Resisting Arrest': 'Public Order Crimes',
    'Affray || Affray': 'Public Order Crimes',
    'Indecent Exposure || Indecent Exposure': 'Public Order Crimes',
    'Disorderly Conduct': 'Public Order Crimes',
    'Dis.Erly Conduct || Trespassing': 'Public Order Crimes',
    'Dis.Erly Conduct || Hindering': 'Public Order Crimes',
    'Dis.Erly Conduct': 'Public Order Crimes',
    'Disorderly Conduct': 'Public Order Crimes',
    'Open Container': 'Public Order Crimes',
    'Aggressive Panhandling': 'Public Order Crimes',
    'Fail Obey Renble/Lawfl': 'Public Order Crimes',
    'Curfew': 'Public Order Crimes',
    'Hindering': 'Public Order Crimes',
    'Driving Suspended || Driving Suspended': 'Traffic Violations',
    'Driving Suspended || Suspended': 'Traffic Violations',
    'Driving Suspended || Suspended License': 'Traffic Violations',
    'Driving Without License || Driving Without License': 'Traffic Violations',
    'Driving Without License || No Drivers License': 'Traffic Violations',
    'Driving Without License || License Violation': 'Traffic Violations',
    'Driving Suspended || Driving While Suspended': 'Traffic Violations',
    'Driving Without License || Driving On Suspended': 'Traffic Violations',
    'Driving Without License || Driving Without Permit': 'Traffic Violations',
    'Driving Without License || No Driving License': 'Traffic Violations',
    'Driving Without License || No Drivers Lic': 'Traffic Violations',
    'Driving Suspended || Driving W/Suspended Lic': 'Traffic Violations',
    'Driving Suspended || Suspended Driver': 'Traffic Violations',
    'Driving Without License || No Driver\'s License': 'Traffic Violations',
    'Driving Without License || No Drivers License': 'Traffic Violations',
    'Driving On Suspended Lic.': 'Traffic Violations',
    'Driving On Susp Lic': 'Traffic Violations',
    'Driving Without License': 'Traffic Violations',
    'Driving With Out License': 'Traffic Violations',
    'Driving Under The Influence': 'Traffic Violations',
    'Driving On Revoked Lic': 'Traffic Violations',
    'Driving W/O Lic': 'Traffic Violations',
    'Driving W/Out License': 'Traffic Violations',
    'Driving On Suspended License': 'Traffic Violations',
    'Sex Offense Fourth Degree': 'Violent Crimes',
    'Rape Second Degree': 'Violent Crimes',
    'Perverted Practice': 'Violent Crimes',
    'Failure To Appear': 'Administrative',
    'Fto': 'Administrative',
    'Failure To Stop After Accident': 'Administrative',
    'Leaving Scene Of Accident': 'Administrative',
    'Alc. Bev./Intox:Endanger': 'Violent Crimes',
    'Obstructing & Hindering': 'Obstruction of Justice',
    'Resist/Interfere With Arrest': 'Obstruction of Justice',
    'Animal Cruelty': 'Violent Crimes',
    'Trespass-Posted Property': 'Property Crimes',
    'Rogue And Vagabond': 'Other',
    'Gaming/Cards- Dice- Etc.': 'Other',
    'Panhandling': 'Other',
    'Public Intoxication': 'Other',
    'Trespassing': 'Property Crimes',
    'Confine Unattended Child': 'Other',
    'Violation Of Peace Order': 'Other',
    'Violation Of Ex Parte': 'Other',
    'Violation Of Protective Order': 'Other',
    'Loitering': 'Other',
    'Aggressive Panhandling': 'Other',
    'Prostitution-General': 'Other',
    'Prostitution': 'Other',
    'Human Traffic-Benefit Financ': 'Other',
    'Perverted Practice': 'Other',
    'Child Neglect': 'Other',
    'Curfew': 'Other',
    'False Imprisonment': 'Other',    
    'Att-Cds Manuf/Dist-Narc || Distribution Cds': 'Drug Offenses',
    'Obstructing & Hindering || Assault On Police': 'Violent Crimes',
    'Att-Cds:Possess-Not Marihuana || Cds': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds Possession; Not Marij': 'Drug Offenses',
    'Asslt-Sec Degree || 2 Nd Degree Assault': 'Violent Crimes',
    'Theft Less Than $100.00 || Theft Under $1000': 'Property Crimes',
    'Burglary-4Th Degree-Store || Burglary 4Th': 'Property Crimes',
    'Cds:Possess-Not Marihuana || Poss W/ Int Cocaine': 'Drug Offenses',
    'Cds: Possession-Marihuana || Cds Poss With Intent': 'Drug Offenses',
    'Burglary-4Th Degree Theft || Commercial B&E': 'Property Crimes',
    'Burglary-First Degree || 1St Degree Burglary': 'Property Crimes',
    'Cds:P W/I Dist:Narc || Dist. Heroin': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Possession Of Her': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft Under $1,000.00': 'Property Crimes',
    'False Stmt To Peace Officer || Cds Violation': 'Public Order Crimes',
    'Theft-Scheme:1K To Under 10K || Theft': 'Property Crimes',
    'Cds:P W/I Dist:Narc || P/W/I Heroin': 'Drug Offenses',
    'Cds:P W/I Dist:Narc || Dist Cds': 'Drug Offenses',
    'Mal Dest Prop/Valu - $500 || Mal Dest Of Property': 'Property Crimes',
    'Cds:Possess-Not Marihuana || Dist Of Cocaine': 'Drug Offenses',
    'Cds:Poss Para || Poss Of Heroin': 'Drug Offenses',
    'Cds:P W/I Dist:Narc || Heroin Distribution': 'Drug Offenses',
    'Con-Cds Manuf/Dist-Narc || Cds Dist Heroin': 'Drug Offenses',
    'DUI': 'Traffic Violations',
    'Affray || Disorderly Conduct': 'Public Order Crimes',
    'U/U Livestock Mv Etc || Motor Vehicle Theft': 'Property Crimes',
    'Dis.Erly Conduct || Disordely': 'Public Order Crimes',
    'Att-Cds Manuf/Dist. || Cds Poss': 'Drug Offenses',
    'Obstructing & Hindering || Fto': 'Public Order Crimes',
    'Cds:P W/I Dist:Narc || Cds W/ Intent': 'Drug Offenses',
    'Know Alter Firearm Id Number || Handgun Violation': 'Weapons Violations',
    'Resist/Interfere With Arrest || Fto': 'Public Order Crimes',
    'Cds:Possess-Not Marihuana || Cds-Poss Heroin': 'Drug Offenses',
    'Cds Poss W/Int To Dist || Dist Marijuana': 'Drug Offenses',
    'PWID': 'Drug Offenses',
    'VIOLATION OF PEACE ORDER': 'Public Order Crimes',
    'Cds: Possession-Marihuana || Cds Pwid Marijuana': 'Drug Offenses',
    'Asslt-Sec Degree || Assault 1St': 'Violent Crimes',
    'Cds:Possess-Not Marihuana || Cds Poss. Other': 'Drug Offenses',
    'Peace Order: Fail To Comply || Violation Of Protection': 'Public Order Crimes',
    'Asslt-Sec Degree || Agg Assault By Threat': 'Violent Crimes',
    'Cds Possess - Lg Amt || Pwi Heroin': 'Drug Offenses',
    'Cds: Possession-Marihuana || Poss Of Cds': 'Drug Offenses',
    'Cds Poss W/Int To Dist || Poss W/Int Marijuana': 'Drug Offenses',
    'Robbery || Common Assault': 'Robbery',
    'Asslt-Sec Degree || Affray': 'Violent Crimes',
    'Cds: Poss Marihuana L/T 10 G || Cds Pwid': 'Drug Offenses',
    'Rifle/Shotgun:Unregistered || Hgv': 'Weapons Violations',
    'Cds:P W/I Dist:Narc || Possession Of Heroin': 'Drug Offenses',
    'Theft Less Than $100.00 || Lar': 'Property Crimes',
    'Firearm/Drug Traf Crime || Hand Gun Violation': 'Weapons Violations',
    'ASSAULT 1ST DEGREE': 'Violent Crimes',
    'Cds: Poss Marihuana L/T 10 G || Cds Possession Mariju': 'Drug Offenses',
    'Driving On Suspended License A': 'Traffic Violations',
    'Asslt-Sec Degree || Aslt': 'Violent Crimes',
    'Reg Firearm:Illegal Possession || Handgun': 'Weapons Violations',
    'Theft: Less $1,000 Value || Theft Under $1,000': 'Property Crimes',
    'Cds Manuf/Dist-Narc || Pwid Heroin': 'Drug Offenses',
    'Trespass: Private Property || Common Assault': 'Violent Crimes',
    'Theft Less Than $100.00 || Cds': 'Property Crimes',
    'Theft: $1,000 To Under $10,000 || Motor Vehicle Theft': 'Property Crimes',
    'Cds: Poss Marihuana L/T 10 G || Possession Mar': 'Drug Offenses',
    'Handgun In Vehicle || Handgun': 'Weapons Violations',
    'Sex Offense Fourth Degree || Sex Offense': 'Violent Crimes',
    'Cds:Possess-Not Marihuana || Poss W/ Heroin': 'Drug Offenses',
    'Mal Dest Prop/Valu - $500 || Distruction Of Property': 'Property Crimes',
    'Cds:P W/I Dist:Narc || Att Dist Cocaine': 'Drug Offenses',
    'Asslt-Sec Degree || Assault And Robbery': 'Violent Crimes',
    'Cds:P W/I Dist:Narc || Poss With Intent Cocaine': 'Drug Offenses',
    'Mal Destr Prop Value + $500 || Destruction Of Propert': 'Property Crimes',
    'Att-Cds Manuf/Dist-Narc || Poss Cds': 'Drug Offenses',
    'Cds Manuf/Dist-Narc || Pwi Cocaine': 'Drug Offenses',
    'Cds:P W/I Dist:Narc || Cds Poss Cocaine': 'Drug Offenses',
    'Cds Manuf/Dist. || Cds Distribution': 'Drug Offenses',
    'Con-Cds Manuf/Dist. || Cds': 'Drug Offenses',
    'Cds: Possession-Marihuana || Hgv/Cds': 'Drug Offenses',
    'Possess/Issue Forged Currency || False Pretense': 'Fraud',
    'Cds:Possess-Not Marihuana || Pwid Marijuana': 'Drug Offenses',
    'Trespass: Private Property || B&E': 'Property Crimes',
    'Mal Destr Prop Value + $500 || Common Assault': 'Violent Crimes',
    'Theft: $1,000 To Under $10,000 || Theft Over $1000': 'Property Crimes',
    'Cds: Possession-Marihuana || Pwi-Marijuana': 'Drug Offenses',
    'Mal Dest Prop/Valu - $500 || Destruction Property': 'Property Crimes',
    'Asslt-Sec Degree || Common Assult': 'Violent Crimes',
    'Armed Robbery || Att. Armed Robbery': 'Robbery',
    'Rifle/Shot-Poss W/Fel Conv || Handgun Violation': 'Weapons Violations',
    'Reckless Driving': 'Traffic Violations',
    'Night Time Curfew': 'Public Order Crimes',
    'BREAKING AND ENTERING': 'Property Crimes',
    'Firearm Poss W/Fel Convict || Handgun': 'Weapons Violations',
    'Theft: Less $1,000 Value || Theft Under $1000.00': 'Property Crimes',
    'Reckless Endangerment || Child Neglect': 'Violent Crimes',
    'Armed Robbery || Att Armed Robbery': 'Robbery',
    'Cds: Poss Marihuana L/T 10 G || Cds-Poss': 'Drug Offenses',
    'Asslt-Sec Degree || Poss Marijuana': 'Violent Crimes',
    'Asslt-Sec Degree || Fto': 'Public Order Crimes',
    'Cds:Possess-Not Marihuana || Assault On Police': 'Violent Crimes',
    'Curfew Violations': 'Public Order Crimes',
    'Cds:P W/I Dist:Narc || Assault On Police': 'Violent Crimes',
    'Att-Cds Manuf/Dist. || Dist Cds': 'Drug Offenses',
    'Robbery || Attempted Unarmed Robbery': 'Robbery',
    'Driving On Suspended Out Of St': 'Traffic Violations',
    'Burglary-4Th Degree-Store || Burglary 4Th Deg': 'Property Crimes',
    'Agg Assault': 'Violent Crimes',
    'Poss Firearm w/ Fel Conv || Possess Firearm': 'Weapons Violations',
    'Possession: Cds || Drug Possession': 'Drug Offenses',
    'Reckless Endangerment || Reckless Endanger': 'Violent Crimes',
    'Theft: Less $1,000 Value || Theft Under 1000': 'Property Crimes',
    'Asslt-Sec Degree || Fleeing And Eluding': 'Public Order Crimes',
    'Cds:Possess-Not Marihuana || Possession Cds': 'Drug Offenses',
    'Robbery || Attempted Robbery': 'Robbery',
    'Mal Destr Prop Value + $500 || Vandalism': 'Property Crimes',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Cds:Possess-Not Marihuana || Possession Cocaine': 'Drug Offenses',
    'Cds: Possession-Marihuana || Pwi Cds': 'Drug Offenses',
    'Cds: Possession-Marihuana || Possession Of Heroin': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Robbery || Armed Robbery': 'Robbery',
    'Possession Cds: Marijuana || Drug Possession': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Assault: 2Nd Degree || Assault On Police': 'Violent Crimes',
    'Cds:Possess-Not Marihuana || Possession Of Heroin': 'Drug Offenses',
    'Robbery || Robbery Armed': 'Robbery',
    'Cds:Possess-Not Marihuana || Possession Marijuana': 'Drug Offenses',
    'Reckless Driving || Reckless': 'Traffic Violations',
    'Possess Firearm: Felon || Possession Of Firearm': 'Weapons Violations',
    'Theft: $1,000 To Under $10,000 || Theft 1000': 'Property Crimes',
    'Firearm-Use In Crime || Firearm': 'Weapons Violations',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Asslt-Sec Degree || Assault 2Nd Degree': 'Violent Crimes',
    'Cds:Possess-Not Marihuana || Cds Possess': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Possession Of Cocaine': 'Drug Offenses',
    'Asslt-Sec Degree || Assault & Robbery': 'Violent Crimes',
    'Possession Cds: Marijuana || Pwi Marijuana': 'Drug Offenses',
    'Unlawful Possession Of Firearm || Firearm Possession': 'Weapons Violations',
    'Cds:Possess-Not Marihuana || Pwi Cds': 'Drug Offenses',
    'Burglary-First Degree || Burglary - 1St Degree': 'Property Crimes',
    'Cds:Possess-Not Marihuana || Possession Of Cds': 'Drug Offenses',
    'Possession Of Cds || Drug Possession': 'Drug Offenses',
    'Possession Of Marijuana || Drug Possession': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Possession Cds': 'Drug Offenses',
    'Driving On Suspended License || Suspended License': 'Traffic Violations',
    'Firearm-Illegal Possession || Firearm': 'Weapons Violations',
    'Cds:Possess-Not Marihuana || Cds': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Robbery || Robbery Armed': 'Robbery',
    'Burglary-4Th Degree-Store || Burglary 4Th': 'Property Crimes',
    'Possess Controlled Dangerous Sub || Cds Possession': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Cds:Possess-Not Marihuana || Cds Possession': 'Drug Offenses',
    'Possession Controlled Dangerous || Drug Possession': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Possess Cds': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds Distribution': 'Drug Offenses',
    'Possess Cds: Marijuana || Cds Poss': 'Drug Offenses',
    'Assault 1St || Assault': 'Violent Crimes',
    'Possess Controlled Dangerous || Drug Possession': 'Drug Offenses',
    'Possess Weapon || Weapon Possession': 'Weapons Violations',
    'Possess Controlled Dangerous || Cds Possession': 'Drug Offenses',
    'Possess Marijuana || Drug Possession': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Possession Cocaine': 'Drug Offenses',
    'Possession Marijuana || Drug Possession': 'Drug Offenses',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Possess Controlled Dangerous || Drug Possession': 'Drug Offenses',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Possess Firearm || Weapon Possession': 'Weapons Violations',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Possess Controlled Dangerous || Drug Possession': 'Drug Offenses',
    'Firearm Possession || Weapon Possession': 'Weapons Violations',
    'Asslt-Sec Degree || Common Assault': 'Violent Crimes',
    'Possess Marijuana || Drug Possession': 'Drug Offenses',
    'Assault 2Nd Degree || Assault': 'Violent Crimes',
    'Robbery || Armed Robbery': 'Robbery',
    'Theft: $1,000 To Under $10,000 || Theft Over $1000': 'Property Crimes',
    'Possess Controlled Dangerous Sub || Cds Possession': 'Drug Offenses',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Cds:Possess-Not Marihuana || Cds Possession': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Assault 2Nd Degree || Assault': 'Violent Crimes',
    'Assault 2Nd Degree || Assault': 'Violent Crimes',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Assault 1St || Assault': 'Violent Crimes',
    'Possess Controlled Dangerous || Drug Possession': 'Drug Offenses',
    'Cds:Possess-Not Marihuana || Cds Distribution': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Possess Controlled Dangerous || Cds Possession': 'Drug Offenses',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Possess Controlled Dangerous || Cds Possession': 'Drug Offenses',
    'Possess Marijuana || Drug Possession': 'Drug Offenses',
    'Assault 2Nd Degree || Assault': 'Violent Crimes',
    'Possess Controlled Dangerous || Cds Possession': 'Drug Offenses',
    'Possess Marijuana || Drug Possession': 'Drug Offenses',
    'Assault 2Nd Degree || Assault': 'Violent Crimes',
    'Theft: Less $1,000 Value || Theft': 'Property Crimes',
    'Possess Controlled Dangerous || Cds Possession': 'Drug Offenses',
    'Possess Marijuana || Drug Possession': 'Drug Offenses',
    'Assault 2Nd Degree || Assault': 'Violent Crimes',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Possess Controlled Dangerous || Cds Possession': 'Drug Offenses',
    'Possess Marijuana || Drug Possession': 'Drug Offenses',
    'Theft Less Than $100.00 || Theft': 'Property Crimes',
    'Possess Controlled Dangerous || Drug Possession': 'Drug Offenses',
    'Possess Marijuana || Drug Possession': 'Drug Offenses'
    
}
# Función para mapear los delitos a las nuevas categorías
def map_charge_description(offense):
    for key in category_mapping.keys():
        if key == offense:
            return category_mapping[key]    
    return None

# Aplicar la función de mapeo a la columna IncidentOffense
df['ChargeCategory'] = df['ChargeDescription'].apply(map_charge_description)


Borramos las siguientes columnas:
- 'Arrest' que es un ID de la detención.
- 'ArrestDate' de la que hemos sacado 3 nuevas columnas ('Year', 'Month' y 'DayOfWeek').
- 'ArrestTime' de al que hemos sacado otra columna solo con la hora en formato float ('Hour')
- De las relacionadas con localizaciones nos quedamos con 'Latitude', 'Longitude' y 'District' el resto se eliminan por información redundante y/o incompleta ('ArrestLocation, IncidentLocation', 'Post', 'Neighborhood', 'Location 1').
- De las relacionadas con los hechos delictivos nos quedamos con la columna creada por nosotros 'ChargeCategory' y el resto las eliminamos por información redundante y/o incompleta ('IncidentOffense', 'Charge', 'ChargeDescription')

In [21]:
# Eliminar las columnas especificadas
columns_to_drop = [
    'Arrest', 'ArrestLocation', 'IncidentLocation', 'IncidentOffense', 'Charge', 'ChargeDescription',
    'OffenseCategory', 'Post', 'Neighborhood', 'Location 1', 'ArrestDate', 'ArrestTime'
]
df_cleaned = df.drop(columns=columns_to_drop, errors='ignore')

Limpieza de datos (Duplicados y NaNs)

In [23]:
df.drop_duplicates(inplace=True)
df.dropna(inplace=True)

Guardar el DataFrame modificado en un nuevo archivo CSV

In [25]:
df_cleaned.to_csv('New_BPD.csv', index=False)

print("Archivo CSV guardado como 'New_BPD.csv'")

Archivo CSV guardado como 'New_BPD.csv'


In [26]:
df_original = pd.read_csv('BPD_Arrests.csv')

In [27]:
df_2 = pd.read_csv('New_BPD.csv')

COMPARATIVA DEL DATASET ORIGINAL CON EL QUE TRABAJAREMOS LAS VISUALIZACIONES Y LAS PREDICCIONES
=

In [29]:
df_original.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 130713 entries, 0 to 130712
Data columns (total 15 columns):
 #   Column             Non-Null Count   Dtype  
---  ------             --------------   -----  
 0   Arrest             123699 non-null  float64
 1   Age                130685 non-null  float64
 2   Sex                130713 non-null  object 
 3   Race               130713 non-null  object 
 4   ArrestDate         130713 non-null  object 
 5   ArrestTime         130713 non-null  object 
 6   ArrestLocation     78595 non-null   object 
 7   IncidentOffense    130713 non-null  object 
 8   IncidentLocation   76987 non-null   object 
 9   Charge             114255 non-null  object 
 10  ChargeDescription  130211 non-null  object 
 11  District           78601 non-null   object 
 12  Post               78583 non-null   float64
 13  Neighborhood       78595 non-null   object 
 14  Location 1         78666 non-null   object 
dtypes: float64(3), object(12)
memory usage: 15.0+ MB


In [30]:
df_2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 78666 entries, 0 to 78665
Data columns (total 11 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Age             78651 non-null  float64
 1   Sex             78666 non-null  object 
 2   Race            78666 non-null  object 
 3   District        78599 non-null  object 
 4   Year            78666 non-null  int64  
 5   Month           78666 non-null  int64  
 6   DayOfWeek       78666 non-null  int64  
 7   Hour            76516 non-null  float64
 8   Latitude        78666 non-null  float64
 9   Longitude       78666 non-null  float64
 10  ChargeCategory  36782 non-null  object 
dtypes: float64(4), int64(3), object(4)
memory usage: 6.6+ MB


In [31]:
df_2["ChargeCategory"].unique()

array([nan, 'Drug Offenses', 'Violent Crimes', 'Public Order Crimes',
       'Traffic Violations', 'Property Crimes', 'Robbery',
       'Weapons Violations', 'Other', 'Administrative'], dtype=object)

- Tras el análisis exploratorio de datos (EDA) de las 130.000 filas pasaremos a tener 78.000 para realizar las Visualizaciones y el Análisis y modelamiento de datos. (60%)