# **Crime Data in Mexico From 2015 to 2023**
## A Data Visualization Project


### Fernando Herrera

### **Introduction**
This project has the aim of visualizing several data points regarding crime in Mexico during the years
2015 to 2023. Diego Valle's website [elcri.men](https://elcri.men/), contains up to date data extracted directly from official
Mexican governmental [sources](https://www.gob.mx/sesnsp/acciones-y-programas/incidencia-delictiva-del-fuero-comun-nueva-metodologia)
and provides the monthly crime count for different types of crimes, sub-types and modalities, for each one
of Mexico's 32 states. An interactive web application was developed using Dash, allowing the user to select one 
or several Mexican states to analize, as well as one to several crime types, sub types, and modalities. The user 
is also able to select the time window to analyze, in order to visualize in a line graph (create using Plotly Express), 
the total count of the selected crime/s in the selected state/s. The app also offers a pie graph, also created with
Plotly Express, in order to compare the total crime count during the time window between all selected states.

### **Dependencies**

In [1]:
!pip install -r requirements.txt

You should consider upgrading via the '/Users/macbook/Desktop/Dev/fhg99.github.io/Projects/Data/DataAnalysis/MexicoCrimeDashboard/.env/bin/Python3 -m pip install --upgrade pip' command.[0m


Basic Imports

In [2]:
import pandas as pd
from pathlib import Path

PROJECT_DIR = Path().parent.resolve()
FILES_DIR = Path(PROJECT_DIR, 'Files')

### **Load Data**

The dataset used for this project was created by Diego Valle and can be accessed through his [webpage](https://elcri.men/acerca/).

In [3]:
data = pd.read_csv(Path(FILES_DIR, 'nm-fuero-comun-estados.csv'))
print(data.columns)
print(data.shape)

Index(['state_code', 'state', 'bien_juridico', 'tipo', 'subtipo', 'modalidad',
       'date', 'count', 'population'],
      dtype='object')
(326144, 9)


For this project we will allow the user to select several variables to filter the data, such as the state,
the affected legal asset (legal term), the crimes' type, subtype and modality. All of these fields will also 
have the defautl option of "Include All" in orther to include all of the incidences in that category. These fields
will be selected through a dropdown component, with the property of Multi set to True, except for the initial and
final date fields of the time window, which should only hold one value out of all the posible dates found in the
dataset.

For this, we need to create a list of dictionaries containing the posible state names for the dropdown
to contain only states mentioned in the data. We also creat a dictionary for tbe option to select 
all posible values at once and insert it at index 0 of the options list.

In [4]:
states = data['state'].unique()
states

array(['AGUASCALIENTES', 'BAJA CALIFORNIA', 'BAJA CALIFORNIA SUR',
       'CAMPECHE', 'COAHUILA', 'COLIMA', 'CHIAPAS', 'CHIHUAHUA',
       'CIUDAD DE MÉXICO', 'DURANGO', 'GUANAJUATO', 'GUERRERO', 'HIDALGO',
       'JALISCO', 'MÉXICO', 'MICHOACÁN', 'MORELOS', 'NAYARIT',
       'NUEVO LEÓN', 'OAXACA', 'PUEBLA', 'QUERÉTARO', 'QUINTANA ROO',
       'SAN LUIS POTOSÍ', 'SINALOA', 'SONORA', 'TABASCO', 'TAMAULIPAS',
       'TLAXCALA', 'VERACRUZ', 'YUCATÁN', 'ZACATECAS'], dtype=object)

In [5]:
state_options = [{'label': state, 'value': state} for state in states]
# 'INCLUIR TODOS' translates to 'INCLUDE ALL'
show_all_option = {'label': 'INCLUIR TODOS', 'value': 'INCLUIR TODOS'}
state_options.insert(0, show_all_option)
state_options

[{'label': 'INCLUIR TODOS', 'value': 'INCLUIR TODOS'},
 {'label': 'AGUASCALIENTES', 'value': 'AGUASCALIENTES'},
 {'label': 'BAJA CALIFORNIA', 'value': 'BAJA CALIFORNIA'},
 {'label': 'BAJA CALIFORNIA SUR', 'value': 'BAJA CALIFORNIA SUR'},
 {'label': 'CAMPECHE', 'value': 'CAMPECHE'},
 {'label': 'COAHUILA', 'value': 'COAHUILA'},
 {'label': 'COLIMA', 'value': 'COLIMA'},
 {'label': 'CHIAPAS', 'value': 'CHIAPAS'},
 {'label': 'CHIHUAHUA', 'value': 'CHIHUAHUA'},
 {'label': 'CIUDAD DE MÉXICO', 'value': 'CIUDAD DE MÉXICO'},
 {'label': 'DURANGO', 'value': 'DURANGO'},
 {'label': 'GUANAJUATO', 'value': 'GUANAJUATO'},
 {'label': 'GUERRERO', 'value': 'GUERRERO'},
 {'label': 'HIDALGO', 'value': 'HIDALGO'},
 {'label': 'JALISCO', 'value': 'JALISCO'},
 {'label': 'MÉXICO', 'value': 'MÉXICO'},
 {'label': 'MICHOACÁN', 'value': 'MICHOACÁN'},
 {'label': 'MORELOS', 'value': 'MORELOS'},
 {'label': 'NAYARIT', 'value': 'NAYARIT'},
 {'label': 'NUEVO LEÓN', 'value': 'NUEVO LEÓN'},
 {'label': 'OAXACA', 'value': 'OAX

Now for the affected legal-asset options:

In [6]:
legal_assets = data['bien_juridico'].unique()
legal_assets # The first 3 of them traduce to "Heritage, Family, Sexual Freedom and Safety"

array(['EL PATRIMONIO', 'LA FAMILIA', 'LA LIBERTAD Y LA SEGURIDAD SEXUAL',
       'LA SOCIEDAD', 'LA VIDA Y LA INTEGRIDAD CORPORAL',
       'LIBERTAD PERSONAL',
       'OTROS BIENES JURÍDICOS AFECTADOS (DEL FUERO COMÚN)'], dtype=object)

In [7]:
legal_asset_options = [{'label': asset, 'value': asset} for asset in legal_assets]
legal_asset_options.insert(0, show_all_option)
legal_asset_options

[{'label': 'INCLUIR TODOS', 'value': 'INCLUIR TODOS'},
 {'label': 'EL PATRIMONIO', 'value': 'EL PATRIMONIO'},
 {'label': 'LA FAMILIA', 'value': 'LA FAMILIA'},
 {'label': 'LA LIBERTAD Y LA SEGURIDAD SEXUAL',
  'value': 'LA LIBERTAD Y LA SEGURIDAD SEXUAL'},
 {'label': 'LA SOCIEDAD', 'value': 'LA SOCIEDAD'},
 {'label': 'LA VIDA Y LA INTEGRIDAD CORPORAL',
  'value': 'LA VIDA Y LA INTEGRIDAD CORPORAL'},
 {'label': 'LIBERTAD PERSONAL', 'value': 'LIBERTAD PERSONAL'},
 {'label': 'OTROS BIENES JURÍDICOS AFECTADOS (DEL FUERO COMÚN)',
  'value': 'OTROS BIENES JURÍDICOS AFECTADOS (DEL FUERO COMÚN)'}]

Now, once a legal asset is provided, the type, sub-type and modality of the crime can be determined, but
each category has it's own posible sub categories available. For example, there are only certain type
of crimes under a given legal asset, as there are only several sub-types of crimes given a certain crime
type, and the same goes for modality. For this, we can create a function for each one of these 3 new 
categories, that returns the options list depending on the passed value of the previous category.

In [8]:
# Default options: contains all possible crime types plus the "Include All" option.
crime_types = data['tipo'].unique()
crime_type_options = [{'label': crime_type, 'value': crime_type} for crime_type in crime_types]
crime_type_options.insert(0, show_all_option)

def get_type_options(selected_legal_assets):
    crime_type_options = []
    if('INCLUIR TODOS' in selected_legal_assets):
        crime_types = data['tipo'].unique()
    else:
        crime_types = data[data['bien_juridico'].isin(selected_legal_assets)]['tipo'].unique()
    crime_type_options = [{'label': crime_type, 'value': crime_type} for crime_type in crime_types]
    crime_type_options.insert(0, show_all_option)
    return crime_type_options

# We test the function passing the first value in the legal_assets list a single element list.
# We can see that for the affected legal asset of "Heritage" ("EL PATRIMONIO"), there are crime types 
# such as: "Trust Abuse" ("ABUSO DE CONFIANZA") and "Property Damage" ("DAÑO A LA PROPIEDAD")
print(legal_assets[0])
get_type_options([legal_assets[0]])

EL PATRIMONIO


[{'label': 'INCLUIR TODOS', 'value': 'INCLUIR TODOS'},
 {'label': 'ABUSO DE CONFIANZA', 'value': 'ABUSO DE CONFIANZA'},
 {'label': 'DAÑO A LA PROPIEDAD', 'value': 'DAÑO A LA PROPIEDAD'},
 {'label': 'DESPOJO', 'value': 'DESPOJO'},
 {'label': 'EXTORSIÓN', 'value': 'EXTORSIÓN'},
 {'label': 'FRAUDE', 'value': 'FRAUDE'},
 {'label': 'OTROS DELITOS CONTRA EL PATRIMONIO',
  'value': 'OTROS DELITOS CONTRA EL PATRIMONIO'},
 {'label': 'ROBO', 'value': 'ROBO'}]

A similar function is created for sub-type and modality.

In [9]:
# Default options: contains all possible crime subtypes plus the "Include All" option.
crime_sub_types = data['subtipo'].unique()
crime_sub_type_options = [{'label': crime_sub_type, 'value': crime_sub_type} for crime_sub_type in crime_sub_types]
crime_sub_type_options.insert(0, show_all_option)

def get_sub_type_options(selected_crime_types):
    crime_sub_type_options = []
    if('INCLUIR TODOS' in selected_crime_types):
        crime_sub_types = data['subtipo'].unique()
    else:
        crime_sub_types = data[data['tipo'].isin(selected_crime_types)]['subtipo'].unique()
    crime_sub_type_options = [{'label': crime_sub_type, 'value': crime_sub_type} for crime_sub_type in crime_sub_types]
    crime_sub_type_options.insert(0, show_all_option)
    return crime_sub_type_options

get_sub_type_options(['ROBO'])

[{'label': 'INCLUIR TODOS', 'value': 'INCLUIR TODOS'},
 {'label': 'OTROS ROBOS', 'value': 'OTROS ROBOS'},
 {'label': 'ROBO A CASA HABITACIÓN', 'value': 'ROBO A CASA HABITACIÓN'},
 {'label': 'ROBO A INSTITUCIÓN BANCARIA',
  'value': 'ROBO A INSTITUCIÓN BANCARIA'},
 {'label': 'ROBO A NEGOCIO', 'value': 'ROBO A NEGOCIO'},
 {'label': 'ROBO A TRANSEÚNTE EN ESPACIO ABIERTO AL PÚBLICO',
  'value': 'ROBO A TRANSEÚNTE EN ESPACIO ABIERTO AL PÚBLICO'},
 {'label': 'ROBO A TRANSEÚNTE EN VÍA PÚBLICA',
  'value': 'ROBO A TRANSEÚNTE EN VÍA PÚBLICA'},
 {'label': 'ROBO A TRANSPORTISTA', 'value': 'ROBO A TRANSPORTISTA'},
 {'label': 'ROBO DE AUTOPARTES', 'value': 'ROBO DE AUTOPARTES'},
 {'label': 'ROBO DE GANADO', 'value': 'ROBO DE GANADO'},
 {'label': 'ROBO DE MAQUINARIA', 'value': 'ROBO DE MAQUINARIA'},
 {'label': 'ROBO DE VEHÍCULO AUTOMOTOR',
  'value': 'ROBO DE VEHÍCULO AUTOMOTOR'},
 {'label': 'ROBO EN TRANSPORTE INDIVIDUAL',
  'value': 'ROBO EN TRANSPORTE INDIVIDUAL'},
 {'label': 'ROBO EN TRANSPORTE 

In [10]:
# Default options: contains all possible crime modalities plus the "Include All" option.
crime_modalities = data['modalidad'].unique()
crime_modality_options = [{'label': crime_modality, 'value': crime_modality} for crime_modality in crime_modalities]
crime_modality_options.insert(0, show_all_option)

def get_modality_options(selected_crime_sub_types):
    crime_modality_options = []
    if('INCLUIR TODOS' in selected_crime_sub_types):
        crime_modalities = data['modalidad'].unique()
    else:
        crime_modalities = data[data['subtipo'].isin(selected_crime_sub_types)]['modalidad'].unique()
    crime_modality_options = [{'label': crime_modality, 'value': crime_modality} for crime_modality in crime_modalities]
    crime_modality_options.insert(0, show_all_option)
    return crime_modality_options

get_modality_options(['ROBO A CASA HABITACIÓN'])

[{'label': 'INCLUIR TODOS', 'value': 'INCLUIR TODOS'},
 {'label': 'CON VIOLENCIA', 'value': 'CON VIOLENCIA'},
 {'label': 'SIN VIOLENCIA', 'value': 'SIN VIOLENCIA'}]

Finally, we need to create the options lists for all for the possible dates.

In [11]:
dates = data['date'].unique()
date_options = [{'label': date, 'value': date} for date in dates]
date_options

[{'label': '2015-01', 'value': '2015-01'},
 {'label': '2015-02', 'value': '2015-02'},
 {'label': '2015-03', 'value': '2015-03'},
 {'label': '2015-04', 'value': '2015-04'},
 {'label': '2015-05', 'value': '2015-05'},
 {'label': '2015-06', 'value': '2015-06'},
 {'label': '2015-07', 'value': '2015-07'},
 {'label': '2015-08', 'value': '2015-08'},
 {'label': '2015-09', 'value': '2015-09'},
 {'label': '2015-10', 'value': '2015-10'},
 {'label': '2015-11', 'value': '2015-11'},
 {'label': '2015-12', 'value': '2015-12'},
 {'label': '2016-01', 'value': '2016-01'},
 {'label': '2016-02', 'value': '2016-02'},
 {'label': '2016-03', 'value': '2016-03'},
 {'label': '2016-04', 'value': '2016-04'},
 {'label': '2016-05', 'value': '2016-05'},
 {'label': '2016-06', 'value': '2016-06'},
 {'label': '2016-07', 'value': '2016-07'},
 {'label': '2016-08', 'value': '2016-08'},
 {'label': '2016-09', 'value': '2016-09'},
 {'label': '2016-10', 'value': '2016-10'},
 {'label': '2016-11', 'value': '2016-11'},
 {'label': 

### **Initialize Application and Imports**


In [12]:
# Import packages
from dash import Dash, html, dcc, Output, Input
import plotly.express as px

# Initialize the app
app = Dash(__name__)

### **Define Layout**


In [13]:
# App layout
app.layout = html.Div([
    html.H1('Incidencia Delictiva en México / Crime Incidence in Mexico',
            style={'fontSize': '2rem',
                   'text-align': 'center'}),
    html.Div([
        html.Label("Estado / State:",
                   style={'font-size': '1.2rem'}),
        dcc.Dropdown(
            id='dropdown-state',
            options=state_options,
            value=['INCLUIR TODOS'],
            placeholder='ESTADO / STATE',
            multi=True,
            style={'margin-top': '0.2rem',
                   'margin-bottom': '0.3rem'}
        )
    ]),
    html.Div([
        html.Label("Bien Jurídico Afectado / Affected Legal Asset:",
                   style={'font-size': '1.2rem'}),
        dcc.Dropdown(
            id='dropdown-legal-asset',
            options=legal_asset_options,
            value=['INCLUIR TODOS'],
            placeholder='BIEN JURÍDICO AFECTADO / AFFECTED LEGAL ASSET',
            multi=True,
            style={'margin-top': '0.2rem',
                   'margin-bottom': '0.3rem'}
        )
    ]),
    html.Div([
        html.Label("Tipo / Type:",
                   style={'font-size': '1.2rem'}),
        dcc.Dropdown(
            id='dropdown-type',
            options=crime_type_options,
            value=['INCLUIR TODOS'],
            placeholder='TIPO / TYPE',
            multi=True,
            style={'margin-top': '0.2rem',
                   'margin-bottom': '0.3rem'}
        )
    ]),
    html.Div([
        html.Label("Subtipo / Subtype:",
                   style={'font-size': '1.2rem'}),
        dcc.Dropdown(
            id='dropdown-sub-type',
            options=crime_sub_type_options,
            value=['INCLUIR TODOS'],
            placeholder='SUBTIPO / SUBTYPE',
            multi=True,
            style={'margin-top': '0.2rem',
                   'margin-bottom': '0.3rem'}
        )
    ]),
    html.Div([
        html.Label("Modalidad / Modality:",
                   style={'font-size': '1.2rem'}),
        dcc.Dropdown(
            id='dropdown-modality',
            options=crime_modality_options,
            value=['INCLUIR TODOS'],
            placeholder='MODALIDAD / MODALITY',
            multi=True,
            style={'margin-top': '0.2rem',
                   'margin-bottom': '0.3rem'}
        )
    ]),
    html.Div([
        html.Div([
            html.Label("Fecha Inicial / Initial Date:",
                   style={'font-size': '1.2rem'}),
            dcc.Dropdown(
                id='dropdown-initial-date',
                options=date_options,
                value=dates[0],
                placeholder='FECHA INICIO / INITIAL DATE',
                style={'margin-top': '0.2rem',
                    'margin-bottom': '0.3rem',
                    'width': '20rem'}
            )
        ]),
        html.Div([
            html.Label("Fecha Final / Final Date:",
                   style={'font-size': '1.2rem'}),
            dcc.Dropdown(
                id='dropdown-final-date',
                options=date_options,
                value=dates[-1],
                placeholder='FECHA FINAL / FINAL DATE',
                style={'margin-top': '0.2rem',
                    'margin-bottom': '0.3rem',
                    'width': '20rem'}
            )
        ])
    ], style={'display': 'inline-flex'}),
    html.Div([
        html.Div(id='output-container', className='chart-grid', style={'display': 'flex'})
    ])
])

### **Callback Functions**
<br>
This firt callback function listens to changes in the affectd legal asset dropdown in order to 
change the options of the other dropdowns.


In [14]:
@app.callback(
    Output(component_id='dropdown-type', component_property='options'),
    Input(component_id='dropdown-legal-asset',component_property='value'))
def update_input_container(selected_legal_assets):
    return get_type_options(selected_legal_assets)


@app.callback(
    Output(component_id='dropdown-sub-type', component_property='options'),
    Input(component_id='dropdown-type',component_property='value'))
def update_input_container(selected_crime_types):
    return get_sub_type_options(selected_crime_types)

@app.callback(
    Output(component_id='dropdown-modality', component_property='options'),
    Input(component_id='dropdown-sub-type',component_property='value'))
def update_input_container(selected_crime_sub_types):
    return get_modality_options(selected_crime_sub_types)

Finally, we add the callback to listen to any changes in the input data and to add both graphs to the output container.
For those dropdowns that have the "Include All" option available, if that option is selected then no filtering is
done for that specific variable.

In [15]:
# Callback for plotting
@app.callback(
    Output(component_id='output-container', component_property='children'),
    [Input(component_id='dropdown-state', component_property='value'), 
     Input(component_id='dropdown-legal-asset', component_property='value'),
     Input(component_id='dropdown-type', component_property='value'),
     Input(component_id='dropdown-sub-type', component_property='value'),
     Input(component_id='dropdown-modality', component_property='value'),
     Input(component_id='dropdown-initial-date', component_property='value'),
     Input(component_id='dropdown-final-date', component_property='value')])

def update_output_container(selected_states,
                            selected_legal_assets,
                            selected_types,
                            selected_sub_types,
                            selected_modalities,
                            selected_initial_date,
                            selected_final_date):
    selected_data = data.copy()
    if('INCLUIR TODOS' not in selected_states):
        selected_data = data[data['state'].isin(selected_states)]
    if('INCLUIR TODOS' not in selected_legal_assets):
        selected_data = selected_data[selected_data['bien_juridico'].isin(selected_legal_assets)]
    if('INCLUIR TODOS' not in selected_types):
        selected_data = selected_data[selected_data['tipo'].isin(selected_types)]
    if('INCLUIR TODOS' not in selected_sub_types):
        selected_data = selected_data[selected_data['subtipo'].isin(selected_sub_types)]
    if('INCLUIR TODOS' not in selected_modalities):
        selected_data = selected_data[selected_data['modalidad'].isin(selected_modalities)]
    selected_data['date'] = pd.to_datetime(selected_data['date']).copy()
    selected_data = selected_data[(selected_data['date'] >= selected_initial_date) & (selected_data['date'] <= selected_final_date)]

# Line Graph
    monthly_data = selected_data.groupby('date')['count'].sum().reset_index()
    R_chart1 = dcc.Graph(
            figure=px.line(monthly_data, 
            x='date',
            y='count',
            title="Conteo Total de Incidencias <br>Total Incidence Count"))

# Pie Chart
    state_data = selected_data.groupby('state')['count'].sum().reset_index()
    R_chart2 = dcc.Graph(
            figure=px.pie(state_data,
            values='count',
            names='state',
                 title="Incidencias por Estado <br>Incidences by State"))

    return [
            html.Div(className='chart-item', children=[html.Div(children=R_chart1),html.Div(children=R_chart2)],style={'display': 'flex'})
            ]

### **Run the App**
<br>
We can now run the application. It can be locally accessed through -> http://localhost:8050/


In [16]:
# Ignore deprecation and future warnings
import warnings
warnings.filterwarnings('ignore', category=DeprecationWarning)
warnings.filterwarnings('ignore', category=FutureWarning)

# Run the app
if __name__ == '__main__':
    app.run(debug=True, use_reloader=False)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

### **Examlple of Usage**

<video controls src="Files/usage.mp4" />
