# Análisis de series temporales
En este notebook se analiza la evolución en el tiempo de los distintos incidentes del dataset, en función de distintas características, tales como el número de incidentes o el número de personas involucradas.

In [1]:
# Importamos las librerías necesarias
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

In [2]:
pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

Cargamos el dataset de incidentes preprocesado. Como estamos realizando un análisis de series temporales, usamos la columna "date" como index y eliminamos la columna unnamed: 0

In [3]:
df = pd.read_csv('gun_violence_dataset/incidents_dataset.csv', parse_dates=['date'], dayfirst=False, index_col='date')
df = df.drop(labels=['Unnamed: 0'], axis=1)

In [4]:
df.head()

Unnamed: 0_level_0,incident_id,state,city_or_county,address,n_killed,n_injured,congressional_district,latitude,longitude,n_guns_involved,notes,average_age,n_victims,n_suspects,stolen_gun_involved,minors_involved,women_percentage,domestic violence,drive-by,gang involvement,home invasion,institution/group/business,shooting,murder,officer involved,possession,shot,suicide,drug involvement,stolen/illegally owned gun,bar/club incident,sex crime,school incident,kidnapping/abductions/hostage,defensive use,car-jacking,armed robbery,hate crime,child,accidental/negligent discharge,brandishing/flourishing/open carry/lost/found,gun(s) stolen,guns stolen,road rage,under the influence of alcohol or drugs,bb/pellet/replica gun,cleaning gun,implied weapon,house party,atf/le confiscation/raid/arrest,gun at school,thought gun was unloaded,hunting accident,stolen gun,police targeted,pistol-whipping,playing with gun,shootout,unlawful purchase/sale,non-aggression,gun shop robbery or burglary,concealed carry license,assault weapon,lockdown/alert,tsa action,terrorism,ghost gun,political violence,mistaken id,nav,gun buy back action
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1
2013-01-01,461105,Pennsylvania,Mckeesport,1506 Versailles Avenue and Coursin Street,0,4,14.0,40.3467,-79.8559,,Julian Sims under investigation: Four Shot and...,20.0,4.0,1.0,Unknown,False,25.0,False,False,False,False,False,True,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2013-01-01,460726,California,Hawthorne,13500 block of Cerise Avenue,1,3,43.0,33.909,-118.333,,Four Shot; One Killed; Unidentified shooter in...,20.0,4.0,1.0,Unknown,False,0.0,False,False,True,False,False,True,True,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2013-01-01,478855,Ohio,Lorain,1776 East 28th Street,1,3,9.0,41.4455,-82.1377,2.0,,31.2,3.0,2.0,Unknown,False,0.0,False,False,False,False,False,False,True,False,False,True,True,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2013-01-05,478925,Colorado,Aurora,16000 block of East Ithaca Place,4,0,6.0,39.6518,-104.802,,,37.75,3.0,1.0,Unknown,False,25.0,False,False,False,False,False,True,True,True,False,True,True,True,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
2013-01-07,478959,North Carolina,Greensboro,307 Mourning Dove Terrace,2,2,6.0,36.114,-79.9569,2.0,Two firearms recovered. (Attempted) murder sui...,31.25,3.0,1.0,Unknown,True,50.0,True,False,False,False,False,False,True,False,False,True,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


# Por número de incidentes

In [58]:
fig4 = go.Figure()

fig4.add_trace(
    go.Scatter(
        x = df.incident_id.resample('Y').count().index,
            y = df.incident_id.resample('Y').count(),
            name = 'Número de incidentes agrupados por año'
    )
)

fig4.update_layout(title={
             'text': "Número de incidentes agrupados por año",
         'y':0.85,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
}
                 )

fig4.show()

In [57]:
fig4 = go.Figure()

fig4.add_trace(
    go.Scatter(
        x = df.incident_id.resample('M').count().index,
            y = df.incident_id.resample('M').count(),
            name = 'Número de incidentes agrupados por mes'
    )
)

fig4.update_layout(title={
             'text': "Número de incidentes agrupados por mes",
         'y':0.85,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
}
                 )

fig4.show()

# Componente estacional (trend y seasonality)

Analizamos el componente estacional

In [9]:
# Seasonal decomposition
seasonal_decomposition = seasonal_decompose(df.incident_id.resample('M').count(), model='multiplicative', period=12)

Si seleccionamos una ventana de tiempo específica, podemos observar que la tendencia es ascendente

In [29]:
# Trend
trend = seasonal_decomposition.trend
fig4 = go.Figure()

fig4.add_trace(
    go.Scatter(
        x = trend.index,
            y = trend,
            name = 'Trend'
    )
)

fig4.update_layout(title={
             'text': "Análisis de la tendencia",
         'y':0.85,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
}
                 )

fig4.show()

In [56]:
# Seasonality
fig4 = go.Figure()

seasonality = seasonal_decomposition.seasonal

fig4.add_trace(
    go.Scatter(
        x = seasonality.index,
            y = seasonality,
            name = 'Seasonality'
    )
)

fig4.update_layout(title={
             'text': "Análisis de la estacionalidad",
         'y':0.85,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
}
                 )


fig4.show()

# Número de víctimas, sospechosos, heridos y asesinados por mes

In [53]:
fig4 = go.Figure()

fig4.add_trace(
    go.Scatter(
        x = df.resample('M').count().index,
            y = df['n_killed'].resample('M').sum(),
            name = 'n_killed', line=dict(
                color="#FF0000", 
                dash='dash')
    )
)

fig4.add_trace(
    go.Scatter(
        x = df.resample('M').count().index,
            y = df['n_injured'].resample('M').sum(),
            name = 'n_injured', line=dict(
                color="#0000FF", 
                dash='dash')
    )
)

fig4.add_trace(
    go.Scatter(
        x = df.resample('M').count().index,
            y = df['n_victims'].resample('M').sum(),
            name = 'n_victims'
        , line=dict(color="#FF0000")
    )
)

fig4.add_trace(
    go.Scatter(
        x = df.resample('M').count().index,
            y = df['n_suspects'].resample('M').sum(),
            name = 'n_suspects'
        , line=dict(color="#0000FF")
    )
)

fig4.update_layout(title={
             'text': "Número de víctimas, sospechosos, heridos y asesinados por mes",
         'y':0.9,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
}
                 )

fig4.show()

# Número de incidentes por estado

In [13]:
import plotly.io as pio
import plotly.graph_objects as go

In [52]:
buttons = []
i = 0



fig3 = go.Figure()
state_list = list(df['state'].unique())

for state in state_list:
    
    data = df['state']==state
    data.resample('M').sum()

    fig3.add_trace(
        go.Scatter(
            x = df.resample('M').count().index,
            y = data.resample('M').sum(),
            name = state, visible = (i==0)
        )
    )
    
for state in state_list:
    args = [False] * len(state_list)
    args[i] = True
    
    #create a button object for the country we are on
    button = dict(label = state,
                  method = "update",
                  args=[{"visible": args}])
    
    #add the button to our list of buttons
    buttons.append(button)
    
    #i is an iterable used to tell our "args" list which value to set to True
    i+=1  
    
fig3.update_layout(updatemenus=[dict(active=0,
                                    type="dropdown",
                                    buttons=buttons,
                                    x = 0,
                                    y = 1.1,
                                    xanchor = 'left',
                                    yanchor = 'bottom'),
                              ])

fig3.update_layout(title={
             'text': "Número de incidentes por estado",
         'y':0.9,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
}
                 )

fig3.update_layout(
    autosize=False,
    width=1000,
    height=800,)

# Edad media de los involucrados

In [50]:
fig4 = go.Figure()

fig4.add_trace(
    go.Scatter(
        x = df.average_age.resample('M').mean().index,
            y = df.average_age.resample('M').mean(),
            name = 'Edad media de los involucrados por mes'
    )
)

fig4.update_layout(title={
             'text': "Edad media de los involucrados por mes",
         'y':0.85,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
}
                 )

fig4.show()