# Descubrimientos de patrones de accidentes de tránsito en la CDMX

## Pre-Procesamiento y Limpieza de datos 

### Datasets 

Esta capa contiene la delimitación territorial de las 16 alcaldías de la Ciudad de México

<a href = "https://datos.cdmx.gob.mx/explore/dataset/alcaldias/map/?location=9,19.32072,-99.15261">Alcaldías</a>

En el siguiente link podemos encontrar en el archivo "mexico-latest-free.shp.zip" las diferentes capas que contienen los puntos de interés

<a href = "http://download.geofabrik.de/north-america/mexico.html">OpenStreeMap data</a>

En esta capa contiene el conjunto de datos con los incidentes viales reportados por el C5 desde 2014 actualizado mensualmente 

<a href = "https://datos.cdmx.gob.mx/explore/dataset/incidentes-viales-c5/map/?disjunctive.incidente_c4&location=10,19.33685,-99.15797">Siniestros C5</a>


En los siguientes link encontramos los archivos originales para generar la capa de estaciones de transporte correspondientes a: 
- [Metro](https://datos.cdmx.gob.mx/explore/dataset/estaciones-metro/map/?location=10,19.42256,-99.11934)
- [Metrobús](https://datos.cdmx.gob.mx/explore/dataset/estaciones-metrobus/custom/)
- [RTP](https://datos.cdmx.gob.mx/explore/dataset/paradas-de-rtp/map/?location=10,19.35506,-99.14389)
- [Trolebús](https://datos.cdmx.gob.mx/explore/dataset/paradas-de-trolebus/map/?location=10,19.38553,-99.13676)
- [Sistema de Transporte Unificado](https://datos.cdmx.gob.mx/explore/dataset/estaciones-paradas-y-terminales-del-sistema-de-transporte-unificado/map/?location=10,19.36109,-99.15163)
- [CETRAM](https://datos.cdmx.gob.mx/explore/dataset/ubicacion-de-centros-de-transferencia-modal-cetram/map/?location=10,19.38,-99.09805)



Importamos todas las librerías necesarias

In [1]:
from math import radians, cos, sin, asin, sqrt, atan2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
from pylab import rcParams
import seaborn as sns
import folium
import networkx as nx
import osmnx as ox
import geopandas as gpd
from shapely.geometry import Point, Polygon, LineString, MultiLineString
import shapefile as shp
import warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns', None)

Procesamiento de los accidentes, primero cargamos la base de datos de accidentes, filtramos por los tipos de reportes que si son accidentes verificables o reales

In [2]:
url = '/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1/shp_b/original_data/ptos_C5_n/incidentes-viales-c5.geojson'
pts = gpd.read_file(url)
real_pts = pts[(pts['codigo_cierre']=='(A) La unidad de atención a emergencias fue despachada, llegó al lugar de los hechos y confirmó la emergencia reportada')|(pts['codigo_cierre']=='(I) El incidente reportado es afirmativo y se añade información adicional al evento')]


Guardamos el conjunto de accidentes, hacemos lo mismo separando los accidentes por año 

In [3]:
real_pts.to_file('/Users//daniel.rodriguez/Documents/ACC/ACC_PROOF//ACC1//shp_b//processed_data//ptosC5//pts_real.shp')


In [3]:
pts14 = real_pts[real_pts['ano']=='2014']
pts15 = real_pts[real_pts['ano']=='2015']
pts16 = real_pts[real_pts['ano']=='2016']
pts17 = real_pts[real_pts['ano']=='2017']
pts18 = real_pts[real_pts['ano']=='2018']
pts19 = real_pts[real_pts['ano']=='2019']

In [5]:
pts14.to_file('/Users//daniel.rodriguez/Documents/ACC/ACC_PROOF//ACC1//shp_b//processed_data//ptosC5//pts14.shp')
pts15.to_file('/Users//daniel.rodriguez/Documents/ACC/ACC_PROOF//ACC1//shp_b//processed_data//ptosC5//pts15.shp')
pts16.to_file('/Users//daniel.rodriguez/Documents/ACC/ACC_PROOF//ACC1//shp_b//processed_data//ptosC5//pts16.shp')
pts17.to_file('/Users//daniel.rodriguez/Documents/ACC/ACC_PROOF//ACC1//shp_b//processed_data//ptosC5//pts17.shp')
pts18.to_file('/Users//daniel.rodriguez/Documents/ACC/ACC_PROOF//ACC1//shp_b//processed_data//ptosC5//pts18.shp')
pts19.to_file('/Users//daniel.rodriguez/Documents/ACC/ACC_PROOF//ACC1//shp_b//processed_data//ptosC5//pts19.shp')

Cargamos los datos del shape de alcaldías de la CDMX para delimitar todas las capas 

In [2]:
alcaldias = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//original_data//alcaldias_cdmx//alcaldias.shp')

In [3]:
alcaldias.replace(['CuauhtÃ©moc', 
                   'Ã\x81lvaro ObregÃ³n', 
                   'Xochimilco', 
                   'TlÃ¡huac',
                   'Benito JuÃ¡rez', 
                   'Cuajimalpa de Morelos', 
                   'Gustavo A. Madero',
                   'Tlalpan', 
                   'Venustiano Carranza', 
                   'Azcapotzalco', 
                   'Iztapalapa',
                   'Iztacalco', 
                   'Miguel Hidalgo', 
                   'La Magdalena Contreras',
                   'CoyoacÃ¡n', 
                   'Milpa Alta'],
                  
                  ['Cuauhtemoc',
                   'Alvaro Obregon',
                   'Xochimilco',
                   'Tlahuac',
                   'Benito Juarez',
                   'Cuajimalpa de Morelos',
                   'Gustavo A. Madero',
                   'Tlalpan',
                   'Venustiano Carranza',
                   'Azcapotzalco',
                   'Iztapalapa',
                   'Iztacalco',
                   'Miguel Hidalgo',
                   'La Magdalena Contreras',
                   'Coyoacan',
                   'Milpa Alta'], inplace = True)

alcaldias = alcaldias[['nomgeo','geometry']]


Procesamos los puntos de interés o variables indirectas

In [8]:
pois = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1/shp_b/original_data/shp_OSM/gis_osm_pois_free_1.shp')
pois_alc = gpd.sjoin(pois, alcaldias, op = 'intersects')
pois_alc = pois_alc[['fclass','nomgeo','geometry']]
pois_alc.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//shp_OSM//pois.shp')

In [9]:
traffic = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1/shp_b//original_data//shp_OSM//gis_osm_traffic_free_1.shp')
traffic_alc = gpd.sjoin(traffic, alcaldias, op = 'intersects')
traffic_alc = traffic_alc[['fclass','nomgeo','geometry']]
traffic_alc.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//shp_OSM//traffic.shp')

In [4]:
roads = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//original_data//shp_OSM//gis_osm_roads_free_1.shp')
roads_alc = gpd.sjoin(roads, alcaldias, op = 'intersects')
roads_alc = roads_alc[(roads_alc.fclass != 'service') & (roads_alc.fclass != 'footway')]
roads_alc = roads_alc[['fclass','name','nomgeo','geometry']]
roads_alc = roads_alc.reset_index(drop=True)
roads_alc.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//shp_OSM//roads.shp')

Procesamiento de capa de estaciones de transporte masivo

In [11]:
url_metro = '/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//original_data//SHP_transp//estaciones-metro.geojson'
metro = gpd.read_file(url_metro)

url_metrobus = '/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//original_data//SHP_transp//estaciones-metrobus.geojson'
metrobus = gpd.read_file(url_metrobus)

url_rtp = '/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//original_data//SHP_transp//paradas-de-rtp.geojson'
rtp = gpd.read_file(url_rtp)

url_trole = '/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//original_data//SHP_transp//paradas-de-trolebus.geojson'
trole = gpd.read_file(url_trole)

url_cetram = '/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//original_data//SHP_transp//ubicacion-de-centros-de-transferencia-modal-cetram.geojson'
cetram = gpd.read_file(url_cetram)

In [12]:
metro1 = metro[['stop_name','geometry']]
metro1['fclass'] = 'metro'

metrobus1 = metrobus[['nombre','geometry']]
metrobus1.rename(columns={'nombre':'stop_name'}, inplace = True)
metrobus1['fclass'] = 'metrobus'

rtp1 = rtp[['stop_name','geometry']]
rtp1['fclass'] = 'rtp'

trole1 = trole[['stop_name','geometry']]
trole1['fclass'] = 'trolebus'

cetram1 = cetram[['nombre','geometry']]
cetram1.rename(columns={'nombre':'stop_name'}, inplace = True)
cetram1['fclass'] = 'cetram'

trans = pd.concat([metro1, metrobus1, rtp1, trole1, cetram1], axis = 0, ignore_index = True)
trans_alc = gpd.sjoin(trans, alcaldias, op = 'intersects')
trans_alc = trans_alc[['fclass','stop_name','nomgeo','geometry']]
trans_alc.to_file('/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//processed_data//shp_transp//estaciones_transp.shp')

Procesamos las intersecciones de la red vial de la CDMX

In [13]:
roads_alc.total_bounds

array([-99.4078734,  18.9685746, -98.9248062,  19.5883329])

In [14]:
G = ox.graph_from_bbox(19.5883329, 18.9685746, -98.9248062, -99.4078734, network_type='drive') # N, S, E, W

G_proj = ox.project_graph(G)
nodes_proj = ox.graph_to_gdfs(G_proj, edges=False)
graph_area_m = nodes_proj.unary_union.convex_hull.area

In [15]:
intersections = ox.clean_intersections(G_proj, tolerance=12, dead_ends=False)
gdf = gpd.GeoDataFrame(geometry=intersections)
gdf.crs = G_proj.graph['crs']
intersect = ox.project_gdf(gdf, to_latlong=True)

In [16]:
intersect['fclass'] = 'interseccion'
intersect.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//shp_OSM//intersections.shp')

Preprocesamos los hexagonos 

In [6]:
#Primero hacemos el intersect con las alcaldias de la ciudad de mexico y quitamos los duplicados
hex50 = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//original_data//hex//temp50.shp')
hex50_alc = gpd.sjoin(hex50, alcaldias, op = 'intersects')
hex50_alc = hex50_alc[['left','bottom','right','top','geometry']]
hex50_alc = hex50_alc.reset_index(drop=True)
hex50_alc1 = hex50_alc[['left','bottom','right','top']]
hex50_alc2 = hex50_alc1[hex50_alc1.duplicated()]
list_index50 = list(hex50_alc2.index)
hex50_alc_f = hex50_alc[~hex50_alc.index.isin(list_index50)]

#Ya que tenemos unicamente hexagonos de alcaldia, ahora lo hacemos con la red vial 
road_hex50 = gpd.sjoin(hex50_alc_f, roads_alc, op = 'intersects')
road_hex50 = road_hex50[['left','bottom','right','top','fclass','name','nomgeo','geometry']]
road_hex50 = road_hex50.reset_index(drop=True)
road_hex50_1 = road_hex50[['left','bottom','right','top']]
road_hex50_2 = road_hex50_1[road_hex50_1.duplicated()]
list_index50 = list(road_hex50_2.index)
road_hex50_f = road_hex50[~road_hex50.index.isin(list_index50)]
road_hex50_f = road_hex50_f[['left','bottom','right','top','geometry']]
road_hex50_f.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//hex_roads//road_hex50.shp')

In [5]:
#Primero hacemos el intersect con las alcaldias de la ciudad de mexico y quitamos los duplicados
hex100 = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//original_data//hex//temp100_p.shp')
hex100_alc = gpd.sjoin(hex100, alcaldias, op = 'intersects')
hex100_alc = hex100_alc[['left','bottom','right','top','geometry']]
hex100_alc = hex100_alc.reset_index(drop=True)
hex100_alc1 = hex100_alc[['left','bottom','right','top']]
hex100_alc2 = hex100_alc1[hex100_alc1.duplicated()]
list_index100 = list(hex100_alc2.index)
hex100_alc_f = hex100_alc[~hex100_alc.index.isin(list_index100)]

#Ya que tenemos unicamente hexagonos de alcaldia, ahora lo hacemos con la red vial 
road_hex100 = gpd.sjoin(hex100_alc_f, roads_alc, op = 'intersects')
road_hex100 = road_hex100[['left','bottom','right','top','fclass','name','nomgeo','geometry']]
road_hex100 = road_hex100.reset_index(drop=True)
road_hex100_1 = road_hex100[['left','bottom','right','top']]
road_hex100_2 = road_hex100_1[road_hex100_1.duplicated()]
list_index100 = list(road_hex100_2.index)
road_hex100_f = road_hex100[~road_hex100.index.isin(list_index100)]
road_hex100_f = road_hex100_f[['left','bottom','right','top','geometry']]
road_hex100_f.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//hex_roads//road_hex100.shp')

In [7]:
#Primero hacemos el intersect con las alcaldias de la ciudad de mexico y quitamos los duplicados
hex200 = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//original_data//hex//temp200.shp')
hex200_alc = gpd.sjoin(hex200, alcaldias, op = 'intersects')
hex200_alc = hex200_alc[['left','bottom','right','top','geometry']]
hex200_alc = hex200_alc.reset_index(drop=True)
hex200_alc1 = hex200_alc[['left','bottom','right','top']]
hex200_alc2 = hex200_alc1[hex200_alc1.duplicated()]
list_index200 = list(hex200_alc2.index)
hex200_alc_f = hex200_alc[~hex200_alc.index.isin(list_index200)]

#Ya que tenemos unicamente hexagonos de alcaldia, ahora lo hacemos con la red vial 
road_hex200 = gpd.sjoin(hex200_alc_f, roads_alc, op = 'intersects')
road_hex200 = road_hex200[['left','bottom','right','top','fclass','name','nomgeo','geometry']]
road_hex200 = road_hex200.reset_index(drop=True)
road_hex200_1 = road_hex200[['left','bottom','right','top']]
road_hex200_2 = road_hex200_1[road_hex200_1.duplicated()]
list_index200 = list(road_hex200_2.index)
road_hex200_f = road_hex200[~road_hex200.index.isin(list_index200)]
road_hex200_f = road_hex200_f[['left','bottom','right','top','geometry']]
road_hex200_f.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//hex_roads//road_hex200.shp')

In [6]:
#Primero hacemos el intersect con las alcaldias de la ciudad de mexico y quitamos los duplicados
hex300 = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//original_data//hex//temp300_p.shp')
hex300_alc = gpd.sjoin(hex300, alcaldias, op = 'intersects')
hex300_alc = hex300_alc[['left','bottom','right','top','geometry']]
hex300_alc = hex300_alc.reset_index(drop=True)
hex300_alc1 = hex300_alc[['left','bottom','right','top']]
hex300_alc2 = hex300_alc1[hex300_alc1.duplicated()]
list_index300 = list(hex300_alc2.index)
hex300_alc_f = hex300_alc[~hex300_alc.index.isin(list_index300)]

#Ya que tenemos unicamente hexagonos de alcaldia, ahora lo hacemos con la red vial 
road_hex300 = gpd.sjoin(hex300_alc_f, roads_alc, op = 'intersects')
road_hex300 = road_hex300[['left','bottom','right','top','fclass','name','nomgeo','geometry']]
road_hex300 = road_hex300.reset_index(drop=True)
road_hex300_1 = road_hex300[['left','bottom','right','top']]
road_hex300_2 = road_hex300_1[road_hex300_1.duplicated()]
list_index300 = list(road_hex300_2.index)
road_hex300_f = road_hex300[~road_hex300.index.isin(list_index300)]
road_hex300_f = road_hex300_f[['left','bottom','right','top','geometry']]
road_hex300_f.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//hex_roads//road_hex300.shp')

In [7]:
hex500 = gpd.read_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//original_data//hex//temp500_p.shp')
hex500_alc = gpd.sjoin(hex500, alcaldias, op = 'intersects')
hex500_alc = hex500_alc[['left','bottom','right','top','geometry']]
hex500_alc = hex500_alc.reset_index(drop=True)
hex500_alc1 = hex500_alc[['left','bottom','right','top']]
hex500_alc2 = hex500_alc1[hex500_alc1.duplicated()]
list_index500 = list(hex500_alc2.index)
hex500_alc_f = hex500_alc[~hex500_alc.index.isin(list_index500)]

road_hex500 = gpd.sjoin(hex500_alc_f, roads_alc, op = 'intersects')
road_hex500 = road_hex500[['left','bottom','right','top','fclass','name','nomgeo','geometry']]
road_hex500 = road_hex500.reset_index(drop=True)
road_hex500_1 = road_hex500[['left','bottom','right','top']]
road_hex500_2 = road_hex500_1[road_hex500_1.duplicated()]
list_index500 = list(road_hex500_2.index)
road_hex500_f = road_hex500[~road_hex500.index.isin(list_index500)]
road_hex500_f = road_hex500_f[['left','bottom','right','top','geometry']]
road_hex500_f.to_file('/Users/daniel.rodriguez/Documents/ACC/ACC_PROOF/ACC1//shp_b//processed_data//hex_roads//road_hex500.shp')

Procesamos el shape de índice de cruce peligroso para el peatón

In [20]:
cruces_index = gpd.read_file('/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//original_data//cruces_indice//cruceros503Point.shp')
cruces_index1 = cruces_index[['calificaci','geometry']]
cruces_index2 = cruces_index1[cruces_index1['calificaci']<=0.6]
cruces_index2['fclass'] = 'cruce_peligroso'
cruces_index2.to_file('/Users//daniel.rodriguez//Documents//ACC//ACC_PROOF//ACC1//shp_b//processed_data//cruces//cruces_pelis.shp')