# Introduction

This notebook aims to explore trends in recent data on motor vehicle collisions. Cleaned data will be joined (inner join) to the Vehicle Collisions - Crashes Joined to Neighborhood Data.csv file. The the latter contains the neighborhood information

### Tasks
- First, Visualizing data by neigborhood boundary
- Second, Visualizing data using H3
- Third, Visializing data using hexagon boundaries within neiboorhoods
- Using plotly for vis


# Data Source

The data used in this notebook was obtained from: 

- [NYC Open Data's Motor Vehicle Collision-Crashes](https://data.cityofnewyork.us/Public-Safety/Motor-Vehicle-Collisions-Crashes/h9gi-nx95)
  - This dataset contains information from all police reported motor vehicle collisions in NYC. Each row represents a crash event.The police report (MV104-AN) is required to be filled out for collisions where someone is injured or killed, or where there is at least 1000 dollars worth of damage. This notebook uses a subset of the data and was accessed with the [Socrata Open Data (SODA) API](https://dev.socrata.com/consumers/getting-started.html). 
 

# Loading Dependencies

In [None]:
import json
import requests
import pandas as pd
from sodapy import Socrata
import numpy as np
import urllib.request
import plotly.express as px
import plotly.io as pio
pio.renderers.default = 'browser'
#from urllib.request import urlopen
import geopandas as gdp
from geojson import Feature, Point, FeatureCollection, Polygon

Create access token (set of permissions that allow the token to make certain types of requests to Mapbox APIs)

In [38]:
px.set_mapbox_access_token(open(".mapbox_token").read())

In [None]:
# “I suggested to use edit distance + agglomerative clustering + spell checking to cleaning up mispelled categories”.

# Neighborhoods Visualization

In [61]:
url = 'https://services5.arcgis.com/GfwWNkhOj9bNBqoJ/arcgis/rest/services/NYC_Neighborhood_Tabulation_Areas_2020/FeatureServer/0/query?where=1=1&outFields=*&outSR=4326&f=pgeojson'

In [62]:
hood_json = requests.get(url)
hood_json = hood_json.json()

In [3]:
#hood_json

# Importing the Data


In [4]:

df_crash_count = pd.read_csv('Data/Borough and Neighborhood Crash Counts (2012-07-01 through 2022-03-15).csv')
df_motor_vehicle = pd.read_csv('Data/Motor Vehicle Collisions - Crashes Joined to Neighborhood Data.csv')
df_original = pd.read_csv('Data/NYC-Open-Data-Motor-Vehicle-Collision-Crashes.csv')#

In [5]:
# Lower columns to allow matching mergin data frame column names
df_motor_vehicle.columns= df_motor_vehicle.columns.str.lower()

In [6]:
df_original.columns

Index(['collision_id', 'crash_date', 'crash_time', 'number_of_persons_killed',
       'number_of_persons_injured', 'latitude', 'longitude', 'year', 'month',
       'day_of_week', 'hour'],
      dtype='object')

In [7]:
df_motor_vehicle.columns

Index(['collision_id', 'crash date', 'crash time', 'latitude', 'longitude',
       'boroname', 'ntaname', 'cdtaname', 'geometry'],
      dtype='object')

In [8]:
df_motor_vehicle.shape

(1644913, 9)

In [9]:
df_original.shape

(1657292, 11)

### Joining datasets (Original + Crashes Joined to Neighborhood Data)


In [10]:

# Join on collition id (add injured, killed, month, year columns)
#df_motor_vehicle.merge(df_original[['number_of_persons_killed','number_of_persons_injured','latitude','longitude']])  # df2 but only with columns x, a, and b
df_motor_vehicle = pd.merge(df_motor_vehicle,df_original[['collision_id','number_of_persons_killed',
       'number_of_persons_injured','year', 'month']],on='collision_id', how='inner')

In [11]:
df_motor_vehicle.shape

(1644603, 13)

In [12]:
df_motor_vehicle.columns

Index(['collision_id', 'crash date', 'crash time', 'latitude', 'longitude',
       'boroname', 'ntaname', 'cdtaname', 'geometry',
       'number_of_persons_killed', 'number_of_persons_injured', 'year',
       'month'],
      dtype='object')

In [13]:
# Rename Neighborhood column for clarity of map labels
df_crash_count = df_crash_count.rename(columns = {'NTAName': 'Neighborhood'}) 
# df_motor_vehicle = df_motor_vehicle.rename(columns = {'ntaname': 'Neighborhood'}) 

#### Choropleth map of collitions per NYC neigborhood

In [18]:

collition_fig = px.choropleth_mapbox(
    df_crash_count,
    locations = "Neighborhood",
    geojson = hood_json,
    color = "Crashes",
    featureidkey="properties.NTAName",
    #color_continuous_scale=px.colors.continuous.Viridis[::-1],
    color_continuous_scale=px.colors.sequential.Inferno[::-1],
    #color_continuous_scale="viridis",
    #px.colors.sequential.Viridis,
    hover_name="Neighborhood",
    #hover_data= ["Count"],
    mapbox_style="carto-positron",
    center={"lat": 40.730610, "lon": -73.9749},
    zoom=8.5,
    opacity=0.5,
    title = "NYC Neighborhood",)
collition_fig.update_layout(
    title={
        'text': "location of fatalities",
        'y':0.9,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'})

In [37]:
#collition_fig.show()

#### Create choropleth map of number of persons injured per NYC neigborhood

**Filtering**
- Previous 12 months 




In [14]:
# Filter the last 12 months of data and collitions with at least 1 injured person
prev_12_months = df_motor_vehicle.loc[(df_motor_vehicle['crash date'] >= '2021-04-01')
                     & (df_motor_vehicle['crash date'] < '2022-03-31')]
# prev_12_months = df_motor_vehicle.loc[(df_motor_vehicle['crash date'] >= '2021-04-01')
#                      & (df_motor_vehicle['crash date'] < '2022-03-31') & (df_motor_vehicle['number_of_persons_injured']>=1)]

In [16]:
prev_12_months.shape

(96580, 13)

In [15]:
df_gb = prev_12_months.groupby('ntaname')['number_of_persons_injured'].sum().reset_index()

In [16]:
df_gb.head()

Unnamed: 0,ntaname,number_of_persons_injured
0,Allerton,139
1,Alley Pond Park,132
2,Annadale-Huguenot-Prince's Bay-Woodrow,131
3,Arden Heights-Rossville,47
4,Astoria (Central),191


In [17]:
# View number of persons injured in descending order
df_gb.sort_values('number_of_persons_injured',ascending=False)

Unnamed: 0,ntaname,number_of_persons_injured
32,Canarsie,681
202,South Ozone Park,673
68,East New York-New Lots,623
155,Mott Haven-Port Morris,602
17,Bedford-Stuyvesant (West),544
...,...,...
150,Miller Field,0
151,Montefiore Cemetery,0
158,Mount Olivet & All Faiths Cemeteries,0
191,Rockaway Community Park,0


In [18]:
# Rename neiborhood name column for clarity on map labels
df_gb = df_gb.rename(columns = {'ntaname': 'Neighborhood'}) 

In [63]:
injured_fig = px.choropleth_mapbox(
    df_gb,
    locations = "Neighborhood",
    geojson = hood_json,
    color = "number_of_persons_injured",
    featureidkey="properties.NTAName",
    #color_continuous_scale=px.colors.continuous.Viridis[::-1],
    color_continuous_scale=px.colors.sequential.Inferno[::-1],
    #color_continuous_scale="viridis",
    #px.colors.sequential.Viridis,
    hover_name="Neighborhood",
    hover_data= ["number_of_persons_injured"],
    mapbox_style="carto-positron",
    center={"lat": 40.730610, "lon": -73.9749},
    zoom=8.5,
    opacity=0.5,
    title = "NYC Neighborhood",)
# fig.update_layout(
#     title={
#         'text': "location of fatalities",
#         'y':0.9,
#         'x':0.5,
#         'xanchor': 'center',
#         'yanchor': 'top'})

In [64]:
injured_fig.show()

# Visualizing Collisions with H3

In [19]:
import h3
from shapely.geometry import Polygon

##### Create data frame that includes collision_id, neighborhood name, lat, lon and persons_injured columns
- Lat and lon will be used to create he3 cells and to to populate the hexagones with corresponding data
- Collision id for counting the total number of collisions per location (h3 cell)
- Number of persons injured (sum total per corresponding location)
- Neighborhood column will allow to filter visualization by neigborhood name



In [73]:
prev_12_months.columns

Index(['collision_id', 'crash date', 'crash time', 'latitude', 'longitude',
       'boroname', 'ntaname', 'cdtaname', 'geometry',
       'number_of_persons_killed', 'number_of_persons_injured', 'year',
       'month'],
      dtype='object')

In [20]:

df_h3 = (prev_12_months[['collision_id','ntaname','latitude','longitude','number_of_persons_injured']])

In [21]:
df_h3.head()

Unnamed: 0,collision_id,ntaname,latitude,longitude,number_of_persons_injured
0,4407147,Park Slope,40.68358,-73.97617,1
1,4407702,Park Slope,40.669067,-73.9878,0
2,4407489,Park Slope,40.673008,-73.97851,0
3,4407699,Park Slope,40.681335,-73.97667,1
4,4407890,Park Slope,40.682842,-73.9766,1


- Function to map car collision points to the H3 cells.
- H3 resolution 9 (.1 square km hexagon) 

In [22]:
H3_res = 9 # H3 Resolution (.1km^2)
def geo_to_h3(row):
  return h3.geo_to_h3(lat=row.latitude,lng=row.longitude,resolution = H3_res)

In [23]:
df_h3['h3_cell'] = df_h3.apply(geo_to_h3,axis=1)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [24]:
df_h3.shape

(96580, 6)

Visualization that groups by h3 and aggregates collition id (count), number of persons injured (sum)

In [27]:

#df_h3_gb = (df_h3.groupby('h3_cell').number_of_persons_injured.sum()).reset_index()
## Grouping by h3 (counting ids, adding number of persons injured)
df_h3_gb = df_h3.groupby('h3_cell').agg(
        # Get count of id per cell
        number_of_collisions = ('collision_id', "count"),
        number_of_persons_injured = ('number_of_persons_injured', sum)).reset_index()


In [25]:
df_h3_gb3 = df_h3.groupby(['h3_cell','ntaname']).agg(
        # Get count of id per cell
        number_of_collisions = ('collision_id', "count"),
        number_of_persons_injured = ('number_of_persons_injured', sum)).reset_index()

In [27]:
df_h3_gb3.shape

(7749, 4)

In [53]:
df_h3_gb3.dtypes

h3_cell                      object
ntaname                      object
number_of_collisions          int64
number_of_persons_injured     int64
geometry                     object
dtype: object

In [57]:
df_h3_gb3.loc[df_h3_gb3.h3_cell.str.contains('892a100dea7ffff')] #892a100dea7ffff
#df_neiborhood = df_h3_gb3.loc[df_h3_gb3.ntaname.str.contains('astoria', case = False)]


Unnamed: 0,h3_cell,ntaname,number_of_collisions,number_of_persons_injured,geometry
3847,892a100dea7ffff,Bedford-Stuyvesant (East),29,17,POLYGON ((-73.94325245463516 40.70260990419921...
3848,892a100dea7ffff,Bedford-Stuyvesant (West),24,11,POLYGON ((-73.94325245463516 40.70260990419921...
3849,892a100dea7ffff,Bushwick (West),2,2,POLYGON ((-73.94325245463516 40.70260990419921...
3850,892a100dea7ffff,East Williamsburg,6,2,POLYGON ((-73.94325245463516 40.70260990419921...
3851,892a100dea7ffff,South Williamsburg,39,19,POLYGON ((-73.94325245463516 40.70260990419921...


In [51]:
df_h3_gb3.h3_cell.value_counts()

892a100dea7ffff    5
892a100d907ffff    5
892a1077473ffff    4
892a100de67ffff    4
892a1001e8bffff    4
                  ..
892a100d1bbffff    1
892a100d1b3ffff    1
892a100d1abffff    1
892a100d19bffff    1
892a10776d7ffff    1
Name: h3_cell, Length: 5964, dtype: int64

Needed for plotting
- Use shapely (import Polygon)
- Function to generate hexagon geometry for each hexagon


In [29]:
# Function to generate hexagon geometry for each hexagon
def add_geometry(row):
  points = h3.h3_to_geo_boundary(row['h3_cell'], True)
  return Polygon(points)

In [32]:
df_h3_gb3['geometry'] = (df_h3_gb3.apply(add_geometry,axis=1))

In [33]:
df_h3_gb3.head()

Unnamed: 0,h3_cell,ntaname,number_of_collisions,number_of_persons_injured,geometry
0,892a1000047ffff,Pelham Bay-Country Club-City Island,1,0,POLYGON ((-73.79056100760786 40.85996287572233...
1,892a1000063ffff,Pelham Bay-Country Club-City Island,6,1,POLYGON ((-73.79037556521729 40.85467378813006...
2,892a1000067ffff,Pelham Bay-Country Club-City Island,8,4,POLYGON ((-73.78814034418616 40.85208351682743...
3,892a1000073ffff,Pelham Bay-Country Club-City Island,1,0,POLYGON ((-73.78832552397652 40.85737243243801...
4,892a100007bffff,Pelham Bay-Country Club-City Island,4,3,POLYGON ((-73.79261103722521 40.85726410455732...


##### For the Choropleth map
- Data frame + Geojson-dictionary
    - Create a GeoJSON-formatted dictionary using Dataframe
    - Need a location column to assign color to map


In [34]:
def hexagons_dataframe_to_geojson(df_hex, hex_id_field,geometry_field, value_field,file_output = None):

    list_features = []

    for i, row in df_hex.iterrows():
        feature = Feature(geometry = row[geometry_field],
                          id = row[hex_id_field],
                          properties = {"value": row[value_field]})
        list_features.append(feature)

    feat_collection = FeatureCollection(list_features)

    if file_output is not None:
        with open(file_output, "w") as f:
            json.dump(feat_collection, f)

    else :
      return feat_collection

In [35]:
# GeoJson object
# Only one needs to be created at picked resolution
geojson_obj = (hexagons_dataframe_to_geojson
                       (df_h3_gb3,
                        hex_id_field='h3_cell',
                        value_field='number_of_persons_injured',
                        geometry_field='geometry'))

In [41]:
df_h3_gb3.columns

Index(['h3_cell', 'ntaname', 'number_of_collisions',
       'number_of_persons_injured', 'geometry'],
      dtype='object')

In [43]:
df_h3_gb3.ntaname.unique()

array(['Pelham Bay-Country Club-City Island', 'Co-op City',
       'Pelham Bay Park', 'Eastchester-Edenwald-Baychester',
       'Fort Totten', 'Douglaston-Little Neck', 'Whitestone-Beechhurst',
       'Throgs Neck-Schuylerville', 'Bay Terrace-Clearview', 'Bayside',
       'Auburndale', 'Williamsbridge-Olinville', 'Pelham Gardens',
       'Allerton', 'Bronx Park', 'Norwood', 'Morris Park',
       'Hutchinson Metro Center', 'Westchester Square', 'Bedford Park',
       'Belmont', 'Kingsbridge Heights-Van Cortlandt Village',
       'University Heights (North)-Fordham', 'Van Cortlandt Park',
       'Pelham Parkway-Van Nest', 'West Farms', 'Tremont',
       'Wakefield-Woodlawn', 'Woodlawn Cemetery', 'Castle Hill-Unionport',
       'Ferry Point Park-St. Raymond Cemetery', 'Soundview-Clason Point',
       'College Point', 'Soundview Park',
       'Soundview-Bruckner-Bronx River', 'Hunts Point',
       'Crotona Park East', 'Parkchester',
       'Glen Oaks-Floral Park-New Hyde Park', 'Bellerose'

In [74]:
neighborhood = 'Rosedale'
df_neighborhood = df_h3_gb3[df_h3_gb3['ntaname'] == neighborhood]

In [65]:
#df_neiborhood = df_h3_gb3.loc[df_h3_gb3.ntaname.str.contains('canarsie', case = False)]
#df.loc[df.words.str.contains('word',case=False)]


In [75]:
# Fig colored by number of persons
hex_fig = (px.choropleth_mapbox(
                    df_neighborhood, # Passing selected neighborhood
                    #df_h3_gb3, # Passing dataframe covering all neighborhoods
                    geojson=geojson_obj, 
                    locations='h3_cell',
                    #featureidkey = "features.NTAName",
                    color='number_of_persons_injured',
                    #color_continuous_scale=px.colors.continuous.Viridis[::-1],
                    #color_continuous_scale=px.colors.sequential.Inferno[::-1],
                    color_continuous_scale=px.colors.sequential.Inferno[::-1],
                    #color_continuous_scale="viridis",
                    #range_color=(0,fire_ignitions_g['count'].mean()           ),                  mapbox_style='carto-positron',
                    zoom=7,
                    center = {"lat": 40.730610, "lon": -73.9749},
                    opacity=0.7,
                    title = "Number of Persons Injured at Location",
                    hover_name="ntaname",
                    #hover_data= ["number_of_persons_injured"], {'h3_cell':False}
                    hover_data={'h3_cell':False,
                               'number_of_persons_injured': True,
                               'number_of_collisions': True,
                               'ntaname': False}
))


In [76]:
hex_fig.show()

### Visualization that groups by h3 and aggregates collition id (count), number of persons injured (sum)


Can we request the previous 12 months of data (or 2021 and 2022 and filter past 12 months on our end)???

**Filtering**
- Previous 12 months 

In [68]:
prev_12_months.head()

Unnamed: 0,collision_id,crash date,crash time,latitude,longitude,boroname,ntaname,cdtaname,geometry,number_of_persons_killed,number_of_persons_injured,year,month
0,4407147,2021-04-13 00:00:00,1900-01-01 21:35:00,40.68358,-73.97617,Brooklyn,Park Slope,BK06 Park Slope-Carroll Gardens (CD 6 Approxim...,POINT (-73.97617 40.68358),0,1,2021,4
1,4407702,2021-04-16 00:00:00,1900-01-01 06:40:00,40.669067,-73.9878,Brooklyn,Park Slope,BK06 Park Slope-Carroll Gardens (CD 6 Approxim...,POINT (-73.9878 40.669067),0,0,2021,4
2,4407489,2021-04-14 00:00:00,1900-01-01 12:37:00,40.673008,-73.97851,Brooklyn,Park Slope,BK06 Park Slope-Carroll Gardens (CD 6 Approxim...,POINT (-73.97851 40.673008),0,0,2021,4
3,4407699,2021-04-15 00:00:00,1900-01-01 16:20:00,40.681335,-73.97667,Brooklyn,Park Slope,BK06 Park Slope-Carroll Gardens (CD 6 Approxim...,POINT (-73.97667 40.681335),0,1,2021,4
4,4407890,2021-04-15 00:00:00,1900-01-01 20:50:00,40.682842,-73.9766,Brooklyn,Park Slope,BK06 Park Slope-Carroll Gardens (CD 6 Approxim...,POINT (-73.9766 40.682842),0,1,2021,4


**Filtering one neighborhood for h3 Visualization (Lower East Side in example)**

In [107]:
df_motor_vehicle.ntaname.unique()

array(['Park Slope', 'Downtown Brooklyn-DUMBO-Boerum Hill',
       'Throgs Neck-Schuylerville', 'Brooklyn Heights', 'Flatlands',
       "Annadale-Huguenot-Prince's Bay-Woodrow",
       'Pelham Parkway-Van Nest', 'Cambria Heights', 'Morris Park',
       'Bushwick (West)', 'Queens Village', 'Hunts Point',
       'Arden Heights-Rossville', 'East New York-New Lots',
       'Eastchester-Edenwald-Baychester', 'Canarsie',
       'Upper West Side (Central)', 'SoHo-Little Italy-Hudson Square',
       'Woodside', 'Flatbush', 'Bedford-Stuyvesant (East)',
       'East Williamsburg', 'Harlem (South)', 'Bay Ridge',
       'Sunset Park (West)', 'East Flatbush-Remsen Village', 'Melrose',
       'Lincoln Terrace Park', 'Parkchester', 'Williamsburg', 'Rego Park',
       'South Williamsburg', 'West Farms',
       'Springfield Gardens (South)-Brookville', 'Bayside', 'Corona',
       'Fresh Meadows-Utopia', 'Flushing-Willets Point',
       'Springfield Gardens (North)-Rochdale Village',
       'East Flatbu

In [71]:
df_one = prev_12_months.loc[(prev_12_months['ntaname']=='Lower East Side')]

In [72]:
df_one.head()

Unnamed: 0,collision_id,crash date,crash time,latitude,longitude,boroname,ntaname,cdtaname,geometry,number_of_persons_killed,number_of_persons_injured,year,month
787300,4407653,2021-04-15 00:00:00,1900-01-01 09:28:00,40.71649,-73.98484,Manhattan,Lower East Side,MN03 Lower East Side-Chinatown (CD 3 Equivalent),POINT (-73.98484 40.71649),0,0,2021,4
787301,4407400,2021-04-14 00:00:00,1900-01-01 02:30:00,40.718826,-73.98424,Manhattan,Lower East Side,MN03 Lower East Side-Chinatown (CD 3 Equivalent),POINT (-73.98424 40.718826),0,0,2021,4
787302,4407868,2021-04-16 00:00:00,1900-01-01 08:40:00,40.721985,-73.98552,Manhattan,Lower East Side,MN03 Lower East Side-Chinatown (CD 3 Equivalent),POINT (-73.98552 40.721985),0,0,2021,4
787303,4408030,2021-04-10 00:00:00,1900-01-01 11:15:00,40.71975,-73.992165,Manhattan,Lower East Side,MN03 Lower East Side-Chinatown (CD 3 Equivalent),POINT (-73.992165 40.71975),0,1,2021,4
787304,4409360,2021-04-19 00:00:00,1900-01-01 10:04:00,40.72098,-73.98696,Manhattan,Lower East Side,MN03 Lower East Side-Chinatown (CD 3 Equivalent),POINT (-73.98696 40.72098),0,0,2021,4


In [32]:
# Pick relevant collumns
df_h3 = (df_one[['collision_id','latitude','longitude','number_of_persons_injured']])

In [33]:
df_h3['h3_cell'] = df_h3.apply(geo_to_h3,axis=1)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [84]:
df_h3.head()

Unnamed: 0,collision_id,latitude,longitude,number_of_persons_injured,h3_cell
787300,4407653,40.71649,-73.98484,0,892a1072dd3ffff
787301,4407400,40.718826,-73.98424,0,892a100d36bffff
787302,4407868,40.721985,-73.98552,0,892a1072cb7ffff
787303,4408030,40.71975,-73.992165,1,892a1072ca3ffff
787304,4409360,40.72098,-73.98696,0,892a1072cb7ffff


In [34]:
# Grouping by h3 (counting ids, adding number of persons injured)
df_h3_gb2 = df_h3.groupby('h3_cell').agg(
        # Get count of id per cell
        number_of_collisions = ('collision_id', "count"),
        number_of_persons_injured = ('number_of_persons_injured', sum)).reset_index()

In [113]:
df_h3_gb2.head()

Unnamed: 0,h3_cell,number_of_collisions,number_of_persons_injured
0,892a100ca43ffff,30,9
1,892a100ca47ffff,20,11
2,892a100ca4bffff,7,3
3,892a100ca4fffff,43,24
4,892a100ca53ffff,13,2


In [35]:
df_h3_gb2['geometry'] = (df_h3_gb2.apply(add_geometry,axis=1))

In [88]:
df_h3_gb2.head()

Unnamed: 0,h3_cell,number_of_collisions,number_of_persons_injured,geometry
0,892a100d32bffff,7,4,POLYGON ((-73.97376971671147 40.71764607231747...
1,892a100d363ffff,5,3,POLYGON ((-73.98028229967294 40.72011317399058...
2,892a100d367ffff,13,11,POLYGON ((-73.97804268787144 40.71753084085164...
3,892a100d36bffff,52,19,"POLYGON ((-73.98455550101015 40.7199977026327,..."
4,892a100d36fffff,38,27,POLYGON ((-73.98231565232341 40.71741545161098...


In [36]:
df_h3_gb2['injuries_per_collision'] = round((df_h3_gb2['number_of_persons_injured']/df_h3_gb2['number_of_collisions']),2)

In [76]:
#df_h3_gb2.sort_values('number_of_collisions',ascending=False)

##### Map colored by number of collisions


In [37]:
# Visualization of filter neighborhood
# Choropleth mat with collored by number of collitions
#fig that includes every collision that has injured people
hex_fig = (px.choropleth_mapbox(
                    df_h3_gb2, 
                    geojson=geojson_obj, 
                    locations='h3_cell',
                    #featureidkey = "features.NTAName",
                    color='number_of_collisions',
                    #color_continuous_scale=px.colors.continuous.Viridis[::-1],
                    #color_continuous_scale=px.colors.sequential.Inferno[::-1],
                    color_continuous_scale=px.colors.sequential.Viridis[::-1],
                    #color_continuous_scale="viridis",
                    #range_color=(0,fire_ignitions_g['count'].mean()           ),                  mapbox_style='carto-positron',
                    zoom=11,
                    center = {"lat": 40.730610, "lon": -73.9749},
                    opacity=0.7,
                    title = "Map Colored by Number of Collisions in the Lower East Side",
                    hover_name="number_of_collisions",
                    #hover_data= ["number_of_persons_injured"], {'h3_cell':False}
                    hover_data={'h3_cell':False,
                                'number_of_collisions':True,
                               'number_of_persons_injured': True,
                               'injuries_per_collision': True}
))

In [38]:
hex_fig.show()

In [143]:
df_h3_gb2.geometry.loc[:1]

0    POLYGON ((-73.86704796411296 40.66499839215409...
1    POLYGON ((-73.87374115313263 40.67274635914418...
Name: geometry, dtype: object

In [138]:
d

##### Map colored by number of persons injured

In [136]:
#fig that includes every collision that has injured people
hex_fig2 = (px.choropleth_mapbox(
                    df_h3_gb2, 
                    geojson=geojson_obj, 
                    locations='h3_cell',
                    #featureidkey = "features.NTAName",
                    color='number_of_persons_injured',
                    color_continuous_scale=px.colors.continuous.Viridis[::-1],
                    #color_continuous_scale=px.colors.sequential.Inferno[::-1],
                    #color_continuous_scale=px.colors.sequential.Inferno[::-1],
                    #color_continuous_scale="viridis",
                    #range_color=(0,fire_ignitions_g['count'].mean()           ),                  mapbox_style='carto-positron',
                    zoom=11,
                    center = {"lat": 40.730610, "lon": -73.9749},
                    opacity=0.7,
                    title = "Map Colored by Number of Persons Injured at Location",
                    hover_name="number_of_persons_injured",
                    #hover_data= ["number_of_persons_injured"], {'h3_cell':False}
                    hover_data={'h3_cell':False,
                                'number_of_collisions':True,
                               'number_of_persons_injured': True,
                               'injuries_per_collision': True}
))

AttributeError: module 'plotly.express.colors' has no attribute 'continuous'

In [94]:
hex_fig2.show()

##### Visualize rate of injuries per collision (check locations with over a 100  collisions)

In [96]:
over_100_collitions = df_h3_gb2.loc[(df_h3_gb2['number_of_collisions']>50)]

In [98]:
hex_fig_100 = (px.choropleth_mapbox(
                    over_100_collitions, 
                    geojson=geojson_obj, 
                    locations='h3_cell',
                    #featureidkey = "features.NTAName",
                    color='injuries_per_collision',
                    color_continuous_scale=px.colors.continuous.Viridis[::-1],
                    #color_continuous_scale=px.colors.sequential.Inferno[::-1],
                    #color_continuous_scale=px.colors.sequential.Inferno[::-1],
                    #color_continuous_scale="viridis",
                    #range_color=(0,fire_ignitions_g['count'].mean()           ),                  mapbox_style='carto-positron',
                    zoom=7,
                    center = {"lat": 40.730610, "lon": -73.9749},
                    opacity=0.7,
                    title = "Locations with over a 100 Collisions",
                    hover_name="injuries_per_collision",
                    #hover_data= ["number_of_persons_injured"], {'h3_cell':False}
                    hover_data={'h3_cell':False,
                                'number_of_collisions':True,
                               'number_of_persons_injured': True,
                               'injuries_per_collision': True}
))

In [99]:
hex_fig_100.show()