# shst_00_processing_sharedstreets_geometry

This notebook is for processing Sharedstreets geometry geojson file that was created by 'Extract' command in Sharedstreet API (Please check 'how_to_use_sharedstreet_api.pdf').

The chapters and objectives in the notebook like below: <br>
- 0. Filter by NYC Boundary
- 1. Extract node points
- 2. Drop duplicated segments & Calculate length of each segments (unit: feet)


## 0. Filter by NYC Boundary

This chapter aims to filter the Sharedstreets geometry (segments) by NYC boundary. We will use 'sjoin' function in Geopandas.

In [1]:
import geopandas as gpd
import pandas as pd
import numpy as np
from shapely.geometry import Point
from fiona.crs import from_epsg

In [2]:
# import nyc boundary shapefile
gdf_nyc = gpd.read_file('../data/borough_boundaries_w_water/geo_export_b872bcc2-4115-4a61-9581-5cdb4e9449e6.shp')

In [3]:
# make sure that you run the Sharedstreet extract command with bbox.geojson file in 'folder'
# import the Sharedstreets geometry
gdf_shst_segment = gpd.read_file('../data/sharedstreets_geometry/bbox.out.geojson')

In [4]:
# filter the sharedstreets segments
gdf_shst_segment_filtered = gpd.sjoin(gdf_shst_segment, gdf_nyc, op='intersects').drop(['index_right',
                                                                                        'boro_code',
                                                                                        'boro_name',
                                                                                        'shape_area',
                                                                                        'shape_leng'], axis=1)

In [5]:
gdf_shst_segment_filtered.head(3)

Unnamed: 0,id,fromIntersectionId,toIntersectionId,forwardReferenceId,backReferenceId,roadClass,geometry
0,db6792075ebbddc84479fda26174ca30,374b01a56e64379b8d7198962eaede90,2922a5babc5f921116a9fed4131a5bb1,48b7ab8e4cbafb2c1893cd682ded6704,a8475c8bd67f9e0ec8ce6a404aae41c1,Residential,"LINESTRING (-73.91694 40.64668, -73.91625 40.6..."
1,42ccdc2b9ebc38f98c22bb0035045628,37db438d57f16f92e5ba91f1ad1793bb,374b01a56e64379b8d7198962eaede90,febaf06db79d8a16588d1c387a62fdb2,9db38906c3d8ae5df463e297be4e2b9b,Residential,"LINESTRING (-73.91765 40.64623, -73.91732 40.6..."
2,84afb6627019b793945a7aab1feefe77,374b01a56e64379b8d7198962eaede90,5b6e4972c82ad4eb6d24c17b94b33b59,3f53ec240fc39c6b6810243b5b6fc830,fbbb71d35b794421e030d3ec9e1dcede,Residential,"LINESTRING (-73.91694 40.64668, -73.91662 40.6..."


## 1. Extract node points

From line segments, the position of 'from_node' and 'to_node' will be extracted at this part.

#### 1-1 from_node

In [6]:
# extract 'from node id' and its geometry (line segment)
gdf_shst_from_node = gdf_shst_segment_filtered[['fromIntersectionId','geometry']]

In [7]:
# from the line segment, extract the first point
gdf_shst_from_node['fromIntersectionGeometry']= gdf_shst_from_node['geometry'].apply(lambda x: Point(x.coords[0]))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gdf_shst_from_node['fromIntersectionGeometry']= gdf_shst_from_node['geometry'].apply(lambda x: Point(x.coords[0]))


In [8]:
# drop the geometry (segement)
gdf_shst_from_node = gdf_shst_from_node.drop('geometry', axis=1)

In [9]:
# drop duplicates
gdf_shst_from_node = gdf_shst_from_node.drop_duplicates()

#### 1.2 to_node 

In [10]:
# extract 'to node id' and its geometry (line segment)
gdf_shst_to_node = gdf_shst_segment_filtered[['toIntersectionId','geometry']]

In [11]:
# from the line segment, extract the last point
gdf_shst_to_node['toIntersectionGeometry'] = gdf_shst_to_node['geometry'].apply(lambda x: Point(x.coords[-1]))

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  gdf_shst_to_node['toIntersectionGeometry'] = gdf_shst_to_node['geometry'].apply(lambda x: Point(x.coords[-1]))


In [12]:
# drop the geometry (segement)
gdf_shst_to_node = gdf_shst_to_node.drop('geometry', axis=1)

In [13]:
# drop duplicates
gdf_shst_to_node = gdf_shst_to_node.drop_duplicates()

#### Merge dataset

Here, we will merge from and to node dataset, drop duplicates, and export that as a shst_node dataset

In [14]:
# change the column names 
gdf_shst_from_node = gdf_shst_from_node.rename(columns={'fromIntersectionId':'node_id',
                                                'fromIntersectionGeometry':'geometry'})  

gdf_shst_to_node = gdf_shst_to_node.rename(columns={'toIntersectionId':'node_id',
                                                'toIntersectionGeometry':'geometry'})

In [16]:
# merge (concatenate) the datasets
gdf_shst_node = pd.concat([gdf_shst_from_node,gdf_shst_to_node], axis=0, ignore_index=True)

In [17]:
# drop duplicates
gdf_shst_node = gdf_shst_node.drop_duplicates()

In [18]:
# set the dataset type as geodataframe
gdf_shst_node = gpd.GeoDataFrame(gdf_shst_node, geometry='geometry')

In [19]:
# reset the index
gdf_shst_node = gdf_shst_node.reset_index().drop('index', axis=1)

In [20]:
gdf_shst_node.head()

Unnamed: 0,node_id,geometry
0,374b01a56e64379b8d7198962eaede90,POINT (-73.91694 40.64668)
1,37db438d57f16f92e5ba91f1ad1793bb,POINT (-73.91765 40.64623)
2,5b6e4972c82ad4eb6d24c17b94b33b59,POINT (-73.91621 40.64715)
3,c8dd8ecf9b57214609ecead610eef9cb,POINT (-73.91715 40.64387)
4,a19ad445993732ebd6b49a61801b9547,POINT (-73.91665 40.64342)


Export node point geometries as geojson and shapefile

In [21]:
gdf_shst_node.to_file('../data/sharedstreets_geometry/node/shst_node.shp')

In [22]:
gdf_shst_node.to_file('../data/sharedstreets_geometry/node/shst_node.geojson', driver='GeoJSON')

## 2. Drop duplicated segments & Calculate length of each segments (unit: feet)

In [23]:
gdf_shst_segment_filtered.head(3)

Unnamed: 0,id,fromIntersectionId,toIntersectionId,forwardReferenceId,backReferenceId,roadClass,geometry
0,db6792075ebbddc84479fda26174ca30,374b01a56e64379b8d7198962eaede90,2922a5babc5f921116a9fed4131a5bb1,48b7ab8e4cbafb2c1893cd682ded6704,a8475c8bd67f9e0ec8ce6a404aae41c1,Residential,"LINESTRING (-73.91694 40.64668, -73.91625 40.6..."
1,42ccdc2b9ebc38f98c22bb0035045628,37db438d57f16f92e5ba91f1ad1793bb,374b01a56e64379b8d7198962eaede90,febaf06db79d8a16588d1c387a62fdb2,9db38906c3d8ae5df463e297be4e2b9b,Residential,"LINESTRING (-73.91765 40.64623, -73.91732 40.6..."
2,84afb6627019b793945a7aab1feefe77,374b01a56e64379b8d7198962eaede90,5b6e4972c82ad4eb6d24c17b94b33b59,3f53ec240fc39c6b6810243b5b6fc830,fbbb71d35b794421e030d3ec9e1dcede,Residential,"LINESTRING (-73.91694 40.64668, -73.91662 40.6..."


In [25]:
gdf_shst_segment_filtered = gdf_shst_segment_filtered.drop_duplicates(subset='id')

In [27]:
# to calculate length, we have to change the coordinate system
gdf_shst_segment_filtered_2263 = gdf_shst_segment_filtered.copy()


# change coordinate system from epsg4326 to epsg2263
# epsg2263 is based on us-ft.
gdf_shst_segment_filtered_2263.crs = from_epsg(4326)
gdf_shst_segment_filtered_2263 = gdf_shst_segment_filtered_2263.to_crs(epsg=2263)  

  return _prepare_from_string(" ".join(pjargs))


In [28]:
# calculate length
gdf_shst_segment_filtered_2263['length'] = gdf_shst_segment_filtered_2263['geometry'].length

In [29]:
# add length into the gdf_shst_segment_filtered dataframe
gdf_shst_segment_filtered = gdf_shst_segment_filtered.merge(gdf_shst_segment_filtered_2263[['id','length']], on='id')

In [30]:
# export dataset
gdf_shst_segment_filtered.to_file('../data/sharedstreets_geometry/segment/shst_segment.shp')

In [31]:
# export dataset
gdf_shst_segment_filtered.to_file('../data/sharedstreets_geometry/segment/shst_segment.geojson', driver='GeoJSON')

In [32]:
# export dataset (this is simplified dataset that contain only id and geometry columns)
gdf_shst_segment_filtered[['id','geometry']].to_file('../data/sharedstreets_geometry/segment/shst_segment_simplified.geojson', driver='GeoJSON')