# Calculate Distances of Roadway Blocks

The DC datasets Roadway Blocks and Roadway SubBlocks contain super useful information about our streets, but don't come with the distance of the actual blocks. The shapefiles have a `SHAPELEN` of zero for every object. 

Fortunately, we can use the shapefiles to calculate the distances. 

Source data: https://opendata.dc.gov/pages/roadway-centerlines

Author: Devin Brady, https://github.com/devinbrady/

In [1]:
import shapely
import pandas as pd
import geopandas as gpd
from geopy.distance import distance

In [2]:
def calculate_linestring_distance(row):
    """
    For one row in a GeoPandas GeoDataFrame, calculate the distance in both miles and meters
    of the LineString geography and save the distances to columns in the GeoDataFrame.
    
    Use this function in an .apply() statement, row-wise, like so:
    rb_shp = rb_shp.apply(calculate_linestring_distance, axis=1)
    
    The row's `geometry` field must not be None, and be a LineString (there is one 
    MultiLineString in the DC street data). In those cases, the distances are set to zero.
    """
    
    geom = row['geometry']
    miles_in_block = 0
    meters_in_block = 0
    
    if geom and isinstance(geom, shapely.geometry.linestring.LineString):
        
        # Iterate through every point in the LineString, calculating distance to the previous point
        for idx, pnt in enumerate(geom.coords):
            if idx > 0:
                segment_distance = distance(
                    (geom.coords[idx-1][1], geom.coords[idx-1][0])
                    , (geom.coords[idx][1], geom.coords[idx][0])
                )
                miles_in_block += segment_distance.miles
                meters_in_block += segment_distance.meters

    row['distance_miles'] = miles_in_block
    row['distance_meters'] = meters_in_block
    
    return row

In [3]:
rb = pd.read_csv('Roadway_Block.csv', low_memory=False)

In [4]:
rb_shp = gpd.read_file('Roadway_Block/Roadway_Block.shp')

In [5]:
# Use a sample of 100 rows to demonstrate this function. In production, leave out this line
rb_shp = rb_shp.sample(100)

In [6]:
rb_shp = rb_shp.apply(calculate_linestring_distance, axis=1)

In [7]:
rb_distance = pd.merge(rb, rb_shp[['OBJECTID', 'distance_miles', 'distance_meters']], how='inner', on='OBJECTID')

In [8]:
# Save CSV with distances appended
# rb_distance.to_csv('Roadway_Block_Distance.csv', index=False)

In [9]:
rb_distance[['OBJECTID', 'ROUTENAME', 'FROMSTREET', 'TOSTREET', 'distance_miles', 'distance_meters']].head()

Unnamed: 0,OBJECTID,ROUTENAME,FROMSTREET,TOSTREET,distance_miles,distance_meters
0,27558,M ST SW,HOWISON PL SW,CANAL ST SW,0.025701,41.36218
1,27670,51ST ST SE,BASS PL SE,C ST SE,0.047512,76.463356
2,27975,NORTH CAPITOL ST BN,FLORIDA AVE NW/FLORIDA AVE NE,LINCOLN RD NE/Q ST NE,0.022115,35.590786
3,28061,44TH ST NW,CHESAPEAKE ST NW,DAVENPORT ST NW,0.073563,118.387975
4,28246,G ST NW,12TH ST NW,13TH ST NW,0.083114,133.759385
