# Network analysis in Senegal

### Objectives
    1)	Use measures of road-based accessibility to identify road segments that, if rehabilitated, would improve agricultural market activities in Senegal, including during flood conditions.
    2)	Gain a better understanding of the accessibility, connectivity, and criticality of roads in Senegal in relationship to agricultural origins, processing & transfer sites, and markets.

To this end, the team will develop an accessibility model which measures the travel time from sites of agricultural production to their nearest populated areas, processing centers, and markets. 

### Datasets for analysis
#### ORIGIN
    1) agriculture: MapSPAM 2017. Measuring value in international dollars.
    2) agriculture: UMD Land Cover 2019 30m. Assign MapSPAM value onto land cover cropland class for more precise origin information.
    3) population: WorldPop 2020, UN-adjusted.
    4) settlement extent: GRID3 2020.
#### DESTINATION
    4) markets: derived from WorldPop 2020 and GRID3 2020 urban clusters.
    5) agricultural processing hubs: to be acquired.
#### TRAVEL ROUTE
    6) roads: OpenStreetMap, July 2021.
    7) elevation: 
#### OBSTACLE
    8) flood: FATHOM. 1-in-10, 20, and 50 year flood return periods. 
#### INTERVENTION
    9) upcoming road projects: AGEROUTE interventions separate from the World Bank-financed project
    10) targeted road projects: critical road segments identified by this accessibility model's baseline outputs


### Model design
#### Basic formula: 
    (a) Off-road driving time from origin to closest road node
    +
    (b) Driving time from road node in (a) to a destination (closeness measured by road segments speeds)

#### Model origin & destination (OD) sets:
    A)	Travel time from an area that has agricultural value/potential to the nearest processing hub (if provided).
    B)	Travel time from an area that has agricultural value/potential to the nearest larger settlement, (“larger” settlement identified using a case-appropriate population metric to be determined).
    C)	Travel time from an area that has agricultural value/potential to the nearest market.
    D)	Travel time from all settlements to the nearest market.
    E)	Travel time from larger settlements to the nearest market.

#### Before/after scenarios for each OD set:
    1)	Pre-project, baseline weather: No inclement weather. Road network status as of November 2021.
    2)	Pre-project, flood: 1-in-10, 1-in-20 and 1-in-50 year flood return period. Road network status as of November 2021.
    3)	Post-project, baseline weather: No inclement weather. Road network status if X number of critical road segments to high-value areas are protected (i.e., their travel times reduced).
    4)	Post-project, flood: 1-in-10 year flood return period. Road network status if X number of critical road segments to high-value areas are protected (i.e., their travel times reduced).

#### Notes:
    --Destinations are expected to be proximal to the road network, so no measure is taken between road and destination.
    --All travel times will be assigned to each model variation’s point of origin; the aggregation up to admin areas is possible if desired.
    --Obstacles & interventions modify the road segment speeds. Basic formula is then applied to the modified road network.


### Prep workspace

In [1]:
import os, sys
GISFolder = os.getcwd()
GISFolder

'C:\\Users\\wb527163\\GEO-Cdrive-Grace'

In [2]:
# Note: needed to reinstall rtree due to geopandas import error. Did so in the console. 
# conda install -c conda-forge rtree=0.9.3

In [2]:
# load and filter osm network (step 1)
import geopandas as gpd
from geopandas import GeoDataFrame
import pandas as pd
import time
sys.path.append(r"C:\Users\wb527163\.conda\envs\geo\GOSTnets-master")
import GOSTnets as gn

In [3]:
import networkx as nx
import osmnx as ox
import numpy as np
import rasterio as rt
import shapely
from shapely.geometry import Point, box
from shapely.ops import unary_union
from shapely.wkt import loads
from shapely import wkt
from shapely.geometry import LineString, MultiLineString, Point
import peartree

In [4]:
#### Might not use these
import fiona
from osgeo import gdal
import importlib
import matplotlib.pyplot as plt
import subprocess, glob

In [5]:
pth = os.path.join(GISFolder, "SEN-Cdrive") # Personal folder system for running model.
pth

'C:\\Users\\wb527163\\GEO-Cdrive-Grace\\SEN-Cdrive'

In [6]:
out_pth = os.path.join(GISFolder, "SEN-Cdrive\outputs") # For storing intermediate outputs from the model.
out_pth

'C:\\Users\\wb527163\\GEO-Cdrive-Grace\\SEN-Cdrive\\outputs'

In [7]:
team_pth = 'R:\\SEN\\GEO' # This is where the unmodified input data is stored. Finalized outputs also housed here.
team_pth

'R:\\SEN\\GEO'

### Prepare and clean the data

#### Return periods from FATHOM: 1-in-10, 20, and 50 year floods.

We are starting with just the 1-in-10 year return period. Joining 1-in-20 and 1-in-50 to the road dataframe was causing file size errors. 20 and 50 will be run in a replicated script.

In [42]:
flood50 = gpd.read_file("C:/Users/wb527163/GEO-Cdrive-Grace/SEN-Cdrive/scratch.gdb", layer="PFU_1in50")
flood50.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 2459356 entries, 0 to 2459355
Data columns (total 4 columns):
 #   Column        Dtype   
---  ------        -----   
 0   PFU_1in50     float64 
 1   Shape_Length  float64 
 2   Shape_Area    float64 
 3   geometry      geometry
dtypes: float64(3), geometry(1)
memory usage: 75.1 MB


In [20]:
gTime = nx.read_gpickle("SEN-Cdrive/gTime.pickle")
gTime_edge = gn.edge_gdf_from_graph(gTime)
gTime_edge

Unnamed: 0,stnode,endnode,osmid,mode,highway,time,ref,length,access,oneway,bridge,lanes,junction,tunnel,width,maxspeed,name,area,geometry
0,358284990,5217543379,59618174,drive,unclassified,2.385144,D 523,33.127,,False,,,,,,,D 523,,"LINESTRING (-12.32347 12.38119, -12.32368 12.3..."
1,358284990,1888282175,178482063,drive,tertiary,0.769920,,12.832,,False,,,,,,,,,"LINESTRING (-12.32347 12.38119, -12.32351 12.3..."
2,358284990,5329792467,178482063,drive,tertiary,2.926860,,48.781,,False,,,,,,,,,"LINESTRING (-12.32347 12.38119, -12.32317 12.3..."
3,358284993,1888282575,178470940,drive,tertiary,7.594620,,126.577,,False,,,,,,,,,"LINESTRING (-12.28135 12.41380, -12.28205 12.4..."
4,358284993,1888198886,178470940,drive,tertiary,7.234680,,120.578,,False,,,,,,,,,"LINESTRING (-12.28135 12.41380, -12.28061 12.4..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4012560,9246539941,9246539942,366052716,drive,residential,6.906780,,76.742,,False,,,,,,,,,"LINESTRING (-17.48521 14.72445, -17.48451 14.7..."
4012561,9246539942,9246539941,366052716,drive,residential,6.906780,,76.742,,False,,,,,,,,,"LINESTRING (-17.48451 14.72456, -17.48521 14.7..."
4012562,9246539942,3700438702,366052716,drive,residential,0.076140,,0.846,,False,,,,,,,,,"LINESTRING (-17.48451 14.72456, -17.48451 14.7..."
4012563,9276108905,6048975958,177950649,drive,secondary,8.840674,,171.902,,True,,2,,,,,Route de l'Aeroport,,"LINESTRING (-17.50499 14.74971, -17.50357 14.7..."


In [11]:
gTime_node = os.path.join(pth, "gTime_nodes.csv")
gTime_node = pd.read_csv(gTime_node)
gTime_node.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1829568 entries, 0 to 1829567
Data columns (total 7 columns):
 #   Column      Dtype  
---  ------      -----  
 0   Unnamed: 0  int64  
 1   node_ID     int64  
 2   y           float64
 3   highway     object 
 4   x           float64
 5   ref         object 
 6   geometry    object 
dtypes: float64(2), int64(2), object(3)
memory usage: 97.7+ MB


  exec(code_obj, self.user_global_ns, self.user_ns)


### Update driving times based on flood intersection.

#### Join road network and flood raster into single table.

In [26]:
gTime_edge.reset_index(inplace=True) # To create unique ID and to avoid:  ValueError: cannot reindex from a duplicate axis.
gTime_edge.rename(columns={'index': 'ID_graph'}, inplace=True)
gTime_edge.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 4012565 entries, 0 to 4012564
Data columns (total 20 columns):
 #   Column    Dtype   
---  ------    -----   
 0   ID_graph  int64   
 1   stnode    int64   
 2   endnode   int64   
 3   osmid     int64   
 4   mode      object  
 5   highway   object  
 6   time      float64 
 7   ref       object  
 8   length    float64 
 9   access    object  
 10  oneway    bool    
 11  bridge    object  
 12  lanes     object  
 13  junction  object  
 14  tunnel    object  
 15  width     object  
 16  maxspeed  object  
 17  name      object  
 18  area      object  
 19  geometry  geometry
dtypes: bool(1), float64(2), geometry(1), int64(4), object(12)
memory usage: 585.5+ MB


In [43]:
# Spatial join should be on projected GDFs.
gTime_edge = gTime_edge.to_crs("EPSG:31028")
flood50 = flood50.to_crs("EPSG:31028")
gTime_edge.crs == flood50.crs

True

In [44]:
join50 = gpd.sjoin_nearest(gTime_edge, flood50, how="left", max_distance=3) 
join50.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
Int64Index: 4195759 entries, 0 to 4012564
Data columns (total 24 columns):
 #   Column        Dtype   
---  ------        -----   
 0   ID_graph      int64   
 1   stnode        int64   
 2   endnode       int64   
 3   osmid         int64   
 4   mode          object  
 5   highway       object  
 6   time          float64 
 7   ref           object  
 8   length        float64 
 9   access        object  
 10  oneway        bool    
 11  bridge        object  
 12  lanes         object  
 13  junction      object  
 14  tunnel        object  
 15  width         object  
 16  maxspeed      object  
 17  name          object  
 18  area          object  
 19  geometry      geometry
 20  index_right   float64 
 21  PFU_1in50     float64 
 22  Shape_Length  float64 
 23  Shape_Area    float64 
dtypes: bool(1), float64(6), geometry(1), int64(4), object(12)
memory usage: 772.3+ MB


In [45]:
join50

Unnamed: 0,ID_graph,stnode,endnode,osmid,mode,highway,time,ref,length,access,...,tunnel,width,maxspeed,name,area,geometry,index_right,PFU_1in50,Shape_Length,Shape_Area
0,0,358284990,5217543379,59618174,drive,unclassified,2.385144,D 523,33.127,,...,,,,D 523,,"LINESTRING (790871.607 1370083.257, 790848.290...",,,,
1,1,358284990,1888282175,178482063,drive,tertiary,0.769920,,12.832,,...,,,,,,"LINESTRING (790871.607 1370083.257, 790867.287...",,,,
2,2,358284990,5329792467,178482063,drive,tertiary,2.926860,,48.781,,...,,,,,,"LINESTRING (790871.607 1370083.257, 790904.086...",,,,
3,3,358284993,1888282575,178470940,drive,tertiary,7.594620,,126.577,,...,,,,,,"LINESTRING (795418.079 1373739.218, 795343.307...",,,,
4,4,358284993,1888198886,178470940,drive,tertiary,7.234680,,120.578,,...,,,,,,"LINESTRING (795418.079 1373739.218, 795497.547...",,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4012560,4012560,9246539941,9246539942,366052716,drive,residential,6.906780,,76.742,,...,,,,,,"LINESTRING (232231.532 1629246.092, 232307.524...",,,,
4012561,4012561,9246539942,9246539941,366052716,drive,residential,6.906780,,76.742,,...,,,,,,"LINESTRING (232307.524 1629257.675, 232231.532...",,,,
4012562,4012562,9246539942,3700438702,366052716,drive,residential,0.076140,,0.846,,...,,,,,,"LINESTRING (232307.524 1629257.675, 232307.393...",,,,
4012563,4012563,9276108905,6048975958,177950649,drive,secondary,8.840674,,171.902,,...,,,,Route de l'Aeroport,,"LINESTRING (230131.706 1632066.089, 230285.874...",,,,


In [46]:
# How many nodes experienced flooding?
pc_flooded = join50["PFU_1in50"].count() / len(join50) * 100

print("No flood crossing at node:", join50["PFU_1in50"].isnull().sum(), "locations", end="\n")
print("Flood crossing at node:", join50["PFU_1in50"].count(), "locations", end="\n")
print("\nPercent flooded:", pc_flooded, "percent", "out of", len(join50), "possible locations")

No flood crossing at node: 3748839 locations
Flood crossing at node: 446920 locations

Percent flooded: 10.651708069982094 percent out of 4195759 possible locations


In [47]:
join50 = join50[['ID_graph', 'stnode', 'endnode', 'time', 'length', 'highway', 'osmid', 'geometry', 'PFU_1in50']]
join50.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
Int64Index: 4195759 entries, 0 to 4012564
Data columns (total 9 columns):
 #   Column     Dtype   
---  ------     -----   
 0   ID_graph   int64   
 1   stnode     int64   
 2   endnode    int64   
 3   time       float64 
 4   length     float64 
 5   highway    object  
 6   osmid      int64   
 7   geometry   geometry
 8   PFU_1in50  float64 
dtypes: float64(3), geometry(1), int64(4), object(1)
memory usage: 320.1+ MB


In [48]:
join50 = join50.to_crs("EPSG:4326")
join50.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [49]:
join50

Unnamed: 0,ID_graph,stnode,endnode,time,length,highway,osmid,geometry,PFU_1in50
0,0,358284990,5217543379,2.385144,33.127,unclassified,59618174,"LINESTRING (-12.32347 12.38119, -12.32368 12.3...",
1,1,358284990,1888282175,0.769920,12.832,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32351 12.3...",
2,2,358284990,5329792467,2.926860,48.781,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32317 12.3...",
3,3,358284993,1888282575,7.594620,126.577,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28205 12.4...",
4,4,358284993,1888198886,7.234680,120.578,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28061 12.4...",
...,...,...,...,...,...,...,...,...,...
4012560,4012560,9246539941,9246539942,6.906780,76.742,residential,366052716,"LINESTRING (-17.48521 14.72445, -17.48451 14.7...",
4012561,4012561,9246539942,9246539941,6.906780,76.742,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48521 14.7...",
4012562,4012562,9246539942,3700438702,0.076140,0.846,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48451 14.7...",
4012563,4012563,9276108905,6048975958,8.840674,171.902,secondary,177950649,"LINESTRING (-17.50499 14.74971, -17.50357 14.7...",


In [50]:
# Fewer errors farther down when using dataframe instead of gdf
join50_df = pd.DataFrame(join50)
join50_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4195759 entries, 0 to 4012564
Data columns (total 9 columns):
 #   Column     Dtype   
---  ------     -----   
 0   ID_graph   int64   
 1   stnode     int64   
 2   endnode    int64   
 3   time       float64 
 4   length     float64 
 5   highway    object  
 6   osmid      int64   
 7   geometry   geometry
 8   PFU_1in50  float64 
dtypes: float64(3), geometry(1), int64(4), object(1)
memory usage: 320.1+ MB


In [51]:
join50_df

Unnamed: 0,ID_graph,stnode,endnode,time,length,highway,osmid,geometry,PFU_1in50
0,0,358284990,5217543379,2.385144,33.127,unclassified,59618174,"LINESTRING (-12.32347 12.38119, -12.32368 12.3...",
1,1,358284990,1888282175,0.769920,12.832,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32351 12.3...",
2,2,358284990,5329792467,2.926860,48.781,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32317 12.3...",
3,3,358284993,1888282575,7.594620,126.577,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28205 12.4...",
4,4,358284993,1888198886,7.234680,120.578,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28061 12.4...",
...,...,...,...,...,...,...,...,...,...
4012560,4012560,9246539941,9246539942,6.906780,76.742,residential,366052716,"LINESTRING (-17.48521 14.72445, -17.48451 14.7...",
4012561,4012561,9246539942,9246539941,6.906780,76.742,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48521 14.7...",
4012562,4012562,9246539942,3700438702,0.076140,0.846,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48451 14.7...",
4012563,4012563,9276108905,6048975958,8.840674,171.902,secondary,177950649,"LINESTRING (-17.50499 14.74971, -17.50357 14.7...",


In [52]:
join50_df.to_csv(os.path.join(out_pth, 'gTime_flood50_intermediate.csv'))

### Create speed penalties.
Note: Flood depths are in centimeters. FATHOM uses meters, but conversion process to vector required some finessing. 

In [53]:
# Give a depth to the nodes that don't cross a flood point. 
join50.loc[join50['PFU_1in50'].isnull(), 'PFU_1in50'] = -1

In [54]:
join50["t50"] = 1 # This is the penalty column.
join50.loc[join50['PFU_1in50'] < 0, 't50'] = 1 # Where no flood crosses, keep the default value (no penalty).
join50.loc[(join50['PFU_1in50'] > 10) & (join50['PFU_1in50'] <= 30), 't50'] = 1.25
join50.loc[(join50['PFU_1in50'] > 30) & (join50['PFU_1in50'] <= 60), 't50'] = 2
join50.loc[(join50['PFU_1in50'] > 60) & (join50['PFU_1in50'] <= 90), 't50'] = 5
join50.loc[(join50['PFU_1in50'] > 90), 't50'] = 9999
join50

Unnamed: 0,ID_graph,stnode,endnode,time,length,highway,osmid,geometry,PFU_1in50,t50
0,0,358284990,5217543379,2.385144,33.127,unclassified,59618174,"LINESTRING (-12.32347 12.38119, -12.32368 12.3...",-1.0,1.0
1,1,358284990,1888282175,0.769920,12.832,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32351 12.3...",-1.0,1.0
2,2,358284990,5329792467,2.926860,48.781,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32317 12.3...",-1.0,1.0
3,3,358284993,1888282575,7.594620,126.577,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28205 12.4...",-1.0,1.0
4,4,358284993,1888198886,7.234680,120.578,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28061 12.4...",-1.0,1.0
...,...,...,...,...,...,...,...,...,...,...
4012560,4012560,9246539941,9246539942,6.906780,76.742,residential,366052716,"LINESTRING (-17.48521 14.72445, -17.48451 14.7...",-1.0,1.0
4012561,4012561,9246539942,9246539941,6.906780,76.742,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48521 14.7...",-1.0,1.0
4012562,4012562,9246539942,3700438702,0.076140,0.846,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48451 14.7...",-1.0,1.0
4012563,4012563,9276108905,6048975958,8.840674,171.902,secondary,177950649,"LINESTRING (-17.50499 14.74971, -17.50357 14.7...",-1.0,1.0


In [55]:
# Turn the penalty column into a flood-affected time column.
join50['t50'] = join50['t50'] * join50['time']
join50

Unnamed: 0,ID_graph,stnode,endnode,time,length,highway,osmid,geometry,PFU_1in50,t50
0,0,358284990,5217543379,2.385144,33.127,unclassified,59618174,"LINESTRING (-12.32347 12.38119, -12.32368 12.3...",-1.0,2.385144
1,1,358284990,1888282175,0.769920,12.832,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32351 12.3...",-1.0,0.769920
2,2,358284990,5329792467,2.926860,48.781,tertiary,178482063,"LINESTRING (-12.32347 12.38119, -12.32317 12.3...",-1.0,2.926860
3,3,358284993,1888282575,7.594620,126.577,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28205 12.4...",-1.0,7.594620
4,4,358284993,1888198886,7.234680,120.578,tertiary,178470940,"LINESTRING (-12.28135 12.41380, -12.28061 12.4...",-1.0,7.234680
...,...,...,...,...,...,...,...,...,...,...
4012560,4012560,9246539941,9246539942,6.906780,76.742,residential,366052716,"LINESTRING (-17.48521 14.72445, -17.48451 14.7...",-1.0,6.906780
4012561,4012561,9246539942,9246539941,6.906780,76.742,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48521 14.7...",-1.0,6.906780
4012562,4012562,9246539942,3700438702,0.076140,0.846,residential,366052716,"LINESTRING (-17.48451 14.72456, -17.48451 14.7...",-1.0,0.076140
4012563,4012563,9276108905,6048975958,8.840674,171.902,secondary,177950649,"LINESTRING (-17.50499 14.74971, -17.50357 14.7...",-1.0,8.840674


In [56]:
join50.to_csv(os.path.join(out_pth, 'join50.csv'))

### Convert back to graph object.

In [9]:
# Converting back to graph can cause memory errors. Suggested to restart the kernel and reload the nodes and revised edges at this point.
gTime_node = os.path.join(pth, "gTime_nodes.csv")
gTime_node = pd.read_csv(gTime_node)
join50 = os.path.join(out_pth, "join50.csv")
join50 = pd.read_csv(join50)
print(gTime_node.info())
print(join50.info())

  exec(code_obj, self.user_global_ns, self.user_ns)


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1829568 entries, 0 to 1829567
Data columns (total 7 columns):
 #   Column      Dtype  
---  ------      -----  
 0   Unnamed: 0  int64  
 1   node_ID     int64  
 2   y           float64
 3   highway     object 
 4   x           float64
 5   ref         object 
 6   geometry    object 
dtypes: float64(2), int64(2), object(3)
memory usage: 97.7+ MB
None
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4195759 entries, 0 to 4195758
Data columns (total 11 columns):
 #   Column      Dtype  
---  ------      -----  
 0   Unnamed: 0  int64  
 1   ID_graph    int64  
 2   stnode      int64  
 3   endnode     int64  
 4   time        float64
 5   length      float64
 6   highway     object 
 7   osmid       int64  
 8   geometry    object 
 9   PFU_1in50   float64
 10  t50         float64
dtypes: float64(4), int64(5), object(2)
memory usage: 352.1+ MB
None


In [10]:
print('start: %s\n' % time.ctime())
G_flood = gn.edges_and_nodes_gdf_to_graph(gTime_node, join50, node_tag='node_ID', u_tag='stnode', v_tag='endnode', geometry_tag='geometry')
gn.example_edge(G_flood, 10)
print('\nend: %s' % time.ctime())
print('\n--- processing complete')

start: Wed Dec 22 13:12:40 2021

(358284990, 5217543379, {'geometry': <shapely.geometry.linestring.LineString object at 0x000001411CF7FF40>, 'Unnamed: 0': 0, 'ID_graph': 0, 'time': 2.3851440000000004, 'length': 33.127, 'highway': 'unclassified', 'osmid': 59618174, 'PFU_1in50': -1.0, 't50': 2.3851440000000004})
(358284990, 1888282175, {'geometry': <shapely.geometry.linestring.LineString object at 0x000001411CF7FEB0>, 'Unnamed: 0': 1, 'ID_graph': 1, 'time': 0.76992, 'length': 12.832, 'highway': 'tertiary', 'osmid': 178482063, 'PFU_1in50': -1.0, 't50': 0.76992})
(358284990, 5329792467, {'geometry': <shapely.geometry.linestring.LineString object at 0x000001411CF7FE80>, 'Unnamed: 0': 2, 'ID_graph': 2, 'time': 2.92686, 'length': 48.781, 'highway': 'tertiary', 'osmid': 178482063, 'PFU_1in50': -1.0, 't50': 2.92686})
(5217543379, 358284990, {'geometry': <shapely.geometry.linestring.LineString object at 0x0000014189C99340>, 'Unnamed: 0': 16698, 'ID_graph': 16698, 'time': 2.3851440000000004, 'len

In [11]:
print('start: %s\n' % time.ctime())
gn.save(G_flood, 'gTime_flood50', out_pth, edges = True, nodes = True)
print('\nend: %s' % time.ctime())
print('\n--- processing complete')

start: Wed Dec 22 13:21:57 2021


end: Wed Dec 22 13:34:54 2021

--- processing complete


### Create travel time values for the road nodes nearest to each service.

Using calculate_OD.

In [9]:
# If starting a new session, load from file.
HDurban_snap = os.path.join(out_pth, "HDurban_snap.csv")
HDurban_snap = pd.read_csv(HDurban_snap)
hamlet_snap = os.path.join(out_pth, "hamlet_snap.csv")
hamlet_snap = pd.read_csv(hamlet_snap)

In [10]:
print('start: %s\n' % time.ctime())
ag_snap = os.path.join(out_pth, "ag_snap.csv")
ag_snap = pd.read_csv(ag_snap)
print('\nend: %s' % time.ctime())
print('\n--- processing complete')

start: Mon Dec 27 19:34:06 2021


end: Mon Dec 27 19:34:53 2021

--- processing complete


In [11]:
G_flood = nx.read_gpickle("SEN-Cdrive/outputs/gTime_flood20.pickle")

In [12]:
# We only need to find the origin-destination pairs for nodes closest to the origins and services,
# and some nodes will be the nearest for more than one service (and definitely for multiple origins).
list_hamlet = list(hamlet_snap.NN.unique())
list_ag = list(ag_snap.NN.unique())
originslist = list_hamlet + list_ag
origins = list(set(originslist))

In [13]:
dests = list(HDurban_snap.NN.unique()) 

In [14]:
len(origins)

637854

In [15]:
len(dests) 

58

In [16]:
fail_value = 999999999 # If there is no shortest path, the OD pair will be assigned the fail value.

In [17]:
print('start: %s\n' % time.ctime())
OD = gn.calculate_OD(G_flood, origins, dests, fail_value, weight = 't20')
print('\nend: %s' % time.ctime())
print('\n--- processing complete')

start: Mon Dec 27 19:36:14 2021


end: Mon Dec 27 19:55:43 2021

--- processing complete


In [18]:
OD_df = pd.DataFrame(OD, index = origins, columns = dests)

In [19]:
OD_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 637854 entries, 3571449893 to 8925478824
Data columns (total 58 columns):
 #   Column      Non-Null Count   Dtype  
---  ------      --------------   -----  
 0   6058226279  637854 non-null  float64
 1   6029307183  637854 non-null  float64
 2   4998093094  637854 non-null  float64
 3   2201506815  637854 non-null  float64
 4   3474499811  637854 non-null  float64
 5   1697006012  637854 non-null  float64
 6   1901689169  637854 non-null  float64
 7   6032060028  637854 non-null  float64
 8   6040927878  637854 non-null  float64
 9   3449495495  637854 non-null  float64
 10  3990543961  637854 non-null  float64
 11  8972391475  637854 non-null  float64
 12  3418418812  637854 non-null  float64
 13  1983641803  637854 non-null  float64
 14  6014451367  637854 non-null  float64
 15  6027163276  637854 non-null  float64
 16  2833577858  637854 non-null  float64
 17  4656728818  637854 non-null  float64
 18  6045659373  637854 non-null  fl

In [20]:
# Convert to minutes and save to file.
OD_min = OD_df[OD_df <fail_value] / 60
OD_min.to_csv(os.path.join(out_pth, 'OD_flood20_allorigins.csv'))
OD_min

Unnamed: 0,6058226279,6029307183,4998093094,2201506815,3474499811,1697006012,1901689169,6032060028,6040927878,3449495495,...,1968458114,1936967272,3496518021,6027615161,6027276892,6041228287,5536661253,7357630367,8178147277,6026834850
3571449893,213.645444,839.820832,701.854926,580.699756,163.539580,664.646282,704.134038,660.914796,579.768214,553.139068,...,579.518640,635.524359,156.190505,156.007176,703.371664,699.192686,698.664422,698.576883,558.990729,558.269196
3571449966,214.032355,840.207744,702.241838,581.086668,163.926491,665.033193,704.520949,661.301707,580.155126,553.525980,...,579.905551,635.911270,156.577416,156.394088,703.758576,699.579598,699.051334,698.963794,559.377640,558.656107
3405774993,85.481910,895.185232,691.910559,570.755389,153.595212,654.701914,694.189670,650.970428,569.823847,543.194701,...,569.574272,625.579991,291.516231,291.332903,693.427297,689.248319,688.720055,688.632515,549.046361,548.324828
3405774994,85.882659,895.585981,691.509810,570.354640,153.194463,654.301165,693.788921,650.569679,569.423098,542.793952,...,569.173523,625.179242,291.115482,290.932154,693.026548,688.847570,688.319306,688.231766,548.645612,547.924079
3405774995,86.216205,895.919527,691.176264,570.021094,152.860917,653.967619,693.455375,650.236133,569.089552,542.460406,...,568.839977,624.845696,290.781936,290.598608,692.693002,688.514024,687.985760,687.898220,548.312066,547.590533
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8633974649,289.766806,789.441018,896.010761,774.855591,357.695414,858.802116,898.289872,855.070630,773.924048,747.294903,...,773.674474,770.225204,201.273940,201.090611,897.527498,893.348521,892.820257,892.732717,753.146563,752.425030
8633974652,289.676390,789.350601,895.920344,774.765174,357.604997,858.711700,898.199456,854.980214,773.833632,747.204486,...,773.584058,770.134788,201.183523,201.000195,897.437082,893.258104,892.729840,892.642300,753.056146,752.334613
8633974653,289.610430,789.284642,895.854385,774.699215,357.539038,858.645741,898.133496,854.914255,773.767673,747.138527,...,773.518098,770.068829,201.117564,200.934236,897.371123,893.192145,892.663881,892.576341,752.990187,752.268654
8633974654,289.577655,789.251866,895.821609,774.666439,357.506263,858.612965,898.100721,854.881479,773.734897,747.105751,...,773.485323,770.036053,201.084789,200.901460,897.338347,893.159369,892.631105,892.543566,752.957412,752.235879


In [21]:
# Create origin-specific matrix and save to file.
OD_ag = OD_df.loc[list_ag,: ]
OD_ag = OD_ag[OD_ag < fail_value] / 60 
OD_ag.to_csv(os.path.join(out_pth, 'OD_flood20_ag.csv'))
OD_ag

Unnamed: 0,6058226279,6029307183,4998093094,2201506815,3474499811,1697006012,1901689169,6032060028,6040927878,3449495495,...,1968458114,1936967272,3496518021,6027615161,6027276892,6041228287,5536661253,7357630367,8178147277,6026834850
3507831609,83.755636,860.493854,851.261458,730.106288,312.946111,814.052814,853.540570,810.321328,729.174746,702.545600,...,728.925172,784.930891,445.758016,445.574687,852.778196,848.599218,848.070954,847.983414,708.397260,707.675727
3507831510,90.114132,866.852350,857.619955,736.464785,319.304608,820.411310,859.899066,816.679824,735.533242,708.904097,...,735.283668,791.289387,452.116512,451.933184,859.136692,854.957715,854.429451,854.341911,714.755757,714.034224
6188134127,25.237636,801.975854,792.743458,671.588288,254.428111,755.534814,795.022569,751.803328,670.656746,644.027600,...,670.407171,726.412890,387.240016,387.056687,794.260196,790.081218,789.552954,789.465414,649.879260,649.157727
8631201421,96.469734,804.245498,863.975556,742.820386,325.660209,826.766911,866.254667,823.035426,741.888844,715.259698,...,741.639269,797.644988,373.966705,373.783376,865.492294,861.313316,860.785052,860.697512,721.111358,720.389825
8598305977,53.485996,783.022540,820.991818,699.836648,282.676471,783.783173,823.270929,780.051688,698.905106,672.275960,...,698.655531,754.661250,368.286701,368.103373,822.508556,818.329578,817.801314,817.713774,678.127620,677.406087
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3651042474,140487.455803,141221.255348,139981.539985,140016.939178,140364.129068,139945.350892,139981.653020,139928.309656,139970.235088,139936.043949,...,139847.040556,139785.574025,140348.504547,140348.321219,139983.056722,139969.998602,139969.108724,139969.001946,139938.145188,139939.151306
3651042508,140479.305548,141213.105093,139973.389730,140008.788923,140355.978813,139937.200637,139973.502765,139920.159401,139962.084833,139927.893694,...,139838.890301,139777.423770,140340.354292,140340.170964,139974.906467,139961.848347,139960.958469,139960.851692,139929.994933,139931.001051
3651042501,140484.800664,141218.600208,139978.884845,140014.284039,140361.473928,139942.695753,139978.997881,139925.654516,139967.579949,139933.388810,...,139844.385417,139782.918886,140345.849408,140345.666079,139980.401583,139967.343463,139966.453585,139966.346807,139935.490049,139936.496167
3651042393,140471.270362,141205.069906,139965.354543,140000.753736,140347.943626,139929.165450,139965.467579,139912.124214,139954.049646,139919.858507,...,139830.855115,139769.388584,140332.319105,140332.135777,139966.871281,139953.813161,139952.923282,139952.816505,139921.959746,139922.965864


In [22]:
OD_hamlet = OD_df.loc[list_hamlet,: ]
OD_hamlet = OD_hamlet[OD_hamlet < fail_value] / 60 
OD_hamlet.to_csv(os.path.join(out_pth, 'OD_flood20_hamlet.csv'))
OD_hamlet

Unnamed: 0,6058226279,6029307183,4998093094,2201506815,3474499811,1697006012,1901689169,6032060028,6040927878,3449495495,...,1968458114,1936967272,3496518021,6027615161,6027276892,6041228287,5536661253,7357630367,8178147277,6026834850
7761872870,63.403221,874.359225,829.335161,708.179991,291.019814,792.126517,831.614273,788.395031,707.248449,680.619303,...,706.998875,763.004594,428.940834,428.757505,830.851899,826.672921,826.144657,826.057118,686.470963,685.749430
7761872869,63.384661,874.340665,829.316602,708.161432,291.001255,792.107957,831.595713,788.376471,707.229889,680.600744,...,706.980315,762.986034,428.922274,428.738946,830.833339,826.654362,826.126098,826.038558,686.452404,685.730871
6442044321,62.374829,873.330833,828.306770,707.151600,289.991423,791.098125,830.585881,787.366639,706.220057,679.590912,...,705.970483,761.976202,427.912442,427.729114,829.823507,825.644530,825.116266,825.028726,685.442572,684.721039
2142496418,63.828351,874.784355,829.760292,708.605122,291.444945,792.551647,832.039403,788.820161,707.673579,681.044434,...,707.424005,763.429724,429.365964,429.182636,831.277029,827.098052,826.569788,826.482248,686.896094,686.174561
2142496429,64.210725,875.166729,830.142666,708.987496,291.827319,792.934021,832.421777,789.202535,708.055953,681.426808,...,707.806379,763.812098,429.748338,429.565010,831.659403,827.480426,826.952162,826.864622,687.278468,686.556935
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9207762346,759.815291,1270.957620,441.918070,366.535448,671.915052,404.709426,444.197182,391.234121,319.831358,284.110717,...,305.620498,249.148226,398.206819,398.023490,443.434808,433.572803,432.682925,432.576147,293.401199,286.416565
8463584916,719.145603,1428.350095,230.836629,251.844536,595.818867,194.647536,233.115741,177.606300,205.140446,170.949307,...,92.201681,64.552430,555.599294,555.415966,232.353367,228.174389,227.646125,227.558585,173.050546,174.056664
8463593882,722.904808,1415.738086,234.804839,255.603741,599.578072,198.615746,237.083950,181.574510,208.899651,174.708512,...,95.960887,68.659633,542.987285,542.803956,236.321576,232.142599,231.614335,231.526795,176.809751,177.815869
9208004175,163905.632985,164639.432529,163399.717166,163435.116360,163782.306249,163363.528073,163399.830202,163346.486837,163388.412269,163354.221131,...,163265.217738,163203.751207,163766.681729,163766.498400,163401.233904,163388.175784,163387.285906,163387.179128,163356.322369,163357.328487


### Filter 1st nearest

#### Check each file to make sure nearest neighbor column is named correctly. If not, rename.

In [23]:
# Reload from file even if already loaded. Quickest way to ensure NN is a column rather than only the index.
OD_hamlet = os.path.join(out_pth, "OD_flood20_hamlet.csv")
OD_hamlet = pd.read_csv(OD_hamlet)
OD_ag = os.path.join(out_pth, "OD_flood20_ag.csv")
OD_ag = pd.read_csv(OD_ag)

In [25]:
OD_hamlet

Unnamed: 0,NN,6058226279,6029307183,4998093094,2201506815,3474499811,1697006012,1901689169,6032060028,6040927878,...,1968458114,1936967272,3496518021,6027615161,6027276892,6041228287,5536661253,7357630367,8178147277,6026834850
0,7761872870,63.403221,874.359225,829.335161,708.179991,291.019814,792.126517,831.614273,788.395031,707.248449,...,706.998875,763.004594,428.940834,428.757505,830.851899,826.672921,826.144657,826.057118,686.470963,685.749430
1,7761872869,63.384661,874.340665,829.316602,708.161432,291.001255,792.107957,831.595713,788.376471,707.229889,...,706.980315,762.986034,428.922274,428.738946,830.833339,826.654362,826.126098,826.038558,686.452404,685.730871
2,6442044321,62.374829,873.330833,828.306770,707.151600,289.991423,791.098125,830.585881,787.366639,706.220057,...,705.970483,761.976202,427.912442,427.729114,829.823507,825.644530,825.116266,825.028726,685.442572,684.721039
3,2142496418,63.828351,874.784355,829.760292,708.605122,291.444945,792.551647,832.039403,788.820161,707.673579,...,707.424005,763.429724,429.365964,429.182636,831.277029,827.098052,826.569788,826.482248,686.896094,686.174561
4,2142496429,64.210725,875.166729,830.142666,708.987496,291.827319,792.934021,832.421777,789.202535,708.055953,...,707.806379,763.812098,429.748338,429.565010,831.659403,827.480426,826.952162,826.864622,687.278468,686.556935
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
61225,9207762346,759.815291,1270.957620,441.918070,366.535448,671.915052,404.709426,444.197182,391.234121,319.831358,...,305.620498,249.148226,398.206819,398.023490,443.434808,433.572803,432.682925,432.576147,293.401199,286.416565
61226,8463584916,719.145603,1428.350095,230.836629,251.844536,595.818867,194.647536,233.115741,177.606300,205.140446,...,92.201681,64.552430,555.599294,555.415966,232.353367,228.174389,227.646125,227.558585,173.050546,174.056664
61227,8463593882,722.904808,1415.738086,234.804839,255.603741,599.578072,198.615746,237.083950,181.574510,208.899651,...,95.960887,68.659633,542.987285,542.803956,236.321576,232.142599,231.614335,231.526795,176.809751,177.815869
61228,9208004175,163905.632985,164639.432529,163399.717166,163435.116360,163782.306249,163363.528073,163399.830202,163346.486837,163388.412269,...,163265.217738,163203.751207,163766.681729,163766.498400,163401.233904,163388.175784,163387.285906,163387.179128,163356.322369,163357.328487


In [24]:
OD_ag.rename(columns={'Unnamed: 0': 'NN'}, inplace=True) 
OD_hamlet.rename(columns={'Unnamed: 0': 'NN'}, inplace=True) 
OD_ag

Unnamed: 0,NN,6058226279,6029307183,4998093094,2201506815,3474499811,1697006012,1901689169,6032060028,6040927878,...,1968458114,1936967272,3496518021,6027615161,6027276892,6041228287,5536661253,7357630367,8178147277,6026834850
0,3507831609,83.755636,860.493854,851.261458,730.106288,312.946111,814.052814,853.540570,810.321328,729.174746,...,728.925172,784.930891,445.758016,445.574687,852.778196,848.599218,848.070954,847.983414,708.397260,707.675727
1,3507831510,90.114132,866.852350,857.619955,736.464785,319.304608,820.411310,859.899066,816.679824,735.533242,...,735.283668,791.289387,452.116512,451.933184,859.136692,854.957715,854.429451,854.341911,714.755757,714.034224
2,6188134127,25.237636,801.975854,792.743458,671.588288,254.428111,755.534814,795.022569,751.803328,670.656746,...,670.407171,726.412890,387.240016,387.056687,794.260196,790.081218,789.552954,789.465414,649.879260,649.157727
3,8631201421,96.469734,804.245498,863.975556,742.820386,325.660209,826.766911,866.254667,823.035426,741.888844,...,741.639269,797.644988,373.966705,373.783376,865.492294,861.313316,860.785052,860.697512,721.111358,720.389825
4,8598305977,53.485996,783.022540,820.991818,699.836648,282.676471,783.783173,823.270929,780.051688,698.905106,...,698.655531,754.661250,368.286701,368.103373,822.508556,818.329578,817.801314,817.713774,678.127620,677.406087
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
625846,3651042474,140487.455803,141221.255348,139981.539985,140016.939178,140364.129068,139945.350892,139981.653020,139928.309656,139970.235088,...,139847.040556,139785.574025,140348.504547,140348.321219,139983.056722,139969.998602,139969.108724,139969.001946,139938.145188,139939.151306
625847,3651042508,140479.305548,141213.105093,139973.389730,140008.788923,140355.978813,139937.200637,139973.502765,139920.159401,139962.084833,...,139838.890301,139777.423770,140340.354292,140340.170964,139974.906467,139961.848347,139960.958469,139960.851692,139929.994933,139931.001051
625848,3651042501,140484.800664,141218.600208,139978.884845,140014.284039,140361.473928,139942.695753,139978.997881,139925.654516,139967.579949,...,139844.385417,139782.918886,140345.849408,140345.666079,139980.401583,139967.343463,139966.453585,139966.346807,139935.490049,139936.496167
625849,3651042393,140471.270362,141205.069906,139965.354543,140000.753736,140347.943626,139929.165450,139965.467579,139912.124214,139954.049646,...,139830.855115,139769.388584,140332.319105,140332.135777,139966.871281,139953.813161,139952.923282,139952.816505,139921.959746,139922.965864


#### Find first, second, and third nearest destination for each origin node. 

In [26]:
fail_value = 999999999

In [27]:
# Nearest
OD_ag["ag_HD1F20"] = 0
sub = OD_ag.iloc[:,1:-1] # Filtering out the newly created field and the node ID column. ("include everything between column 0 and the last column")
OD_ag["ag_HD1F20"] = sub.min(axis=1) # Default is axis=0, meaning min value of each column selected. We want min of each row.
ag1 = OD_ag[['NN', 'ag_HD1F20']] # Remove unnecessary OD values.


# Second nearest
dupes = OD_ag.apply(pd.Series.duplicated, axis = 1, keep=False) # If a number is repeated within a row, value is True. If not, False.
# The first time this is done, there should be two True values per row, unless any POIs are equidistant.
dupes = OD_ag.where(~dupes, fail_value) # For any value that appears more than once in its row, it is replaced with the fail_value.
OD_ag["ag_HD2F20"] = 0
Dsub = dupes.iloc[:,1:] # Filtering out the node ID column. No need to filter 1st nearest as its new "dupes" value is too high to be caught.
OD_ag["ag_HD2F20"] = Dsub.min(axis=1) 
ag2 = OD_ag.loc[:,['NN', 'ag_HD2F20']] 


# Third nearest
dupes = OD_ag.apply(pd.Series.duplicated, axis = 1, keep=False)
# Since this includes both first and second nearest columns, there should be four True values per row, unless POIs are equidistant.
dupes = OD_ag.where(~dupes, fail_value)

OD_ag["ag_HD3F20"] = 0
Dsub = dupes.iloc[:,1:] # Filtering out the node ID column.
OD_ag["ag_HD3F20"] = Dsub.min(axis=1)
ag3 = OD_ag.loc[:,['NN', 'ag_HD3F20']]

# Combine and write to file
ag_all = OD_ag.loc[:,['NN', 'ag_HD1F20', 'ag_HD2F20', 'ag_HD3F20']]
ag_all.to_csv(os.path.join(out_pth, 'ag_to_HDurban_flood20.csv'))
ag_all.head()

Unnamed: 0,NN,ag_HD1F20,ag_HD2F20,ag_HD3F20
0,3507831609,82.054823,83.755636,312.946111
1,3507831510,88.413319,90.114132,319.304608
2,6188134127,23.536822,25.237636,254.428111
3,8631201421,94.76892,96.469734,325.660209
4,8598305977,51.785182,53.485996,282.676471


In [28]:
# Nearest
OD_hamlet["ha_HD1F20"] = 0
sub = OD_hamlet.iloc[:,1:-1] # Filtering out the newly created field and the node ID column. ("include everything between column 0 and the last column")
OD_hamlet["ha_HD1F20"] = sub.min(axis=1) # Default is axis=0, meaning min value of each column selected. We want min of each row.
hamlet1 = OD_hamlet[['NN', 'ha_HD1F20']] # Remove unnecessary OD values.


# Second nearest
dupes = OD_hamlet.apply(pd.Series.duplicated, axis = 1, keep=False) # If a number is repeated within a row, value is True. If not, False.
# The first time this is done, there should be two True values per row, unless any POIs are equidistant.
dupes = OD_hamlet.where(~dupes, fail_value) # For any value that appears more than once in its row, it is replaced with the fail_value.
OD_hamlet["ha_HD2F20"] = 0
Dsub = dupes.iloc[:,1:] # Filtering out the node ID column. No need to filter 1st nearest as its new "dupes" value is too high to be caught.
OD_hamlet["ha_HD2F20"] = Dsub.min(axis=1) 
hamlet2 = OD_hamlet.loc[:,['NN', 'ha_HD2F20']] 


# Third nearest
dupes = OD_hamlet.apply(pd.Series.duplicated, axis = 1, keep=False)
# Since this includes both first and second nearest columns, there should be four True values per row, unless POIs are equidistant.
dupes = OD_hamlet.where(~dupes, fail_value)
OD_hamlet["ha_HD3F20"] = 0
Dsub = dupes.iloc[:,1:] # Filtering out the node ID column.
OD_hamlet["ha_HD3F20"] = Dsub.min(axis=1)
hamlet3 = OD_hamlet.loc[:,['NN', 'ha_HD3F20']]


# Combine and write to file
hamlet_all = OD_hamlet.loc[:,['NN', 'ha_HD1F20', 'ha_HD2F20', 'ha_HD3F20']]
hamlet_all.to_csv(os.path.join(out_pth, 'hamlet_to_HDurban_flood20.csv'))
hamlet_all.head()

Unnamed: 0,NN,ha_HD1F20,ha_HD2F20,ha_HD3F20
0,7761872870,63.403221,66.771529,291.019814
1,7761872869,63.384661,66.75297,291.001255
2,6442044321,62.374829,65.743138,289.991423
3,2142496418,63.828351,67.19666,291.444945
4,2142496429,64.210725,67.579034,291.827319


### Join back to georeferenced _snap file.

In [13]:
# If starting new session, reload from file.
ag_all = os.path.join(out_pth, "ag_to_HDurban_flood20.csv")
ag_all = pd.read_csv(ag_all)
ag_all.info() # Check to make sure NN data type matches its corresponding _snap file

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 541461 entries, 0 to 541460
Data columns (total 5 columns):
 #   Column      Non-Null Count   Dtype  
---  ------      --------------   -----  
 0   Unnamed: 0  541461 non-null  int64  
 1   NN          541461 non-null  int64  
 2   ag_HD1F10   541453 non-null  float64
 3   ag_HD2F10   541461 non-null  float64
 4   ag_HD3F10   541461 non-null  float64
dtypes: float64(3), int64(2)
memory usage: 20.7 MB


In [None]:
hamlet_all = os.path.join(out_pth, "hamlet_to_HDurban_flood20.csv")
hamlet_all = pd.read_csv(hamlet_all)
hamlet_all.info()

In [10]:
ag_snap = os.path.join(out_pth, "ag_snap.csv")
ag_snap = pd.read_csv(ag_snap)
ag_snap.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6852701 entries, 0 to 6852700
Data columns (total 6 columns):
 #   Column      Dtype  
---  ------      -----  
 0   Unnamed: 0  int64  
 1   ID          int64  
 2   LC_90m      int64  
 3   geometry    object 
 4   NN          int64  
 5   NN_dist     float64
dtypes: float64(1), int64(4), object(1)
memory usage: 313.7+ MB


In [None]:
hamlet_snap = os.path.join(out_pth, "hamlet_snap.csv")
hamlet_snap = pd.read_csv(hamlet_snap)
hamlet_snap.info()

In [29]:
ag_to_HDurban = pd.merge(ag_snap, ag_all, on='NN',how='left')
ag_to_HDurban

Unnamed: 0.1,Unnamed: 0,ID_ag,ID_spam,grid_val,ID_LC,val,x,y,NSnomax,geometry,NN,NN_dist,ag_HD1F20,ag_HD2F20,ag_HD3F20
0,0,13941103,2344.0,2008851.0,0.1,2008851.0,-16.458988,12.208646,,POINT (-16.458987999999977 12.208646000000044),3507831609,397.198533,82.054823,83.755636,312.946111
1,1,13941122,2345.0,94605.0,0.1,94605.0,-16.375655,12.208646,,POINT (-16.375654999999938 12.208646000000044),3507831510,2881.578460,88.413319,90.114132,319.304608
2,2,13941017,2346.0,708413.0,0.1,708413.0,-16.208989,12.208646,,POINT (-16.208988999999974 12.208646000000044),6188134127,17711.673781,23.536822,25.237636,254.428111
3,3,13941036,2402.0,315719.0,0.1,315719.0,-16.208492,12.152815,,POINT (-16.20849199999998 12.152815000000032),3507831510,21562.215777,88.413319,90.114132,319.304608
4,4,13941044,2401.0,375288.0,0.1,375288.0,-16.294213,12.149487,,POINT (-16.294212999999957 12.149487000000022),3507831510,12874.077085,88.413319,90.114132,319.304608
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13941119,13941119,13941081,3.0,87412.0,0.1,87412.0,-15.291536,16.874047,,POINT (-15.291535999999951 16.874047000000076),3651042474,2658.421843,139785.574025,139786.528309,139792.211856
13941120,13941120,13941093,6.0,21660.0,0.1,21660.0,-15.292326,16.791961,,POINT (-15.292325999999946 16.79196100000007),3651042508,1234.146014,139777.423770,139778.378054,139784.061601
13941121,13941121,13941109,2.0,70993.0,0.1,70993.0,-15.374255,16.867753,,POINT (-15.374254999999948 16.86775300000005),3651042501,9172.299324,139782.918886,139783.873169,139789.556716
13941122,13941122,13941119,5.0,54822.0,0.1,54822.0,-15.375659,16.791961,,POINT (-15.375658999999928 16.79196100000007),3651042393,4280.425149,139769.388584,139770.342867,139776.026414


In [30]:
hamlet_to_HDurban = pd.merge(hamlet_snap, hamlet_all, on='NN',how='left')
hamlet_to_HDurban

Unnamed: 0.1,Unnamed: 0,Unnamed_ 0,mgrs_code,type,GlobalID,Shape_Leng,Shape_Area,geometry,NN,NN_dist,ha_HD1F20,ha_HD2F20,ha_HD3F20
0,0,0,28PCU1265_01,hamlet,{ED2CCDD5-C78F-40B6-A18A-3A01B61A4998},0.004314,0.000001,POINT (-16.721473282454415 12.348636090165247),7761872870,307.058089,63.403221,66.771529,291.019814
1,1,1,28PCU1365_01,hamlet,{372B104B-B208-4D14-84E2-8ABFD4D8C37A},0.009910,0.000006,POINT (-16.716456507935607 12.34788723564788),7761872869,801.450257,63.384661,66.752970,291.001255
2,2,2,28PCU1365_02,hamlet,{D03C2B85-5F35-4EE8-8346-B83494628F26},0.003754,0.000001,POINT (-16.713855008830738 12.350880992129111),6442044321,694.273717,62.374829,65.743138,289.991423
3,3,3,28PCU1566_01,hamlet,{5EAFF1C3-6EE5-4F96-99FC-78F924454480},0.004401,0.000002,POINT (-16.701275174874546 12.355585269999269),2142496418,689.246791,63.828351,67.196660,291.444945
4,4,4,28PCU1566_02,hamlet,{1D6A9E17-0D49-446D-A23B-7A47B155DC64},0.005357,0.000002,POINT (-16.698773736706396 12.356804484409668),2142496429,607.912599,64.210725,67.579034,291.827319
...,...,...,...,...,...,...,...,...,...,...,...,...,...
125881,125881,125881,28QED6412_03,hamlet,{5555A010-36B2-47D2-96C4-BDD1E59111ED},0.005397,0.000002,POINT (-14.397827933065358 16.394142941310925),8592243241,5089.086102,191.325867,193.414105,199.472924
125882,125882,125882,28QED6413_03,hamlet,{20205A44-8B9D-4FCE-B14C-53826594DB5A},0.003610,0.000001,POINT (-14.397473236932905 16.405676938062538),6375187769,3949.180081,189.396436,191.484673,197.543492
125883,125883,125883,28QED6413_04,hamlet,{AC6A169C-FD0E-4DF6-BDAB-B69FBD04BFAF},0.015471,0.000008,POINT (-14.400364192239715 16.404303134272737),8592243457,4008.851333,191.027109,193.115346,199.174166
125884,125884,125884,28QED6424_03,hamlet,{51593C65-B268-4BA1-8212-E43232C021FF},0.003883,0.000001,POINT (-14.396243884215123 16.49715066208162),3646207611,1826.174300,187.250035,189.338272,195.397091


The geometry column is missing a comma for some reason and isn't reading as a GDF in Python or Arc. So let's quick make a shapefile version by extracting the points. Can use this as well for the non-flood travel times, which were saved as csv.

In [32]:
hamlet_to_HDurban["geometry"] = hamlet_to_HDurban["geometry"].astype('str')
hamlet_to_HDurban["geometry"]  = hamlet_to_HDurban["geometry"] .str.strip('POINT ')
hamlet_to_HDurban["geometry"]  = hamlet_to_HDurban["geometry"] .str.strip('()')
XY = hamlet_to_HDurban["geometry"] .str.split(" ", expand=True)
hamlet_to_HDurban["X"] = XY[0]
hamlet_to_HDurban["Y"] = XY[1]
hamlet_to_HDurban["X"] = hamlet_to_HDurban["X"].astype('float')
hamlet_to_HDurban["Y"] = hamlet_to_HDurban["Y"].astype('float')
hamlet_to_HDurban = hamlet_to_HDurban.drop(columns=['geometry'])
hamlet_to_HDurban.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 125886 entries, 0 to 125885
Data columns (total 14 columns):
 #   Column      Non-Null Count   Dtype  
---  ------      --------------   -----  
 0   Unnamed: 0  125886 non-null  int64  
 1   Unnamed_ 0  125886 non-null  int64  
 2   mgrs_code   125886 non-null  object 
 3   type        125886 non-null  object 
 4   GlobalID    125886 non-null  object 
 5   Shape_Leng  125886 non-null  float64
 6   Shape_Area  125886 non-null  float64
 7   NN          125886 non-null  int64  
 8   NN_dist     125886 non-null  float64
 9   ha_HD1F20   125883 non-null  float64
 10  ha_HD2F20   125886 non-null  float64
 11  ha_HD3F20   125886 non-null  float64
 12  X           125886 non-null  float64
 13  Y           125886 non-null  float64
dtypes: float64(8), int64(3), object(3)
memory usage: 14.4+ MB


In [31]:
hamlet_to_HDurban.to_csv(os.path.join(out_pth, 'hamlet_to_HDurban_flood20.csv'))
ag_to_HDurban.to_csv(os.path.join(out_pth, 'ag_to_HDurban_flood20.csv'))

### Combine with cost-distance raster travel times from origins to road node.

In [8]:
# If reloading
ag_to_HDurban = os.path.join(out_pth, "ag_to_HDurban_flood20.csv")
ag_to_HDurban = pd.read_csv(ag_to_HDurban)
ag_to_HDurban.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13941124 entries, 0 to 13941123
Data columns (total 16 columns):
 #   Column        Dtype  
---  ------        -----  
 0   Unnamed: 0    int64  
 1   Unnamed: 0.1  int64  
 2   ID_ag         int64  
 3   ID_spam       float64
 4   grid_val      float64
 5   ID_LC         float64
 6   val           float64
 7   x             float64
 8   y             float64
 9   NSnomax       float64
 10  geometry      object 
 11  NN            int64  
 12  NN_dist       float64
 13  ag_HD1F20     float64
 14  ag_HD2F20     float64
 15  ag_HD3F20     float64
dtypes: float64(11), int64(4), object(1)
memory usage: 1.7+ GB


In [9]:
ag_to_HDurban['NSnomax'].replace({-9999: np.nan},inplace =True)
ag_to_HDurban

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,ID_ag,ID_spam,grid_val,ID_LC,val,x,y,NSnomax,geometry,NN,NN_dist,ag_HD1F20,ag_HD2F20,ag_HD3F20
0,0,0,13941103,2344.0,2008851.0,0.1,2008851.0,-16.458988,12.208646,,POINT (-16.458987999999977 12.208646000000044),3507831609,397.198533,82.054823,83.755636,312.946111
1,1,1,13941122,2345.0,94605.0,0.1,94605.0,-16.375655,12.208646,,POINT (-16.375654999999938 12.208646000000044),3507831510,2881.578460,88.413319,90.114132,319.304608
2,2,2,13941017,2346.0,708413.0,0.1,708413.0,-16.208989,12.208646,,POINT (-16.208988999999974 12.208646000000044),6188134127,17711.673781,23.536822,25.237636,254.428111
3,3,3,13941036,2402.0,315719.0,0.1,315719.0,-16.208492,12.152815,,POINT (-16.20849199999998 12.152815000000032),3507831510,21562.215777,88.413319,90.114132,319.304608
4,4,4,13941044,2401.0,375288.0,0.1,375288.0,-16.294213,12.149487,,POINT (-16.294212999999957 12.149487000000022),3507831510,12874.077085,88.413319,90.114132,319.304608
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13941119,13941119,13941119,13941081,3.0,87412.0,0.1,87412.0,-15.291536,16.874047,,POINT (-15.291535999999951 16.874047000000076),3651042474,2658.421843,139785.574025,139786.528309,139792.211856
13941120,13941120,13941120,13941093,6.0,21660.0,0.1,21660.0,-15.292326,16.791961,,POINT (-15.292325999999946 16.79196100000007),3651042508,1234.146014,139777.423770,139778.378054,139784.061601
13941121,13941121,13941121,13941109,2.0,70993.0,0.1,70993.0,-15.374255,16.867753,,POINT (-15.374254999999948 16.86775300000005),3651042501,9172.299324,139782.918886,139783.873169,139789.556716
13941122,13941122,13941122,13941119,5.0,54822.0,0.1,54822.0,-15.375659,16.791961,,POINT (-15.375658999999928 16.79196100000007),3651042393,4280.425149,139769.388584,139770.342867,139776.026414


In [10]:
ag_to_HDurban['HD1mmF20'] = 0 # mm for multi-modal
ag_to_HDurban['HD1mmF20'] = ag_to_HDurban['ag_HD1F20'] + ag_to_HDurban['NSnomax']

In [11]:
# Remove unnecessary data. These data are still saved in the _to_HDurban files.
ag_to_HDurban = ag_to_HDurban[['ID_ag', 'val', 'x', 'y', 'NSnomax', 'NN', 'HD1mmF20']]
ag_to_HDurban

Unnamed: 0,ID_ag,val,x,y,NSnomax,NN,HD1mmF20
0,13941103,2008851.0,-16.458988,12.208646,,3507831609,
1,13941122,94605.0,-16.375655,12.208646,,3507831510,
2,13941017,708413.0,-16.208989,12.208646,,6188134127,
3,13941036,315719.0,-16.208492,12.152815,,3507831510,
4,13941044,375288.0,-16.294213,12.149487,,3507831510,
...,...,...,...,...,...,...,...
13941119,13941081,87412.0,-15.291536,16.874047,,3651042474,
13941120,13941093,21660.0,-15.292326,16.791961,,3651042508,
13941121,13941109,70993.0,-15.374255,16.867753,,3651042501,
13941122,13941119,54822.0,-15.375659,16.791961,,3651042393,


In [12]:
print(len(ag_to_HDurban.loc[ag_to_HDurban['HD1mmF20']>240]), end="\n") # Number of origins isolated by excessive travel time
print(len(ag_to_HDurban.loc[ag_to_HDurban['HD1mmF20'].isnull()])) # Number of origins isolated by inability to access road
# Second value should match the non-flooded results (755160), as the cost-distance raster does not incorporate the flood penalties.

1163857
755160


In [None]:
crs = "EPSG:4326"
geometry = [Point(xy) for xy in zip(ag_to_HDurban.x, ag_to_HDurban.y)]
ag_to_HDurban = GeoDataFrame(ag_to_HDurban, crs=crs, geometry=geometry) 
ag_to_HDurban.to_file(driver='ESRI Shapefile', filename='C:/Users/wb527163/GEO-Cdrive-Grace/SEN-Cdrive/ag_to_HDurban_mmF20.shp') 

In [None]:
hamlet_to_HDurban = os.path.join(out_pth, "hamlet_to_HDurban_flood20.csv")
hamlet_to_HDurban = pd.read_csv(hamlet_to_HDurban)
hamlet_to_HDurban.info()

In [None]:
hamlet_to_HDurban['NSnomax'].replace({-9999: np.nan},inplace =True)
hamlet_to_HDurban

In [None]:
hamlet_to_HDurban['HD1mmF20'] = 0 # mm for multi-modal
hamlet_to_HDurban['HD1mmF20'] = hamlet_to_HDurban['ha_HD1'] + hamlet_to_HDurban['NSnomax']

In [None]:
# Remove unnecessary data. These data are still saved in the _to_HDurban files.
hamlet_to_HDurban = hamlet_to_HDurban[['mgrs_code', 'GlobalID', 'NSnomax', 'NN', 'HD1mmF20', 'X', 'Y']]
hamlet_to_HDurban

In [40]:
list1 = list(hamlet_to_HDurbanF10['mgrs_code'].unique())
len(list1)

125886

In [43]:
crs = "EPSG:4326"
geometry = [Point(xy) for xy in zip(hamlet_to_HDurban.X, hamlet_to_HDurban.Y)]
hamlet_to_HDurban = GeoDataFrame(hamlet_to_HDurban, crs=crs, geometry=geometry) 
hamlet_to_HDurban.to_file(driver='ESRI Shapefile', filename='R:/SEN/GEO/Team/Projects/Sen_TransportOV/hamlet_to_HDurban_mmF20.shp') 

TypeError: Cannot interpret '<geopandas.array.GeometryDtype object at 0x000002ABCDBEFA00>' as a data type

In [16]:
ag_to_HDurban = ag_to_HDurban.drop(columns=['Unnamed: 0_x', 'Unnamed: 0_y'])
ag_to_HDurban.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 6852701 entries, 0 to 6852700
Data columns (total 9 columns):
 #   Column     Dtype  
---  ------     -----  
 0   ID         int64  
 1   LC_90m     int64  
 2   NN         int64  
 3   NN_dist    float64
 4   ag_HD1F10  float64
 5   ag_HD2F10  float64
 6   ag_HD3F10  float64
 7   X          float64
 8   Y          float64
dtypes: float64(6), int64(3)
memory usage: 522.8 MB


In [64]:
geometry = [Point(xy) for xy in zip(hamlet_to_HDurban.X, hamlet_to_HDurban.Y)]
crs = "EPSG:4326"
hamlet_to_HDurban = GeoDataFrame(hamlet_to_HDurban, crs=crs, geometry=geometry) 
hamlet_to_HDurban.to_file(driver='ESRI Shapefile', filename='SEN-Cdrive/outputs/hamlet_to_HDurban_flood20.shp') 

In [None]:
geometry = [Point(xy) for xy in zip(ag_to_HDurban.X, ag_to_HDurban.Y)]
crs = "EPSG:4326"
ag_to_HDurban = GeoDataFrame(ag_to_HDurban, crs=crs, geometry=geometry) 
ag_to_HDurban.to_file(driver='ESRI Shapefile', filename='SEN-Cdrive/outputs/ag_to_HDurban_flood20.shp') 

In [13]:
# Checking to make sure the ID fields are the same since the dataframes are sorted by different values.
print(LC_value['ID'].min())
print(LC_value['ID'].max())
print(ag_to_HDurban['ID'].min())
print(ag_to_HDurban['ID'].max())

17
7090381
17
7090381


In [33]:
LC_value['HD1dif'] = 0 # dif for difference between the two travel times (in minutes)
LC_value['HD1pc'] = 0 # pc for percent change
LC_value['HD1dif'] = LC_value['HD1F10mm'] - LC_value['HD1mm'] # Travel time is X minutes longer in flood conditions.
LC_value['HD1pc'] = LC_value['HD1dif'] / LC_value['HD1mm'] * 100 # Travel time is X percent longer in flood conditions.
LC_value.head()

Unnamed: 0,ID,NSnomax,geom_UTM,HD1,HD1mm,HD1F10,X,Y,HD1F10mm,HD1dif,HD1pc
0,6851351,96.0536,POINT (308243.986 1391596.205),69.855921,165.909521,70.207052,-16.76351,12.58311,166.260652,0.351131,0.21164
1,6851352,33.9541,POINT (308596.067 1391593.845),51.759525,85.713625,52.110656,-16.76027,12.58311,86.064756,0.351131,0.409656
2,6851353,32.1907,POINT (308684.087 1391593.256),51.759525,83.950225,52.110656,-16.75946,12.58311,84.301356,0.351131,0.418261
3,6851354,31.1915,POINT (308771.020 1391592.675),51.759525,82.951025,52.110656,-16.75866,12.58311,83.302156,0.351131,0.423299
4,6851355,29.5983,POINT (308859.041 1391592.086),51.759525,81.357825,52.110656,-16.75785,12.58311,81.708956,0.351131,0.431588


In [34]:
crs = "EPSG:31028"
LC_value = GeoDataFrame(LC_value, crs=crs, geometry='geom_UTM') 
LC_value.to_file(driver='ESRI Shapefile', filename='R:/SEN/GEO/Team/Projects/Sen_TransportOV/LC_value/LC_value_mmF10.shp') 