# Load OSMNx graph for Kampala

## Introduction

### Network graphs

Review network graphs by going through 1854 Cholera Outbreak Advanced 1 notebooks: https://github.com/PHI-Case-Studies/1854-Cholera-Outbreak-London-Advanced-1https://github.com/PHI-Case-Studies/1854-Cholera-Outbreak-London-Advanced-1

### Resource stewardship

<img src="images/sustainability.jpeg" alt="sustainability" width="600"/>

While at this workshop you will experience what it means to work with street networks. You will be downloading the street network graphs for Kampala, which is about 4-6 GB each in size, while working in a shared Jupyter environment with other users. Resources (CPU, RAM and disk space) that enable this Jupyter services are scarce and limited. We should strive to conserve as much of resources by doing the following:
1. Keep one notebook open at a time. The notebooks are supposed to be run in sequence. Shut down earlier notebooks in the sequence once you are finished using them. You can close the tab then right-click / shutdown kernel on the filetree to your left. 
2. This notebook - Notebook 1 - loads 2 graphs (projected and unprojected) but saves these immediately to disk upon opening. When you set `download_graph` to `True` it will use the Overpass API, a "free" service, to download the Kampala graphs to your notebook server. If you have the local copy of these graphs in your shared folder, use these copies - copy them to your data folder, and help keep the Overpass API open and readily available for new graph download requests.
3. Watch out for memory consumption on the status bar below (where it says `Mem: xx GB`. If you are consuming > 5-6 GB shut down other notebooks so you only have 1 notebook running at a time (see #1).

## Load OSMNx Graph

### Configure OSMNx

TBD

In [1]:
import osmnx as ox, csv

with open('overpass-api.csv', mode='r') as infile:
    reader = csv.reader(infile)
    overpass_api = {rows[0]:rows[1] for rows in reader}

ox.config(
    log_console=True, 
    use_cache=True, 
    log_file=True,
    overpass_endpoint=overpass_api['main']
)

### Use a Bounding Box

As a responsible netizen we will use only the resources we need. In this section, there are two things we will do to be a responsible netizen and:

1. We will use a bounding box to limit the size of the graph, and hence the amount of data to request from the Overpass API.
2. We will use the saved graph to avoid a trip to the Overpass API. 

If running the notebooks for the first time, set `download_graphml` to `True`. Loading the graph may take anywhere from 10-15 mins.

**Optional:** If you want to track what how your notebook server is interacting with OSMNx and Overpass API, you can launch a terminal (from the JupyterLab Launcher), then do/type the following on the terminal command line:
1. `cd notebooks/2021-HELINA-COVID-19-OSMNx-Workshop/logs`
2. `ls -la`
3. Look for the log file (`.log` extension) for today (log file will have the current date in the file name), copy the filename (highlight with your mouse and `Ctrl-C` to copy).
4. Example: `tail -f osmnx_2021-10-17.log` (see sample log screenshot below)


<img src="images/logs.png" alt="sustainability" width="800"/>

In [2]:
%%time
download_graph = True

if download_graph:
    G = ox.graph_from_bbox(0.408513, 0.218915, 32.666921, 32.509538, network_type='all', simplify=False)
    G_proj = ox.project_graph(G)
    ox.distance.add_edge_lengths(G_proj, precision=3)
    ox.speed.add_edge_speeds(G_proj, precision=3)
    ox.speed.add_edge_travel_times(G_proj, precision=3)
    ox.save_graphml(G, 'data/g_unprojected.graphml')
    ox.save_graphml(G_proj, 'data/g_projected.graphml')
else:
    G = ox.load_graphml('data/g_unprojected.graphml')
    G_proj = ox.load_graphml('data/g_projected.graphml')

CPU times: user 8min 1s, sys: 19.1 s, total: 8min 20s
Wall time: 8min 21s


Upon running the notebook for the first time and with `download_graph` set to `True`, we did the following in the preceding code cell:

1. We set the value of `download_graph` so we can either download a fresh graph from the Overpass API or reuse an existing saved graph.
2. We downloaded the OSMNx graph for Kampala using a bounding box.
3. We projected the graph so we can do distance measurements.
4. We added some graph features so we can use them later for graph computatation steps (e.g., lengths, speeds, travel time).
5. We saved the graph as GraphML so we can reuse the graph later.
6. We timed the execution of code for the cell.

OSMNx References:
1. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.graph.graph_from_bbox
2. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.projection.project_graph
3. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.distance.add_edge_lengths
4. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.speed.add_edge_speeds
5. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.speed.add_edge_travel_times
6. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.io.save_graphml
7. https://osmnx.readthedocs.io/en/stable/osmnx.html#osmnx.io.load_graphml

## Graph Projections

In [3]:
g_crs = G.graph['crs']

g_crs

'epsg:4326'

In [4]:
g_crs_prj = G_proj.graph['crs']

g_crs_prj

<Projected CRS: +proj=utm +zone=36 +ellps=WGS84 +datum=WGS84 +unit ...>
Name: unknown
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- undefined
Coordinate Operation:
- name: UTM zone 36N
- method: Transverse Mercator
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In the code cells above we obtained the CRS values of the two graphs. What do you notice that's different about the two CRS values?

### Obtain EPSG code for the projected graph

What is EPSG?

The projected graph and geodataframes we will create for residence centroids and testing facilities have to use the same planar projection. We obtain the projected graph's CRS below and obtain the corresponding EPSG authority code for use in Notebooks 2 and 3.

In [5]:
from pyproj import CRS
import yaml, os

auth, crs = CRS.from_string(str(g_crs_prj)).to_authority()
crs_dict = {auth:crs}

print(crs_dict)

with open('proj_crs.yml', 'w') as outfile:
    yaml.dump(crs_dict, outfile, default_flow_style=False)

{'EPSG': '32636'}


In [6]:
with open("proj_crs.yml", "r") as stream:
    try:
        epsg_dict = yaml.safe_load(stream)
    except yaml.YAMLError as exc:
        print(exc)
        
proj_epsg_str = str(epsg_dict).replace("{","").replace("}", "").replace("'","").replace(" ","")

proj_epsg_str

'EPSG:32636'

Reference for 32636: https://epsg.io/32636

## Graph Statistics

In [7]:
%%time
G_stats_dict = ox.stats.basic_stats(G, area=None, clean_int_tol=None, clean_intersects=None, tolerance=None, circuity_dist=None)

G_stats_dict

CPU times: user 2min 58s, sys: 4.23 s, total: 3min 3s
Wall time: 3min 2s


{'n': 214995,
 'm': 438476,
 'k_avg': 4.078941370729551,
 'edge_length_total': 8415315.189000037,
 'edge_length_avg': 19.192191109661728,
 'streets_per_node_avg': 2.0707969952789598,
 'streets_per_node_counts': {0: 0,
  1: 15991,
  2: 169598,
  3: 27621,
  4: 1767,
  5: 15,
  6: 3},
 'streets_per_node_proportions': {0: 0.0,
  1: 0.0743784739179981,
  2: 0.7888462522384242,
  3: 0.12847275518035303,
  4: 0.00821879578594851,
  5: 6.976906439684643e-05,
  6: 1.3953812879369287e-05},
 'intersection_count': 199004,
 'street_length_total': 4290415.052999885,
 'street_segment_count': 222473,
 'street_length_avg': 19.285104498073405,
 'circuity_avg': 1.0000000300019478,
 'self_loop_proportion': 0.0}

In [None]:
%%time
G_proj_stats_dict = ox.stats.basic_stats(G_proj, area=None, clean_int_tol=None, clean_intersects=None, tolerance=None, circuity_dist=None)

G_proj_stats_dict

## Data Directory

In [None]:
!ls -la data

## Further Reading

1. Notebook Shortcuts: https://towardsdatascience.com/jypyter-notebook-shortcuts-bf0101a98330
2. OSMNx Graph Statistics: https://osmnx.readthedocs.io/en/stable/internals.html?highlight=plot#osmnx-stats-module
3. 1854 Cholera Outbreak Advanced 1 GitHub repository: https://github.com/PHI-Case-Studies/1854-Cholera-Outbreak-London-Advanced-1