<a href="https://colab.research.google.com/github/BayAreaMetro/mtc_wrangler/blob/main/momo_workshop/colab_Create_SF_network_from_OSM_GTFS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GMNS/Network Wrangler 2.0 Workshop - Building a San Francisco Network
---
This workshop workbook demonstrates how to use network_wrangler to create a base network from Open Streetmap (OSM) and a regional GTFS feed.

In addition to using the network_wrangler library, this notebook uses the script, [`create_mtc_network_from_OSM.py`](https://github.com/BayAreaMetro/mtc_wrangler/blob/main/create_baseyear_network/create_mtc_network_from_OSM.py). The script also contains a main method and works as a standalone tool, which would be more practical for building an actual network; this demonstration just uses the python notebook in order to illustrate the output of the major steps.


# 🏞 Setup Environment

## Fetch code from GitHub

Fetch [network_wrangler code](https://github.com/BayAreaMetro/network_wrangler/tree/centroids) and [network creation script](https://github.com/BayAreaMetro/mtc_wrangler/blob/main/create_baseyear_network/create_mtc_network_from_OSM.py).
Currently the network_wrangler code is the centroids branch of a BayAreaMetro fork, but this will be merged into [`network_wrangler`](https://github.com/network-wrangler/network_wrangler) after review.

In [None]:
!git clone https://github.com/BayAreaMetro/network_wrangler.git
!git clone https://github.com/BayAreaMetro/mtc_wrangler.git

Ensure `mtc_wrangler` code is up-to-date (we're fixing bugs!)

In [None]:
%cd /content/mtc_wrangler/
!git pull
print("Last commit for mtc_wrangler:")
!git log -1


Ensure `network_wrangler` is on the `centroids` branch and up-to-date. Install package in editable mode.

In [None]:
%cd /content/network_wrangler/
!git pull
!git checkout centroids
print("Last commit for network_wrangler on centroids branch:")
!git log -1

## Install python packages and test import
This includes:
* `scikit-learn` for nearest neighbor searches to match transit to roadway network,
* `pygris` to fetch county shapefiles,
* `mapclassify` for visualization,
* `us` for state lookup codes
* and local version of `network_wrangler`

In [None]:
%%capture python_install_cap
!pip install scikit-learn pygris mapclassify us
# Install Tableau package if using Tableau for visualization
#!pip install tableauhyperapi
%cd /content/network_wrangler/
!pip install -e .

In [None]:
# run this if you want to see the output from the python install
print(python_install_cap)

In [None]:
import sys
print(sys.version)
# make sure we can import network_wrangler
import network_wrangler
import pprint
print(pprint.pformat(dir(network_wrangler)))

## 🚌 🚆 Fetch public GTFS Input files from Google Drive


This is the regional transit feed for the San Francisco Bay Area, provided by [511 SF Bay’s Portal for Open Transit Data](https://511.org/open-data/transit). This was downloaded on October 2, 2024 via `http://api.511.org/transit/datafeeds?api_key=[my_api_key]&operator_id=RG&historic=2023-09`


In [None]:
%cd /content/
!mkdir BayArea_511gtfs_2023-09
%cd /content/BayArea_511gtfs_2023-09
!gdown 1wu-echoNNi5NzQh3BT4RwfnHYlUg8ZK5
!unzip BayArea_511gtfs_2023-09.zip

## Fetch TIGER county files from Google Drive

In [None]:
%cd /content
!mkdir tl_2010_us_county10
%cd /content/tl_2010_us_county10
!gdown 1vzmweK-QysWeVWBhNAC5R-pd4jfKlzV5
!unzip tl_2010_us_county10.zip

## Setup Logging

We have both info and debug logging. Info logs are high-level and will get reported to stdout, while debug logs are very detailed and will only be logged to the debug log file.

In [None]:
%cd /content/
from network_wrangler import WranglerLogger
import pathlib
info_log_file = pathlib.Path("create_SF_network_info.log")
debug_log_file = pathlib.Path("create_SF_network_debug.log")
network_wrangler.setup_logging(
    info_log_file,
    debug_log_file,
    std_out_level="info",
    file_mode="w"
  )

# We have custom loggers and we want to prevent their messages from being
# processed by the root logger's handlers (if any remain)
WranglerLogger.propagate = False

# this one will just go to the debug file
WranglerLogger.debug("Debug test")
# this will go to the console (stdout) and the info & debug files
WranglerLogger.info("Info test")

In [None]:
!tail /content/create_SF_network_debug.log

## Create output directory and import script code

In [None]:
%cd /content/mtc_wrangler/create_baseyear_network
from create_mtc_network_from_OSM import (
  get_travel_model_zones,
  step1_download_osm_network,
  stepa_standardize_attributes,
  step2_simplify_network_topology,
  step3_assign_county_node_link_numbering,
  step4_add_centroids_and_connectors,
  step5_prepare_gtfs_transit_data,
  step6_create_transit_network
)
%cd /content/mtc_wrangler/momo_workshop
from visualization import *

OUTPUT_DIR = pathlib.Path("/content/output_SF_OSM")
OUTPUT_DIR.mkdir(exist_ok=True)

## Fetch travel model zone shapefiles
These are being developed here: https://github.com/BayAreaMetro/tm2py-utils/tree/main/tm2py_utils/inputs/maz_taz

In [None]:
travel_model_zones = get_travel_model_zones(OUTPUT_DIR)

# 🏗 Build the network!
---


## Step 1: Download OSM network data

This downloads the Open Street Map data for San Francisco county using [osmnx](https://osmnx.readthedocs.io/)

In [None]:
# Download the OSM network data for San Francisco county
# This can take a couple of minutes
osm_g = step1_download_osm_network("San Francisco", OUTPUT_DIR)
print(type(osm_g))

In [None]:
# quick plot of the network graph
fig, ax = create_osmnx_plot(osm_g)

## Step 1a: standardize attributes for the roadway network
Optionally write it, if output format arguments are specified.

In [None]:
links_unsimplified_gdf, nodes_unsimplified_gdf = stepa_standardize_attributes(
    osm_g, "San Francisco", "1a_original_", OUTPUT_DIR, []
  )

## Step 2: Simplify network topology
This consolidates intersections while preserving connectivity.

In [None]:
simplified_g = step2_simplify_network_topology(osm_g, "San Francisco", OUTPUT_DIR)
print(type(osm_g))

Summarize some of the differences between the original and simplified.

In [None]:
compare_original_and_simplified_networks(osm_g, simplified_g)
# Plot node degree changes original vs simplified
plot_node_degree_changes(osm_g, simplified_g)

## Step 2a: standardize attributes for the roadway network
Optionally write it, if output format arguments are specified.

In [None]:
links_gdf, nodes_gdf = stepa_standardize_attributes(
    simplified_g, "San Francisco",
    prefix="2a_simplified",
    output_dir=OUTPUT_DIR,
    output_formats=["geojson"]
  )


In [None]:
# clip to smaller area for visualization
orig_links_gdf_clip, links_gdf_clip = clip_original_and_simplified_links(links_unsimplified_gdf, links_gdf, travel_model_zones["TAZ"])

In [None]:
map_original_and_simplified_links(orig_links_gdf_clip, links_gdf_clip)

## Step 3: Assign county-specific numbering and create [RoadwayNetwork](https://bayareametro.github.io/network_wrangler/main/api/#network_wrangler.roadway.network.RoadwayNetwork) instance.
This also drops OSM columns we've translated into standard columns and writes the roadway network

In [None]:
roadway_network = step3_assign_county_node_link_numbering(
    links_gdf, nodes_gdf,
    county="San Francisco",
    output_dir=OUTPUT_DIR,
    output_formats=["geojson"]
)

## Step 4: Add centroids and centroid connectors
This modifies the roadway_network in place.

In [None]:
step4_add_centroids_and_connectors(
    roadway_network,
    county="San Francisco",
    output_dir=OUTPUT_DIR,
    output_formats=["geojson"]
)

In [None]:
# Let's take a look
# bbox_name arg can specify clipping area (one of 'SF_downtown''SF_financial_district''SF_mission')
# If none passed, will show SF
create_roadway_network_map(roadway_network.links_df, bbox_name="SF_downtown")

## Step 5: Prepare GTFS transit data
Read and filter to service date, relevant operators. Creates [GtfsModel](https://bayareametro.github.io/network_wrangler/main/api_transit/#network_wrangler.models.gtfs.gtfs.GtfsModel) instance.

This is a bit noisy because there are a lot of operators in the Bay Area to get dropped.  It may have been more prudent to just add SF Muni 🚍, BART 🚝 and Caltrain 🚈 directly, but the underlying code was written to scale for the region.


In [None]:
gtfs_model = step5_prepare_gtfs_transit_data(
    county="San Francisco",
    input_gtfs=pathlib.Path("/content/BayArea_511gtfs_2023-09"),
    output_dir=OUTPUT_DIR
)

## Step 6: Create TransitNetwork by integrating GtfsModel with RoadwayNetwork
This updates the tables in the [GtfsModel](https://bayareametro.github.io/network_wrangler/main/api_transit/#network_wrangler.models.gtfs.gtfs.GtfsModel) instance so they're wrangler-flavored, where nodes refer to the [RoadwayNetwork](https://bayareametro.github.io/network_wrangler/main/api/#network_wrangler.roadway.network.RoadwayNetwork)  instance. For bus routes, this means "snapping" stops to existing nodes; for other types of transit, this means creating nodes and links in the roadway network that transit-specific. With this done, a Wrangler-flavored [Feed](https://bayareametro.github.io/network_wrangler/main/api_transit/#network_wrangler.transit.feed.feed.Feed) instance can be created and incorporated into a [TransitNetwork](https://bayareametro.github.io/network_wrangler/main/api/#network_wrangler.transit.network.TransitNetwork) instance.


In [None]:
transit_network, shape_links_gdf = step6_create_transit_network(
    gtfs_model,
    roadway_network,
    county="San Francisco",
    output_dir=OUTPUT_DIR,
    output_formats=["geojson"]
)

In [None]:
# Visualize roadway network with transit links
# bbox_name arg can specify clipping area (one of 'SF_downtown''SF_financial_district''SF_mission')
# If none passed, will show SF
create_roadway_network_map(roadway_network.links_df, bbox_name="SF_downtown")

In [None]:
# Now we're ready to view the transit shape links. It's overwhelming to include all lines in the map, so choose some
print(f"Route ids: {shape_links_gdf["route_id"].unique()}")
print(f"{type(shape_links_gdf)=}")
print(f"shape_links_gdf.dtypes=\n{shape_links_gdf.dtypes}")
# Again, bbox_name arg can specify clipping area (one of 'SF_downtown''SF_financial_district''SF_mission')
create_roadway_transit_map(
    roadway_network.links_df,
    shape_links_gdf,
    route_ids=["SF:30","SF:45"],
    bbox_name="SF_downtown"
)

## Step 7: Create [Scenario](https://bayareametro.github.io/network_wrangler/main/api/#network_wrangler.scenario.Scenario)

This is ready for for project cards to be applied, so a future scenario (and alternative scenarios) can be created.

We didn't have time to create project cards for this network, so Sijia will demo project cards creation and application from work done for UDOT.


In [None]:
my_scenario = network_wrangler.scenario.create_scenario(
    base_scenario = {
        "road_net": roadway_network,
        "transit_net": transit_network,
        "applied_projects": [],
        "conflicts": {}
    },
)

# write it to disk
scenario_dir = OUTPUT_DIR / "7_wrangler_scenario"
scenario_dir.mkdir(exist_ok=True)
my_scenario.write(
    path=scenario_dir,
    name="mtc_2023",
    roadway_file_format="geojson",
    roadway_true_shape=True
)
WranglerLogger.info(f"Wrote scenario to {scenario_dir}")
