## Oslo Bus Network: Shortest Path and Redundancy Analysis

This notebook builds a directed **bus network** for Oslo using GTFS data,  
computes the **fastest path** between two hubs: Oslo central hub and Skøyen hub,  
and analyses the **robustness** of this path by removing stops and edges.

The pipeline:

1. Load CSV files.
2. Prepare and clean **bus edges**.
3. Filter **bus stops** and keep bus edges only.
4. Build a **directed multigraph** where:
- nodes = stops,
- edges = bus trips with weight = travel time (seconds).
5. Create two **hubs** combining multiple physical stops:
- OSLO HUB - Oslo S, bus terminal, Jernbanetorget area;
- SKOYEN HUB - Skøyen area.

6. Compute the **shortest-time path** between the hubs using Dijkstra.

7. Run node level experiments:
   - remove one internal stop at a time,
   - remove two consecutive internal stops,
   - remove two random internal stops.

8. Experiments:
   - Zone level

In [1]:
import pandas as pd
import numpy as np
import networkx as nx
import matplotlib.pyplot as plt
from networkx.exception import NetworkXNoPath
import random

### 1. Load input CSV data

We load two CSV files:

- **nodes_GTFS_OSLO.csv**  
  Each row is a stop with:
  - `id`, `stopPlaceId`, `name`, `lat`, `lon`, `modes`, `stopType`.

- **edges_GTFS_OSLO_with_mondayTrips.csv**  
  Each row is a directed connection between two stops with:
  - `from`, `to`, `lineId`, `lineCode`, `mode`,
  - `authority`,  
  - `travelTimeSec`,
  - `tripsInFeed`, `tripsOn2025_11_17` - frequency info.

In [2]:
nodes = pd.read_csv("../../Extraction/out/nodes_GTFS_OSLO.csv")
edges = pd.read_csv("../../Extraction/out/edges_GTFS_OSLO_with_mondayTrips.csv")

### 2. Prepare bus edges

We first filter the `edges` to keep **only rows where `mode == "bus"`**.  
All further analysis is performed on these bus edges.

In [3]:
# Prepare edges
# Filter Bus edges only
bus_edges = edges[edges["mode"] == "bus"].copy()

print("Bus edges shape:", bus_edges.shape)
print("Bus edges dtypes:")
print(bus_edges.dtypes)

print("\nMissing values in raw bus_edges:")
print(bus_edges.isna().sum())

Bus edges shape: (3896, 9)
Bus edges dtypes:
from                  object
to                    object
lineId                object
lineCode              object
mode                  object
authority             object
travelTimeSec        float64
tripsInFeed            int64
tripsOn2025_11_17      int64
dtype: object

Missing values in raw bus_edges:
from                   0
to                     0
lineId                 0
lineCode               0
mode                   0
authority              0
travelTimeSec        144
tripsInFeed            0
tripsOn2025_11_17      0
dtype: int64


### 2.1 Convert `travelTimeSec` to numeric

The `travelTimeSec` column may contain invalid or non-numeric values.  
We use `pd.to_numeric(..., errors="coerce")` to:

- parse valid numbers,  
- replace invalid entries with `NaN`.

After this step, we count how many missing values remain in `travelTimeSec`.

In [4]:
# Convert 'travelTimeSec' to numeric (invalid values - NaN)
bus_edges["travelTimeSec"] = pd.to_numeric(
    bus_edges["travelTimeSec"],
    errors="coerce",
)

print(
    "Number of missing values in travelTimeSec after to_numeric:",
    bus_edges["travelTimeSec"].isna().sum(),
)

Number of missing values in travelTimeSec after to_numeric: 144


### 2.2 Per-line median imputation (within each `lineId`)

Many missing `travelTimeSec` values can be reasonably imputed from **other trips of the same line**.

For each `lineId` group:

- compute the median `travelTimeSec`,
- fill missing values in that group with this median.

If *all* values for a given line are missing, its median is `NaN` and those rows remain missing at this step.

In [5]:
# Per-line median imputation (within each lineId)
bus_edges["travelTimeSec"] = (
    bus_edges
    .groupby("lineId")["travelTimeSec"]
    .transform(lambda x: x.fillna(x.median()))
)

print(
    "Number of missing values after per-line median fill:",
    bus_edges["travelTimeSec"].isna().sum(),
)

Number of missing values after per-line median fill: 2


  return np.nanmean(a, axis, out=out, keepdims=keepdims)


### 2.3 Global median fallback

Some lines had **no valid travel times at all**, so the per-line median is `NaN`.  
To avoid leaving `NaN` values in `travelTimeSec`, we:

1. Compute the **global median** travel time over all bus edges.
2. Fill `NaN` values with this global median.

In [6]:
# Global median fallback for lines that had all NaN
global_median = bus_edges["travelTimeSec"].median()
bus_edges["travelTimeSec"] = bus_edges["travelTimeSec"].fillna(global_median)

print(
    "Number of missing values after global median fill:",
    bus_edges["travelTimeSec"].isna().sum(),
)

print("\ntravelTimeSec statistics (seconds):")
print(bus_edges["travelTimeSec"].describe())

Number of missing values after global median fill: 0

travelTimeSec statistics (seconds):
count    3896.000000
mean       94.751027
std        66.951515
min        60.000000
25%        60.000000
50%        60.000000
75%       120.000000
max       840.000000
Name: travelTimeSec, dtype: float64


### 3. Prepare bus nodes

Steps:

1. Check `nodes` for missing values and data types.
2. Filter to stops whose `modes` column contains `"bus"`.

In [7]:
# Prepare Bus nodes
print("Missing values in nodes_oslo:")
print(nodes.isna().sum())

print("\nNodes dtypes:")
print(nodes.dtypes)

Missing values in nodes_oslo:
id             0
stopPlaceId    0
name           0
lat            0
lon            0
modes          0
stopType       0
dtype: int64

Nodes dtypes:
id              object
stopPlaceId     object
name            object
lat            float64
lon            float64
modes           object
stopType        object
dtype: object


In [8]:
# Filter nodes by mode 'bus'
nodes["modes"] = nodes["modes"].astype(str)
bus_nodes = nodes[nodes["modes"].str.contains("bus")].copy()

used_node_ids = set(bus_edges["from"]) | set(bus_edges["to"])
bus_nodes = bus_nodes[bus_nodes["id"].isin(used_node_ids)].copy()

print("Bus nodes:", bus_nodes.shape[0])

Bus nodes: 1525


### 4. Build a directed multi-graph of bus stops

- **Nodes** = bus stops from `bus_nodes`.
- **Edges** = bus edges from `bus_edges`, with:
  - direction (`from` - `to`),
  - weight = `travelTimeSec` (seconds),
  - attributes: lineId, lineCode, mode.

In [9]:
# Build a Graph
#G = nx.DiGraph()
G = nx.MultiDiGraph()

# bus nodes 
for _, row in bus_nodes.iterrows():
    G.add_node(
        row["id"],
        name=row["name"],
        lat=row["lat"],
        lon=row["lon"],
        stopPlaceId=row["stopPlaceId"],
    )
    
# bus edges weight = travelTimeSec
for _, row in bus_edges.iterrows():
    u = row["from"]
    v = row["to"]
    t = row["travelTimeSec"]

    if G.has_node(u) and G.has_node(v):
        G.add_edge(
            u, v,
            weight=t,
            lineId=row["lineId"],
            lineCode=row["lineCode"],
            mode="bus"      
        )

print("Graph: nodes =", G.number_of_nodes(), ", edges =", G.number_of_edges())

Graph: nodes = 1525 , edges = 3896


### 5.1 Define hub areas and connect them to the network

We are interested in trips between two **hub areas**:

- **Oslo hub**: union of stops named `"Oslo bussterminal"` and `"Jernbanetorget"`.
- **Skøyen hub**: union of stops named `"Skøyen"` and `"Skøyen stasjon"`.

Each hub consists of multiple physical stops.

In [10]:
oslo_hub_allowed = ["Oslo bussterminal", "Jernbanetorget"]

oslo_hub_nodes = bus_nodes[
    bus_nodes["name"].isin(oslo_hub_allowed)
][["id", "name", "lat", "lon"]].copy()

print("OSLO HUB nodes:")
print(oslo_hub_nodes)
print("Total OSLO HUB stops:", len(oslo_hub_nodes))

OSLO HUB nodes:
                   id               name        lat        lon
4      NSR:Quay:11969  Oslo bussterminal  59.911940  10.756710
5      NSR:Quay:12002  Oslo bussterminal  59.911593  10.759892
11     NSR:Quay:11973  Oslo bussterminal  59.911573  10.760016
53     NSR:Quay:11992  Oslo bussterminal  59.911632  10.759662
70     NSR:Quay:11983  Oslo bussterminal  59.911647  10.759543
104     NSR:Quay:7194     Jernbanetorget  59.912095  10.751538
107    NSR:Quay:11970  Oslo bussterminal  59.911537  10.759329
220    NSR:Quay:11989  Oslo bussterminal  59.911829  10.758367
234    NSR:Quay:11977  Oslo bussterminal  59.911809  10.758475
277     NSR:Quay:7158     Jernbanetorget  59.909041  10.749527
379     NSR:Quay:7159     Jernbanetorget  59.909486  10.748067
624     NSR:Quay:7203     Jernbanetorget  59.911425  10.749736
694    NSR:Quay:11980  Oslo bussterminal  59.911465  10.760298
742     NSR:Quay:7202     Jernbanetorget  59.911701  10.750412
862    NSR:Quay:11963  Oslo busstermina

In [11]:
# 
skoyen_hub_nodes = bus_nodes[
    bus_nodes["name"].isin(["Skøyen", "Skøyen stasjon"])
][["id", "name", "lat", "lon"]].copy()

print("SKØYEN HUB nodes:")
print(skoyen_hub_nodes)
print("Total SKØYEN HUB stops:", len(skoyen_hub_nodes))

SKØYEN HUB nodes:
                   id            name        lat        lon
674    NSR:Quay:11817  Skøyen stasjon  59.922145  10.678730
1057   NSR:Quay:11819  Skøyen stasjon  59.922080  10.680596
1137   NSR:Quay:11837          Skøyen  59.923102  10.681629
1302   NSR:Quay:11824  Skøyen stasjon  59.922905  10.678470
1372  NSR:Quay:107539  Skøyen stasjon  59.922770  10.678523
1548   NSR:Quay:11818  Skøyen stasjon  59.922313  10.679381
Total SKØYEN HUB stops: 6


### 5.2 Add abstract hub nodes and transfer edges

We create two synthetic nodes:

- `OSLO_HUB`
- `SKOYEN_HUB`

Their geographic coordinates are the mean of all physical stops in each hub.  

Each transfer inside a hub has a fixed `weight` - 60 seconds.

In [12]:
#
OsloHub   = "OSLO_HUB"
SkoyenHub = "SKOYEN_HUB"


oslo_lat   = oslo_hub_nodes["lat"].mean()
oslo_lon   = oslo_hub_nodes["lon"].mean()
skoyen_lat = skoyen_hub_nodes["lat"].mean()
skoyen_lon = skoyen_hub_nodes["lon"].mean()


#
G.add_node(
    OsloHub,
    name="Oslo HUB (Oslo S / Bussterminal / Jernbanetorget)",
    lat=oslo_lat,
    lon=oslo_lon,
    stopPlaceId="hub"
)

G.add_node(
    SkoyenHub,
    name="Skøyen HUB",
    lat=skoyen_lat,
    lon=skoyen_lon,
    stopPlaceId="hub"
)


transfer_time = 60  

for nid in oslo_hub_nodes["id"]:
    if G.has_node(nid):
        G.add_edge(nid, OsloHub,   weight=transfer_time, mode="transfer")
        G.add_edge(OsloHub, nid,   weight=transfer_time, mode="transfer")

for nid in skoyen_hub_nodes["id"]:
    if G.has_node(nid):
        G.add_edge(nid, SkoyenHub, weight=transfer_time, mode="transfer")
        G.add_edge(SkoyenHub, nid, weight=transfer_time, mode="transfer")

print("Hubs added. Graph now: nodes =", G.number_of_nodes(), ", edges =", G.number_of_edges())

Hubs added. Graph now: nodes = 1527 , edges = 3978


### 6.1 Compute the base (fastest) path between hubs

We compute the **fastest path (minimum travel time)** from `OSLO_HUB` to `SKOYEN_HUB`:

- Use Dijkstra’s algorithm with edge attribute `weight` (seconds).
- Convert travel time to minutes.

If no path exists, we catch `NetworkXNoPath` and handle it gracefully.

In [13]:
# Base OSLO_HUB - SKOYEN_HUB path

try:
    base_path = nx.dijkstra_path(G, OsloHub, SkoyenHub, weight="weight")
    base_time_sec = nx.dijkstra_path_length(G, OsloHub, SkoyenHub, weight="weight")
    base_time_min = base_time_sec / 60

    print("BASE FASTEST PATH from OSLO_HUB to SKOYEN_HUB")
    for nid in base_path:
        print(nid, "->", G.nodes[nid].get("name"))

    print("Base travel time (min):", round(base_time_min, 1))
    print("Number of hops (edges):", len(base_path) - 1)

except NetworkXNoPath:
    print("No path between OsloHub and SkoyenHub in the graph.")
    base_path = None
    base_time_sec = None
    base_time_min = None

BASE FASTEST PATH from OSLO_HUB to SKOYEN_HUB
OSLO_HUB -> Oslo HUB (Oslo S / Bussterminal / Jernbanetorget)
NSR:Quay:7159 -> Jernbanetorget
NSR:Quay:101778 -> Kvadraturen
NSR:Quay:101777 -> Wessels plass
NSR:Quay:7350 -> Nationaltheatret
NSR:Quay:104030 -> Solli
NSR:Quay:7813 -> Frogner kirke
NSR:Quay:7835 -> Olav Kyrres plass
NSR:Quay:11844 -> Thune
NSR:Quay:11817 -> Skøyen stasjon
SKOYEN_HUB -> Skøyen HUB
Base travel time (min): 16.0
Number of hops (edges): 10


### 6.2 Extract internal stops on the base path

Identify all **internal stops** on this path, all nodes between the two hubs:

Internal nodes = all nodes on `base_path` excluding `OSLO_HUB` and `SKOYEN_HUB`.

These nodes are candidates for the redundancy experiments below.

In [14]:
#
if base_path is None:
    raise RuntimeError("No base path found – cannot run redundancy experiments.")

internal_nodes = [n for n in base_path if n not in (OsloHub, SkoyenHub)]

print("Internal stops on BASE path:")
for nid in internal_nodes:
    print(nid, "->", G.nodes[nid].get("name"))

Internal stops on BASE path:
NSR:Quay:7159 -> Jernbanetorget
NSR:Quay:101778 -> Kvadraturen
NSR:Quay:101777 -> Wessels plass
NSR:Quay:7350 -> Nationaltheatret
NSR:Quay:104030 -> Solli
NSR:Quay:7813 -> Frogner kirke
NSR:Quay:7835 -> Olav Kyrres plass
NSR:Quay:11844 -> Thune
NSR:Quay:11817 -> Skøyen stasjon


### 7.1 Redundancy experiment 1 – remove one stop at a time

Goal: **How sensitive is the Oslo–Skøyen connection to the removal of a single stop on the base path?**

For each internal node:

1. Make a copy `H` of the original graph `G`.
2. Remove that node from `H`.
3. Recompute the fastest path from `OSLO_HUB` to `SKOYEN_HUB`.
4. Record:
   - whether a path still exists,
   - new travel time,
   - time increase `Δt` compared to the base path,
   - full new path (IDs and names).

This measures **node redundancy**: if removal of a stop has little or no impact, the network is robust at that point.

In [15]:
# EXPERIMENT 1: Remove 1 node at a time

single_results = []

for nid in internal_nodes:
    H = G.copy()
    node_name = G.nodes[nid].get("name")

    # remove 1 stop
    H.remove_node(nid)

    try:
        new_path = nx.dijkstra_path(H, OsloHub, SkoyenHub, weight="weight")
        new_time_sec = nx.dijkstra_path_length(H, OsloHub, SkoyenHub, weight="weight")
        new_time_min = new_time_sec / 60
        delta_min = new_time_min - base_time_min
        status = "path_exists"

        path_names = [H.nodes[x].get("name") for x in new_path]

    except NetworkXNoPath:
        new_path = None
        new_time_sec = None
        new_time_min = None
        delta_min = None
        status = "no_path"
        path_names = None

    single_results.append(
        {
            "experiment": "remove_1_node",
            "removed_nodes_ids": nid,
            "removed_nodes_names": node_name,
            "status": status,
            "new_time_min": new_time_min,
            "delta_time_min": delta_min,
            "path_ids": "|".join(new_path) if new_path is not None else None,
            "path_names": "|".join(path_names) if path_names is not None else None,
        }
    )

single_df = pd.DataFrame(single_results)

#single_df.to_csv("oslo_skoyen_remove1node.csv", index=False)
#print("Saved: oslo_skoyen_remove1node.csv")

single_df

Unnamed: 0,experiment,removed_nodes_ids,removed_nodes_names,status,new_time_min,delta_time_min,path_ids,path_names
0,remove_1_node,NSR:Quay:7159,Jernbanetorget,path_exists,16.0,0.0,OSLO_HUB|NSR:Quay:7203|NSR:Quay:101778|NSR:Qua...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
1,remove_1_node,NSR:Quay:101778,Kvadraturen,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
2,remove_1_node,NSR:Quay:101777,Wessels plass,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
3,remove_1_node,NSR:Quay:7350,Nationaltheatret,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
4,remove_1_node,NSR:Quay:104030,Solli,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
5,remove_1_node,NSR:Quay:7813,Frogner kirke,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
6,remove_1_node,NSR:Quay:7835,Olav Kyrres plass,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
7,remove_1_node,NSR:Quay:11844,Thune,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
8,remove_1_node,NSR:Quay:11817,Skøyen stasjon,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...


### 7.2 Redundancy experiment 2 – remove two consecutive stops

Check the impact of removing **pairs of consecutive internal stops** from the base path.

For each pair of neighbours on `base_path`:

1. Copy the graph `G` into `H`.
2. Remove both stops from `H`.
3. Try to find a new shortest path between hubs.
4. Record status, new time, and time increase.

This simulates disruptions affecting **two adjacent stops**.

In [16]:
# EXPERIMENT 2: Remove 2 consecutive nodes from base path

pair_results = []

for i in range(1, len(base_path) - 2):
    n1 = base_path[i]
    n2 = base_path[i + 1]

    if n1 in (OsloHub, SkoyenHub) or n2 in (OsloHub, SkoyenHub):
        continue

    H = G.copy()
    names = [G.nodes[n1].get("name"), G.nodes[n2].get("name")]
    H.remove_nodes_from([n1, n2])

    try:
        new_path = nx.dijkstra_path(H, OsloHub, SkoyenHub, weight="weight")
        new_time_sec = nx.dijkstra_path_length(H, OsloHub, SkoyenHub, weight="weight")
        new_time_min = new_time_sec / 60
        delta_min = new_time_min - base_time_min
        status = "path_exists"
        path_names = [H.nodes[x].get("name") for x in new_path]
    except NetworkXNoPath:
        new_path = None
        new_time_min = None
        delta_min = None
        status = "no_path"
        path_names = None

    pair_results.append(
        {
            "experiment": "remove_2_consecutive",
            "removed_nodes_ids": f"{n1}|{n2}",
            "removed_nodes_names": "|".join(names),
            "status": status,
            "new_time_min": new_time_min,
            "delta_time_min": delta_min,
            "path_ids": "|".join(new_path) if new_path is not None else None,
            "path_names": "|".join(path_names) if path_names is not None else None,
        }
    )

pair_df = pd.DataFrame(pair_results)

#pair_df.to_csv("oslo_skoyen_remove2consecutive.csv", index=False)
#print("Saved: oslo_skoyen_remove2consecutive.csv")

pair_df

Unnamed: 0,experiment,removed_nodes_ids,removed_nodes_names,status,new_time_min,delta_time_min,path_ids,path_names
0,remove_2_consecutive,NSR:Quay:7159|NSR:Quay:101778,Jernbanetorget|Kvadraturen,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
1,remove_2_consecutive,NSR:Quay:101778|NSR:Quay:101777,Kvadraturen|Wessels plass,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
2,remove_2_consecutive,NSR:Quay:101777|NSR:Quay:7350,Wessels plass|Nationaltheatret,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
3,remove_2_consecutive,NSR:Quay:7350|NSR:Quay:104030,Nationaltheatret|Solli,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
4,remove_2_consecutive,NSR:Quay:104030|NSR:Quay:7813,Solli|Frogner kirke,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
5,remove_2_consecutive,NSR:Quay:7813|NSR:Quay:7835,Frogner kirke|Olav Kyrres plass,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
6,remove_2_consecutive,NSR:Quay:7835|NSR:Quay:11844,Olav Kyrres plass|Thune,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
7,remove_2_consecutive,NSR:Quay:11844|NSR:Quay:11817,Thune|Skøyen stasjon,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...


### 7.3 Redundancy experiment 3 – remove two random internal stops

In this experiment we randomly pick **two distinct internal stops** and remove them simultaneously.

We repeat this random removal several times (e.g. 10 runs) to get a small Monte Carlo sample:

- For each run `k`:
  - randomly choose two different internal stops,
  - copy `G` into `H` and remove those stops,
  - compute the new shortest path and record the time increase.

This gives an idea of **typical robustness** under random failures.

In [17]:
# EXPERIMENT 3: Remove 2 random internal nodes, repeat 10 times

random_results = []
internal_only = internal_nodes[:]  

for k in range(10):

    n1, n2 = random.sample(internal_only, 2)
    H = G.copy()
    names = [G.nodes[n1].get("name"), G.nodes[n2].get("name")]
    H.remove_nodes_from([n1, n2])

    try:
        new_path = nx.dijkstra_path(H, OsloHub, SkoyenHub, weight="weight")
        new_time_sec = nx.dijkstra_path_length(H, OsloHub, SkoyenHub, weight="weight")
        new_time_min = new_time_sec / 60
        delta_min = new_time_min - base_time_min
        status = "path_exists"
        path_names = [H.nodes[x].get("name") for x in new_path]
    except NetworkXNoPath:
        new_path = None
        new_time_min = None
        delta_min = None
        status = "no_path"
        path_names = None

    random_results.append(
        {
            "experiment": f"remove_2_random_run_{k+1}",
            "removed_nodes_ids": f"{n1}|{n2}",
            "removed_nodes_names": "|".join(names),
            "status": status,
            "new_time_min": new_time_min,
            "delta_time_min": delta_min,
            "path_ids": "|".join(new_path) if new_path is not None else None,
            "path_names": "|".join(path_names) if path_names is not None else None,
        }
    )

random_df = pd.DataFrame(random_results)

#random_df.to_csv("oslo_skoyen_remove2random.csv", index=False)
#print("Saved: oslo_skoyen_remove2random.csv")

random_df

Unnamed: 0,experiment,removed_nodes_ids,removed_nodes_names,status,new_time_min,delta_time_min,path_ids,path_names
0,remove_2_random_run_1,NSR:Quay:7159|NSR:Quay:101777,Jernbanetorget|Wessels plass,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
1,remove_2_random_run_2,NSR:Quay:11817|NSR:Quay:104030,Skøyen stasjon|Solli,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
2,remove_2_random_run_3,NSR:Quay:7835|NSR:Quay:104030,Olav Kyrres plass|Solli,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
3,remove_2_random_run_4,NSR:Quay:7350|NSR:Quay:11817,Nationaltheatret|Skøyen stasjon,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
4,remove_2_random_run_5,NSR:Quay:11844|NSR:Quay:101777,Thune|Wessels plass,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
5,remove_2_random_run_6,NSR:Quay:11844|NSR:Quay:7813,Thune|Frogner kirke,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
6,remove_2_random_run_7,NSR:Quay:101778|NSR:Quay:11817,Kvadraturen|Skøyen stasjon,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
7,remove_2_random_run_8,NSR:Quay:7159|NSR:Quay:11844,Jernbanetorget|Thune,path_exists,28.0,12.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
8,remove_2_random_run_9,NSR:Quay:7159|NSR:Quay:7813,Jernbanetorget|Frogner kirke,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...
9,remove_2_random_run_10,NSR:Quay:7350|NSR:Quay:104030,Nationaltheatret|Solli,path_exists,25.0,9.0,OSLO_HUB|NSR:Quay:109952|NSR:Quay:101886|NSR:Q...,Oslo HUB (Oslo S / Bussterminal / Jernbanetorg...


### 7.4 Redundancy experiment 4 – remove entire current path

Analyse **edge redundancy** by iteratively removing whole paths.

1. Start with a copy `H` of the full graph and set `current_path = base_path`.
2. For `k = 1..3`:
   - remove all directed edges that belong to `current_path`,
   - attempt to find a new shortest path between hubs in `H`,
   - set `current_path` to this new path and repeat.

Note: in a `MultiDiGraph`, removing an edge `(u, v)` without specifying a key will remove **all parallel edges between `u` and `v`**.  
In this experiment this is intentional: we treat the whole corridor between two consecutive nodes as unavailable.

In [18]:
# EXPERIMENT 4: Remove whole current path (edges) and find next one

k_paths_results = []

H = G.copy()
current_path = base_path
current_time_min = base_time_min

for k in range(1, 4):  
    if current_path is None:
        break

    edges_to_remove = list(zip(current_path[:-1], current_path[1:]))
    H.remove_edges_from(edges_to_remove)

    try:
        new_path = nx.dijkstra_path(H, OsloHub, SkoyenHub, weight="weight")
        new_time_sec = nx.dijkstra_path_length(H, OsloHub, SkoyenHub, weight="weight")
        new_time_min = new_time_sec / 60
        delta_min = new_time_min - base_time_min
        status = "path_exists"
        path_names = [H.nodes[x].get("name") for x in new_path]
    except NetworkXNoPath:
        new_path = None
        new_time_min = None
        delta_min = None
        status = "no_path"
        path_names = None

    k_paths_results.append(
        {
            "experiment": f"k_shortest_{k}",
            "status": status,
            "new_time_min": new_time_min,
            "delta_time_min": delta_min,
            "path_ids": "|".join(new_path) if new_path is not None else None,
            "path_names": "|".join(path_names) if path_names is not None else None,
        }
    )

    current_path = new_path
    current_time_min = new_time_min

k_paths_df = pd.DataFrame(k_paths_results)

print("EDGE REDUNDANCY (remove whole path, k-shortest)")
print(k_paths_df[["experiment", "status", "new_time_min", "delta_time_min"]])

#k_paths_df.to_csv("oslo_skoyen_k_shortest_paths.csv", index=False)
#print("\nSaved: oslo_skoyen_k_shortest_paths.csv")

EDGE REDUNDANCY (remove whole path, k-shortest)
     experiment       status  new_time_min  delta_time_min
0  k_shortest_1  path_exists          28.0            12.0
1  k_shortest_2  path_exists          40.0            24.0
2  k_shortest_3  path_exists          54.0            38.0


### 8 Zone-Level Redundancy Experiments (Whole Stop Failures)

In the previous experiments (1–3), we modelled **node-level failures**:
- each failure removed a **single quay** identified by its `id`,
- but in GTFS a logical stop area may consist of multiple quays,
- therefore removing one quay does **NOT** disable the entire stop.

To model more realistic disruptions such as complete closure of a stop area due to
construction, road blockage, or full station outage—we now consider **zone failures**.

A *zone* corresponds to a **stop name**

Each of these may contain multiple physical platforms (`id`, "quays").  
A zone failure removes **all quays with that `name`**, effectively shutting down the entire stop.

We run three zone-level experiments:

### **Zone Experiment 1 - Remove one full zone**
Model a complete outage of a single stop area.

### **Zone Experiment 2 - Remove two consecutive zones**
Model a corridor-level shutdown affecting two neighbouring stop areas along the base path.

### **Zone Experiment 3 - Remove two random zones**
Model random large-scale failures of stop areas (Monte Carlo style, 10 runs).

These experiments allow us to compare:
- single-quay robustness (node failures), and  
- full stop-area robustness (zone failures).

In [19]:
# 8 Build ordered list of unique "zones" (stop names) on the base path
internal_zone_names = []
for nid in internal_nodes:
    name = G.nodes[nid].get("name")
    if name not in internal_zone_names:
        internal_zone_names.append(name)

print("Zones on base path (in order):")
for z in internal_zone_names:
    print("  -", z)

# Map: zone name - all node IDs (quays) with that name in the whole network
zone_to_ids = {
    name: bus_nodes.loc[bus_nodes["name"] == name, "id"].tolist()
    for name in internal_zone_names
}

print("\nExample zone mapping (name - ids):")
for name, ids in zone_to_ids.items():
    print(name, ":", ids)

Zones on base path (in order):
  - Jernbanetorget
  - Kvadraturen
  - Wessels plass
  - Nationaltheatret
  - Solli
  - Frogner kirke
  - Olav Kyrres plass
  - Thune
  - Skøyen stasjon

Example zone mapping (name - ids):
Jernbanetorget : ['NSR:Quay:7194', 'NSR:Quay:7158', 'NSR:Quay:7159', 'NSR:Quay:7203', 'NSR:Quay:7202', 'NSR:Quay:109952', 'NSR:Quay:105756', 'NSR:Quay:122003', 'NSR:Quay:7224', 'NSR:Quay:104023', 'NSR:Quay:104022', 'NSR:Quay:7193']
Kvadraturen : ['NSR:Quay:7230', 'NSR:Quay:101778']
Wessels plass : ['NSR:Quay:7311', 'NSR:Quay:101777']
Nationaltheatret : ['NSR:Quay:7349', 'NSR:Quay:7350', 'NSR:Quay:7384', 'NSR:Quay:7373', 'NSR:Quay:102096', 'NSR:Quay:109539', 'NSR:Quay:109233', 'NSR:Quay:7449']
Solli : ['NSR:Quay:7747', 'NSR:Quay:104031', 'NSR:Quay:7739', 'NSR:Quay:104030']
Frogner kirke : ['NSR:Quay:7812', 'NSR:Quay:7813']
Olav Kyrres plass : ['NSR:Quay:7836', 'NSR:Quay:7835']
Thune : ['NSR:Quay:11844', 'NSR:Quay:11845']
Skøyen stasjon : ['NSR:Quay:11817', 'NSR:Quay:1181

In [20]:
# 8.1 ZONE EXPERIMENT 1: remove 1 zone (all quays with this name)

zone_single_results = []

for zone_name in internal_zone_names:
    zone_ids = zone_to_ids[zone_name]

    H = G.copy()
    H.remove_nodes_from(zone_ids)

    try:
        new_path = nx.dijkstra_path(H, OsloHub, SkoyenHub, weight="weight")
        new_time_sec = nx.dijkstra_path_length(H, OsloHub, SkoyenHub, weight="weight")
        new_time_min = new_time_sec / 60
        delta_min = new_time_min - base_time_min
        status = "path_exists"
    except NetworkXNoPath:
        new_path = None
        new_time_min = None
        delta_min = None
        status = "no_path"

    zone_single_results.append(
        {
            "experiment": "remove_1_zone",
            "removed_zone_name": zone_name,
            "removed_node_ids": "|".join(zone_ids),
            "status": status,
            "new_time_min": new_time_min,
            "delta_time_min": delta_min,
        }
    )

zone_single_df = pd.DataFrame(zone_single_results)

# zone_single_df.to_csv("oslo_skoyen_remove1zone.csv", index=False)

zone_single_df

Unnamed: 0,experiment,removed_zone_name,removed_node_ids,status,new_time_min,delta_time_min
0,remove_1_zone,Jernbanetorget,NSR:Quay:7194|NSR:Quay:7158|NSR:Quay:7159|NSR:...,path_exists,28.0,12.0
1,remove_1_zone,Kvadraturen,NSR:Quay:7230|NSR:Quay:101778,path_exists,25.0,9.0
2,remove_1_zone,Wessels plass,NSR:Quay:7311|NSR:Quay:101777,path_exists,25.0,9.0
3,remove_1_zone,Nationaltheatret,NSR:Quay:7349|NSR:Quay:7350|NSR:Quay:7384|NSR:...,path_exists,25.0,9.0
4,remove_1_zone,Solli,NSR:Quay:7747|NSR:Quay:104031|NSR:Quay:7739|NS...,path_exists,25.0,9.0
5,remove_1_zone,Frogner kirke,NSR:Quay:7812|NSR:Quay:7813,path_exists,25.0,9.0
6,remove_1_zone,Olav Kyrres plass,NSR:Quay:7836|NSR:Quay:7835,path_exists,28.0,12.0
7,remove_1_zone,Thune,NSR:Quay:11844|NSR:Quay:11845,path_exists,28.0,12.0
8,remove_1_zone,Skøyen stasjon,NSR:Quay:11817|NSR:Quay:11819|NSR:Quay:11824|N...,no_path,,


In [21]:
# 8.2 ZONE EXPERIMENT 2: remove 2 consecutive zones along the base path

zone_pair_results = []

for i in range(len(internal_zone_names) - 1):
    z1 = internal_zone_names[i]
    z2 = internal_zone_names[i + 1]

    ids_z1 = zone_to_ids[z1]
    ids_z2 = zone_to_ids[z2]
    zone_ids = list(set(ids_z1 + ids_z2))  # union without duplicates

    H = G.copy()
    H.remove_nodes_from(zone_ids)

    try:
        new_path = nx.dijkstra_path(H, OsloHub, SkoyenHub, weight="weight")
        new_time_sec = nx.dijkstra_path_length(H, OsloHub, SkoyenHub, weight="weight")
        new_time_min = new_time_sec / 60
        delta_min = new_time_min - base_time_min
        status = "path_exists"
    except NetworkXNoPath:
        new_path = None
        new_time_min = None
        delta_min = None
        status = "no_path"

    zone_pair_results.append(
        {
            "experiment": "remove_2_consecutive_zones",
            "removed_zones": f"{z1}|{z2}",
            "removed_node_ids": "|".join(zone_ids),
            "status": status,
            "new_time_min": new_time_min,
            "delta_time_min": delta_min,
        }
    )

zone_pair_df = pd.DataFrame(zone_pair_results)

# zone_pair_df.to_csv("oslo_skoyen_remove2zones_consecutive.csv", index=False)

zone_pair_df

Unnamed: 0,experiment,removed_zones,removed_node_ids,status,new_time_min,delta_time_min
0,remove_2_consecutive_zones,Jernbanetorget|Kvadraturen,NSR:Quay:7159|NSR:Quay:7224|NSR:Quay:104023|NS...,path_exists,28.0,12.0
1,remove_2_consecutive_zones,Kvadraturen|Wessels plass,NSR:Quay:101777|NSR:Quay:101778|NSR:Quay:7311|...,path_exists,25.0,9.0
2,remove_2_consecutive_zones,Wessels plass|Nationaltheatret,NSR:Quay:109539|NSR:Quay:7384|NSR:Quay:7350|NS...,path_exists,25.0,9.0
3,remove_2_consecutive_zones,Nationaltheatret|Solli,NSR:Quay:109539|NSR:Quay:7384|NSR:Quay:7350|NS...,path_exists,25.0,9.0
4,remove_2_consecutive_zones,Solli|Frogner kirke,NSR:Quay:7812|NSR:Quay:7747|NSR:Quay:7813|NSR:...,path_exists,25.0,9.0
5,remove_2_consecutive_zones,Frogner kirke|Olav Kyrres plass,NSR:Quay:7813|NSR:Quay:7835|NSR:Quay:7812|NSR:...,path_exists,28.0,12.0
6,remove_2_consecutive_zones,Olav Kyrres plass|Thune,NSR:Quay:11844|NSR:Quay:11845|NSR:Quay:7835|NS...,path_exists,28.0,12.0
7,remove_2_consecutive_zones,Thune|Skøyen stasjon,NSR:Quay:11818|NSR:Quay:11824|NSR:Quay:11817|N...,no_path,,


In [22]:
# 8.3 ZONE EXPERIMENT 3: remove 2 random zones (10 runs)

zone_random_results = []

# for reproducibility (optional):
# random.seed(42)

for k in range(10):
    z1, z2 = random.sample(internal_zone_names, 2)
    ids_z1 = zone_to_ids[z1]
    ids_z2 = zone_to_ids[z2]
    zone_ids = list(set(ids_z1 + ids_z2))

    H = G.copy()
    H.remove_nodes_from(zone_ids)

    try:
        new_path = nx.dijkstra_path(H, OsloHub, SkoyenHub, weight="weight")
        new_time_sec = nx.dijkstra_path_length(H, OsloHub, SkoyenHub, weight="weight")
        new_time_min = new_time_sec / 60
        delta_min = new_time_min - base_time_min
        status = "path_exists"
    except NetworkXNoPath:
        new_path = None
        new_time_min = None
        delta_min = None
        status = "no_path"

    zone_random_results.append(
        {
            "experiment": f"remove_2_zones_random_run_{k+1}",
            "removed_zones": f"{z1}|{z2}",
            "removed_node_ids": "|".join(zone_ids),
            "status": status,
            "new_time_min": new_time_min,
            "delta_time_min": delta_min,
        }
    )

zone_random_df = pd.DataFrame(zone_random_results)

# zone_random_df.to_csv("oslo_skoyen_remove2zones_random.csv", index=False)

zone_random_df

Unnamed: 0,experiment,removed_zones,removed_node_ids,status,new_time_min,delta_time_min
0,remove_2_zones_random_run_1,Jernbanetorget|Olav Kyrres plass,NSR:Quay:7159|NSR:Quay:7224|NSR:Quay:104023|NS...,path_exists,31.0,15.0
1,remove_2_zones_random_run_2,Nationaltheatret|Jernbanetorget,NSR:Quay:109233|NSR:Quay:7203|NSR:Quay:105756|...,path_exists,28.0,12.0
2,remove_2_zones_random_run_3,Solli|Wessels plass,NSR:Quay:104030|NSR:Quay:7747|NSR:Quay:7739|NS...,path_exists,25.0,9.0
3,remove_2_zones_random_run_4,Solli|Kvadraturen,NSR:Quay:7230|NSR:Quay:7747|NSR:Quay:101778|NS...,path_exists,25.0,9.0
4,remove_2_zones_random_run_5,Nationaltheatret|Wessels plass,NSR:Quay:109539|NSR:Quay:7384|NSR:Quay:7350|NS...,path_exists,25.0,9.0
5,remove_2_zones_random_run_6,Nationaltheatret|Wessels plass,NSR:Quay:109539|NSR:Quay:7384|NSR:Quay:7350|NS...,path_exists,25.0,9.0
6,remove_2_zones_random_run_7,Thune|Kvadraturen,NSR:Quay:11844|NSR:Quay:11845|NSR:Quay:101778|...,path_exists,28.0,12.0
7,remove_2_zones_random_run_8,Frogner kirke|Skøyen stasjon,NSR:Quay:11818|NSR:Quay:7812|NSR:Quay:11824|NS...,no_path,,
8,remove_2_zones_random_run_9,Olav Kyrres plass|Kvadraturen,NSR:Quay:101778|NSR:Quay:7835|NSR:Quay:7836|NS...,path_exists,28.0,12.0
9,remove_2_zones_random_run_10,Jernbanetorget|Olav Kyrres plass,NSR:Quay:7159|NSR:Quay:7224|NSR:Quay:104023|NS...,path_exists,31.0,15.0
