# Network Metrics

Build the network graph, compute centrality metrics, detect communities, and save outputs to `data/processed/`

## Optional Prep
Run preprocessing and temporal preprocessing if processed files are missing

In [None]:
# Uncomment to regenerate processed data
# !python3 ../src/preprocessing.py
# !python3 ../src/temporal_preprocessing.py

## Setup
Import helpers / ensure the project `src` directory is on the path

In [2]:
import os, sys
# Resolve project root as parent of this notebook's directory
NOTEBOOK_DIR = os.path.abspath(os.getcwd())
PROJECT_ROOT = os.path.abspath(os.path.join(NOTEBOOK_DIR, os.pardir))
SRC_DIR = os.path.join(PROJECT_ROOT, "src")
if SRC_DIR not in sys.path:
    sys.path.append(SRC_DIR)

from build_network import load_and_build
from centrality_analysis import compute_centrality_measures, save_centrality_rankings
from community_detection import detect_communities, save_communities

print(f"Project root: {PROJECT_ROOT}")

Project root: /Users/masoncacurak/Downloads/CS_5483/dallas_network_analysis


## Build graph for community detection
Select a time period weight (e.g., AM, Midday, PM, Evening) or leave `period=None` to use `weight_type` (congested/freeflow)

In [3]:
period_main = "AM"  # options: AM, Midday, PM, Evening, or None
G = load_and_build(weight_type="congested", period=period_main)
print(f"Graph built for period={period_main}: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges")

Loading processed_nodes.csv...
Loading processed_links.csv...
Building graph using temporal weight column 'travel_time_AM'
Adding 21389 nodes...
Adding 35696 edges...
Graph build complete: 21389 nodes, 35696 edges
Graph built for period=AM: 21389 nodes, 35696 edges


## Centrality metrics for each time period
Compute centrality metrics for multiple periods and save rankings per period

In [None]:
periods = ["AM", "Midday", "PM", "Evening"]
centrality_paths = {}

for p in periods:
    print(f"\nCentrality for {p} period:")
    Gp = load_and_build(weight_type="congested", period=p)
    centrality = compute_centrality_measures(Gp)
    output_path = os.path.join(PROJECT_ROOT, "data", "processed", f"centrality_rankings_{p}.csv")
    centrality_paths[p] = save_centrality_rankings(centrality, output_path=output_path)
    print(f"Saved {p} rankings to: {output_path}")

print("\nCentrality computations finished")


Centrality for AM
Loading processed_nodes.csv...
Loading processed_links.csv...
Building graph using temporal weight column 'travel_time_AM'
Adding 21389 nodes...
Adding 35696 edges...
Graph build complete: 21389 nodes, 35696 edges
Computing degree centrality...

Top 10 nodes by degree centrality:
 1. Node 757: 0.000468
 2. Node 11593: 0.000468
 3. Node 1287: 0.000421
 4. Node 107: 0.000374
 5. Node 108: 0.000374
 6. Node 126: 0.000374
 7. Node 176: 0.000374
 8. Node 359: 0.000374
 9. Node 402: 0.000374
10. Node 470: 0.000374
Degree stats -> min: 0.000000, max: 0.000468, mean: 0.000156

Computing betweenness centrality by weight...


KeyboardInterrupt: 

## Community detection
- Run Louvain for each period-specific graph (am/midday/pm/evening), saving community assignments to `communities_<period>.csv`
- Commented out: Run Louvain on the undirected graph built for period_main (You can set `run_gn_sample=True` to also run Girvanâ€“Newman on a sampled subgraph)


In [4]:
'''
partition, num_comms, sizes = detect_communities(G, run_gn_sample=True)
communities_path = save_communities(partition)

print(f"Communities saved to: {communities_path}")
print(f"Total communities: {num_comms}")
print(f"Top community sizes: {sizes[:10]}")
'''
periods = ["AM", "Midday", "PM", "Evening"]
community_paths = {}

for p in periods:
    print(f"\nCommunities for {p} period:")
    Gp = load_and_build(weight_type="congested", period=p)
    partition, num_comms, sizes = detect_communities(Gp, run_gn_sample=False)
    out_path = os.path.join(PROJECT_ROOT, "data", "processed", f"communities_{p}.csv")
    community_paths[p] = save_communities(partition, output_path=out_path)
    print(f"Saved {p} communities to: {out_path}")
    print(f"Total communities: {num_comms}; top sizes: {sizes[:10]}")

print("\nAll community computations finished")


Communities for AM period:
Loading processed_nodes.csv...
Loading processed_links.csv...
Building graph using temporal weight column 'travel_time_AM'
Adding 21389 nodes...
Adding 35696 edges...
Graph build complete: 21389 nodes, 35696 edges
Running Louvain community detection...
Detected 475 communities via Louvain
Largest communities (by size): [299, 297, 290, 288, 282, 281, 276, 269, 266, 266]
Community assignments saved to /Users/masoncacurak/Downloads/CS_5483/dallas_network_analysis/data/processed/communities_AM.csv
Saved AM communities to: /Users/masoncacurak/Downloads/CS_5483/dallas_network_analysis/data/processed/communities_AM.csv
Total communities: 475; top sizes: [299, 297, 290, 288, 282, 281, 276, 269, 266, 266]

Communities for Midday period:
Loading processed_nodes.csv...
Loading processed_links.csv...
Building graph using temporal weight column 'travel_time_Midday'
Adding 21389 nodes...
Adding 35696 edges...
Graph build complete: 21389 nodes, 35696 edges
Running Louvain 