<a href="https://colab.research.google.com/github/hhhhhenanZ/dataset-test/blob/main/gmns_ready_tutorial_Tempe_City.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GMNS Ready Tutorial - Tempe Dataset


**Quick tutorial:** Prepare and validate GMNS transportation networks using Tempe, AZ dataset.

**Time:** ~5 minutes | **Author:** ASU Trans+AI Lab

## Setup

In [1]:
# Install gmns-ready
!pip install gmns-ready -q
print("‚úÖ Installed gmns-ready")

‚úÖ Installed gmns-ready


In [2]:
# Download network files (node.csv, link.csv)
import urllib.request
import os

base_url = "https://raw.githubusercontent.com/hhhhhenanZ/dataset-test/main/Tempe/Tempe_tutorial/"

# Download node and link files
print("Downloading network files...")
urllib.request.urlretrieve(base_url + "node.csv", "node.csv")
urllib.request.urlretrieve(base_url + "link.csv", "link.csv")
print("‚úÖ Network files downloaded")

Downloading network files...
‚úÖ Network files downloaded


In [3]:
# Download shapefile components
os.makedirs('data', exist_ok=True)

shapefile_name = "Census_Tract_Boundary"
extensions = ['shp', 'shx', 'dbf', 'prj', 'cpg']

print("Downloading shapefile components...")
for ext in extensions:
    url = f"{base_url}data/{shapefile_name}.{ext}"
    try:
        urllib.request.urlretrieve(url, f"data/{shapefile_name}.{ext}")
        print(f"  ‚úÖ {shapefile_name}.{ext}")
    except:
        print(f"  ‚ö†Ô∏è  {shapefile_name}.{ext} not found (optional)")

print("\n‚úÖ Setup complete!")

Downloading shapefile components...
  ‚úÖ Census_Tract_Boundary.shp
  ‚úÖ Census_Tract_Boundary.shx
  ‚úÖ Census_Tract_Boundary.dbf
  ‚úÖ Census_Tract_Boundary.prj
  ‚úÖ Census_Tract_Boundary.cpg

‚úÖ Setup complete!


## Workflow

In [4]:
import gmns_ready as gr
print(f"Using gmns-ready v{gr.__version__}\n")

Using gmns-ready v0.0.9



### Step 1: Validate Spatial Alignment

In [5]:
gr.validate_basemap()

GMNS Base Map Validator

Step 1: Checking folder structure...
  Data folder found: ./data
  Found 1 shapefile(s) in data folder
  Using node file: node.csv
  Using link file: link.csv

Step 2: Loading network files...
  Loaded 2738 nodes from node.csv
  Loaded 4165 links from link.csv
  Loaded shapefile: Census_Tract_Boundary.shp (44 features)

Step 3: Detecting location...
  Detected location: Tempe, Arizona United States

Step 4: Checking node-link topology...
  Checking if links connect to valid nodes...
  [OK] All 4165 links connect to valid nodes
    - All from_node_ids exist in node.csv
    - All to_node_ids exist in node.csv

  Checking if links are in same geographic area as nodes...
  Node extent: [-111.9841, 33.3199] to [-111.8752, 33.4666]
  Link extent: [-111.9841, 33.3199] to [-111.8752, 33.4666]
  [OK] Links and nodes are in the same geographic area (overlap: 100.0%)

  Checking link geometry consistency...
  [OK] Link geometries match node locations

Step 5: Checking spa

### Step 2: Extract Zones

In [6]:
gr.extract_zones()

Shapefile loaded successfully.
Current CRS: EPSG:2223
Reprojected CRS: EPSG:4326
Available columns: Index(['OBJECTID', 'STATEFP', 'COUNTYFP', 'TRACTCE', 'GEOID', 'NAME',
       'NAMELSAD', 'MTFCC', 'FUNCSTAT', 'ALAND', 'AWATER', 'INTPTLAT',
       'INTPTLON', 'Shape__Are', 'Shape__Len', 'geometry'],
      dtype='object')
Detected zone identifier column: 'TRACTCE'
Sample values: ['810000', '810100', '810400']
Total zones: 44

SUMMARY STATISTICS
Total number of zones: 44
Zone ID column: TRACTCE
CRS: EPSG:4326
Centroid coordinates calculated and boundaries preserved.
Figure(2000x800)
Centroid and boundary data saved to zone.csv

File created: zone.csv

GeoDataFrame now has two geometries:
- 'geometry' column: Point geometries (centroids) from x_coord, y_coord
- 'boundary_geometry' column: Original polygon boundaries

For subsequent steps, you can:
- Use gdf['geometry'] for centroid-based operations
- Use gdf['boundary_geometry'] for boundary-based operations
- Access both from the CSV via

In [7]:
# Preview zones
import pandas as pd
zones = pd.read_csv('zone.csv')
print(f"Extracted {len(zones)} zones\n")
zones[['zone_id', 'x_coord', 'y_coord']].head()

Extracted 44 zones



Unnamed: 0,zone_id,x_coord,y_coord
0,1,-111.958574,33.329291
1,2,-111.922345,33.329354
2,3,-111.948389,33.327312
3,4,-111.968541,33.418651
4,5,-111.93266,33.418289


### Step 3: Build Zone-Connected Network

In [8]:
gr.build_network()

CONNECTOR GENERATION
Configuration:
  - Activity nodes: Always connect to nearest zone
  - Zone nodes: Always connect to nearest network link (no limit)

Processing node data...
  Activity nodes: 135
  Regular nodes: 2603

Updating link node IDs...

STEP 1: Connecting activity nodes to zones...
  Using boundary-based matching
  Building spatial index...
  [OK] Connected 135 activity nodes to zones
  [OK] 27 zones have activity connectors

STEP 2: Connecting zones to physical road network...
  Zones to connect: 17
  [OK] Connected 17/17 zones to network

[OK] Total connector links generated: 304
  Saved: /content/connected_network/connector_links.csv

Merging links...
  [OK] Saved: /content/connected_network/link.csv

Creating updated node file...
  [OK] Saved: /content/connected_network/node.csv

COMPLETION SUMMARY
Total execution time: 2.69 seconds
Output directory: /content/connected_network
Files created:
  - node.csv
  - link.csv
  - activity_node.csv
  - connector_links.csv


In [9]:
# Check results
nodes = pd.read_csv('connected_network/node.csv')
links = pd.read_csv('connected_network/link.csv')
connectors = pd.read_csv('connected_network/connector_links.csv')

print(f" Network Statistics:")
print(f"  Zones: {len(zones)}")
print(f"  Total nodes: {len(nodes)}")
print(f"  Total links: {len(links)}")
print(f"  Connectors: {len(connectors)}")

 Network Statistics:
  Zones: 44
  Total nodes: 2782
  Total links: 4469
  Connectors: 304


### Step 4: Validate Network

In [10]:
gr.validate_network()

GMNS Readiness Validator
Network: connected_network
Validation levels: 1-3

Level 1: Basic Data File Validation
------------------------------------------------------------
  Check 1.1: File existence...
    [OK] node.csv - Found
    [OK] link.csv - Found
  Check 1.2: Required fields and data types...
    [OK] node.csv - All required fields present
    [OK] link.csv - All required fields present
  Check 1.3: Data structure (sorted order)...
    [OK] node.csv - Sorted by node_id
    [OK] link.csv - Sorted by from_node_id
  Check 1.4: Link endpoint validation...
       Reading from: connected_network/node.csv
       Reading from: connected_network/link.csv
       node.csv has 2782 unique node IDs
       link.csv references 2782 unique node IDs
       node.csv range: 1 to 2928
       node.csv first 10: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
       link.csv range: 1 to 2928
       link.csv first 10: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    [OK] All from_node_id values valid
    [OK] All to_node_id val

### Step 5: Validate Accessibility

In [12]:
gr.validate_accessibility()

sample_settings.csv file created successfully!
sample_mode_type.csv file created successfully!
number_of_modes = 1
# of nodes= 2782, largest zone id (# of zones) = 44, First Through Node ID = 45, number of links = 4469
total_base_link_volume = 0.000000
 Memory allocation completes. Starting the minpath calculations.
Accessibility computing for zone 8
Accessibility computing for zone 16
Accessibility computing for zone 24
Accessibility computing for zone 32
Accessibility computing for zone 40
All OD accessibility computing: 0 hours 0 minutes 0 seconds 8 ms
Output written to od_performance.csv
---------- Summary Statistics ----------
Average path distance: 6.06152 miles
Average free flow travel time: 7.43566 minutes
Printing out OD accessibility: 0 hours 0 minutes 0 seconds 3 ms
Zone-based accessibility output written to zone_accessibility.csv
GMNS Accessibility Validator
Network: connected_network

Method: DTALite Python package
DTALite, version 0.8.1

  [OK] DTALite package found (vers

In [13]:
# Load accessibility results
accessibility = pd.read_csv('connected_network/zone_accessibility.csv')

print(f" Accessibility Results:")
print(f"  Avg origins connecting: {accessibility['origin_count'].mean():.0f}")
print(f"  Avg destinations reachable: {accessibility['destination_count'].mean():.0f}")

 Accessibility Results:
  Avg origins connecting: 39
  Avg destinations reachable: 39


### Step 6: Validate Assignment Readiness

In [14]:
gr.validate_assignment()

ASSIGNMENT-READY VALIDATOR
Network Directory: connected_network

CHECKING NODE.CSV
----------------------------------------------------------------------
[OK] Total nodes: 2782
[OK] Required columns present: node_id, x_coord, y_coord
[OK] Centroid nodes (zone_id = node_id): 44

CHECKING LINK.CSV
----------------------------------------------------------------------
[OK] Total links: 4469
  Excluding 304 connector links (type 0)
  Analyzing 4165 non-connector links

Parameter: vdf_alpha (dimensionless)
  Overall average: 0.150
  ‚Üí All non-connector link types use constant value: 0.150

Parameter: vdf_beta (dimensionless)
  Overall average: 4.000
  ‚Üí All non-connector link types use constant value: 4.000

Parameter: vdf_plf (dimensionless)
  Overall average: 1.000
  ‚Üí All non-connector link types use constant value: 1.000

Parameter: vdf_fftt (minutes)
  Overall average: 0.138
  By Link Type:
    Type Name                 Count    Average     
    1    Motorway/Freeway     519     

### Step 7: Enhance Connectivity (If Needed)

In [15]:
# Enhance connectivity for poorly connected zones
gr.enhance_connectors()

print("\n" + "="*10)
print(" What just happened:")
print("="*10)
print("‚úì Added more connectors for zones with low accessibility")
print("‚úì New file created: connected_network/link_updated.csv")
print("\n Next steps:")
print("1. Backup original: Rename link.csv ‚Üí link_v1.csv")
print("2. Use updated: Rename link_updated.csv ‚Üí link.csv")
print("3. Re-validate to see improvements")
print("="*10)

CONNECTOR EDITOR - IMPROVING ZONE ACCESSIBILITY
Working directory: /content/connected_network
Threshold: 10.0% of total zones
Search radius: 8000m
Connectors per zone: 10

[1/5] Loading data...
  Loaded 44 zones
  Loaded 4469 links
  Loaded 2782 nodes

[2/5] Identifying poorly connected zones...
  Threshold: 4 zones (10.0%)
  Found 4 poorly connected zones
  Zone IDs: [3, 5, 10, 18]

[3/5] Preparing spatial data...
  Built spatial index
  Found 4/4 zone coordinates

[4/5] Analyzing existing connectors...
  Found 304 existing connectors
  Unique connections: 304

[5/5] Generating new connectors...
  Processing zone 3 (1/4)...
  Processing zone 5 (2/4)...
  Processing zone 10 (3/4)...
  Processing zone 18 (4/4)...
  [OK] Generated 80 new connector links

[6/6] Merging and saving...
  [OK] Saved: /content/connected_network/link_updated.csv

[7/7] Generating report...
  [OK] Saved: /content/connected_network/connector_editor_report.txt

EXECUTION SUMMARY
Problematic zones: 4
New connectors

  final_link_df = pd.concat([link_df, new_connector_df], ignore_index=True)


In [16]:
# Backup original and use updated version
import shutil
import os

os.rename('connected_network/link.csv', 'connected_network/link_v1.csv')
print("\n‚úì Backed up: link.csv ‚Üí link_v1.csv")

os.rename('connected_network/link_updated.csv', 'connected_network/link.csv')
print("‚úì Updated: link_updated.csv ‚Üí link.csv")



‚úì Backed up: link.csv ‚Üí link_v1.csv
‚úì Updated: link_updated.csv ‚Üí link.csv


In [17]:
# Re-validate with enhanced connectors
print("\n" + "="*70)
print("Re-validating accessibility with enhanced connectors...")
print("="*70)
gr.validate_accessibility()


Re-validating accessibility with enhanced connectors...
sample_settings.csv file created successfully!
sample_mode_type.csv file created successfully!
number_of_modes = 1
# of nodes= 2782, largest zone id (# of zones) = 44, First Through Node ID = 45, number of links = 4549
total_base_link_volume = 0.000000
 Memory allocation completes. Starting the minpath calculations.
Accessibility computing for zone 8
Accessibility computing for zone 16
Accessibility computing for zone 24
Accessibility computing for zone 32
Accessibility computing for zone 40
All OD accessibility computing: 0 hours 0 minutes 0 seconds 17 ms
Output written to od_performance.csv
---------- Summary Statistics ----------
Average path distance: 5.8999 miles
Average free flow travel time: 7.28768 minutes
Printing out OD accessibility: 0 hours 0 minutes 0 seconds 9 ms
Zone-based accessibility output written to zone_accessibility.csv
GMNS Accessibility Validator
Network: connected_network

Method: DTALite Python package
D

## Download Results

In [18]:
# Zip and download all results (excluding sample_data)
import shutil
import os
import tempfile

print(" Creating zip file (excluding sample_data)...")

# Create temporary directory
with tempfile.TemporaryDirectory() as temp_dir:
    # Copy all items except sample_data
    for item in os.listdir('.'):
        if item != 'sample_data' and not item.startswith('.'):
            src = os.path.join('.', item)
            dst = os.path.join(temp_dir, item)

            if os.path.isdir(src):
                shutil.copytree(src, dst)
            else:
                shutil.copy2(src, dst)

    # Create zip from temp directory
    shutil.make_archive('tempe_network_results', 'zip', temp_dir)

from google.colab import files
files.download('tempe_network_results.zip')

print(" Results downloaded as tempe_network_results.zip")
print("\nIncluded in zip:")
print("  - node.csv, link.csv, zone.csv")
print("  - data/ folder (shapefile)")
print("  - connected_network/ folder (all outputs)")

 Creating zip file (excluding sample_data)...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

 Results downloaded as tempe_network_results.zip

Included in zip:
  - node.csv, link.csv, zone.csv
  - data/ folder (shapefile)
  - connected_network/ folder (all outputs)


## Clean up all generated files and folders

In [19]:
# Run this cell if you want to start fresh with new data

import shutil
import os

folders_to_remove = ['data', 'connected_network', 'osm_network_connectivity_check']
files_to_remove = ['node.csv', 'link.csv', 'zone.csv', 'tempe_network_results.zip']

print(" Cleaning up generated files and folders...")
print("="*70)

# Remove folders
for folder in folders_to_remove:
    if os.path.exists(folder):
        shutil.rmtree(folder)
        print(f"‚úì Removed folder: {folder}/")

# Remove files
for file in files_to_remove:
    if os.path.exists(file):
        os.remove(file)
        print(f"‚úì Removed file: {file}")

print("="*70)
print(" Cleanup complete! Ready for new data.")

 Cleaning up generated files and folders...
‚úì Removed folder: data/
‚úì Removed folder: connected_network/
‚úì Removed file: node.csv
‚úì Removed file: link.csv
‚úì Removed file: zone.csv
‚úì Removed file: tempe_network_results.zip
 Cleanup complete! Ready for new data.


## Summary

**üéâ Complete!** You've successfully:

1. ‚úÖ Validated spatial alignment
2. ‚úÖ Extracted zones from shapefile  
3. ‚úÖ Built zone-connected network
4. ‚úÖ Validated network topology
5. ‚úÖ Analyzed zone accessibility
6. ‚úÖ Verified assignment readiness

**Next steps:**
- Use with DTALite/TAPLite for traffic assignment
- Integrate with travel demand models
- Try with your own data!

**Learn more:**
- GitHub: https://github.com/hhhhhenanZ/gmns_ready
- PyPI: https://pypi.org/project/gmns-ready/
- Paper: *Coming soon*