# Notebook 2: Network Shade Calculation
## Shade-Optimized Pedestrian Routing to Transit

**Author:** Kavana Raju  
**Course:** MUSA 5500 - Geospatial Data Science with Python  
**Date:** December 2025

---

This notebook calculates shade scores for all street segments:
1. Calculate solar position for 8 temporal scenarios
2. Model building shadows using geometric methods
3. Extract tree canopy coverage (from LiDAR)
4. Combine building + tree shade
5. Assign shade scores to all network edges

## Setup & Imports

In [1]:
import osmnx as ox
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
from shapely.geometry import Point, LineString, Polygon, box
from shapely.ops import unary_union
import warnings
warnings.filterwarnings('ignore')

# Create output directories
for d in ['outputs/figures', 'outputs/maps']:
    Path(d).mkdir(parents=True, exist_ok=True)

print("âœ“ Imports successful")

âœ“ Imports successful


## 1. Load Data from Notebook 1

In [2]:
print("Loading processed data from Notebook 1...\n")

# Load street network
edges_gdf = gpd.read_file('data/processed/network_edges.geojson')
nodes_gdf = gpd.read_file('data/processed/network_nodes.geojson')
print(f"âœ“ Network loaded: {len(edges_gdf):,} edges, {len(nodes_gdf):,} nodes")

# Load buildings with heights
buildings = gpd.read_file('data/processed/buildings_with_heights.geojson')
print(f"âœ“ Buildings loaded: {len(buildings):,} buildings")

# Check which height column exists
if 'height_ft' in buildings.columns:
    height_col = 'height_ft'
    height_unit = 'feet'
elif 'height_m' in buildings.columns:
    height_col = 'height_m'
    height_unit = 'meters'
    # Convert to feet for consistency
    buildings['height_ft'] = buildings['height_m'] * 3.28084
    height_col = 'height_ft'
    height_unit = 'feet (converted)'
else:
    raise ValueError("No height column found in buildings data!")

print(f"  Using height column: {height_col} ({height_unit})")
print(f"  Mean height: {buildings[height_col].mean():.1f} ft")

# Load SEPTA stops
septa_stops = gpd.read_file('data/processed/septa_stops.geojson')
print(f"âœ“ Transit stops loaded: {len(septa_stops)} stops")

# Load study area
study_area = gpd.read_file('data/processed/study_area.geojson')
print(f"âœ“ Study area loaded")

print(f"\nâœ“ All data loaded successfully")

Loading processed data from Notebook 1...

âœ“ Network loaded: 23,486 edges, 7,343 nodes
âœ“ Buildings loaded: 16,632 buildings
  Using height column: height_ft (feet)
  Mean height: 32.4 ft
âœ“ Transit stops loaded: 60 stops
âœ“ Study area loaded

âœ“ All data loaded successfully


## 2. Define Temporal Scenarios

I analyzed shade at different times of day across seasons:
- **Summer:** June 21 (longest day)
- **Winter:** December 21 (shortest day)
- **Spring:** March 21 (equinox)
- **Fall:** September 21 (equinox)

Times of day:
- **Morning:** 9:00 AM
- **Midday:** 12:00 PM  
- **Evening:** 6:00 PM

In [3]:
from datetime import datetime
import pytz

# Define scenarios
scenarios = {
    'summer_morning': datetime(2024, 6, 21, 9, 0),
    'summer_midday': datetime(2024, 6, 21, 12, 0),
    'summer_evening': datetime(2024, 6, 21, 18, 0),
    'winter_morning': datetime(2024, 12, 21, 9, 0),
    'winter_midday': datetime(2024, 12, 21, 12, 0),
    'winter_evening': datetime(2024, 12, 21, 18, 0),
    'spring_midday': datetime(2024, 3, 21, 12, 0),
    'fall_midday': datetime(2024, 9, 21, 12, 0),
}

# Philadelphia location
latitude = 39.9526
longitude = -75.1652
timezone = pytz.timezone('America/New_York')

print("Temporal scenarios defined:")
for name, dt in scenarios.items():
    print(f"  â€¢ {name}: {dt.strftime('%B %d, %Y at %I:%M %p')}")

print(f"\nLocation: Philadelphia ({latitude:.4f}Â°N, {longitude:.4f}Â°W)")

Temporal scenarios defined:
  â€¢ summer_morning: June 21, 2024 at 09:00 AM
  â€¢ summer_midday: June 21, 2024 at 12:00 PM
  â€¢ summer_evening: June 21, 2024 at 06:00 PM
  â€¢ winter_morning: December 21, 2024 at 09:00 AM
  â€¢ winter_midday: December 21, 2024 at 12:00 PM
  â€¢ winter_evening: December 21, 2024 at 06:00 PM
  â€¢ spring_midday: March 21, 2024 at 12:00 PM
  â€¢ fall_midday: September 21, 2024 at 12:00 PM

Location: Philadelphia (39.9526Â°N, -75.1652Â°W)


## 3. Calculate Solar Position for Each Scenario

In [4]:
import pvlib

print("Calculating solar position for each scenario...\n")

solar_positions = {}

for scenario_name, dt in scenarios.items():
    # Localize datetime
    dt_local = timezone.localize(dt)
    
    # Calculate solar position
    solar_pos = pvlib.solarposition.get_solarposition(
        dt_local,
        latitude,
        longitude
    )
    
    altitude = solar_pos['apparent_elevation'].values[0]
    azimuth = solar_pos['azimuth'].values[0]
    
    solar_positions[scenario_name] = {
        'altitude': altitude,
        'azimuth': azimuth,
        'datetime': dt
    }
    
    print(f"{scenario_name:20s} - Altitude: {altitude:6.2f}Â° | Azimuth: {azimuth:6.2f}Â°")

# Save solar positions
solar_df = pd.DataFrame(solar_positions).T
solar_df.to_csv('data/processed/solar_positions.csv')
print(f"\nâœ“ Solar positions calculated and saved")

Calculating solar position for each scenario...

summer_morning       - Altitude:  36.90Â° | Azimuth:  88.85Â°
summer_midday        - Altitude:  68.86Â° | Azimuth: 136.67Â°
summer_evening       - Altitude:  26.48Â° | Azimuth: 279.37Â°
winter_morning       - Altitude:  14.19Â° | Azimuth: 138.24Â°
winter_midday        - Altitude:  26.64Â° | Azimuth: 180.24Â°
winter_evening       - Altitude: -14.95Â° | Azimuth: 251.74Â°
spring_midday        - Altitude:  47.76Â° | Azimuth: 154.38Â°
fall_midday          - Altitude:  48.56Â° | Azimuth: 159.55Â°

âœ“ Solar positions calculated and saved


## 4. Calculate Building Shadows for Each Scenario

In [5]:
# Project data to PA State Plane (feet) for shadow calculations
CRS_PROJECTED = 'EPSG:2272'

buildings_proj = buildings.to_crs(CRS_PROJECTED)
edges_proj = edges_gdf.to_crs(CRS_PROJECTED)

print(f"Data projected to {CRS_PROJECTED}")
print(f"  Buildings: {len(buildings_proj):,}")
print(f"  Street edges: {len(edges_proj):,}")

Data projected to EPSG:2272
  Buildings: 16,632
  Street edges: 23,486


In [6]:
def calculate_building_shadow(building_geom, height_ft, altitude_deg, azimuth_deg):
    """
    Calculate shadow polygon for a building.
    
    Parameters:
    - building_geom: Building footprint geometry
    - height_ft: Building height in feet
    - altitude_deg: Solar altitude angle in degrees
    - azimuth_deg: Solar azimuth angle in degrees (0=North, 90=East)
    
    Returns:
    - Shadow polygon
    """
    # If sun is below horizon or building has no height, no shadow
    if altitude_deg <= 0 or height_ft <= 0:
        return None
    
    # Calculate shadow length
    altitude_rad = np.radians(altitude_deg)
    shadow_length = height_ft / np.tan(altitude_rad)
    
    # Calculate shadow direction (opposite of sun)
    shadow_azimuth = (azimuth_deg + 180) % 360
    shadow_azimuth_rad = np.radians(shadow_azimuth)
    
    # Calculate shadow offset
    dx = shadow_length * np.sin(shadow_azimuth_rad)
    dy = shadow_length * np.cos(shadow_azimuth_rad)
    
    # Create shadow polygon by translating building footprint
    try:
        from shapely.affinity import translate
        shadow = translate(building_geom, xoff=dx, yoff=dy)
        
        # Union with building footprint for full shadow
        full_shadow = unary_union([building_geom, shadow])
        
        return full_shadow.convex_hull if full_shadow.is_valid else None
    except:
        return None

print("âœ“ Shadow calculation function defined")

âœ“ Shadow calculation function defined


In [7]:
print("\nCalculating building shadows for all scenarios...\n")
print("This will take 30-45 minutes for ~16k buildings Ã— 8 scenarios")
print("Please be patient...\n")

# Store shadow geometries for each scenario
building_shadows = {}

for scenario_name, solar_data in solar_positions.items():
    print(f"Processing: {scenario_name}...")
    
    altitude = solar_data['altitude']
    azimuth = solar_data['azimuth']
    
    shadows = []
    
    for idx, building in buildings_proj.iterrows():
        shadow = calculate_building_shadow(
            building.geometry,
            building[height_col],
            altitude,
            azimuth
        )
        
        if shadow is not None:
            shadows.append(shadow)
        
        # Progress indicator
        if (idx + 1) % 2000 == 0:
            print(f"  {idx+1:,} / {len(buildings_proj):,} buildings processed")
    
    # Create GeoDataFrame of shadows
    shadows_gdf = gpd.GeoDataFrame(
        geometry=shadows,
        crs=CRS_PROJECTED
    )
    
    building_shadows[scenario_name] = shadows_gdf
    
    print(f"  âœ“ {len(shadows):,} shadows calculated\n")

print("âœ“ All building shadows calculated")


Calculating building shadows for all scenarios...

This will take 30-45 minutes for ~16k buildings Ã— 8 scenarios
Please be patient...

Processing: summer_morning...
  2,000 / 16,632 buildings processed
  4,000 / 16,632 buildings processed
  6,000 / 16,632 buildings processed
  8,000 / 16,632 buildings processed
  10,000 / 16,632 buildings processed
  12,000 / 16,632 buildings processed
  14,000 / 16,632 buildings processed
  16,000 / 16,632 buildings processed
  âœ“ 16,632 shadows calculated

Processing: summer_midday...
  2,000 / 16,632 buildings processed
  4,000 / 16,632 buildings processed
  6,000 / 16,632 buildings processed
  8,000 / 16,632 buildings processed
  10,000 / 16,632 buildings processed
  12,000 / 16,632 buildings processed
  14,000 / 16,632 buildings processed
  16,000 / 16,632 buildings processed
  âœ“ 16,632 shadows calculated

Processing: summer_evening...
  2,000 / 16,632 buildings processed
  4,000 / 16,632 buildings processed
  6,000 / 16,632 buildings process

## 5. Extract Tree Canopy Coverage

Using LiDAR heights for shadows

In [9]:
# ============================================================================
# STEP 1: LOAD TREE HEIGHT RASTER
# ============================================================================

import rasterio
from rasterio.mask import mask as raster_mask
from shapely.geometry import box, mapping
from shapely.ops import unary_union
from shapely.affinity import translate
import time

print("\n" + "="*70)
print("OPTIMIZED TREE SHADOW CALCULATION (EDGE-BY-EDGE APPROACH)")
print("="*70)

tree_height_raster_path = Path('data/processed/tree_heights_from_lidar.tif')

if not tree_height_raster_path.exists():
    print("\nâš  Tree height raster not found!")
    raise FileNotFoundError("Need tree_heights_from_lidar.tif")

print("\nâœ“ LiDAR tree HEIGHT raster found")
print("  Using optimized edge-by-edge approach (processes shadows on-the-fly)")
print("  Time estimate: ~2 hours total for all 8 scenarios\n")

# Load tree height raster ONCE
with rasterio.open(tree_height_raster_path) as src:
    tree_height_data = src.read(1)
    tree_transform = src.transform
    tree_crs = src.crs
    pixel_size = tree_transform[0]

print(f"Tree height raster loaded:")
print(f"  Shape: {tree_height_data.shape}")
print(f"  Mean height: {tree_height_data[tree_height_data > 0].mean():.1f} ft")
print(f"  Max height: {tree_height_data.max():.1f} ft")
print(f"  Pixel size: {pixel_size:.1f} ft")
print(f"  CRS: {tree_crs}")

# Open raster for reading (keep open during processing)
tree_raster = rasterio.open(tree_height_raster_path)

print("\nâœ“ Step 1 complete - raster loaded and ready")


OPTIMIZED TREE SHADOW CALCULATION (EDGE-BY-EDGE APPROACH)

âœ“ LiDAR tree HEIGHT raster found
  Using optimized edge-by-edge approach (processes shadows on-the-fly)
  Time estimate: ~2 hours total for all 8 scenarios

Tree height raster loaded:
  Shape: (2563, 4741)
  Mean height: 116.4 ft
  Max height: 208.2 ft
  Pixel size: 3.0 ft
  CRS: EPSG:2272

âœ“ Step 1 complete - raster loaded and ready


In [10]:
# ============================================================================
# STEP 2: INITIALIZE PROGRESS TRACKING
# ============================================================================

print("\n" + "="*70)
print("INITIALIZING PROGRESS TRACKING")
print("="*70)

# Create dictionary to track completed scenarios
completed_scenarios = {}
scenario_times = {}

# Create backup directory
import os
os.makedirs('data/processed/checkpoints', exist_ok=True)

# Check if we have any previous progress
checkpoint_file = Path('data/processed/checkpoints/shade_progress.pkl')
if checkpoint_file.exists():
    import pickle
    with open(checkpoint_file, 'rb') as f:
        checkpoint_data = pickle.load(f)
        completed_scenarios = checkpoint_data.get('completed', {})
        print(f"\nâœ“ Found checkpoint with {len(completed_scenarios)} completed scenarios")
        for scenario in completed_scenarios.keys():
            print(f"  - {scenario}")
else:
    print("\nâœ“ Starting fresh (no previous checkpoint)")

print(f"\nScenarios to process: {len(scenarios)}")
print(f"Scenarios remaining: {len(scenarios) - len(completed_scenarios)}")

print("\nâœ“ Step 2 complete - tracking initialized")


INITIALIZING PROGRESS TRACKING

âœ“ Starting fresh (no previous checkpoint)

Scenarios to process: 8
Scenarios remaining: 8

âœ“ Step 2 complete - tracking initialized


In [None]:
# ============================================================================
# STEP 3: PROCESS SCENARIOS ONE AT A TIME
# ============================================================================

print("\n" + "="*70)
print("PROCESSING SCENARIOS (WITH AUTO-SAVE AFTER EACH)")
print("="*70)

for scenario_name, solar_data in solar_positions.items():
    
    # Skip if already completed
    if scenario_name in completed_scenarios:
        print(f"\nâœ“ {scenario_name} - ALREADY COMPLETED (skipping)")
        continue
    
    print(f"\n{'='*70}")
    print(f"PROCESSING: {scenario_name}")
    print(f"{'='*70}")
    
    scenario_start_time = time.time()
    
    altitude = solar_data['altitude']
    azimuth = solar_data['azimuth']
    
    # Skip if sun is below horizon
    if altitude <= 0:
        print(f"  âš  Sun below horizon, skipping")
        
        edges_proj[f'building_shadow_{scenario_name}'] = [0] * len(edges_proj)
        edges_proj[f'tree_shadow_{scenario_name}'] = [0] * len(edges_proj)
        edges_proj[f'shade_{scenario_name}'] = [0] * len(edges_proj)
        
        completed_scenarios[scenario_name] = 'below_horizon'
        scenario_times[scenario_name] = 0
        
        # Save checkpoint
        import pickle
        with open('data/processed/checkpoints/shade_progress.pkl', 'wb') as f:
            pickle.dump({
                'completed': completed_scenarios,
                'times': scenario_times
            }, f)
        
        print(f"  âœ“ Checkpoint saved\n")
        continue
    
    # Calculate shadow parameters
    altitude_rad = np.radians(altitude)
    shadow_azimuth = (azimuth + 180) % 360
    shadow_azimuth_rad = np.radians(shadow_azimuth)
    
    print(f"  Sun altitude: {altitude:.1f}Â° | Shadow direction: {shadow_azimuth:.1f}Â°")
    print(f"  Estimated time: 15-20 minutes")
    print(f"  Processing {len(edges_proj):,} edges...\n")
    
    # Get building shadows for this scenario
    building_shadows_gdf = building_shadows[scenario_name]
    building_shadow_union = unary_union(building_shadows_gdf.geometry)
    
    building_shade_scores = []
    tree_shade_scores = []
    combined_shade_scores = []
    
    # Process each edge
    for idx, edge in edges_proj.iterrows():
        try:
            edge_geom = edge.geometry
            edge_length = edge_geom.length
            
            # ============================================================
            # BUILDING SHADOW COVERAGE
            # ============================================================
            if building_shadow_union.intersects(edge_geom):
                building_intersection = building_shadow_union.intersection(edge_geom)
                building_coverage = building_intersection.length / edge_length
            else:
                building_coverage = 0
            building_coverage = min(building_coverage, 1.0)
            
            # ============================================================
            # TREE SHADOW COVERAGE (ON-THE-FLY CALCULATION)
            # ============================================================
            
            # Create buffer around edge to capture nearby trees
            # Buffer size based on potential shadow length
            max_shadow_length = 200 / np.tan(max(altitude_rad, 0.1))
            buffer_dist = min(max_shadow_length, 500)  # Cap at 500ft
            
            edge_buffer = edge_geom.buffer(buffer_dist)
            
            # Extract tree heights in buffered area
            try:
                geom = [mapping(edge_buffer)]
                out_image, out_transform = raster_mask(
                    tree_raster,
                    geom,
                    crop=True,
                    all_touched=True,
                    nodata=0
                )
                
                tree_heights_subset = out_image[0]
                
                # Find tree pixels (height > 0)
                tree_pixels = np.argwhere(tree_heights_subset > 0)
                
                if len(tree_pixels) > 0:
                    # Create shadow polygons for tree pixels in this area
                    local_tree_shadows = []
                    
                    # Sample pixels if too many (for performance)
                    if len(tree_pixels) > 2000:
                        indices = np.random.choice(len(tree_pixels), 2000, replace=False)
                        tree_pixels = tree_pixels[indices]
                    
                    for pixel_row, pixel_col in tree_pixels:
                        tree_height = tree_heights_subset[pixel_row, pixel_col]
                        
                        if tree_height <= 0:
                            continue
                        
                        # Get pixel coordinates
                        px, py = rasterio.transform.xy(
                            out_transform,
                            pixel_row,
                            pixel_col
                        )
                        
                        # Create pixel box
                        pixel_box = box(
                            px - pixel_size/2,
                            py - pixel_size/2,
                            px + pixel_size/2,
                            py + pixel_size/2
                        )
                        
                        # Calculate shadow
                        shadow_length = tree_height / np.tan(altitude_rad)
                        dx = shadow_length * np.sin(shadow_azimuth_rad)
                        dy = shadow_length * np.cos(shadow_azimuth_rad)
                        
                        shadow = translate(pixel_box, xoff=dx, yoff=dy)
                        
                        try:
                            full_shadow = unary_union([pixel_box, shadow])
                            if full_shadow.is_valid:
                                local_tree_shadows.append(full_shadow)
                        except:
                            pass
                    
                    # Union local tree shadows
                    if len(local_tree_shadows) > 0:
                        try:
                            local_tree_shadow_union = unary_union(local_tree_shadows)
                            
                            # Calculate intersection with edge
                            if local_tree_shadow_union.intersects(edge_geom):
                                tree_intersection = local_tree_shadow_union.intersection(edge_geom)
                                tree_coverage = tree_intersection.length / edge_length
                            else:
                                tree_coverage = 0
                        except:
                            tree_coverage = 0
                    else:
                        tree_coverage = 0
                else:
                    tree_coverage = 0
            
            except Exception as e:
                tree_coverage = 0
            
            tree_coverage = min(tree_coverage, 1.0)
            
            # ============================================================
            # COMBINED SHADE
            # ============================================================
            combined_shade = (0.6 * building_coverage) + (0.4 * tree_coverage)
            
            building_shade_scores.append(building_coverage)
            tree_shade_scores.append(tree_coverage)
            combined_shade_scores.append(combined_shade)
            
        except Exception as e:
            building_shade_scores.append(0)
            tree_shade_scores.append(0)
            combined_shade_scores.append(0)
        
        # Progress indicator
        if (idx + 1) % 500 == 0:
            elapsed = time.time() - scenario_start_time
            rate = (idx + 1) / elapsed if elapsed > 0 else 0
            remaining = (len(edges_proj) - idx - 1) / rate if rate > 0 else 0
            print(f"    {idx+1:,} / {len(edges_proj):,} edges ({100*(idx+1)/len(edges_proj):.1f}%) | "
                  f"ETA: {remaining/60:.1f} min")
    
    # Store all shade columns for this scenario
    edges_proj[f'building_shadow_{scenario_name}'] = building_shade_scores
    edges_proj[f'tree_shadow_{scenario_name}'] = tree_shade_scores
    edges_proj[f'shade_{scenario_name}'] = combined_shade_scores
    
    # Calculate statistics
    mean_building = np.mean(building_shade_scores)
    mean_tree = np.mean(tree_shade_scores)
    mean_combined = np.mean(combined_shade_scores)
    
    scenario_elapsed = time.time() - scenario_start_time
    scenario_times[scenario_name] = scenario_elapsed
    
    print(f"\n  âœ“ Scenario complete in {scenario_elapsed/60:.1f} minutes")
    print(f"  Building: {mean_building:.3f} | Tree: {mean_tree:.3f} | Combined: {mean_combined:.3f}")
    print(f"  Max combined: {max(combined_shade_scores):.3f}")
    print(f"  Segments >50%: {sum(1 for s in combined_shade_scores if s > 0.5):,}")
    
    # Mark as completed
    completed_scenarios[scenario_name] = 'completed'
    
    # ========================================================================
    # SAVE CHECKPOINT AFTER EACH SCENARIO
    # ========================================================================
    print(f"\n  ðŸ’¾ SAVING CHECKPOINT...")
    
    import pickle
    
    # Save progress tracker
    with open('data/processed/checkpoints/shade_progress.pkl', 'wb') as f:
        pickle.dump({
            'completed': completed_scenarios,
            'times': scenario_times
        }, f)
    
    # Save edges with current progress
    edges_proj.to_file(
        f'data/processed/checkpoints/edges_checkpoint_{len(completed_scenarios)}.geojson',
        driver='GeoJSON'
    )
    
    print(f"  âœ“ Checkpoint saved ({len(completed_scenarios)}/{len(scenarios)} scenarios complete)")
    print(f"  âœ“ Progress saved to: checkpoints/edges_checkpoint_{len(completed_scenarios)}.geojson")
    
    # Estimate remaining time
    if len(scenario_times) > 0:
        avg_time = np.mean(list(scenario_times.values()))
        remaining_scenarios = len(scenarios) - len(completed_scenarios)
        estimated_remaining = (avg_time * remaining_scenarios) / 60
        print(f"  ðŸ“Š Estimated remaining time: {estimated_remaining:.1f} minutes\n")

# Close raster
tree_raster.close()

print("\n" + "="*70)
print("âœ“ ALL SCENARIOS COMPLETE!")
print("="*70)
print(f"\nTotal scenarios processed: {len(completed_scenarios)}")
print(f"Total time: {sum(scenario_times.values())/60:.1f} minutes")


PROCESSING SCENARIOS (WITH AUTO-SAVE AFTER EACH)

PROCESSING: summer_morning
  Sun altitude: 36.9Â° | Shadow direction: 268.8Â°
  Estimated time: 15-20 minutes
  Processing 23,486 edges...



In [None]:
# ============================================================================
# STEP 4: FINAL SAVE
# ============================================================================

print("\n" + "="*70)
print("FINAL SAVE")
print("="*70)

# Convert to WGS84
print("\nConverting to WGS84...")
edges_final = edges_proj.to_crs('EPSG:4326')

# Save final network
output_path = 'data/processed/network_edges_with_shade.geojson'
print(f"Saving to: {output_path}")
edges_final.to_file(output_path, driver='GeoJSON')

# Get file size
import os
file_size_mb = os.path.getsize(output_path) / (1024 * 1024)

print(f"\nâœ“ Network saved!")
print(f"  File: {output_path}")
print(f"  Size: {file_size_mb:.1f} MB")
print(f"  Columns: {len(edges_final.columns)}")

# Count shade columns
shade_cols = [c for c in edges_final.columns if c.startswith('shade_') and 'shadow' not in c]
print(f"  Shade scenarios: {len(shade_cols)}")

# Clean up checkpoints
print("\nðŸ’¾ Cleaning up checkpoints...")
import shutil
if Path('data/processed/checkpoints').exists():
    # Keep progress file, remove edge checkpoints
    for f in Path('data/processed/checkpoints').glob('edges_checkpoint_*.geojson'):
        f.unlink()
        print(f"  Removed: {f.name}")

print("\nâœ“ Step 4 complete - final save done")

In [None]:
# ============================================================================
# STEP 5: SUMMARY STATISTICS
# ============================================================================

print("\n" + "="*70)
print("SHADE ANALYSIS SUMMARY")
print("="*70)

print(f"\nNetwork Statistics:")
print(f"  Total edges: {len(edges_final):,}")
print(f"  Total length: {edges_final.geometry.length.sum()/5280:.1f} miles")

print(f"\nShade Score Statistics:")
print(f"{'Scenario':<20} {'Mean':<8} {'Min':<8} {'Max':<8} {'High Shade (>0.5)'}")
print("-" * 70)

for col in sorted([c for c in edges_final.columns if c.startswith('shade_') and 'shadow' not in c]):
    scenario = col.replace('shade_', '')
    values = edges_final[col].values
    mean_val = np.mean(values)
    min_val = np.min(values)
    max_val = np.max(values)
    high_count = np.sum(values > 0.5)
    high_pct = 100 * high_count / len(values)
    
    print(f"{scenario:<20} {mean_val:.3f}    {min_val:.3f}    {max_val:.3f}    "
          f"{high_count:,} ({high_pct:.1f}%)")

# Processing time summary
print(f"\nProcessing Time by Scenario:")
import pickle
with open('data/processed/checkpoints/shade_progress.pkl', 'rb') as f:
    checkpoint = pickle.load(f)
    times = checkpoint['times']

for scenario, elapsed in sorted(times.items()):
    print(f"  {scenario:<20} {elapsed/60:.1f} min")

print(f"\nTotal computation time: {sum(times.values())/60:.1f} minutes")

print("\n" + "="*70)
print("NOTEBOOK 2 COMPLETE!")
print("="*70)

print("\nâœ“ Building shadows from LiDAR heights (99.7% coverage)")
print("âœ“ Tree shadows from LiDAR heights (geometric projection)")
print("âœ“ Combined shade scores for all scenarios")
print("âœ“ Network ready for routing analysis")

print("\nðŸ“Š Ready for Notebook 3!")
print("\n" + "="*70)

## 6. Save Results

In [None]:
# ============================================================================
# SAVE NETWORK WITH SHADE SCORES
# ============================================================================

print("\n" + "="*70)
print("SAVING NETWORK WITH SHADE SCORES")
print("="*70)

# Convert back to WGS84 for saving
print("\nConverting to WGS84 for output...")
edges_final = edges_proj.to_crs('EPSG:4326')

# Save complete network with all shade scores
output_path = 'data/processed/network_edges_with_shade.geojson'
print(f"Saving network to: {output_path}")
edges_final.to_file(output_path, driver='GeoJSON')

print("\nâœ“ Network with shade scores saved!")
print(f"  File: {output_path}")
print(f"\n  Total columns: {len(edges_final.columns)}")

# Count shade-related columns
building_shade_cols = [c for c in edges_final.columns if 'building_shadow_' in c]
tree_shade_cols = [c for c in edges_final.columns if 'tree_shadow_' in c]
combined_shade_cols = [c for c in edges_final.columns if c.startswith('shade_') and not 'shadow' in c]

print(f"  Building shadow columns: {len(building_shade_cols)}")
print(f"  Tree shadow columns:     {len(tree_shade_cols)}")
print(f"  Combined shade columns:  {len(combined_shade_cols)}")

# Show scenario coverage
print(f"\n  Scenarios saved: {len(combined_shade_cols)}")
if len(combined_shade_cols) > 0:
    scenario_names = [c.replace('shade_', '') for c in combined_shade_cols]
    for i, name in enumerate(scenario_names, 1):
        print(f"    {i}. {name}")

# File size info
import os
file_size_mb = os.path.getsize(output_path) / (1024 * 1024)
print(f"\n  File size: {file_size_mb:.1f} MB")

# ============================================================================
# SUMMARY STATISTICS
# ============================================================================

print("\n" + "="*70)
print("SHADE ANALYSIS SUMMARY")
print("="*70)

print(f"\nNetwork Statistics:")
print(f"  Total edges: {len(edges_final):,}")
print(f"  Total length: {edges_final.geometry.length.sum()/5280:.1f} miles")

print(f"\nShade Score Statistics Across All Scenarios:")
print(f"{'Scenario':<20} {'Mean':<8} {'Min':<8} {'Max':<8} {'High Shade (>0.5)'}")
print("-" * 70)

for col in combined_shade_cols:
    scenario = col.replace('shade_', '')
    values = edges_final[col].values
    mean_val = np.mean(values)
    min_val = np.min(values)
    max_val = np.max(values)
    high_shade_count = np.sum(values > 0.5)
    high_shade_pct = 100 * high_shade_count / len(values)
    
    print(f"{scenario:<20} {mean_val:.3f}    {min_val:.3f}    {max_val:.3f}    "
          f"{high_shade_count:,} ({high_shade_pct:.1f}%)")

# ============================================================================
# COMPLETION MESSAGE
# ============================================================================

print("\n" + "="*70)
print("NOTEBOOK 2 COMPLETE!")
print("="*70)

print("\nâœ“ Building shadows calculated from LiDAR heights (99.7% coverage)")
print("âœ“ Tree shadows calculated from LiDAR heights (geometric projection)")
print("âœ“ Combined shade scores computed for all scenarios")
print("âœ“ Network saved with all shade attributes")

print("\nðŸ“Š Ready for Notebook 3: Routing Analysis")
print("   The network is now ready for shade-weighted pathfinding!")

print("\n" + "="*70)