# Audit of Community Transportation Options Score


## I. Introduction:

*   **Goal:** This document outlines the methodology used to audit and verify **Community Transportation Options** scores claimed by applicants under the **State of Georgia 2024-2025 Qualified Allocation Plan (QAP)**, Section XII.
*   **Purpose:** The primary objective is to provide a systematic and data-driven approach to assess the validity of applicant claims regarding proximity to public transit, focusing on:
    1.  The **accuracy of walking distance calculations** submitted by applicants compared to independently calculated network distances.
    2.  The preliminary assessment of **transit hub qualification** for nearby stations identified via external data sources.
*   **Methodology Summary:** The audit uses:
    *   A comprehensive dataset of potential transit locations across Georgia, generated by querying the **Google Places API** (Phase 1).
    *   Network-based walking distance calculations using **OpenStreetMap data via the `osmnx` library** (Phase 2) originating from the specific applicant site coordinates.
    *   Systematic application of the QAP Section XII distance thresholds and point values.
*   **Scope:** This audit focuses on verifying the distance-based points claimed under Sections XII.A (Transit-Oriented Development - Hubs) and XII.B (Access to Public Transportation - Stops).
*   **Limitations:** While this process automates distance verification and flags *potential* hubs based on API data, it serves as a **preliminary audit**. Final determination requires **manual verification** of specific QAP criteria not directly captured by the automated tools, including:
    *   Confirmation that a potential hub meets the QAP's 3+ distinct route definition.
    *   Verification of operational requirements (e.g., 5+ days/week service, fixed route/schedule).


## II.Phase 2: QAP Community Transportation Scoring 

**1. Goal:**
   To automatically calculate a preliminary QAP Community Transportation score for a given applicant site by finding the walking distance to nearby transit stops and potential hubs identified in Phase 1.

**2. Inputs:**
    *   `georgia_transit_locations_with_hub.csv` (Output from Phase 1 - list of stops/hubs with coordinates and hub flag).
    *   Applicant Site Coordinates (Latitude, Longitude).

**3. Outputs:**
    *   A printed console report showing:
        *   Closest stop distance & potential Section B score.
        *   Closest potential hub distance & potential Section A score.
        *   Final preliminary score (Max of A or B).
        *   Notes on required manual verifications.

**4. Method:**
    1.  Load Phase 1 transit stop/hub data.
    2.  Filter stops within ~1 mile (straight-line) of the applicant site.
    3.  For filtered stops, calculate *network walking distance* from the applicant site using `osmnx` and OpenStreetMap data.
    4.  Identify the closest stop and closest potential hub based on calculated walking distances.
    5.  Apply QAP distance thresholds and point values (Sections A & B) to determine the preliminary score.
    6.  Format and print the results, including required verification notes.


---
## Code


In [33]:
"""
Phase 2 QAP Transportation Scoring Script

This script analyzes a specific applicant location against a pre-compiled list
of potential transit stops (generated by Phase 1). It calculates network
walking distances using osmnx and OpenStreetMap data to determine potential
scores under the Georgia QAP Community Transportation Options criteria.

Requires: pandas, osmnx, networkx
Input: CSV file from Phase 1, Applicant Latitude/Longitude
Output: Scoring analysis printed to the console.
"""

import pandas as pd
import osmnx as ox
import networkx as nx
import math
import time
import logging
import sys

# --- Configuration ---

# Input File from Phase 1
PHASE1_CSV_FILE = "georgia_transit_locations_with_hub.csv"


# QAP Distance Thresholds (miles)
DIST_THRESHOLD_TIER1 = 0.25
DIST_THRESHOLD_TIER2 = 0.50
DIST_THRESHOLD_TIER3 = 1.00

# Max search radius for initial filtering (straight-line miles)
# Should be slightly larger than the max QAP distance
MAX_SEARCH_RADIUS_MILES = 1.1

# QAP Point Values
POINTS_A = {  # Section A (Hub)
    DIST_THRESHOLD_TIER1: 5.0,
    DIST_THRESHOLD_TIER2: 4.5,
    DIST_THRESHOLD_TIER3: 4.0,
}
POINTS_B = {  # Section B (Stop)
    DIST_THRESHOLD_TIER1: 3.0,
    DIST_THRESHOLD_TIER2: 2.0,
    DIST_THRESHOLD_TIER3: 1.0,
}

# OSMnx Configuration (Adjust buffer as needed)
# Larger buffer downloads more data but increases chance of finding a path
NETWORK_BUFFER_DEGREES = 0.02 # Approx 2.2km buffer around points

# Logging Configuration
logging.getLogger().disabled = True

# --- Helper Functions ---

def haversine(lat1, lon1, lat2, lon2):
    """
    Calculates the great-circle distance between two points
    on the earth (specified in decimal degrees) in miles.
    """
    if None in [lat1, lon1, lat2, lon2]:
        return float('inf') # Handle potential None coordinates

    R = 3958.8 # Earth radius in miles
    phi1 = math.radians(lat1)
    phi2 = math.radians(lat2)
    delta_phi = math.radians(lat2 - lat1)
    delta_lambda = math.radians(lon2 - lon1)

    a = math.sin(delta_phi / 2)**2 + \
        math.cos(phi1) * math.cos(phi2) * \
        math.sin(delta_lambda / 2)**2
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
    distance_miles = R * c
    return distance_miles

def get_network_walking_distance(orig_lat, orig_lon, dest_lat, dest_lon, retries=1):
    """
    Calculates walking distance using osmnx network analysis.

    Downloads OSM walking network data for the bounding box containing
    origin and destination points. Finds nearest network nodes and calculates
    the shortest path distance.

    Args:
        orig_lat (float): Origin latitude.
        orig_lon (float): Origin longitude.
        dest_lat (float): Destination latitude.
        dest_lon (float): Destination longitude.
        retries (int): Number of times to retry on failure (e.g., network issue).

    Returns:
        float: Walking distance in miles, or None if path not found or error occurs.
    """
    for attempt in range(retries + 1):
        try:
            # Determine bounding box with buffer
            north = max(orig_lat, dest_lat) + NETWORK_BUFFER_DEGREES
            south = min(orig_lat, dest_lat) - NETWORK_BUFFER_DEGREES
            east = max(orig_lon, dest_lon) + NETWORK_BUFFER_DEGREES
            west = min(orig_lon, dest_lon) - NETWORK_BUFFER_DEGREES

            # Download/Cache the walking network graph
            # osmnx automatically caches downloaded data for reuse
            logging.debug(f"OSMnx: Requesting graph for bbox: N{north:.4f}, S{south:.4f}, E{east:.4f}, W{west:.4f}")
            G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
                                   simplify=True, truncate_by_edge=True)
            logging.debug("OSMnx: Graph loaded.")

            # Find nearest nodes on the graph
            orig_node = ox.nearest_nodes(G, X=orig_lon, Y=orig_lat)
            dest_node = ox.nearest_nodes(G, X=dest_lon, Y=dest_lat)
            logging.debug(f"OSMnx: Nodes found - Orig: {orig_node}, Dest: {dest_node}")

            # Calculate shortest path length
            distance_meters = nx.shortest_path_length(G, source=orig_node, target=dest_node, weight='length')
            distance_miles = distance_meters * 0.000621371
            logging.debug(f"OSMnx: Distance calculated - {distance_miles:.4f} miles")
            return distance_miles # Success

        except nx.NetworkXNoPath:
            logging.warning(f"OSMnx: No walkable path found between ({orig_lat:.4f},{orig_lon:.4f}) and "
                         f"({dest_lat:.4f},{dest_lon:.4f}) on attempt {attempt+1}")
            # Return None immediately if no path exists on the graph
            return None
        except Exception as e:
            logging.error(f"OSMnx: Error calculating distance (attempt {attempt+1}/{retries+1}): {e}", exc_info=False) # Set exc_info=True for full traceback
            if attempt < retries:
                logging.info("Retrying distance calculation after a short delay...")
                time.sleep(3) # Wait longer before retrying potentially heavier operations
            else:
                logging.error("OSMnx: Final attempt failed.")
                return None # Return None after final retry failure
    return None # Should not be reached


def load_transit_data(filepath):
    """Loads and preprocesses transit data from the Phase 1 CSV."""
    logging.info(f"Loading transit data from {filepath}...")
    try:
        transit_df = pd.read_csv(filepath)
        # Basic validation and type conversion
        required_cols = ['latitude', 'longitude', 'is_potential_hub', 'place_id', 'name']
        if not all(col in transit_df.columns for col in required_cols):
            raise ValueError(f"CSV missing required columns: {required_cols}")

        transit_df['latitude'] = pd.to_numeric(transit_df['latitude'], errors='coerce')
        transit_df['longitude'] = pd.to_numeric(transit_df['longitude'], errors='coerce')
        # Interpret various string representations of True/False for the hub flag
        transit_df['is_potential_hub'] = transit_df['is_potential_hub'].astype(str).str.lower().isin(['true', '1', 't', 'y', 'yes'])

        initial_rows = len(transit_df)
        transit_df.dropna(subset=['latitude', 'longitude'], inplace=True)
        valid_rows = len(transit_df)
        if initial_rows > valid_rows:
            logging.warning(f"Dropped {initial_rows - valid_rows} rows with invalid coordinates.")

        logging.info(f"Loaded {valid_rows} valid transit locations.")
        if valid_rows == 0:
            logging.error("No valid transit locations loaded. Exiting.")
            sys.exit(1)
        return transit_df

    except FileNotFoundError:
        logging.error(f"Fatal: Input CSV file not found - {filepath}")
        sys.exit(1)
    except ValueError as ve:
         logging.error(f"Fatal: Error processing CSV structure: {ve}")
         sys.exit(1)
    except Exception as e:
        logging.error(f"Fatal: An unexpected error occurred loading CSV: {e}", exc_info=True)
        sys.exit(1)


def filter_candidate_stops(df, applicant_lat, applicant_lon, radius_miles):
    """Filters DataFrame for stops within a straight-line radius."""
    logging.info(f"Filtering for candidate stops within {radius_miles} miles (straight-line)...")
    candidates = []
    for _, stop in df.iterrows():
        straight_line_dist = haversine(applicant_lat, applicant_lon, stop['latitude'], stop['longitude'])
        if straight_line_dist <= radius_miles:
            candidate_info = stop.to_dict()
            candidate_info['straight_line_dist'] = straight_line_dist
            candidates.append(candidate_info)
    logging.info(f"Found {len(candidates)} candidates for network distance calculation.")
    return candidates

def calculate_all_walking_distances(candidates, applicant_lat, applicant_lon):
    """Calculates network walking distances for a list of candidate stops."""
    logging.info("Calculating network walking distances for candidates...")
    results = []
    failed_calculations = 0
    total_candidates = len(candidates)

    for i, stop_info in enumerate(candidates):
        logging.info(f"  Calculating distance to: '{stop_info['name']}' ({i+1}/{total_candidates})...")
        walking_dist_miles = get_network_walking_distance(
            applicant_lat, applicant_lon,
            stop_info['latitude'], stop_info['longitude']
        )

        if walking_dist_miles is not None:
            stop_info['walking_distance_miles'] = walking_dist_miles
            results.append(stop_info)
            logging.info(f"    -> Distance: {walking_dist_miles:.3f} miles")
        else:
            failed_calculations += 1
            logging.warning(f"    -> Failed to calculate walking distance for '{stop_info['name']}' ({stop_info['place_id']}).")
        # Optional: Add a small delay to be polite to OSM servers if running many queries rapidly
        # time.sleep(0.05)

    logging.info(f"Finished distance calculations. {len(results)} successful, {failed_calculations} failed.")
    if failed_calculations > 0:
         logging.warning("Final score will be based only on stops where distance calculation succeeded.")
    return results


def apply_qap_scoring(results_list):
    """Applies QAP scoring logic based on calculated walking distances."""
    logging.info("Applying QAP scoring logic...")
    scoring_output = {
        "score_a": 0.0,
        "score_b": 0.0,
        "final_score": 0.0,
        "closest_stop_dist": float('inf'),
        "closest_hub_dist": float('inf'),
        "closest_stop_name": "None Found within calculable range",
        "closest_hub_name": "None Found within calculable range",
        "closest_stop_type": None,
        "closest_hub_type": None,
        "calculation_failures": 0 # Will be updated if passed from previous step
    }

    if not results_list:
        logging.warning("No successful distance calculations to score.")
        return scoring_output

    # Sort by calculated walking distance
    results_list.sort(key=lambda x: x['walking_distance_miles'])

    # --- Section B Scoring (Closest Stop Overall) ---
    closest_stop = results_list[0]
    scoring_output["closest_stop_dist"] = closest_stop['walking_distance_miles']
    scoring_output["closest_stop_name"] = closest_stop['name']
    scoring_output["closest_stop_type"] = closest_stop['source_type']

    for threshold, points in sorted(POINTS_B.items()): # Check from lowest distance up
        if scoring_output["closest_stop_dist"] <= threshold:
            scoring_output["score_b"] = points
            break # Assign points for the tightest threshold met

    # --- Section A Scoring (Closest Potential Hub) ---
    potential_hubs = [r for r in results_list if r['is_potential_hub']]
    if potential_hubs:
        # Already sorted by distance, so the first hub is the closest
        closest_hub = potential_hubs[0]
        scoring_output["closest_hub_dist"] = closest_hub['walking_distance_miles']
        scoring_output["closest_hub_name"] = closest_hub['name']
        scoring_output["closest_hub_type"] = closest_hub['source_type']

        for threshold, points in sorted(POINTS_A.items()):
            if scoring_output["closest_hub_dist"] <= threshold:
                scoring_output["score_a"] = points
                break
    else:
        logging.info("No stops flagged as 'Potential Hubs' found within calculable range.")


    # --- Final Score ---
    scoring_output["final_score"] = max(scoring_output["score_a"], scoring_output["score_b"])

    logging.info(f"Scoring complete. Score A: {scoring_output['score_a']}, Score B: {scoring_output['score_b']}, Final: {scoring_output['final_score']}")
    return scoring_output


def format_output_report(applicant_lat, applicant_lon, scoring_results):
    """Formats the final scoring results into a readable report."""
    report = []
    report.append("\n" + "="*70)
    report.append(" QAP Community Transportation Options - Scoring Analysis Report")
    report.append("="*70)
    report.append(f"Applicant Location:      Latitude={applicant_lat:.6f}, Longitude={applicant_lon:.6f}")
    report.append("-"*70)
    report.append("Closest Stop (Overall):")
    report.append(f"  Name:                '{scoring_results['closest_stop_name']}'")
    report.append(f"  Place Type:            {scoring_results['closest_stop_type']}")
    report.append(f"  Walking Distance:    {scoring_results['closest_stop_dist']:.3f} miles (calculated via OSM network)")
    report.append(f"  Potential Section B Score: {scoring_results['score_b']}")
    report.append("-"*70)
    report.append("Closest Potential Hub:")
    report.append(f"  Name:                '{scoring_results['closest_hub_name']}'")
    report.append(f"  Place Type:            {scoring_results['closest_hub_type']}")
    report.append(f"  Walking Distance:    {scoring_results['closest_hub_dist']:.3f} miles (calculated via OSM network)")
    report.append(f"  Potential Section A Score: {scoring_results['score_a']}")
    report.append("-"*70)
    report.append(f"FINAL CALCULATED SCORE (Max of A or B): {scoring_results['final_score']:.1f}")
    report.append("="*70)
    report.append("\nIMPORTANT NOTES & REQUIRED VERIFICATIONS:")
    if scoring_results.get('calculation_failures', 0) > 0 :
         report.append(f"* WARNING: Failed to calculate walking distance for {scoring_results['calculation_failures']} nearby candidate stops.")
         report.append("  The score is based only on stops where distance calculation was successful.")
    if scoring_results['score_a'] > 0:
        report.append(f"* Section A Score Verification: The 'Potential Hub' ('{scoring_results['closest_hub_name']}')")
        report.append("  MUST be manually verified to meet the QAP definition (3+ distinct bus/rail/mass transit routes).")
    if scoring_results['final_score'] > 0:
        report.append(f"* Operational Verification: The qualifying transit stop/hub ('{scoring_results['closest_stop_name'] if scoring_results['score_b'] >= scoring_results['score_a'] else scoring_results['closest_hub_name']}')")
        report.append("  MUST be manually verified to meet QAP operational requirements (e.g., 5+ days/week service, fixed route).")
    report.append("* Distance Method: Walking distances calculated using OpenStreetMap network data via osmnx.")
    report.append("  Accuracy depends on OSM data quality for pedestrian paths in the specific area.")
    report.append("="*70)

    return "\n".join(report)

# --- Main Execution ---

def main():
    """Main function to orchestrate the scoring process."""
    # 1. Load Data
    transit_df = load_transit_data(PHASE1_CSV_FILE)

    # 2. Filter Candidates
    candidate_stops = filter_candidate_stops(transit_df, APPLICANT_LAT, APPLICANT_LON, MAX_SEARCH_RADIUS_MILES)

    if not candidate_stops:
        logging.warning("No candidate stops found within the initial search radius. Final score is 0.")
        scoring_results = apply_qap_scoring([]) # Get default zero score output
    else:
        # 3. Calculate Distances
        results_with_distances = calculate_all_walking_distances(candidate_stops, APPLICANT_LAT, APPLICANT_LON)

        # 4. Apply Scoring
        scoring_results = apply_qap_scoring(results_with_distances)
        # Pass the failure count if available (though apply_qap_scoring doesn't use it directly now)
        scoring_results['calculation_failures'] = len(candidate_stops) - len(results_with_distances)


    # 5. Format and Print Report
    report = format_output_report(APPLICANT_LAT, APPLICANT_LON, scoring_results)
    print(report)


In [34]:
if __name__ == "__main__":
    # Applicant Site Coordinates 
    APPLICANT_LAT = 33.278968
    APPLICANT_LON = -83.965148
    main()

  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type=

KeyError: 'closest_hub_id'

In [27]:
if __name__ == "__main__":
    # Applicant Site Coordinates 
    APPLICANT_LAT = 33.823971
    APPLICANT_LON = -84.616553
    main()

  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type='walk',
  G = ox.graph_from_bbox(north, south, east, west, network_type=


 QAP Community Transportation Options - Scoring Analysis Report
Applicant Location:      Latitude=33.823971, Longitude=-84.616553
----------------------------------------------------------------------
Closest Stop (Overall):
  Name:                'Cato Environmental Education Center'
  Place ID:            ChIJ6xpZr5oi9YgRjDZXcHTLPys
  Walking Distance:    0.000 miles (calculated via OSM network)
  Potential Section B Score: 3.0
----------------------------------------------------------------------
Closest Potential Hub:
  Name:                'Austell Rd at Perkerson Mill Rd SW'
  Place ID:            ChIJ6RkrTpoi9YgRH4NtjzizRdc
  Walking Distance:    0.203 miles (calculated via OSM network)
  Potential Section A Score: 5.0
----------------------------------------------------------------------
FINAL CALCULATED SCORE (Max of A or B): 5.0

IMPORTANT NOTES & REQUIRED VERIFICATIONS:
* Section A Score Verification: The 'Potential Hub' ('Austell Rd at Perkerson Mill Rd SW')
  MUST be manu

In [30]:
potential_hubs

[{'place_id': 'ChIJwbgsEr2K7IgReJi3-lZlGXE',
  'name': 'Ross Rd & Shelfer Rd',
  'latitude': 30.38181999999999,
  'longitude': -84.282361,
  'is_potential_hub': True,
  'straight_line_dist': 0.7479077365590184,
  'walking_distance_miles': 2.129504903697}]