# Audit of Community Transportation Options Score

## Introduction
This JupyterBook chapter documents the audit process for verifying **Community Transportation Options** scores claimed by applicants under the **State of Georgia 2024-2025 Qualified Allocation Plan (QAP)**. The audit uses MARTA transit data to evaluate the **accuracy of distance calculations, transit hub qualification, and service frequency**.

## Data Sources

### MARTA Datasets Used

| Dataset | Description | Number of Entries | Key Columns |
|---------|-------------|-------------------|-------------|
| `MARTA_Stops.csv` | List of all MARTA transit stops | 9171 | `stop_id`, `stop_name`, `stop_lat`, `stop_lon` |
| `MARTA_Rail_Stations.csv` | List of MARTA rail stations | 38 | `Station`, `Station Code`, `x`, `y` |
| `MARTA_Routes.csv` | Information about MARTA routes | 394 | `route_id`, `route_short_name`, `route_long_name`, `route_type` |

The dataset provides information about **transit stops, rail stations, and bus routes**, which are necessary for verifying applicant claims.

## Audit Methodology
The audit involves the following steps:

1. **Find the nearest transit stop** to the applicant's site using geospatial distance.
2. **Verify if the stop qualifies as a transit hub** (has at least 3 bus/rail routes).
3. **Calculate the actual walking distance** between the site and the transit stop.
4. **Check transit service frequency** (must operate at least 5 days/week or 7 days/week for 6-point claims).
5. **Predict the expected score** based on QAP criteria.

Potential issue: (currently, we assumed all transit services are available)
- Missing Information: Real-Time Transit Service Schedule
  - Whether each transit stop operates at least 5 days/week (or 7 days/week for 6-point claims).
  - Time of day service starts and ends
  - Any seasonal or temporary changes in transit routes



---

## Load Data and Define Functions


In [4]:
import pandas as pd
from geopy.distance import geodesic

In [7]:
# Load MARTA datasets
stops = pd.read_csv("MARTA_data/MARTA_Stops.csv")
rail_stations = pd.read_csv("MARTA_data/MARTA_Rail_Stations.csv")
routes = pd.read_csv("MARTA_data/MARTA_Routes.csv")

In [8]:

# Function to find the nearest transit stop
def find_nearest_transit_stop(site_lat, site_lng, stops_df):
    stops_df["distance_miles"] = stops_df.apply(
        lambda row: geodesic((site_lat, site_lng), (row["stop_lat"], row["stop_lon"])).miles, axis=1
    )
    nearest_stop = stops_df.loc[stops_df["distance_miles"].idxmin()]
    return {
        "stop_id": nearest_stop["stop_id"],
        "stop_name": nearest_stop["stop_name"],
        "stop_lat": nearest_stop["stop_lat"],
        "stop_lon": nearest_stop["stop_lon"],
        "distance_miles": nearest_stop["distance_miles"],
    }

# Function to check if a stop is a transit hub
def verify_transit_hub(stop_id, routes_df):
    stop_routes = routes_df[routes_df["route_id"] == stop_id]
    return len(stop_routes) >= 3

---

## Predict Transit Score


In [9]:
def predict_transit_score(
    site_lat, site_lng, dca_pool, transit_service_days, is_fixed_route, is_site_owned_by_transit_agency
):
    nearest_stop = find_nearest_transit_stop(site_lat, site_lng, stops)
    expected_score = 0.0
    distance = nearest_stop["distance_miles"]
    stop_id = nearest_stop["stop_id"]
    is_hub = verify_transit_hub(stop_id, routes)

    # Subsection A: Transit-Oriented Development
    if dca_pool == "Metro":
        if is_site_owned_by_transit_agency and is_hub and transit_service_days == 7:
            expected_score = 6.0  # A.1
        elif is_hub and transit_service_days >= 5:
            if distance <= 0.25:
                expected_score = 5.0
            elif distance <= 0.5:
                expected_score = 4.5
            elif distance <= 1.0:
                expected_score = 4.0
    
    # Subsection B: General Public Transit Access
    if dca_pool == "Metro" and is_fixed_route and transit_service_days >= 5:
        if distance <= 0.25:
            expected_score = 3.0
        elif distance <= 0.5:
            expected_score = 2.0
        elif distance <= 1.0:
            expected_score = 1.0

    return {
        "expected_score": expected_score,
        "distance_miles": distance,
        "nearest_stop": nearest_stop["stop_name"],
        "is_hub": is_hub,
    }


---

## Example Prediction Execution


In [10]:
prediction_result = predict_transit_score(
    site_lat=33.7490, site_lng=-84.3880,  # Example site in Atlanta
    dca_pool="Metro",
    transit_service_days=7,
    is_fixed_route=True,
    is_site_owned_by_transit_agency=False
)

# Display results
prediction_result

{'expected_score': 3.0,
 'distance_miles': 0.07536159006798185,
 'nearest_stop': 'MARTIN L KING J DR @ COURTLAND ST',
 'is_hub': False}