Aidan H & Zahara M - Monthly Average Ridership by Route & Weekday - https://data.wprdc.org/dataset/prt-monthly-average-ridership-by-route/resource/12bb84ed-397e-435c-8d1b-8ce543108698

This code identifies the top 20 most frequently used bus routes. It analyzes the routes with the highest average ridership, ranks them from most to least used, and outputs each route number along with its rank and average number of riders. By examining these top 20 routes, we can determine which neighborhoods they serve and evaluate the accessibility of their stops, providing insight into how public transportation serves the different neighborhoods of the city.

In [8]:
import pandas as pd

FILE_PATH = "AverageRiders.csv"
ROUTE_NAME_COL = "route_full_name"   
ROUTE_ID_COL = "route"               
RIDERS_COL = "avg_riders"           
TOP_N = 20

def main():
    # Load Average ridership by route through AverageRiders.csv
    df = pd.read_csv(FILE_PATH)

    # Validate required columns
    required = [ROUTE_NAME_COL, ROUTE_ID_COL, RIDERS_COL]
    missing = [c for c in required if c not in df.columns]
    if missing:
        raise ValueError(f"Missing required column(s): {missing}. Found: {list(df.columns)}")

    # Clean text columns
    df[ROUTE_NAME_COL] = df[ROUTE_NAME_COL].astype(str).str.strip()
    df[ROUTE_ID_COL] = df[ROUTE_ID_COL].astype(str).str.strip()

    # Combine route name + ID for display
    df["Route Label"] = df[ROUTE_ID_COL] + " — " + df[ROUTE_NAME_COL]

    # Ensure avg_riders is numeric
    df[RIDERS_COL] = (
        df[RIDERS_COL]
        .astype(str)
        .str.replace(",", "", regex=False)
        .str.strip()
    )
    df[RIDERS_COL] = pd.to_numeric(df[RIDERS_COL], errors="coerce")

    # Drop rows with missing or invalid rider counts
    df = df.dropna(subset=[RIDERS_COL])

    # Average by route (if duplicates exist)
    route_stats = (
        df.groupby(["Route Label", ROUTE_ID_COL, ROUTE_NAME_COL], as_index=False)[RIDERS_COL]
        .mean()
        .rename(columns={RIDERS_COL: "Average Riders"})
    )

    # Round average riders to nearest whole number
    route_stats["Average Riders"] = route_stats["Average Riders"].round(0).astype(int)

    # Sort and select top 20
    top_routes = (
        route_stats.sort_values("Average Riders", ascending=False)
        .head(TOP_N)
        .reset_index(drop=True)
    )

    # Add rank
    top_routes.insert(0, "Rank", range(1, len(top_routes) + 1))

    #Rerange groups for easier to read output.
    top_routes = top_routes[["Rank", "Route Label", ROUTE_ID_COL, "Average Riders"]]
    
    # Print to console
    print(f"\nTop {TOP_N} Most Used Routes (by average riders):\n")
    print(top_routes.to_string(index=False))

if __name__ == "__main__":
    main()


Top 20 Most Used Routes (by average riders):

 Rank                                    Route Label route  Average Riders
    1       RED — RED - Castle Shannon via Beechview   RED            4596
    2                              51 — 51 - CARRICK    51            4323
    3                P1 — P1 - EAST BUSWAY-ALL STOPS    P1            4270
    4         BLSV — BLUE LINE - SOUTH HILLS VILLAGE  BLSV            3906
    5               61C — 61C - MCKEESPORT-HOMESTEAD   61C            3649
    6                     BLLB — BLUE LINE - LIBRARY  BLLB            3213
    7                       71C — 71C - POINT BREEZE   71C            2796
    8                             61D — 61D - MURRAY   61D            2745
    9                     61A — 61A - NORTH BRADDOCK   61A            2738
   10                             71A — 71A - NEGLEY   71A            2532
   11 BLUE — BLUE - SouthHills Village via Overbrook  BLUE            2359
   12                              82 — 82 - LINCOLN 

In conclusion, the data shows that the most frequently used bus lines run through the City of Pittsburgh, with the highest average ridership concentrated on the eastern side of the city. This region benefits from greater access to multiple routes, making it one of the most connected areas for public transportation. Many of the busiest lines, such as the 61 and 71 routes, run through this part of the city. This provides residents along these routes with convenient access to amenities such as shops, schools, and community services that may not be as readily available in other parts of the city.The eastern side of Pittsburgh is also home to several major universities, including the University of Pittsburgh, Carnegie Mellon University, and Chatham University, as well as a variety of shops, restaurants, and community spaces. This combination of accessibility, educational opportunity, and local amenities makes the area the best for residents seeking both convenience and connectivity. Finally, When narrowing the focus to specific neighborhoods within this region, Regent Square* stands out as one of the best examples—offering affordable living, strong transportation links through the 61 routes, and close proximity to the city’s academic and cultural centers.