# [Issue# 1897 Additional Visuals for PUC Analysis](https://github.com/cal-itp/data-analyses/issues/1897)

Received list of transit operators cohorts that may be exempt from efficiency reporting, per PUC 99314.11, .6 and .7. 
- create visuals based on grouping set by list
- recreate visuals based on previous notebook work

## [99314.6](https://leginfo.legislature.ca.gov/faces/codes_displaySection.xhtml?sectionNum=99314.6.&lawCode=PUC)
>`funds shall be allocated for operating or capital purpose` pursuant to Sections 99313 and 99314 to an operator `if the operator meets either of the following efficiency standards`:
>- (A) `The operator shall receive its entire allocation`, and any or all of this allocation may be used for operating purposes, if the operator’s `total operating cost per revenue vehicle hour` in the latest year for which audited data are available `does not exceed the sum of the preceding year’s total operating cost per revenue vehicle hour and an amount equal to the product of the percentage change in the Consumer Price Index for the same period multiplied by the preceding year’s total operating cost per revenue vehicle hour.`
>- (B) The operator shall receive its entire allocation, and any or all of this allocation may be used for operating purposes, `if the operator’s average total operating cost per revenue vehicle hour` in the latest three years for which audited data are available `does not exceed the sum of the average of the total operating cost per revenue vehicle hour in the three years preceding the latest year for which audited data are available and an amount equal to the product of the average percentage change in the Consumer Price Index for the same period multiplied by the average total operating cost per revenue vehicle hour in the same three years`.
## [99314.7 (mainly MTC specific)](https://leginfo.legislature.ca.gov/faces/codes_displaySection.xhtml?lawCode=PUC&sectionNum=99314.7.)
>the `Metropolitan Transportation Commission` shall apply the following eligibility standards to the operators within the region subject to its jurisdiction:

# [99314.11](https://leginfo.legislature.ca.gov/faces/codes_displaySection.xhtml?sectionNum=99314.11.&nodeTreePath=17.11.2.8&lawCode=PUC)
>`Sections 99314.6 and 99314.7 do not apply to an operator for a fiscal year in which the operator expended from local funding an amount for transit operations not less than the amount the operator expended from local funding for transit operations during the 2018–19 fiscal year.` As used in this subdivision, “local funding” means any nonstate grant funds or other revenues generated by, earned by, or distributed to, an operator.

Meaning, if a transit operator spent local funds >= the local funds spent during FY 2018-2019, they are exempt from meeting efficiency standards(?)

## Data Exploration

### Categorical variaables
- Underlying metric
  - Farebox Recovery Ratio
  - Local funding expended
- area type
  - urban
  - rural
- cohorts
  - A
  - B
  - C
- NTD metric
  - UPT
  - PMT
  - VRH
- year
  - 2019
  - 2020
  - 2021
  - 2022
  - 2023
  - 2024

## analyses should be split by underlying metric
resulting groups are:
1. Farebox Recovery ratio
    - urban
        - cohorts
        - ntd metric
        - year
    - rural
        - cohorts
        - ntd metrics
        - year
2. Local funding expended
    - urban
        - cohorts
        - ntd metric
        - year
    - rural
        - cohorts
        - ntd metrics
        - year

## 



In [1]:
import pandas as pd
import altair as alt
from functools import cache
from calitp_data_analysis.gcs_pandas import GCSPandas
from calitp_data_analysis.sql import get_engine, to_snakecase, query_sql

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.options.display.float_format = '{:,.2f}'.format

@cache
def gcs_pandas():
    return GCSPandas()

# Read in cohort list data

In [2]:
# cohort_data = gcs_pandas().read_csv("gs://calitp-analytics-data/data-analyses/ntd/fbr_local_funding_by_cohorts_2019-2024_compiled.csv")

# cohort_data.columns = cohort_data.columns.str.lower()
# cohort_data["ntd_id"] = cohort_data["ntd_id"].astype("str")

# display(
#     cohort_data.info(),
#     cohort_data.head(),
#     cohort_data.value_counts(
#     subset=["urban_rural","metric","cohort","year"]
#     )
# )

# Read in analysis data from prev notebook

In [3]:
gcs_path = "gs://calitp-analytics-data/data-analyses/ntd/"
# ntd_name = "puc_analysis_data.parquet"
# ntd_analysis_data = gcs_pandas().read_parquet(f"{gcs_path}{ntd_name}")

# display(
#     ntd_analysis_data.info(),
#     ntd_analysis_data["year"].unique()
# )

# May need to requery this data to include 2024
is 2024 NTD data in the warehouse now? copy pasted from initial puc analysis notebook.

In [4]:
# metric_list = [
#     "pmt",
#     "upt",
#     "vrh",
#     # "opexp_total" # not needed for this project
# ]

# # empty list for appending DFs
# df_list = []

# # loop to query pmt, upt and vrh from 2018 to 2024
# for metric in metric_list:
#         query = f"""
#         SELECT
#           ntd_id,
#           source_agency,
#           agency_status,
#           primary_uza_name,
#           uza_population,
#           uza_area_sq_miles,
#           year,
#           mode,
#           type_of_service,
#           reporter_type,
#           SUM({metric}) AS total_{metric},
#         FROM
#           `cal-itp-data-infra.mart_ntd_funding_and_expenses.fct_service_data_and_operating_expenses_time_series_by_mode_{metric}`
#         WHERE
#           source_state = "CA"
#           AND year BETWEEN 2018 AND 2024
#         GROUP BY
#           ntd_id,
#           source_agency,
#           agency_status,
#           primary_uza_name,
#           uza_population,
#           uza_area_sq_miles,
#           year,
#           mode,
#           type_of_service,
#           reporter_type
#         """
#         # create df
#         metric = query_sql(query, as_df=True)

#         # append df to list
#         df_list.append(metric)

# # unpack list into separate DFs
# ntd_pmt, ntd_upt, ntd_vrh = df_list

# display( 
#     ntd_upt.head(3)
# )

## merge all the metrics together

In [5]:
# merge_on_col = [
#     "ntd_id",
#     "year",
#     "source_agency",
#     "agency_status",
#     "primary_uza_name",
#     "uza_population",
#     "uza_area_sq_miles",
#     "mode",
#     "type_of_service",
#     "reporter_type",
# ]

# merge_1 = ntd_vrh.merge(ntd_upt, on=merge_on_col, how="inner")
# # merge_2 = merge_1.merge(ntd_vrh, on=merge_on_col, how = "inner")

# ntd_metrics_merge = merge_1.merge(ntd_pmt, on=merge_on_col, how="inner")

# ntd_metrics_merge.head(3)

## get districts for ntd ID
- Do i still need district data for this specific analysis?

In [6]:
# for metric in metric_list:
#         query = f"""
#         SELECT
#           `mart_transit_database.dim_organizations`.`key` AS `key`,
#           `mart_transit_database.dim_organizations`.`source_record_id` AS `source_record_id`,
#           `mart_transit_database.dim_organizations`.`name` AS `name`,
#           `mart_transit_database.dim_organizations`.`ntd_id_2022` AS `ntd_id_2022`,
#           `Bridge_Organizations_X_Headquarters_County_Geography___Key`.`county_geography_name` AS `county`,
#           `Dim_County_Geography___County_Geography_Key`.`caltrans_district` AS `caltrans_district`
#         FROM
#           `mart_transit_database.dim_organizations`

#         LEFT JOIN `mart_transit_database.bridge_organizations_x_headquarters_county_geography` AS `Bridge_Organizations_X_Headquarters_County_Geography___Key` ON `mart_transit_database.dim_organizations`.`key` = `Bridge_Organizations_X_Headquarters_County_Geography___Key`.`organization_key`
#           LEFT JOIN `mart_transit_database.dim_county_geography` AS `Dim_County_Geography___County_Geography_Key` ON `Bridge_Organizations_X_Headquarters_County_Geography___Key`.`county_geography_key` = `Dim_County_Geography___County_Geography_Key`.`key`
#         WHERE
#           (
#             `mart_transit_database.dim_organizations`.`_is_current` = TRUE
#           )

#            AND (
#             `mart_transit_database.dim_organizations`.`ntd_id_2022` IS NOT NULL
#           )
#           AND (
#             (
#               `mart_transit_database.dim_organizations`.`ntd_id_2022` <> ''
#             )

#             OR (
#               `mart_transit_database.dim_organizations`.`ntd_id_2022` IS NULL
#             )
#           )
#           AND (
#             `Bridge_Organizations_X_Headquarters_County_Geography___Key`.`_is_current` = TRUE
#           )
#           AND (
#             `Dim_County_Geography___County_Geography_Key`.`_is_current` = TRUE
#           )
#         """
#         # create df
#         ntd_id_x_district = query_sql(query, as_df=True)
        
# ntd_id_x_district["caltrans_district"] = ntd_id_x_district["caltrans_district"].astype("str")

# ntd_id_x_district.head()

## merge the ntd metrics with Caltrans Districts

In [7]:
# ntd_metrics_merge = ntd_metrics_merge.merge(
#     ntd_id_x_district[["ntd_id_2022","county","caltrans_district"]],
#     left_on = "ntd_id",
#     right_on = "ntd_id_2022",
#     how="inner",
#     indicator=True
# )

# ntd_metrics_merge.head()

# merge ntd metrics with cohort data
- merge on ntd_id
- are there any unmerged rows?

In [8]:
# ntd_cohort_merge = ntd_metrics_merge.drop(columns="_merge").merge(
#     cohort_data,
#     left_on = ["ntd_id","year"],
#     right_on = ["ntd_id","year"],
#     indicator= True,
# )

# # any unmerged rows? NONE
# ntd_cohort_merge["_merge"].value_counts()

In [9]:
# # Sanity check
# # pick up a couple of NTD ID, see if the merge data tracks with the cohort data
# sample_ids = ntd_cohort_merge["ntd_id"].sample(3).to_list()
# keep_cols=[
#     "ntd_id",
#     "source_agency",
#     "mode",
#     "type_of_service",
#     "total_vrh",
#     "total_pmt",
#     "total_upt",
#     "urban_rural",
#     "cohort",
#     "metric",
#     "year"
# ]

# for sample_id in sample_ids:
#     display(
#         f"Sameple NTD ID: {sample_id}",
#         "cohort data",
#         cohort_data[
#             (cohort_data["ntd_id"]== sample_id)
#             & (cohort_data["year"].isin([2023,2024]))
#             ].sort_values(by=["urban_rural","cohort","metric","year"]).head(5),
#         "merge table",
#         ntd_cohort_merge[
#             (ntd_cohort_merge["ntd_id"]== sample_id)
#             & (ntd_cohort_merge["year"].isin([2023,2024]))
#             ][keep_cols].sort_values(by=["urban_rural","cohort","metric","year"]),
        
#     )

# # cohort data matches, 
# # looks a little weird since the ntd metrics is per mode and TOS. the cohort data becomes categorical. GTG

# Save merged cohort data

In [10]:
cort_merge_filname = "ntd_cohort_data_2026-01-26.parquet"
# gcs_pandas().data_frame_to_parquet(ntd_cohort_merge,f"{gcs_path}{cort_merge_filname}")

# Read in merged cohort data from GCS

In [11]:
ntd_cohort_merge = gcs_pandas().read_parquet(f"{gcs_path}{cort_merge_filname}")

# separate list by both metrics (farebox and funding change)

In [12]:
cohort_merge_farebox = ntd_cohort_merge[ntd_cohort_merge["metric"]=="Farebox Recovery Ratio"]
cohort_merge_funding = ntd_cohort_merge[ntd_cohort_merge["metric"]=="Local Funding % Change vs 2019"]

In [13]:
display(
    cohort_merge_farebox.shape,
    cohort_merge_funding.shape,
    cohort_merge_farebox["metric"].unique(),
    cohort_merge_funding["metric"].unique(),
    cohort_merge_farebox.columns
)

(2460, 20)

(2505, 20)

array(['Farebox Recovery Ratio'], dtype=object)

array(['Local Funding % Change vs 2019'], dtype=object)

Index(['ntd_id', 'source_agency', 'agency_status', 'primary_uza_name',
       'uza_population', 'uza_area_sq_miles', 'year', 'mode',
       'type_of_service', 'reporter_type', 'total_vrh', 'total_upt',
       'total_pmt', 'ntd_id_2022', 'county', 'caltrans_district',
       'urban_rural', 'cohort', 'metric', '_merge'],
      dtype='object')

# Group aggregation

## melt big DF 
- so all columns are under 1 column.

In [14]:
group_list_melt = [
    "source_agency",
    "year",
    "ntd_id",
    "caltrans_district",
    "mode",
    "type_of_service",
    "urban_rural",
    "cohort",
    "metric"
]

value_cols = ["total_upt", "total_vrh", "total_pmt"]

melt_farebox = pd.melt(
    cohort_merge_farebox,
    id_vars=group_list_melt,
    value_vars=value_cols,
    var_name="ntd_metric",
    value_name="ntd_metric_value",
    ignore_index=True,
)

melt_funding = pd.melt(
    cohort_merge_funding,
    id_vars=group_list_melt,
    value_vars=value_cols,
    var_name="ntd_metric",
    value_name="ntd_metric_value",
    ignore_index=True,
)

In [15]:
display(
    melt_farebox.shape,
    melt_funding.shape
)

(7380, 11)

(7515, 11)

In [16]:
sample_ids = ntd_cohort_merge["ntd_id"].sample(3).to_list()
melt_farebox[melt_farebox["ntd_id"].isin([sample_ids[1]])].sort_values(by=["year","mode","type_of_service"])

Unnamed: 0,source_agency,year,ntd_id,caltrans_district,mode,type_of_service,urban_rural,cohort,metric,ntd_metric,ntd_metric_value
630,City of Camarillo (CAT) - Public Works,2019,90163,7,DR,PT,Urban,Group C,Farebox Recovery Ratio,total_upt,97529.0
3090,City of Camarillo (CAT) - Public Works,2019,90163,7,DR,PT,Urban,Group C,Farebox Recovery Ratio,total_vrh,28280.0
5550,City of Camarillo (CAT) - Public Works,2019,90163,7,DR,PT,Urban,Group C,Farebox Recovery Ratio,total_pmt,
627,City of Camarillo (CAT) - Public Works,2019,90163,7,MB,PT,Urban,Group C,Farebox Recovery Ratio,total_upt,77029.0
3087,City of Camarillo (CAT) - Public Works,2019,90163,7,MB,PT,Urban,Group C,Farebox Recovery Ratio,total_vrh,5325.0
5547,City of Camarillo (CAT) - Public Works,2019,90163,7,MB,PT,Urban,Group C,Farebox Recovery Ratio,total_pmt,
626,City of Camarillo (CAT) - Public Works,2020,90163,7,DR,PT,Urban,Group C,Farebox Recovery Ratio,total_upt,75537.0
3086,City of Camarillo (CAT) - Public Works,2020,90163,7,DR,PT,Urban,Group C,Farebox Recovery Ratio,total_vrh,22454.0
5546,City of Camarillo (CAT) - Public Works,2020,90163,7,DR,PT,Urban,Group C,Farebox Recovery Ratio,total_pmt,
625,City of Camarillo (CAT) - Public Works,2020,90163,7,MB,PT,Urban,Group C,Farebox Recovery Ratio,total_upt,56136.0


In [17]:
melt_funding[melt_funding["ntd_id"].isin([sample_ids[1]])].sort_values(by=["year","mode","type_of_service"])

Unnamed: 0,source_agency,year,ntd_id,caltrans_district,mode,type_of_service,urban_rural,cohort,metric,ntd_metric,ntd_metric_value
605,City of Camarillo (CAT) - Public Works,2019,90163,7,DR,PT,Urban,Group C,Local Funding % Change vs 2019,total_upt,97529.0
3110,City of Camarillo (CAT) - Public Works,2019,90163,7,DR,PT,Urban,Group C,Local Funding % Change vs 2019,total_vrh,28280.0
5615,City of Camarillo (CAT) - Public Works,2019,90163,7,DR,PT,Urban,Group C,Local Funding % Change vs 2019,total_pmt,
602,City of Camarillo (CAT) - Public Works,2019,90163,7,MB,PT,Urban,Group C,Local Funding % Change vs 2019,total_upt,77029.0
3107,City of Camarillo (CAT) - Public Works,2019,90163,7,MB,PT,Urban,Group C,Local Funding % Change vs 2019,total_vrh,5325.0
5612,City of Camarillo (CAT) - Public Works,2019,90163,7,MB,PT,Urban,Group C,Local Funding % Change vs 2019,total_pmt,
601,City of Camarillo (CAT) - Public Works,2020,90163,7,DR,PT,Urban,Group B,Local Funding % Change vs 2019,total_upt,75537.0
3106,City of Camarillo (CAT) - Public Works,2020,90163,7,DR,PT,Urban,Group B,Local Funding % Change vs 2019,total_vrh,22454.0
5611,City of Camarillo (CAT) - Public Works,2020,90163,7,DR,PT,Urban,Group B,Local Funding % Change vs 2019,total_pmt,
600,City of Camarillo (CAT) - Public Works,2020,90163,7,MB,PT,Urban,Group B,Local Funding % Change vs 2019,total_upt,56136.0


## aggregation group by
- farebox melt
    - PMT, UPT, VRH totals for urban, per year
    - PMT, UPT, VRH totals for rural, per year
    - PMT, UPT, VRH totals for cohort A, per year
    - PMT, UPT, VRH totals for cohort B, per year
    - PMT, UPT, VRH totals for cohort C, per year
- funding melt
    - PMT, UPT, VRH totals for urban, per year
    - PMT, UPT, VRH totals for rural, per year
    - PMT, UPT, VRH totals for cohort A, per year
    - PMT, UPT, VRH totals for cohort B, per year
    - PMT, UPT, VRH totals for cohort C, per year

In [18]:
group_list_agg = [
    "source_agency",
    "year",
    "ntd_id",
    "caltrans_district",
    
]
farebox_vrh_total = (
    melt_farebox[melt_farebox["ntd_metric"] == "total_vrh"]
    .groupby(group_list_agg)["ntd_metric_value"]
    .sum()
    .reset_index()
).rename(columns={"ntd_metric_value": "total_vrh"})

farebox_upt_total = (
    melt_farebox[melt_farebox["ntd_metric"] == "total_upt"]
    .groupby(group_list_agg)["ntd_metric_value"]
    .sum()
    .reset_index()
).rename(columns={"ntd_metric_value": "total_upt"})

farebox_pmt_total = (
    melt_farebox[melt_farebox["ntd_metric"] == "total_pmt"]
    .groupby(group_list_agg)["ntd_metric_value"]
    .sum()
    .reset_index()
).rename(columns={"ntd_metric_value": "total_pmt"})

farebox_cohort_totals = (
    cohort_merge_farebox.groupby(["year","cohort"])
    .agg({"total_upt": "sum", "total_vrh": "sum", "total_pmt": "sum"})
    .reset_index()
)

farebox_area_totals = (
    cohort_merge_farebox.groupby(["year","urban_rural"])
    .agg({"total_upt": "sum", "total_vrh": "sum", "total_pmt": "sum"})
    .reset_index()
)

In [19]:
funding_vrh_total = (
    melt_funding[melt_funding["ntd_metric"] == "total_vrh"]
    .groupby(group_list_agg)["ntd_metric_value"]
    .sum()
    .reset_index()
).rename(columns={"ntd_metric_value": "total_vrh"})

funding_upt_total = (
    melt_funding[melt_funding["ntd_metric"] == "total_upt"]
    .groupby(group_list_agg)["ntd_metric_value"]
    .sum()
    .reset_index()
).rename(columns={"ntd_metric_value": "total_upt"})

funding_pmt_total = (
    melt_funding[melt_funding["ntd_metric"] == "total_pmt"]
    .groupby(group_list_agg)["ntd_metric_value"]
    .sum()
    .reset_index()
).rename(columns={"ntd_metric_value": "total_pmt"})

funding_cohort_totals = (
    cohort_merge_funding.groupby(["year","cohort"])
    .agg({"total_upt": "sum", "total_vrh": "sum", "total_pmt": "sum"})
    .reset_index()
)

funding_area_totals = (
    cohort_merge_funding.groupby(["year","urban_rural"])
    .agg({"total_upt": "sum", "total_vrh": "sum", "total_pmt": "sum"})
    .reset_index()
)

## summary stats

In [28]:
melt_farebox.columns

Index(['source_agency', 'year', 'ntd_id', 'caltrans_district', 'mode',
       'type_of_service', 'urban_rural', 'cohort', 'metric', 'ntd_metric',
       'ntd_metric_value'],
      dtype='object')

In [75]:
metric_df ={ 
    "farebox_upt_total":farebox_upt_total, 
    "farebox_pmt_total":farebox_pmt_total, 
    "farebox_vrh_total":farebox_vrh_total, 
    "funding_upt_total":funding_upt_total, 
    "funding_pmt_total":funding_pmt_total, 
    "funding_vrh_total":funding_vrh_total,

}
metric_cols = [
    "total_upt",
    "total_pmt",
    "total_vrh"
]
cohorts = [
    "Group A",
    "Group B",
    "Group C"
]
for col in metric_cols:
    display(f"""___Farebox Mean and Median for {col}___""","")
    for cohort in cohorts:
        display(
            f"Cohort {cohort}",
            melt_farebox[
                (melt_farebox["ntd_metric"]==col)
                & (melt_farebox["cohort"]==cohort)
                ]["ntd_metric_value"].agg(["mean","median"]),
        )

'___Farebox Mean and Median for total_upt___'

''

'Cohort Group A'

mean     3,996,831.21
median     178,828.00
Name: ntd_metric_value, dtype: float64

'Cohort Group B'

mean     3,265,208.70
median      90,937.00
Name: ntd_metric_value, dtype: float64

'Cohort Group C'

mean     941,629.90
median    28,617.00
Name: ntd_metric_value, dtype: float64

'___Farebox Mean and Median for total_pmt___'

''

'Cohort Group A'

mean     38,053,931.42
median    4,875,106.00
Name: ntd_metric_value, dtype: float64

'Cohort Group B'

mean     22,940,119.87
median    2,443,033.00
Name: ntd_metric_value, dtype: float64

'Cohort Group C'

mean     8,609,814.47
median     572,356.00
Name: ntd_metric_value, dtype: float64

'___Farebox Mean and Median for total_vrh___'

''

'Cohort Group A'

mean     174,088.14
median    35,494.00
Name: ntd_metric_value, dtype: float64

'Cohort Group B'

mean     136,662.45
median    20,820.00
Name: ntd_metric_value, dtype: float64

'Cohort Group C'

mean     48,037.20
median    9,079.00
Name: ntd_metric_value, dtype: float64

In [78]:
for col in metric_cols:
    display("",
            f"""___Funding Expended Mean and Median for {col}___""",
            "")
    for cohort in cohorts:
        display(
            f"Cohort {cohort}",
            melt_funding[
                (melt_funding["ntd_metric"]==col)
                & (melt_funding["cohort"]==cohort)
                ]["ntd_metric_value"].agg(["mean","median"]),
        )

''

'___Funding Expended Mean and Median for total_upt___'

''

'Cohort Group A'

mean     1,129,277.49
median      77,076.00
Name: ntd_metric_value, dtype: float64

'Cohort Group B'

mean     3,152,524.29
median      71,821.00
Name: ntd_metric_value, dtype: float64

'Cohort Group C'

mean     3,024,448.83
median      64,084.00
Name: ntd_metric_value, dtype: float64

''

'___Funding Expended Mean and Median for total_pmt___'

''

'Cohort Group A'

mean     11,292,975.51
median    1,890,074.00
Name: ntd_metric_value, dtype: float64

'Cohort Group B'

mean     27,282,575.26
median    2,435,108.00
Name: ntd_metric_value, dtype: float64

'Cohort Group C'

mean     31,592,570.83
median    2,456,568.50
Name: ntd_metric_value, dtype: float64

''

'___Funding Expended Mean and Median for total_vrh___'

''

'Cohort Group A'

mean     81,027.91
median   16,838.50
Name: ntd_metric_value, dtype: float64

'Cohort Group B'

mean     127,091.95
median    19,620.00
Name: ntd_metric_value, dtype: float64

'Cohort Group C'

mean     119,424.86
median    14,309.00
Name: ntd_metric_value, dtype: float64

# Raw Data tables

In [23]:
display(
    farebox_cohort_totals,
    funding_cohort_totals,
)

Unnamed: 0,year,cohort,total_upt,total_vrh,total_pmt
0,2019,Group A,481794553.0,16706633.0,4436104017.0
1,2019,Group B,912340315.0,28495217.0,3939125127.0
2,2019,Group C,20087538.0,1850617.0,92484145.0
3,2020,Group A,352277028.0,13668740.0,3282966486.0
4,2020,Group B,742982204.0,27311983.0,3208061903.0
5,2020,Group C,29816757.0,2235922.0,103931595.0
6,2021,Group A,232748151.0,15751885.0,1143151643.0
7,2021,Group B,305515511.0,18526592.0,1521517535.0
8,2021,Group C,21506525.0,2299048.0,67203239.0
9,2022,Group A,339112596.0,17344827.0,1940673002.0


Unnamed: 0,year,cohort,total_upt,total_vrh,total_pmt
0,2019,Group A,148588010.0,8696556.0,1008222676.0
1,2019,Group B,1013268987.0,27355817.0,6193402337.0
2,2019,Group C,78329932.0,5536365.0,422818499.0
3,2020,Group A,99043391.0,6192452.0,546200398.0
4,2020,Group B,454096208.0,19486133.0,2408906495.0
5,2020,Group C,429594138.0,12507084.0,2925227657.0
6,2021,Group A,77094314.0,7204425.0,493327080.0
7,2021,Group B,295977041.0,18366058.0,1349474466.0
8,2021,Group C,109997160.0,6162919.0,466817866.0
9,2022,Group A,74711924.0,6127962.0,418870194.0


In [24]:
display(
    farebox_area_totals,
    funding_area_totals
)

Unnamed: 0,year,urban_rural,total_upt,total_vrh,total_pmt
0,2019,Rural,5928494.0,834029.0,0.0
1,2019,Urban,1408293912.0,46218438.0,8467713289.0
2,2020,Rural,4798067.0,770218.0,0.0
3,2020,Urban,1120277922.0,42446427.0,6594959984.0
4,2021,Rural,2326733.0,646031.0,0.0
5,2021,Urban,557443454.0,35931494.0,2731872417.0
6,2022,Rural,3535335.0,678449.0,0.0
7,2022,Urban,800388439.0,40185140.0,4104801024.0
8,2023,Rural,4226671.0,712702.0,0.0
9,2023,Urban,948815353.0,42058866.0,4962163814.0


Unnamed: 0,year,urban_rural,total_upt,total_vrh,total_pmt
0,2019,Rural,5903838.0,827864.0,0.0
1,2019,Urban,1234283091.0,40760874.0,7624443512.0
2,2020,Rural,4824834.0,776637.0,0.0
3,2020,Urban,977908903.0,37409032.0,5880334550.0
4,2021,Rural,2370192.0,662767.0,0.0
5,2021,Urban,480698323.0,31070635.0,2309619412.0
6,2022,Rural,3700019.0,723578.0,0.0
7,2022,Urban,683949382.0,34664550.0,3442444852.0
8,2023,Rural,4374666.0,747190.0,0.0
9,2023,Urban,809475174.0,36582016.0,4178216536.0
