# Revenue Calculation

This notebook takes the cleaned dataset from Notebook 01 and calculates revenue.

It:
- adds billing units (15-min chunks for time-based codes)
- applies billing rates (FY23 vs FY24/FY25)
- rolls revenue up by month

In [1]:
import sys
from pathlib import Path

import pandas as pd

# Point to the repo root (so imports work)
ROOT = Path("..").resolve()
if str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))

from src.revenue_calculator import compute_revenue, get_rates  # noqa: E402

# Load the cleaned dataset created in Notebook 01
DATA_PATH = ROOT / "data" / "clean_encounters.csv"
df = pd.read_csv(DATA_PATH)

df.head()

Unnamed: 0,sex,city,county,patient_category,encounter_facility,pos,encounter_type,encounter_date,visit_time,race,patient_status,encounter___service_provider,cpt_code,duration_min,is_billable,encounter_status
0,Male,Lansing,Ingham County,,Men's Safe Harbor,Office,PRC Individual - CCAR,2023-04-05,11:00 AM,African American,X Discharged,KW,T1012,23.0,Yes,Closed
1,Male,Lansing,Ingham County,,Men's Safe Harbor,Office,PRC Individual - CCAR,2023-04-05,12:45 PM,White,X Discharged,KW,T1012,30.0,Yes,Closed
2,Male,Lansing,Ingham County,,Men's Safe Harbor,Office,PRC Individual - CCAR,2023-04-05,2:00 PM,Other Race,X Discharged,KW,T1012,31.0,Yes,Closed
3,Male,Potterville,Eaton County,,Men's Safe Harbor,Office,PRC Individual - CCAR,2023-04-05,2:45 PM,White,X Discharged,KW,T1012,19.0,Yes,Closed
4,Male,Lansing,Ingham County,,Men's Safe Harbor,Office,PRC Individual - CCAR,2023-04-05,4:15 PM,White,X Discharged,KW,T1012,30.0,Yes,Closed


## Pick which rate table to use

Your original analysis had different rates in FY23 vs FY24/FY25.

Set this to:
- FY23
- FY24
- FY25 (same as FY24)

In [2]:
FISCAL_YEAR = "FY24"   # change to "FY23" or "FY25"

rates = get_rates(FISCAL_YEAR)
rates

{'90832': 71.5,
 '90834': 110.0,
 '90837': 142.0,
 'H0001': 194.0,
 'H0006': 50.5,
 'T1012': 52.5,
 'T1012G': 21.0,
 'H0004': 29.5,
 'H0038': 26.5,
 'H0038G': 6.5}

## Compute revenue

This returns:
- encounter-level data with units + revenue
- monthly revenue by CPT code
- monthly total revenue

In [3]:
encounters_with_revenue, monthly_by_code, monthly_total = compute_revenue(
    df,
    fiscal_year=FISCAL_YEAR,
    rates=rates,
)

monthly_total

Unnamed: 0,month,encounters,total_units,revenue
0,2023-04-01,80,80,4200.0
1,2023-05-01,145,145,7612.5
2,2023-06-01,141,141,7402.5
3,2023-07-01,136,136,7140.0
4,2023-08-01,165,260,7020.0
5,2023-09-01,91,324,8586.0
6,2023-10-01,94,351,9301.5
7,2023-11-01,86,338,8957.0
8,2023-12-01,77,317,8400.5
9,2024-01-01,83,365,9672.5


## Quick check (revenue by code)

This is just a sanity check so you can see service mix by month.

In [4]:
monthly_by_code.head(30)

Unnamed: 0,month,cpt_code,encounters,total_units,revenue
0,2023-04-01,T1012,80,80,4200.0
1,2023-05-01,T1012,145,145,7612.5
2,2023-06-01,T1012,141,141,7402.5
3,2023-07-01,T1012,136,136,7140.0
4,2023-08-01,H0038,160,255,6757.5
5,2023-08-01,T1012,5,5,262.5
6,2023-09-01,H0038,91,324,8586.0
7,2023-10-01,H0038,94,351,9301.5
8,2023-11-01,H0038,86,338,8957.0
9,2023-12-01,H0038,77,317,8400.5


## Save outputs

These files get saved into `data/` so Notebook 03 can use them.

In [5]:
OUT_ENCOUNTERS = ROOT / "data" / "encounters_with_revenue.csv"
OUT_BY_CODE = ROOT / "data" / "monthly_revenue_by_code.csv"
OUT_TOTAL = ROOT / "data" / "monthly_revenue_total.csv"

encounters_with_revenue.to_csv(OUT_ENCOUNTERS, index=False)
monthly_by_code.to_csv(OUT_BY_CODE, index=False)
monthly_total.to_csv(OUT_TOTAL, index=False)

print("Wrote:", OUT_ENCOUNTERS)
print("Wrote:", OUT_BY_CODE)
print("Wrote:", OUT_TOTAL)

Wrote: /mnt/c/Users/glopp/Python Projects/staff-productivity-analysis/data/encounters_with_revenue.csv
Wrote: /mnt/c/Users/glopp/Python Projects/staff-productivity-analysis/data/monthly_revenue_by_code.csv
Wrote: /mnt/c/Users/glopp/Python Projects/staff-productivity-analysis/data/monthly_revenue_total.csv


## Notes

If revenue is 0 for a code, it usually means:
- the code isn’t in the FY rate table, or
- there’s a typo in `cpt_code`

Also, if time-based revenue looks low, check `duration_min` and units.