In [2]:
from scratch import query_high_low_quantile
import polars as pl
from polars import col as c
import polars.selectors as cs

## Purpose

Identify trends in pharmacy margin using NADAC as an estimated cost basis.

## Key questions
- Are there systematic differences in reimbursement between PBM-affiliated and non‑affiliated pharmacies?
- Which providers lie in the tails of the margin distribution (1st and 99th percentiles)?

## Data sources
- WV OIG PBM NADAC Reporting
- CMS NADAC Reporting
- Claims dataset with fields including: `total`, `nadac`, `affiliate`, `ndc`, `drug`, `date`

## Definitions
- **Pharmacy margin** = `total - nadac` (positive => paid more than NADAC; negative => paid below NADAC).
- **affiliate** = whether the pharmacy is affiliated with the PBM (as provided in the dataset).

## Methods (summary)
1. Compute `margin = total - nadac` at the claim level and create aggregated summaries.
2. Group results by `affiliate` and compare central tendency (median), spread (IQR), and tails (1st/99th percentiles).
3. Visualize distributions (boxplots/violins), time trends, and list top outliers by margin and volume for manual review.

## Outputs
- Summary table by `affiliate`: count, median margin, 1st/99th percentiles.
- Visualizations: distribution plots and time series of margins.
- CSV with top N outlier providers for follow-up.

## Caveats
- NADAC is an estimate of acquisition cost and may not reflect discounts, rebates, or special pricing agreements.

In [3]:
(
query_high_low_quantile(0.01, 0.99)
.collect(engine='streaming')
.to_pandas()
# .style
# .tab_header('Margin Over NADAC', subtitle='Top 1% and Bottom 1% of Claims')
# .fmt_currency(cs.matches('(?i)low|high|net$'), accounting=True)
# .fmt_percent(cs.matches('(?i)pct'), decimals=0)
# .cols_width({
#     'low': '150px',
#     'high': '150px',
#     'net': '150px',
#     'net_pct_change': '150px'
# })
# .md()
)

  query_high_low_quantile(0.01, 0.99)


Unnamed: 0,affiliate,low,high,net,net_pct_change
0,False,-27.69,146.69,119.0,
1,True,-119.5,445.43,325.93,1.738908
