#### Prescription Drug Plan Formulary and Pharmacy Network Data Exploration

The Monthly Prescription Drug Plan Formulary and Pharmacy Network Information dataset is used in this analysis. The [dataset](https://data.cms.gov/provider-summary-by-type-of-service/medicare-part-d-prescribers/monthly-prescription-drug-plan-formulary-and-pharmacy-network-information) can be downloaded from Centers for [Medicare & Medicaid Services (CMS)](https://data.cms.gov/) website. The dataset is updated monthly. I use data for June 30, 2025 for the analysis.

This dataset contains detailed information about prescription drug plans (PDPs) and Medicare Advantage plans with drug coverage (MA-PDs). It includes plan identifiers, formulary IDs, premiums, deductibles, contract and plan names, and geographic coverage (state and county codes). The dataset also lists which drugs are included or excluded from a plan’s formulary and provides details on preferred pharmacy networks. It is primarily used by policymakers, researchers, and analysts to evaluate plan availability, pricing, and coverage variations across regions. The data supports public transparency and helps inform healthcare access and cost analyses.

I am exploring the dataset using PostgreSQL in the code blocks below. The questions are generated using ChatGPT.

In [3]:
#Import libraries
import pandas as pd
from sqlalchemy import create_engine, text

# Create database connection
engine = create_engine('postgresql+psycopg2://tharinduabeysinghe:#####@localhost/Pharmacy')

# Run quey and load data to a dataframe
def execute_sql_query(sql):
    # Load data into a pandas DataFrame
    df = pd.DataFrame()
    with engine.connect() as conn:
        df = pd.read_sql_query(text(sql), conn)
    return df

What are the number of plans and and contracts offered?

In [4]:
sql = """
SELECT
    COUNT(DISTINCT(plan_id)) AS number_of_plans,
    COUNT(DISTINCT(contract_id)) AS number_of_contracts
FROM
    plan_information;
"""
        
# Execute query
execute_sql_query(sql)

Unnamed: 0,number_of_plans,number_of_contracts
0,509,723


Which counties have the highest number of drug exclusions in the country?

The query below returns the states and counties with their number of excluded drugs.
State name column is included because there are multiple counties with the same name in different states. Top 10 records are displayed.

In [5]:
sql = """
SELECT
    state_name,
    county,
    COUNT(rxcui) AS exclusions_count
FROM
    excluded_drugs_formulary edf
    JOIN plan_information pi
        ON edf.plan_id = pi.plan_id
    JOIN geographic_locator gl
        ON gl.county_code = pi.county_code
GROUP BY
    state_name, county
ORDER BY
    exclusions_count DESC;
"""

# Execute query
execute_sql_query(sql)

Unnamed: 0,state_name,county,exclusions_count
0,CA,Los Angeles,70763
1,MI,Wayne,38027
2,MI,Washtenaw,35370
3,MI,Oakland,34281
4,MI,Macomb,33680
...,...,...,...
3147,OK,Harper,78
3148,OR,Malheur,76
3149,CA,Lassen,60
3150,FL,Monroe,58


Which plans exclude the most drugs overall?

In [6]:
sql = """
SELECT
    plan_name,
    COUNT(rxcui) AS exclusions_count
FROM
    excluded_drugs_formulary edf
    JOIN public.plan_information pi
        ON edf.plan_id = pi.plan_id
GROUP BY
    plan_name
ORDER BY
    exclusions_count DESC;
"""

# Execute query
execute_sql_query(sql)

Unnamed: 0,plan_name,exclusions_count
0,Wellcare Simple Open (PPO),855690
1,Wellcare Simple (HMO-POS),473572
2,Wellcare Dual Access (HMO-POS D-SNP),449018
3,Molina Medicare Complete Care (HMO D-SNP),355742
4,Wellcare Mutual of Omaha Simple Open (PPO),346182
...,...,...
3036,AARP Medicare Advantage from UHC CA-026P (HMO-...,6
3037,AARP Medicare Advantage from UHC CA-021P (HMO-...,6
3038,HumanaChoice H5216-353 (PPO),5
3039,HumanaChoice H5216-439 (PPO),4


Which drugs are most exluded across the plans?

In [7]:
sql = """
SELECT
    rxcui,
    COUNT(rxcui) AS exclusions_count
FROM
    excluded_drugs_formulary
GROUP BY
    rxcui
ORDER BY
    exclusions_count DESC;
"""

# Execute query
execute_sql_query(sql)

Unnamed: 0,rxcui,exclusions_count
0,312950,2083
1,314228,2083
2,314229,2078
3,1367410,1522
4,310410,1435
5,309594,1286
6,2598453,406
7,402019,322
8,484814,322
9,197397,310


Are certain segments (e.g., PDP vs MA) associated with higher exclusion rates?

Is there a trend in drug exclusions based on geography?

What percentage of plans in a region exclude each top drug?