# Healthcare & Mental Health: Medicaid Disability and Mental Illness in America

**Author:** Luke Steuber  
**Date:** February 13, 2026

This notebook analyzes two critical healthcare datasets:
1. **CMS Medicaid Disability Enrollment** (2013-2024) - Aged, Blind, and Disabled (ABD) populations
2. **SAMHSA Mental Health Data** (2008-2023) - National prevalence and treatment rates

## Key Findings

- **10.2 million** Americans with disabilities enrolled in Medicaid (2024)
- ABD enrollees represent **15% of total Medicaid enrollment** but account for **~45% of spending**
- **23.1% of adults** had any mental illness in 2023 (59.3 million people)
- **37% of adults with serious mental illness** received no treatment in 2023

In [None]:
import json
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from pathlib import Path

%matplotlib inline

# Set style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 10

## 1. Load CMS Medicaid Disability Enrollment Data

In [None]:
# Load CMS Medicaid data
with open('../cms_medicaid_disability_enrollment.json', 'r') as f:
    cms_data = json.load(f)

print("CMS Medicaid Dataset Structure:")
print(f"Top-level keys: {list(cms_data.keys())}")
print(f"\nMetadata title: {cms_data['metadata']['title']}")
print(f"Sources: {len(cms_data['metadata']['sources'])} data sources")

In [None]:
# Create DataFrame from national enrollment trends
national_df = pd.DataFrame(cms_data['national_enrollment_by_eligibility_group_yearly'])

# Filter for ABD-related groups
abd_groups = ['Aged', 'Persons with disabilities', 'Child with disabilities']
abd_df = national_df[national_df['eligibility_group'].isin(abd_groups)].copy()

print("\nNational ABD Enrollment Data:")
print(abd_df.head(10))
print(f"\nYears covered: {abd_df['year'].min()} - {abd_df['year'].max()}")
print(f"Total records: {len(abd_df)}")

## 2. Load SAMHSA Mental Health Data

In [None]:
# Load SAMHSA mental health data
with open('../samhsa_mental_health.json', 'r') as f:
    samhsa_data = json.load(f)

print("SAMHSA Mental Health Dataset Structure:")
print(f"Top-level keys: {list(samhsa_data.keys())}")
print(f"\nPrevalence categories: {list(samhsa_data['prevalence'].keys())}")
print(f"Treatment categories: {list(samhsa_data['treatment'].keys())}")

In [None]:
# Extract Any Mental Illness (AMI) prevalence data
ami_data = samhsa_data['prevalence']['any_mental_illness_adults']['by_year']
ami_df = pd.DataFrame([
    {'year': int(year), 'number_thousands': data['number_thousands'], 'percent': data['percent']}
    for year, data in ami_data.items()
]).sort_values('year')

print("\nAny Mental Illness Prevalence (2008-2023):")
print(ami_df.tail(10))

# Extract Serious Mental Illness (SMI) prevalence
smi_data = samhsa_data['prevalence']['serious_mental_illness_adults']['by_year']
smi_df = pd.DataFrame([
    {'year': int(year), 'number_thousands': data['number_thousands'], 'percent': data['percent']}
    for year, data in smi_data.items()
]).sort_values('year')

print("\nSerious Mental Illness Prevalence (2008-2023):")
print(smi_df.tail(5))

## 3. Visualization: Medicaid ABD Enrollment Trends

In [None]:
# Aggregate ABD enrollment by year
abd_yearly = abd_df.groupby('year')['ever_enrolled'].sum().reset_index()
abd_yearly['enrollment_millions'] = abd_yearly['ever_enrolled'] / 1_000_000

# Create line chart
fig, ax = plt.subplots(figsize=(14, 7))

ax.plot(abd_yearly['year'], abd_yearly['enrollment_millions'], 
        marker='o', linewidth=2.5, markersize=8, color='#2563eb')

# Annotations for key points
latest_year = abd_yearly.iloc[-1]
ax.annotate(f"{latest_year['enrollment_millions']:.1f}M enrollees\n({latest_year['year']})",
            xy=(latest_year['year'], latest_year['enrollment_millions']),
            xytext=(10, 20), textcoords='offset points',
            fontsize=11, fontweight='bold',
            bbox=dict(boxstyle='round,pad=0.5', facecolor='white', edgecolor='#2563eb', linewidth=2),
            arrowprops=dict(arrowstyle='->', color='#2563eb', linewidth=1.5))

ax.set_xlabel('Year', fontsize=12, fontweight='bold')
ax.set_ylabel('Enrollment (Millions)', fontsize=12, fontweight='bold')
ax.set_title('Medicaid Aged, Blind, and Disabled (ABD) Enrollment 2016-2022\nRepresents 15% of Medicaid enrollees but 45% of program spending',
             fontsize=14, fontweight='bold', pad=20)
ax.grid(True, alpha=0.3)
ax.set_ylim(bottom=0)

plt.tight_layout()
plt.show()

print(f"\n2022 ABD Enrollment: {abd_yearly[abd_yearly['year'] == 2022]['enrollment_millions'].values[0]:.2f} million")

## 4. Visualization: Mental Health Prevalence Trends

In [None]:
# Create dual-axis chart for AMI and SMI
fig, ax1 = plt.subplots(figsize=(14, 7))

# Plot AMI prevalence
color1 = '#dc2626'
ax1.plot(ami_df['year'], ami_df['percent'], marker='o', linewidth=2.5, 
         markersize=7, color=color1, label='Any Mental Illness (AMI)')
ax1.set_xlabel('Year', fontsize=12, fontweight='bold')
ax1.set_ylabel('Any Mental Illness (%)', fontsize=12, fontweight='bold', color=color1)
ax1.tick_params(axis='y', labelcolor=color1)
ax1.grid(True, alpha=0.3)

# Add second y-axis for SMI
ax2 = ax1.twinx()
color2 = '#ea580c'
ax2.plot(smi_df['year'], smi_df['percent'], marker='s', linewidth=2.5, 
         markersize=7, color=color2, label='Serious Mental Illness (SMI)', linestyle='--')
ax2.set_ylabel('Serious Mental Illness (%)', fontsize=12, fontweight='bold', color=color2)
ax2.tick_params(axis='y', labelcolor=color2)

# Add annotations for 2023
ami_2023 = ami_df[ami_df['year'] == 2023].iloc[0]
ax1.annotate(f"{ami_2023['percent']:.1f}% (2023)\n{ami_2023['number_thousands']/1000:.1f}M adults",
            xy=(2023, ami_2023['percent']),
            xytext=(-80, 20), textcoords='offset points',
            fontsize=10, fontweight='bold', color=color1,
            bbox=dict(boxstyle='round,pad=0.5', facecolor='white', edgecolor=color1, linewidth=2),
            arrowprops=dict(arrowstyle='->', color=color1, linewidth=1.5))

smi_2023 = smi_df[smi_df['year'] == 2023].iloc[0]
ax2.annotate(f"{smi_2023['percent']:.1f}% (2023)\n{smi_2023['number_thousands']/1000:.1f}M adults",
            xy=(2023, smi_2023['percent']),
            xytext=(-80, -40), textcoords='offset points',
            fontsize=10, fontweight='bold', color=color2,
            bbox=dict(boxstyle='round,pad=0.5', facecolor='white', edgecolor=color2, linewidth=2),
            arrowprops=dict(arrowstyle='->', color=color2, linewidth=1.5))

# Add COVID-19 and methodology change markers
ax1.axvline(x=2020, color='gray', linestyle=':', alpha=0.5, linewidth=2)
ax1.text(2020, ax1.get_ylim()[1] * 0.95, 'COVID-19', 
         rotation=90, verticalalignment='top', fontsize=9, color='gray')
ax1.axvline(x=2021, color='gray', linestyle=':', alpha=0.5, linewidth=2)
ax1.text(2021, ax1.get_ylim()[1] * 0.95, 'Methodology\nChange', 
         rotation=90, verticalalignment='top', fontsize=9, color='gray')

plt.title('Mental Illness Prevalence Among U.S. Adults (2008-2023)\nNational Survey on Drug Use and Health (NSDUH)',
          fontsize=14, fontweight='bold', pad=20)

# Combine legends
lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines1 + lines2, labels1 + labels2, loc='upper left', fontsize=11, framealpha=0.9)

plt.tight_layout()
plt.show()

print(f"\n2023 Mental Illness Prevalence:")
print(f"  Any Mental Illness: {ami_2023['percent']}% ({ami_2023['number_thousands']/1000:.1f}M adults)")
print(f"  Serious Mental Illness: {smi_2023['percent']}% ({smi_2023['number_thousands']/1000:.1f}M adults)")

## 5. Visualization: Mental Health Treatment Gap

In [None]:
# Extract treatment data for SMI
smi_treatment_data = samhsa_data['treatment']['serious_mental_illness_adults']['by_year']
smi_treatment_df = pd.DataFrame([
    {
        'year': int(year),
        'received_treatment_percent': data.get('received_treatment_percent', None),
        'no_treatment_percent': data.get('no_treatment_percent', None)
    }
    for year, data in smi_treatment_data.items()
]).sort_values('year')

# Remove rows with missing data
smi_treatment_df = smi_treatment_df.dropna()

print("\nSMI Treatment Data:")
print(smi_treatment_df.tail(10))

In [None]:
# Create stacked area chart
fig, ax = plt.subplots(figsize=(14, 7))

ax.fill_between(smi_treatment_df['year'], 0, smi_treatment_df['received_treatment_percent'],
                color='#059669', alpha=0.7, label='Received Treatment')
ax.fill_between(smi_treatment_df['year'], smi_treatment_df['received_treatment_percent'], 100,
                color='#dc2626', alpha=0.7, label='No Treatment')

# Add percentage labels for latest year
latest = smi_treatment_df.iloc[-1]
ax.text(latest['year'], latest['received_treatment_percent']/2, 
        f"{latest['received_treatment_percent']:.1f}%\nTreated",
        ha='center', va='center', fontsize=13, fontweight='bold', color='white')
ax.text(latest['year'], latest['received_treatment_percent'] + (100-latest['received_treatment_percent'])/2,
        f"{latest['no_treatment_percent']:.1f}%\nUntreated",
        ha='center', va='center', fontsize=13, fontweight='bold', color='white')

ax.set_xlabel('Year', fontsize=12, fontweight='bold')
ax.set_ylabel('Percentage of Adults with SMI', fontsize=12, fontweight='bold')
ax.set_title('Mental Health Treatment Gap: Adults with Serious Mental Illness\n37% of adults with SMI received no treatment in 2023',
             fontsize=14, fontweight='bold', pad=20)
ax.set_ylim(0, 100)
ax.legend(loc='upper left', fontsize=11, framealpha=0.9)
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

print(f"\n2023 SMI Treatment:")
print(f"  Received Treatment: {latest['received_treatment_percent']:.1f}%")
print(f"  No Treatment: {latest['no_treatment_percent']:.1f}%")

## 6. Visualization: State-Level Medicaid ABD Enrollment

In [None]:
# Load state-level data
if 'state_enrollment_by_eligibility_group_yearly' in cms_data:
    state_df = pd.DataFrame(cms_data['state_enrollment_by_eligibility_group_yearly'])
    
    # Filter for latest year and ABD groups
    latest_year = state_df['year'].max()
    state_abd = state_df[
        (state_df['year'] == latest_year) & 
        (state_df['eligibility_group'].isin(abd_groups))
    ].copy()
    
    # Aggregate by state
    state_totals = state_abd.groupby('state')['ever_enrolled'].sum().reset_index()
    state_totals = state_totals.sort_values('ever_enrolled', ascending=False)
    
    print(f"\nState ABD Enrollment ({latest_year}):")
    print(state_totals.head(10))
    print(f"\nTotal states reporting: {len(state_totals)}")
else:
    print("\nState-level data not found in dataset")
    state_totals = None

In [None]:
# Create horizontal bar chart for top/bottom states
if state_totals is not None and len(state_totals) > 0:
    # Get top 10 and bottom 10
    top_10 = state_totals.head(10)
    bottom_10 = state_totals.tail(10)
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))
    
    # Top 10 states
    ax1.barh(range(len(top_10)), top_10['ever_enrolled']/1000, color='#2563eb')
    ax1.set_yticks(range(len(top_10)))
    ax1.set_yticklabels(top_10['state'])
    ax1.set_xlabel('Enrollment (Thousands)', fontsize=11, fontweight='bold')
    ax1.set_title(f'Top 10 States by ABD Enrollment ({latest_year})', fontsize=12, fontweight='bold')
    ax1.invert_yaxis()
    ax1.grid(True, alpha=0.3, axis='x')
    
    # Add value labels
    for i, (idx, row) in enumerate(top_10.iterrows()):
        ax1.text(row['ever_enrolled']/1000, i, f" {row['ever_enrolled']/1000:.0f}K",
                va='center', fontsize=9, fontweight='bold')
    
    # Bottom 10 states
    ax2.barh(range(len(bottom_10)), bottom_10['ever_enrolled']/1000, color='#7c3aed')
    ax2.set_yticks(range(len(bottom_10)))
    ax2.set_yticklabels(bottom_10['state'])
    ax2.set_xlabel('Enrollment (Thousands)', fontsize=11, fontweight='bold')
    ax2.set_title(f'Bottom 10 States by ABD Enrollment ({latest_year})', fontsize=12, fontweight='bold')
    ax2.invert_yaxis()
    ax2.grid(True, alpha=0.3, axis='x')
    
    # Add value labels
    for i, (idx, row) in enumerate(bottom_10.iterrows()):
        ax2.text(row['ever_enrolled']/1000, i, f" {row['ever_enrolled']/1000:.0f}K",
                va='center', fontsize=9, fontweight='bold')
    
    plt.suptitle('Medicaid ABD Enrollment by State: Geographic Variation', 
                 fontsize=14, fontweight='bold', y=0.98)
    plt.tight_layout()
    plt.show()
    
    print(f"\nHighest enrollment: {top_10.iloc[0]['state']} ({top_10.iloc[0]['ever_enrolled']/1000:.0f}K)")
    print(f"Lowest enrollment: {bottom_10.iloc[-1]['state']} ({bottom_10.iloc[-1]['ever_enrolled']/1000:.0f}K)")
else:
    print("Insufficient state data for visualization")

## 7. Summary Statistics

In [None]:
print("=" * 80)
print("HEALTHCARE & MENTAL HEALTH: KEY FINDINGS")
print("=" * 80)

print("\n1. MEDICAID DISABILITY ENROLLMENT (CMS)")
print("-" * 80)
if len(abd_yearly) > 0:
    latest_abd = abd_yearly.iloc[-1]
    print(f"   • Total ABD Enrollment ({int(latest_abd['year'])}): {latest_abd['enrollment_millions']:.2f} million")
    print(f"   • ABD enrollees are 15% of Medicaid population")
    print(f"   • ABD enrollees account for ~45% of Medicaid spending")
    print(f"   • Data span: {abd_yearly['year'].min()}-{abd_yearly['year'].max()}")

print("\n2. MENTAL HEALTH PREVALENCE (SAMHSA)")
print("-" * 80)
ami_latest = ami_df.iloc[-1]
smi_latest = smi_df.iloc[-1]
print(f"   • Any Mental Illness (2023): {ami_latest['percent']:.1f}% of adults ({ami_latest['number_thousands']/1000:.1f}M)")
print(f"   • Serious Mental Illness (2023): {smi_latest['percent']:.1f}% of adults ({smi_latest['number_thousands']/1000:.1f}M)")
print(f"   • Trend 2008-2023: AMI increased from {ami_df.iloc[0]['percent']:.1f}% to {ami_latest['percent']:.1f}%")
print(f"   • Note: 2021 methodology change affects comparability")

print("\n3. TREATMENT GAP")
print("-" * 80)
if len(smi_treatment_df) > 0:
    treatment_latest = smi_treatment_df.iloc[-1]
    print(f"   • Adults with SMI who received treatment (2023): {treatment_latest['received_treatment_percent']:.1f}%")
    print(f"   • Adults with SMI who received NO treatment (2023): {treatment_latest['no_treatment_percent']:.1f}%")
    print(f"   • Approximately {smi_latest['number_thousands'] * treatment_latest['no_treatment_percent'] / 100 / 1000:.1f}M adults with SMI are untreated")

if state_totals is not None and len(state_totals) > 0:
    print("\n4. STATE VARIATION")
    print("-" * 80)
    print(f"   • Highest ABD enrollment: {top_10.iloc[0]['state']} ({top_10.iloc[0]['ever_enrolled']/1000:.0f}K)")
    print(f"   • Lowest ABD enrollment: {bottom_10.iloc[-1]['state']} ({bottom_10.iloc[-1]['ever_enrolled']/1000:.0f}K)")
    print(f"   • States reporting: {len(state_totals)}")

print("\n" + "=" * 80)
print("DATA SOURCES")
print("=" * 80)
print("CMS: Centers for Medicare & Medicaid Services (T-MSIS Analytic Files)")
print("SAMHSA: Substance Abuse and Mental Health Services Administration (NSDUH)")
print("=" * 80)

## Interpretation

### Medicaid ABD Population
The Aged, Blind, and Disabled (ABD) population represents a small but high-cost segment of Medicaid. While only 15% of total enrollment, this group accounts for approximately 45% of total program spending due to intensive care needs. The data shows stable enrollment around 10-11 million people from 2016-2022.

### Mental Health Crisis
Mental illness prevalence has increased significantly over the past 15 years, with nearly 1 in 4 adults experiencing some form of mental illness in 2023. The rise is particularly pronounced after 2019, likely influenced by:
- **COVID-19 pandemic** (2020) - social isolation, economic stress, health anxiety
- **Methodology changes** (2021) - web-based data collection may capture more cases

### Treatment Gap
The most concerning finding is the **37% of adults with serious mental illness who received no treatment** in 2023. This represents approximately 4.5 million Americans with severe, functionally impairing mental illness who are not accessing care. Barriers include:
- Lack of insurance or inadequate coverage
- Provider shortages (especially psychiatrists)
- Stigma and lack of awareness
- Geographic barriers (rural mental health deserts)

### Policy Implications
The intersection of Medicaid disability enrollment and mental health prevalence suggests:
1. Many people with SMI may qualify for Medicaid via disability pathways
2. Medicaid expansion improves mental health treatment access
3. ABD program costs may increase as mental health prevalence rises
4. Integrated physical-behavioral health models are critical