#   Chronic Disease in America
####  Milestone Submission (Nov 18, 2024)

Applied Data Visualization (COMP5960 - Fall 2024)

Team Members: Chetan Elenki, Kalpana Simhadri, Nathaniel Masson

### 1. America's Health Landscape - National Overview

- Opening with a national choropleth map showing overall chronic disease burden
- Key narrative points:
     * Distribution of major chronic conditions across the country
     * Identification of "hot spots" and "cold spots"
     * Initial patterns that raise questions for more exploration

### 2. Regional Stories: The Geography of Health Disparities
- Diving into distinct regional patterns
- Key narrative elements:
    * The "Stroke Belt" in the Southeast
    * Diabetes patterns in the Southwest
    * Heart disease clusters in the Rust Belt
    * Respiratory health issues in urban corridors

### 3. The Urban-Rural Divide: Two Americas?
- Exploring how health patterns shift across the urban-rural continuum
- Key narrative points:
    * Access to healthcare differences
    * Lifestyle-related health outcomes
    * Economic factors and health correlations

In [4]:
import sys
from pathlib import Path

# Setup paths
PROJECT_ROOT = Path().resolve().parents[0]
sys.path.append(str(PROJECT_ROOT))

In [5]:
import plotly.express as px
import plotly.graph_objects as go
import pandas as pd
import numpy as np
from typing import Dict, List

from config import *

def create_health_choropleth(
    df: pd.DataFrame,
    measures: List[str] = ['DIABETES_CrudePrev', 'OBESITY_CrudePrev', 'BPHIGH_CrudePrev', 'STROKE_CrudePrev']
) -> go.Figure:
    """
    Creates an interactive choropleth map of health outcomes using Plotly
    
    Parameters:
    -----------
    df : pd.DataFrame
        PLACES dataset
    measures : List[str]
        List of health measures to visualize
    
    Returns:
    --------
    fig : go.Figure
        Interactive Plotly figure
    """
    # Calculate state-level statistics
    state_stats = df.groupby(['StateDesc', 'StateAbbr'])[measures].agg(['mean', 'std', 'count']).reset_index()

    # Flatten column names
    state_stats.columns = [
        f"{col[0]}_{col[1]}" if isinstance(col, tuple) and col[1] else col[0]
        for col in state_stats.columns
    ]
    
    # Create figure
    fig = go.Figure()
    
    # Add choropleth traces for each measure
    for measure in measures:
        measure_name = measure.replace('_CrudePrev', '')
        display_name = {
            'DIABETES': 'Diabetes',
            'OBESITY': 'Obesity',
            'BPHIGH': 'High Blood Pressure',
            'STROKE': 'Stroke'
        }.get(measure_name, measure_name)
        
        fig.add_trace(
            go.Choropleth(
                locations=state_stats['StateAbbr'],
                z=state_stats[f'{measure}_mean'],
                locationmode='USA-states',
                colorscale='Reds',
                name=display_name,
                zmin=state_stats[f'{measure}_mean'].min(),
                zmax=state_stats[f'{measure}_mean'].max(),
                visible=True if measure == measures[0] else False,
                colorbar_title="Prevalence (%)",
                hovertemplate=(
                    "<b>%{location}</b><br>" +
                    "Prevalence: %{z:.1f}%<br>" +
                    "<extra></extra>"
                )
            )
        )
    
    # Update layout
    fig.update_layout(
        title={
            'text': 'Chronic Disease Prevalence Across the United States',
            'x': 0.5,
            'xanchor': 'center',
            'font': {'size': 24}
        },
        geo=dict(
            scope='usa',
            projection_type='albers usa',
            showlakes=True,
            lakecolor='rgb(255, 255, 255)'
        ),
        width=1000,
        height=600,
        updatemenus=[{
            'buttons': [
                {
                    'method': 'update',
                    'label': measure.replace('_CrudePrev', ''),
                    'args': [
                        {'visible': [i == j for j in range(len(measures))]},
                        {'title': f'{measure.replace("_CrudePrev", "")} Prevalence by State'}
                    ]
                } for i, measure in enumerate(measures)
            ],
            'direction': 'down',
            'showactive': True,
            'x': 0.1,
            'y': 1.1
        }]
    )
    
    return fig

def main():
    # Load data
    file_path = get_file_path(2024)
    df = pd.read_csv(file_path)
    
    # Create visualizations
    choropleth = create_health_choropleth(df)
    
    return choropleth

# Create and display visualizations
choropleth = main()
choropleth.show()

## County Level Choropleth Maps
#### (Work In Progress)

![County Map](county_map.png "County Choropleth")

## PCP Plot (Urban/Rural analysis)
#### (Work In Progress)

![PCP Plot](pcp_plot.png "Urban-Rural patterns")