## B2. Concordance Analysis – Need vs AI Implementation

**Description**  
This section evaluates the alignment between healthcare need and AI implementation levels across hospitals. Tertiles are computed for both variables to assess patterns of concordance.

**Purpose**  
To examine whether AI implementation levels correspond with areas of greatest need. High concordance would suggest equitable distribution, while misalignment could indicate disparities in AI resource allocation.

**Method Summary**  
- Rank-based tertiles were created for HPSA, MUA, ADI, SVI scores.  
- AI implementation scores were already categorized into tertiles (Low, Medium, High).  
- Cross-tabulations were generated and visualized using heatmaps.  
- A "perfect concordance rate" was calculated as the percentage of hospitals where AI implementation and need tertiles matched exactly (diagonal cells).


### 1 Load necessary libraries, functions, and pre-processed data 

In [17]:

# load necessary libraries 
import geopandas as gpd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os

In [18]:
ai_exposures = ["ai_base_score",
"ai_base_breadth_score",
"ai_base_dev_score",
"ai_base_eval_score"]

In [19]:
AHA_master = pd.read_csv("./data/AHA_master_external_data.csv", low_memory=False)
AHA_IT = AHA_master[AHA_master.id_it.notna()]

### 2 Data engineering 

In [None]:
AHA_master2 = calculate_ai_scores.apply_ai_scores_to_dataframe(AHA_IT)

In [None]:
AHA_IT_US = AHA_master2[AHA_master2['division']!='Territories']
AHA_IT_US.shape

In [None]:
# Convert numeric AI implementation scores to categorical labels
AHA_IT_US['AI_implementation_tertile'] = AHA_IT_US['ai_base_score'].map({
    0: 'Low',
    1: 'Medium',
    2: 'High'
})

### 3 Concordance analysis 

In [None]:
# First create tertiles for all need measures using rank-based approach

# HPSA measures
AHA_IT_US['primary_hpss_tertile'] = pd.qcut(AHA_IT_US['mean_primary_hpss'].rank(method='first'), 
                                           3, 
                                           labels=['Low', 'Medium', 'High'])

AHA_IT_US['mental_hpss_tertile'] = pd.qcut(AHA_IT_US['mean_mental_hpss'].rank(method='first'), 
                                          3, 
                                          labels=['Low', 'Medium', 'High'])

AHA_IT_US['dental_hpss_tertile'] = pd.qcut(AHA_IT_US['mean_dental_hpss'].rank(method='first'), 
                                          3, 
                                          labels=['Low', 'Medium', 'High'])

# MUA measures
AHA_IT_US['mua_score_tertile'] = pd.qcut(AHA_IT_US['mean_mua_score'].rank(method='first'), 
                                        3, 
                                        labels=['Low', 'Medium', 'High'])

# Add MUA-specific measures (elder and infant)
AHA_IT_US['mua_elder_tertile'] = pd.qcut(AHA_IT_US['mean_mua_elders_score'].rank(method='first'), 
                                        3, 
                                        labels=['Low', 'Medium', 'High'])

AHA_IT_US['mua_infant_tertile'] = pd.qcut(AHA_IT_US['mean_mua_infant_score'].rank(method='first'), 
                                         3, 
                                         labels=['Low', 'Medium', 'High'])

# Area Deprivation Index (higher score = higher need)
AHA_IT_US['adi_tertile'] = pd.qcut(AHA_IT_US['national_adi_median'].rank(method='first'), 
                                  3, 
                                  labels=['Low', 'Medium', 'High'])

# Social Vulnerability Index (higher score = higher need)
AHA_IT_US['svi_tertile'] = pd.qcut(AHA_IT_US['svi_themes_median'].rank(method='first'), 
                                  3, 
                                  labels=['Low', 'Medium', 'High'])

# Convert numeric AI implementation scores to categorical labels
AHA_IT_US['AI_implementation_tertile'] = AHA_IT_US['ai_base_score'].map({
    0: 'Low',
    1: 'Medium',
    2: 'High'
})

# Convert to categorical type with ordered categories
AHA_IT_US['AI_implementation_tertile'] = pd.Categorical(
    AHA_IT_US['AI_implementation_tertile'],
    categories=['Low', 'Medium', 'High'],
    ordered=True
)

# Create concordance tables for all measures
need_measures = {
    # Top row - HPSA measures
    'Primary HPSA': 'primary_hpss_tertile',
    'Mental HPSA': 'mental_hpss_tertile', 
    'Dental HPSA': 'dental_hpss_tertile',
    # Middle row - MUA measures
    'MUA Overall': 'mua_score_tertile',
    'MUA Elder': 'mua_elder_tertile',
    'MUA Infant': 'mua_infant_tertile',
    # Bottom row - Social indices
    'Area Deprivation Index': 'adi_tertile',
    'Social Vulnerability Index': 'svi_tertile',
    'Empty': None  # Placeholder for 3x3 grid
}

concordance_tables = {}
for name, column in need_measures.items():
    if column is not None:  # Skip the empty placeholder
        concordance_tables[name] = pd.crosstab(
            AHA_IT_US[column], 
            AHA_IT_US['AI_implementation_tertile'], 
            normalize=True
        ) * 100
        # Reorder to put High Need at TOP, Low Need at BOTTOM
        concordance_tables[name] = concordance_tables[name].reindex(['High', 'Medium', 'Low'])

# Create 3x3 visualization
fig, axes = plt.subplots(3, 3, figsize=(18, 15))

# Calculate global min and max for uniform color scale
all_values = []
for table in concordance_tables.values():
    all_values.extend(table.values.flatten())
vmin = min(all_values)
vmax = max(all_values)

lat_order = [
    'Primary HPSA', 'Mental HPSA', 'Dental HPSA',        # Top row - HPSA
    'MUA Overall', 'MUA Elder', 'MUA Infant',            # Middle row - MUA  
    'Area Deprivation Index', 'Social Vulnerability Index', 'Empty'  # Bottom row - Social indices
]

# Plot all concordance heatmaps
for i, name in enumerate(lat_order):
    row = i // 3
    col = i % 3
    
    if name == 'Empty':
        # Hide the empty subplot
        axes[row, col].axis('off')
    else:
        table = concordance_tables[name]
        sns.heatmap(table, 
                    annot=True, 
                    fmt='.1f', 
                    cmap='YlOrRd', 
                    ax=axes[row, col],
                    vmin=vmin, vmax=vmax,  # Uniform color scale
                    cbar_kws={'label': 'Percentage'})
        axes[row, col].set_title(f'{name} vs AI Implementation')
        axes[row, col].set_xlabel('AI Implementation Level')
        axes[row, col].set_ylabel(f'{name} Need Level')
        

plt.tight_lat()
plt.show()

# Display all concordance tables
for name, table in concordance_tables.items():
    print(f"\n{name} Need Level vs AI Implementation")
    print(table)

# Calculate mismatch percentages for all measures
print("\n" + "="*50)
print("MISMATCH PERCENTAGES")
print("="*50)

for name, table in concordance_tables.items():
    # Calculate ideal concordance (diagonal cells)
    ideal_concordance = (table.iloc[0, 2] +  # High Need + High AI
                        table.iloc[1, 1] +   # Medium Need + Medium AI
                        table.iloc[2, 0])    # Low Need + Low AI
    
    # Calculate mismatch percentage
    mismatch_percentage = 100 - ideal_concordance
    
    print(f"{name}: {mismatch_percentage:.2f}% mismatch")