# Isolating Gamma vs Kappa Effects on AUC

This notebook runs predictions with different combinations of fixed/free gamma and kappa to isolate which parameter causes the AUC drop.

**Configurations:**
1. **Original**: phi fixed, gamma free, kappa free (already run)
2. **Fixed gamma, free kappa**: phi fixed, gamma fixed, kappa free (NEW - run 5 batches)
3. **Fixed kappa, free gamma**: phi fixed, gamma free, kappa fixed (NEW - run 5 batches)
4. **Fixed both**: phi fixed, gamma fixed, kappa fixed (already run)

**Goal**: Compare AUCs to see which parameter (gamma or kappa) is causing the drop.

In [None]:
import subprocess
import sys
import os
from pathlib import Path

# Paths
base_dir = Path('/Users/sarahurbut/aladynoulli2/claudefile')
data_dir = '/Users/sarahurbut/Library/CloudStorage/Dropbox-Personal/data_for_running/'
master_path = data_dir + 'master_for_fitting_pooled_correctedE.pt'
pooled_gk_path = data_dir + 'pooled_kappa_gamma.pt'

print("Configuration:")
print(f"  Master checkpoint: {master_path}")
print(f"  Pooled gamma/kappa: {pooled_gk_path}")
print()

## 1. Run Fixed Gamma, Free Kappa (5 batches)

In [None]:
# Fixed gamma, free kappa - NOHUP COMMAND
script_fixedg_freek = base_dir / 'run_aladyn_predict_with_master_vector_cenosrE_fixedg_freek.py'
output_dir_fixedg_freek = '/Users/sarahurbut/Library/CloudStorage/Dropbox/enrollment_predictions_fixedphi_fixedg_freek_vectorized/'

print("="*80)
print("NOHUP COMMAND: Fixed Gamma, Free Kappa (5 batches)")
print("="*80)
print("Run this in terminal:")
print()
nohup_cmd_fixedg_freek = f"""cd /Users/sarahurbut/aladynoulli2/claudefile && nohup python run_aladyn_predict_with_master_vector_cenosrE_fixedg_freek.py \\
    --trained_model_path {master_path} \\
    --pooled_gamma_path {pooled_gk_path} \\
    --output_dir {output_dir_fixedg_freek} \\
    --max_batches 5 \\
    --num_epochs 200 \\
    --learning_rate 1e-1 \\
    --lambda_reg 1e-2 \\
    > predict_fixedg_freek.log 2>&1 &"""
print(nohup_cmd_fixedg_freek)
print()
print("="*80)

## 2. Run Fixed Kappa, Free Gamma (5 batches)

In [None]:
# Fixed kappa, free gamma - NOHUP COMMAND
script_fixedk_freeg = base_dir / 'run_aladyn_predict_with_master_vector_cenosrE_fixedk_freeg.py'
output_dir_fixedk_freeg = '/Users/sarahurbut/Library/CloudStorage/Dropbox/enrollment_predictions_fixedphi_fixedk_freeg_vectorized/'

print("="*80)
print("NOHUP COMMAND: Fixed Kappa, Free Gamma (5 batches)")
print("="*80)
print("Run this in terminal:")
print()
nohup_cmd_fixedk_freeg = f"""cd /Users/sarahurbut/aladynoulli2/claudefile && nohup python run_aladyn_predict_with_master_vector_cenosrE_fixedk_freeg.py \\
    --trained_model_path {master_path} \\
    --pooled_kappa_path {pooled_gk_path} \\
    --output_dir {output_dir_fixedk_freeg} \\
    --max_batches 5 \\
    --num_epochs 200 \\
    --learning_rate 1e-1 \\
    --lambda_reg 1e-2 \\
    > predict_fixedk_freeg.log 2>&1 &"""
print(nohup_cmd_fixedk_freeg)
print()
print("="*80)

## 3. Compare AUCs Across All Configurations

After running the above, we'll compare AUCs for the first 5 batches across:
- Original (gamma free, kappa free)
- Fixed gamma, free kappa
- Fixed kappa, free gamma
- Fixed both (gamma fixed, kappa fixed)

In [None]:
# TODO: Create comparison script after predictions are done
print("After running predictions, we'll compare AUCs using:")
print("  - Original: enrollment_predictions_fixedphi_correctedE_vectorized")
print("  - Fixed gamma, free kappa: enrollment_predictions_fixedphi_fixedg_freek_vectorized")
print("  - Fixed kappa, free gamma: enrollment_predictions_fixedphi_fixedk_freeg_vectorized")
print("  - Fixed both: enrollment_predictions_fixedphi_fixedgk_vectorized")
print()
print("We can use compare_fixedgk_vs_original_auc.py as a template.")

In [1]:
%run evaluate_all_4_configs_5batches_auc.py --n_bootstraps 100

EVALUATING ALL 4 CONFIGURATIONS - FIRST 5 BATCHES (50K PATIENTS)
Batch size: 10000
Number of batches: 5
Total patients: 50,000
Bootstrap iterations: 100

Loading data files...
✓ Loaded Y: torch.Size([50000, 348, 52]), E: torch.Size([50000, 348]), pce_df: 50000 rows

CONFIGURATION: FIXEDK_FREEG
Directory: /Users/sarahurbut/Library/CloudStorage/Dropbox/enrollment_predictions_fixedphi_fixedk_freeg_vectorized/
  Pooling 5 batches from enrollment_predictions_fixedphi_fixedk_freeg_vectorized...
    ✓ Batch 0: torch.Size([10000, 348, 52])
    ✓ Batch 1: torch.Size([10000, 348, 52])
    ✓ Batch 2: torch.Size([10000, 348, 52])
    ✓ Batch 3: torch.Size([10000, 348, 52])
    ✓ Batch 4: torch.Size([10000, 348, 52])
  ✓ Pooled shape: torch.Size([50000, 348, 52])

EVALUATING: FIXEDK_FREEG

Evaluating static 10-year AUC...

Evaluating ASCVD (10-Year Outcome, 1-Year Score)...
AUC: 0.732 (0.725-0.741) (calculated on 50000 individuals)
Events (10-Year in Eval Cohort): 4333 (8.7%) (from 50000 individual

In [3]:
%run evaluate_all_4_configs_5batches_auc.py --n_bootstraps 10

EVALUATING ALL 5 CONFIGURATIONS - FIRST 5 BATCHES (50K PATIENTS)
Batch size: 10000
Number of batches: 5
Total patients: 50,000
Bootstrap iterations: 10

Loading data files...
✓ Loaded Y: torch.Size([50000, 348, 52]), E: torch.Size([50000, 348]), pce_df: 50000 rows

CONFIGURATION: FIXEDK_FREEG
Directory: /Users/sarahurbut/Library/CloudStorage/Dropbox/enrollment_predictions_fixedphi_fixedk_freeg_vectorized/
  Pooling 5 batches from enrollment_predictions_fixedphi_fixedk_freeg_vectorized...
    ✓ Batch 0: torch.Size([10000, 348, 52])
    ✓ Batch 1: torch.Size([10000, 348, 52])
    ✓ Batch 2: torch.Size([10000, 348, 52])
    ✓ Batch 3: torch.Size([10000, 348, 52])
    ✓ Batch 4: torch.Size([10000, 348, 52])
  ✓ Pooled shape: torch.Size([50000, 348, 52])

EVALUATING: FIXEDK_FREEG

Evaluating static 10-year AUC...

Evaluating ASCVD (10-Year Outcome, 1-Year Score)...
AUC: 0.732 (0.728-0.736) (calculated on 50000 individuals)
Events (10-Year in Eval Cohort): 4333 (8.7%) (from 50000 individuals

In [4]:
%run /Users/sarahurbut/aladynoulli2/claudefile/reshape_auc_results.py

STATIC 10-YEAR AUC RESULTS

AUC by Disease and Method:
                      Original\n(free γ, free κ)  Fixed κ\n(free γ)  Fixed γ\n(free κ)  Fixed γκ\n(regularized)  Fixed γκ\n(unregularized)
disease                                                                                                                                   
ASCVD                                      0.732              0.732              0.695                    0.695                      0.730
All_Cancers                                0.674              0.674              0.669                    0.669                      0.674
Anemia                                     0.595              0.595              0.595                    0.595                      0.594
Anxiety                                    0.515              0.515              0.509                    0.509                      0.514
Asthma                                     0.526              0.526              0.519                    0.519