# ROC Analysis Center
Welcome! This documentation covers the usage of the ROC curve analysis framework consisting of two main components:
1. `roc_analysis_project.ipynb` - A project-specific notebook for generating ROC curves that you should copy into your project-code folder
2. `wrapper_roc_analysis.py` - A centralized utility script with reusable functions

Overview

This framework allows you to create, visualize, and statistically compare ROC curves from multiple machine learning models, scenarios, and cohorts. The system is designed with a centralized/distributed architecture:

- The **wrapper script** (`wrapper_roc_analysis.py`) contains all core functionality and is maintained centrally
- The **notebook** (`roc_analysis_project.ipynb`) can be copied to project-specific directories and customized while importing the centralized utilities

Key Features

- Create ROC curves for multiple modeling scenarios across different cohorts
- Compare multiple estimator types (e.g., RFC vs XGB)
- Statistical comparison of ROC curves using DeLong's test
- Consistent visual styling with customizable color schemes
- Publication-ready figures in vector format (SVG)

Getting Started

- Prerequisites

The framework requires a series of Python packages, including pandas, numpy, matplotlib, scikit-learn, scipy, seaborn, pyyaml
Additionally, it expects a project structure with model outputs in specific locations as defined in the notebook.

- Basic Usage

1. Copy the `roc_analysis_project.ipynb` notebook to your project directory
2. Ensure the `wrapper_roc_analysis.py` script is accessible in your Python path
3. Update file paths in the notebook to match your project structure
4. Run the notebook cells to generate ROC curves for your specific models

- Data Structure

The framework expects TPR (True Positive Rate) data in a specific format:
- An Excel/CSV file containing TPR values from cross-validation
- Data should be structured with columns representing different model configurations
- Column names should follow a specific naming convention: `{cohort}_{fold}_{scenario}_model{model_number}`

- Key Functions

### From `wrapper_roc_analysis.py`

#### Visualization Functions

- `plot_roc_curve()`: Creates a single ROC curve
- `plot_rocs()`: Plots multiple ROC curves with mean and standard deviation
- `plot_rocs_wrapper()`: Higher-level function to plot ROC curves for multiple scenarios
- `plot_rocs_multi_estimator()`: Compares ROC curves across different estimator types
- `plot_colorbar()`: Creates a color legend for scenario visualization

#### Statistical Functions

- `perform_delong_test()`: Executes DeLong's test to compare AUCs
- `delong_roc_test()`: Core implementation of DeLong's test
- `delong_roc_variance()`: Computes variance for DeLong's test

# Libraries and Functions

In [None]:
import sys
import yaml
sys.path.append("../../modeling_pipeline") #Because the project is in a different folder (two levels up), we need to add the path to the sys path
sys.path.append("../..")

%load_ext autoreload
%autoreload 2

from pipeline import * #Load our package with classes pipeline, models, pp (preprocessing), plot, and more
from wrapper_roc_analysis import * #Load our wrapper for the ROC analysis

#This allows us to automatically reload the packages we are working on in the background, no "Restart Kernel" needed



############### CHANGE THIS ############
path= pp.userpath(os.environ.get("USER", os.environ.get("USERNAME")), project="hcc") # Choose your own project here, only works if you added specific project in user_settings.json
############### CHANGE THIS ############


fig_path = f"{path}/visuals"
auroc_path = f"{fig_path}/AUROCs"
if not os.path.exists(auroc_path):
    os.makedirs(auroc_path)

# Load the default color dictionary
yaml_colors_path = "custom_colors.yaml"
with open(yaml_colors_path, 'r') as file:
    config = yaml.safe_load(file)

scenarios_colors = config.get("scenarios_colors", {}) # Extract the color dictionary
print("Successfully loaded color dictionary with", len(scenarios_colors), "entries")

scenario_lists = config.get("scenario_lists", {}) # Extract the color dictionary
print("Successfully loaded scenario list with", len(scenario_lists), "entries")

### Customized Color Schemes

The framework uses a YAML file (`default_colors.yaml`) to define color schemes for different scenarios:

```yaml
scenarios_colors:
  A: '#8A2BE2'  # BlueViolet
  B: '#FF7F50'  # Coral
  C: '#20B2AA'  # LightSeaGreen
  # More colors...

### Scenarios

You can define different scenarios/constellations to plot in one model by changing these variables
scenario_lists:
  incremental: ['A', 'B', 'C', 'D', 'E']
  separate: ['Demographics', 'Diagnosis', 'Blood', 'SNP', 'Metabolomics']
  # More scenario groups...
```

You can also define custom colors directly in the notebook:

```python
my_colors = {
    'A': '#8A2BE2',  # BlueViolet
    'B': '#FF7F50',  # Coral
    # More colors...
}
```

# Data Import for trained models

### Option A: Load and process manually 

In [None]:
# #Raw load (Optional)
# tprs_joblib_path = os.path.join(path + "/Models/Pipelines/RFC/combined_output/val/TPRS_combined.joblib")

# tprs = joblib.load(tprs_joblib_path)
# tprs

In [None]:
# # import the tprs
# tprs=pd.read_excel(path+'/Models/Pipelines/'+model_type+'/combined_output/val/TPRS_combined.xlsx')



# columns=tprs.columns.tolist()
# mapper=pd.DataFrame({'col_names':columns})
# mapper["estimator"] = model_type
# mapper['cohort']=[i.split('_')[0] for i in mapper.col_names]
# mapper['scenario']=[i.split('_')[2] for i in mapper.col_names]
# mapper['model']=[i.split('_model')[1] for i in mapper.col_names]
# mapper.set_index('col_names',inplace=True)
# tprs.transpose()
# mapped_tprs=pd.concat([mapper,tprs.transpose()],axis=1).set_index(['cohort','scenario','model', 'estimator'])
# mapped_tprs.groupby(level=['cohort','scenario']).agg('mean').transpose()
# mapped_tprs


### Option B (recommended): Automatic load via pre-configured joblib and mapper

In [None]:
#proper, processed load
model_type = "RFC"
mapped_tprs = load_tprs(path, [model_type], drop_special_models="Sensitivity")

mapped_tprs

### Data Import for literature benchmarks
You can call them separately in the plot AUROCs, but it is better to add them to the mapped tprs at one point, especially for the Delonge Tests

#### Optional: Create mask dictionaries 
At one point you will need dictionaries defining all eids and PAR eids, both for the whole dataset as well as for validation/testing only. If not created until now, you can load lists that include them and store it as JSON for future reference

In [None]:
# # Create a dictionary with all patients if not done so before
# par_eids = pd.read_csv(path +'/data/09_09_2024/par_eids.csv')["x"]
# par_eids

# all_eids = pd.read_csv(path+ '/data/dataframes/df_covariates.csv')["eid"]
# all_eids


# cohort_eids_dict = {
#     "all": list(all_eids),
#     "par": list(par_eids)
# }

# #Print summary
# print(f"Cohort 'all': {len(cohort_eids_dict['all'])} patients")
# print(f"Cohort 'par': {len(cohort_eids_dict['par'])} patients")
# print(f"Overlap: {len(set(cohort_eids_dict['all']).intersection(set(cohort_eids_dict['par'])))} patients")

# # Step 2: Save the dictionary to a file
# with open(path + "/data/cohort_dict_all.json", 'w') as f:
#     json.dump(cohort_eids_dict, f, indent=2)
# print("Saved cohort_eids_dict.json")

In [None]:
# import the benchmark data (usually not in TPRS format but just a dataframe, prediction and ground truth
#benchmark_par=pd.read_csv(path+'/Models/df_amap_par.csv') # Either as ONE Benchmark (change this or the next line, these are the default benchmarks in the plots) #PAR = Patients at risk
#benchmark_all=pd.read_csv(path+'/Models/df_amap.csv') #All = For ALL Patients
benchmarks= pd.read_csv(path+'/data/dataframes/df_benchmark.csv') #or as dataframe with multiple benchmarks

# Load the cohort dictionary for subsetting of benchmarks according to the cohort names (e.g. "All" or "PAR")
# This is needed to create the benchmark_dict, which is a dictionary of dataframes with the benchmark data for each cohort
with open(path + "/data/cohort_dict_test.json", 'r') as f:
            cohort_dict_test= json.load(f)
with open(path + "/data/cohort_dict_all.json", 'r') as f:
            cohort_dict_all= json.load(f)



#For dedicated cohorts (e.g. "PAR" or "All"), we need to create a dictionary with the filtered benchmark data for each cohort
benchmark_dict = create_benchmark_dict_from_master(benchmarks, cohort_dict_all)

benchmark_prot_dict = create_benchmark_dict_from_master(benchmarks, cohort_dict_all, required_non_na_cols='AFP')

In [None]:
#only select rows where proteomics data is available (= non-NA for AFP)
benchmarks_proteomics = benchmarks[~benchmarks["AFP"].isna()]

benchmarks_proteomics

In [None]:
#Define the scores you want to use (need to be represented as column names in the benchmarks df and, accordingly in the benchmark_dict dataframes)
benchmark_names = ["aMAP", "APRI", "FIB4", "NFS", "LiverRisk", "Liver cirrhosis", "AFP"]

# Add benchmarks to mapped_tprs
mapped_tprs = add_benchmarks_to_mapped_tprs(
    mapped_tprs=mapped_tprs,
    benchmark_dict=benchmark_dict,
    benchmark_names=benchmark_names,  # Optional: specify which benchmarks to include
    n_folds=5
)

# Verify the structure
print("\nUpdated mapped_tprs index levels:")
for i, level in enumerate(mapped_tprs.index.names):
    unique_values = mapped_tprs.index.get_level_values(i).unique()
    print(f"  Level {i} ({level}): {unique_values.tolist()}")

# Check that benchmarks were added correctly
benchmark_indices = mapped_tprs.index[
    mapped_tprs.index.get_level_values('estimator') == 'linear'
]
print(f"\nAdded {len(benchmark_indices)} benchmark entries")


# Statistics

- `perform_delong_test()`: Executes DeLong's test to compare AUCs
- `delong_roc_test()`: Core implementation of DeLong's test
- `delong_roc_variance()`: Computes variance for DeLong's test

In [None]:
# Use for all * all comparisons
delong_results_all = perform_delong_test(
    all_tprs=mapped_tprs,
    cohorts=['all', 'par'],
    scenarios=['C', 'TOP75', 'TOP30', 'TOP15', 'AMAP-RFC'],
    estimators=['RFC'],
    compare_all=True
)

# Usage for comparing against a reference scenario
delong_results_ref = perform_delong_test(
    all_tprs=mapped_tprs,
    cohorts=['all', 'par'],
    scenarios=['A', 'B', 'C', 'D', 'E'],
    estimators=['RFC'],
    compare_all=False,
    reference_scenario='C',
    reference_estimator='RFC'
)

#Compare TOP15-RFC specifically against linear models
linear_scenarios = ['aMAP', 'APRI', 'FIB4', 'NFS', 'Liver cirrhosis']
linear_combos = [(scenario, 'linear') for scenario in linear_scenarios]

delong_results_top15_vs_linear = perform_delong_test_custom(
    all_tprs=mapped_tprs,
    cohorts=['all', 'par'],
    ref_combo=('TOP15', 'RFC'),
    comparison_combos=linear_combos
)


def save_results_to_excel(results, file_name):
    with pd.ExcelWriter(file_name) as writer:
        for cohort, df in results.items():
            df.to_excel(writer, sheet_name=cohort, index=False)

save_results_to_excel(delong_results_all, f"{path}/tables/delong_test_results_all.xlsx")
save_results_to_excel(delong_results_ref, f"{path}/tables/delong_test_results_reference.xlsx")
save_results_to_excel(delong_results_top15_vs_linear, f"{path}/tables/delong_test_results_top15_vs_linear.xlsx")



# Print results
for result_type, delong_results in [("All Comparisons", delong_results_all), ("Reference Comparisons", delong_results_ref), ("Benchmark Comparisons", delong_results_top15_vs_linear)]:
    print(f"\n--- {result_type} ---")
    for cohort, results in delong_results.items():
        print(f"\nDeLong Test Results for {cohort}:")
        print(results)


In [None]:
linear_scenarios = ['aMAP', 'APRI', 'FIB4', 'NFS', 'Cirrhosis']
linear_combos = [(scenario, 'linear') for scenario in linear_scenarios]

delong_results_amap_rfc_vs_linear = perform_delong_test(
    all_tprs=mapped_tprs,
    cohorts=['all', 'par'],
    scenarios=['aMAP', 'APRI', 'FIB4', 'NFS', 'Cirrhosis'],
    estimators=['linear'],
    compare_all=False,
    reference_scenario='AMAP-RFC',
    reference_estimator='RFC'
)
save_results_to_excel(delong_results_amap_rfc_vs_linear, f"{path}/tables/delong_test_results_amap_rfc_vs_linear.xlsx")

In [None]:
for key, df in dataframes.items():
    print(f"{key}: {len(df)} rows (from {key})")

# AUROCs

### Example Workflows in the Notebook

The notebook contains several example workflows, applied to a variety of model constellations (e.g. incremental modalities, i.e from demographics to added diagnosis, added blood, added omics), iterative feature reduction with alpha-level indicating the amount of features, or assessing a number of different benchmarks

1. **Basic ROC comparison**: Compare ROC curves from different model scenarios
   ```python
   fig, ax = plt.subplots(figsize=(6, 5))
   plot_roc_curve(test_scores=benchmarks["aMAP"], true_labels=benchmarks.status, ax=ax)
   plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['incremental'], 'all',
                      title='All Patients', fig_type="AUROCS_combined")
   ```

2. **Literature benchmark comparison**: Compare your models against published benchmarks
   ```python
   fig, ax = plt.subplots(figsize=(6, 5))
   plot_roc_curve(test_scores=benchmarks["aMAP"], true_labels=benchmarks.status, ax=ax, label="aMAP")
   plot_roc_curve(test_scores=benchmarks["APRI"], true_labels=benchmarks.status, ax=ax, label="APRI")
   plot_roc_curve(test_scores=benchmarks["FIB4"], true_labels=benchmarks.status, ax=ax, label="FIB4")
   plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['c'], 'all',
                      title='Literature Benchmark', fig_type="AUROCS_combined")
   ```


3. **Multiple estimator comparison**: Compare different model types (RFC vs XGB)
   ```python
   plot_rocs_multi_estimator(
       all_tprs=all_tprs,
       scenario_list='incremental',
       cohorts=['all'],
       scenarios_colors=scenarios_colors,
       n_splits=5,
       fig_path=auroc_path,
       title='All Patients',
       fig_type="AUROCS_multi_Estimator"
   )
   ```

In [None]:
#Setting up colors. Per default, the colors in default_colors.yaml will be used. But you can also define your own colors here. To use "my_colors, change the color argument in the AUROC function you want to adapt"

my_colors = {
    'A': '#8A2BE2',  # BlueViolet
    'B': '#FF7F50',  # Coral
    'C': '#20B2AA',  # LightSeaGreen
    'D': '#9932CC',  # DarkOrchid
    'E': '#FF8C00',  # DarkOrange
    'Demographics': '#4B0082',  # Indigo
    'Diagnosis': '#FFA07A',  # LightSalmon
    'Blood': '#00CED1',  # DarkTurquoise
    'SNP': '#BA55D3',  # MediumOrchid
    'Metabolomics': '#F08080',  # LightCoral
}

plot_colorbar(scenario_lists['incremental'])
plot_colorbar(scenario_lists['separate'])

### No Imputation Sensitivity Analysis

In [None]:

fig, ax = plt.subplots(figsize=(6, 5))
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['no_impute'], 'all',
                    title='Literature Benchmark (All UKB)', fig_type="AUROCS_no_impute", fig_path=fig_path)
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['no_impute'], 'par',
                    title='Literature Benchmark (All UKB)', fig_type="AUROCS_no_impute", fig_path=fig_path)

### AFP Sensitivity Analysis

In [None]:

row_subsets = ["all", "par"]
for row_subset in row_subsets:

    # All AFP Benchmark
    fig, ax = plt.subplots(figsize=(6, 5))
    plot_roc_curve(test_scores=benchmark_prot_dict[row_subset]["aMAP"], true_labels=benchmark_prot_dict[row_subset].status, ax=ax, label="aMAP", linestyle="-", fig_path=fig_path)
    plot_roc_curve(test_scores=benchmark_prot_dict[row_subset]["AFP"], true_labels=benchmark_prot_dict[row_subset].status, ax=ax, label="AFP", linestyle="-", fig_path=fig_path, color='#4895ad')

    plot_roc_curve(test_scores=benchmark_prot_dict[row_subset]["Liver cirrhosis"], true_labels=benchmark_prot_dict[row_subset].status, ax=ax, label="Cirrhosis", linestyle="-", color= "#385579")
    plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['c'], row_subset,
                    title=f'AFP Benchmark (UKB) {row_subset}', fig_type="AUROCS_combined", fig_path=fig_path)
    plt.show()


In [None]:
# All AFP Benchmark
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmarks_proteomics["aMAP"], true_labels=benchmarks_proteomics.status, ax=ax, label="aMAP", linestyle="-", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks_proteomics["AFP"], true_labels=benchmarks_proteomics.status, ax=ax, label="AFP", linestyle="--", fig_path=fig_path)

plot_roc_curve(test_scores=benchmarks_proteomics["Liver cirrhosis"], true_labels=benchmarks_proteomics.status, ax=ax, label="Cirrhosis", linestyle="-", color= "#385579")
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['c'], 'par',
                   title='AFP Benchmark (UKB)', fig_type="AUROCS_combined", fig_path=fig_path)
plt.show()


### Literature Score comparison

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['benchmarks'], 'all',
                    title='Literature Benchmark (All UKB)', fig_type="AUROCS_combined", fig_path=fig_path)

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['benchmarks'], 'par',
                    title='Literature Benchmark (All UKB)', fig_type="AUROCS_combined", fig_path=fig_path)

In [None]:

fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmarks["aMAP"], true_labels=benchmarks.status, ax=ax, label="aMAP", linestyle="-", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["APRI"], true_labels=benchmarks.status, ax=ax, label="APRI", linestyle="--", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["FIB4"], true_labels=benchmarks.status, ax=ax, label="FIB4", linestyle="-.", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["NFS"], true_labels=benchmarks.status, ax=ax, label="NFS", linestyle=":", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["LiverRisk"], true_labels=benchmarks.status, ax=ax, label="LiverRisk", linestyle="dashed", color= "#385579")
plot_roc_curve(test_scores=benchmarks["Liver cirrhosis"], true_labels=benchmarks.status, ax=ax, label="Cirrhosis", linestyle="-", color= "#385579")
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['c'], 'all',
                title='Literature Benchmark (All UKB)', fig_type="AUROCS_combined", fig_path=fig_path)
plt.show()


In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmarks["aMAP"], true_labels=benchmarks.status, ax=ax, label="aMAP", linestyle="-", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["APRI"], true_labels=benchmarks.status, ax=ax, label="APRI", linestyle="--", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["FIB4"], true_labels=benchmarks.status, ax=ax, label="FIB4", linestyle="-.", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["NFS"], true_labels=benchmarks.status, ax=ax, label="NFS", linestyle=":", fig_path=fig_path)
plot_roc_curve(test_scores=benchmarks["Liver cirrhosis"], true_labels=benchmarks.status, ax=ax, label="Cirrhosis", linestyle="-", color= "#385579")
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['c'], 'all',
                   title='Literature Benchmark (All UKB)', fig_type="AUROCS_combined", fig_path=fig_path)
plt.show()


In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmark_dict["par"]["aMAP"], true_labels=benchmark_dict["par"].status, ax=ax, label="aMAP", linestyle="-", fig_path=fig_path)
plot_roc_curve(test_scores=benchmark_dict["par"]["APRI"], true_labels=benchmark_dict["par"].status, ax=ax, label="APRI", linestyle="--", fig_path=fig_path)
plot_roc_curve(test_scores=benchmark_dict["par"]["FIB4"], true_labels=benchmark_dict["par"].status, ax=ax, label="FIB4", linestyle="-.", fig_path=fig_path)
plot_roc_curve(test_scores=benchmark_dict["par"]["NFS"], true_labels=benchmark_dict["par"].status, ax=ax, label="NFS", linestyle=":", fig_path=fig_path)
plot_roc_curve(test_scores=benchmark_dict["par"]["LiverRisk"], true_labels=benchmark_dict["par"].status, ax=ax, label="LiverRisk", linestyle="dashed", color= "#385579")
plot_roc_curve(test_scores=benchmark_dict["par"]["Liver cirrhosis"], true_labels=benchmark_dict["par"].status, ax=ax, label="Cirrhosis", linestyle="-", color= "#385579")
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['c'], 'all',
                   title='Literature Benchmark (PAR UKB)', fig_type="AUROCS_combined", fig_path=fig_path)
plt.show()


### Combined AUROCs (Incremental) for one estimator class

In [None]:

fig, ax = plt.subplots(figsize=(6, 5))
colors = viridis(np.linspace(0, 1, 4))
plot_roc_curve(test_scores=benchmarks["aMAP"], true_labels=benchmarks.status, ax=ax, label="aMAP", color=colors[0])
plot_roc_curve(test_scores=benchmarks["APRI"], true_labels=benchmarks.status, ax=ax, label="APRI", color=colors[1])
plot_roc_curve(test_scores=benchmarks["FIB4"], true_labels=benchmarks.status, ax=ax, label="FIB4", color=colors[2])
plot_roc_curve(test_scores=benchmarks["NFS"], true_labels=benchmarks.status, ax=ax, label="NFS", color=colors[3])
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['c'], 'all',
                   title='Literature Benchmark', fig_type="AUROCS_combined")
plt.show()

In [None]:
# Greyscale colouring

fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmarks["aMAP"], true_labels=benchmarks.status, ax=ax, label="aMAP")
plot_roc_curve(test_scores=benchmarks["APRI"], true_labels=benchmarks.status, ax=ax, label="APRI")
plot_roc_curve(test_scores=benchmarks["FIB4"], true_labels=benchmarks.status, ax=ax, label="FIB4")
plot_roc_curve(test_scores=benchmarks["NFS"], true_labels=benchmarks.status, ax=ax, label="NFS")
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['incremental'], 'all',
                   title='Literature Benchmark', fig_type="AUROCS_combined", fig_path=fig_path)
plt.show()

##### PAR

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmark_dict["par"]["aMAP"], true_labels=benchmark_dict["par"].status, ax=ax)
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['incremental'], 'par', scenarios_colors=scenarios_colors,
                   title="", fig_type="AUROCS_combined", fig_path=fig_path, linewidth=1.5, font_size=16)
plt.show()

##### All

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmark_dict["all"]["aMAP"], true_labels=benchmark_dict["all"].status, ax=ax)
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['incremental'], 'all', scenarios_colors=scenarios_colors,
                   title="", fig_type="AUROCS_combined", fig_path=fig_path, linewidth=1.5, font_size=16)
plt.show()


## Separately trained Models

##### PAR

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmark_dict["par"]["aMAP"], true_labels=benchmark_dict["par"].status, ax=ax)
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['separate'], 'par',
                   title='Chronic Liver Disease', fig_type="AUROCS_separately", fig_path=fig_path)
plt.show()

##### All

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmark_dict["all"]["aMAP"], true_labels=benchmark_dict["all"].status, ax=ax)
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['separate'], 'all',
                   title='All', fig_type="AUROCS_separately", fig_path=fig_path)
plt.show()


# fig, ax = plt.subplots(figsize=(6, 5))
# plot_roc_curve(test_scores=benchmark_all["aMAP"], true_labels=benchmark_all.status, ax=ax)
# plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['small'], 'all',
#                    title="All Patients - Small Models", fig_type="AUROCS_small_models")
# plt.show()

## Small Models

##### All

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmark_dict["all"]["aMAP"], true_labels=benchmark_dict["all"].status, ax=ax)
plot_rocs_wrapper(mapped_tprs, fig, ax, scenarios=scenario_lists['small'], cohort='all', scenarios_colors=scenarios_colors,
                   title="All - Small Models", fig_type="AUROCS_small_models", fig_path=fig_path, linewidth=1.5, font_size=16)


plt.show()

#### PAR

In [None]:
scenario_lists

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
plot_roc_curve(test_scores=benchmark_dict["par"]["aMAP"], true_labels=benchmark_dict["par"].status, ax=ax)
plot_rocs_wrapper(mapped_tprs, fig, ax, scenario_lists['small_par'], 'par',
                   title="Patients at Risk - Small Models", fig_type="AUROCS_small_models", fig_path=fig_path, linewidth=1.5, font_size=16)


plt.show()

### Single Model AUROCs (5 fold)

##### For all 5 scenarios

In [None]:
n_splits = 5

# Loop through each scenario in the scenarios_colors dictionary, and for each, plot the five different ROC turves obtained from five fold cross validation
# PAR
for scenario in scenario_lists["c"]:
    fig, ax = plt.subplots(figsize=(8, 6.5))
    tprs_data = mapped_tprs.loc[('par', scenario)].values
    plot_rocs(tprs=tprs_data, fig=fig, ax=ax,
              scenario=scenario,
              plot_all=True, fill_bet=True,
              title=f'Chronic liver disease - Scenario {scenario}',
              fig_type="AUROC_sep",
              individual_alpha=0.5, #for the five-fold lines
              individual_color="grey",
              individual_lw=1.5, #for the five-fold lines
              mean_lw=2.5,   #for the mean line
              fig_path=fig_path,
              col_line=scenarios_colors[scenario],
              font_size=20)




In [None]:
#ALL
n_splits = 5
for scenario in scenario_lists["incremental"]:
    fig_all, ax_all =plt.subplots()
    tprs_data = mapped_tprs.loc[('all', scenario)].values
    plot_rocs(tprs=tprs_data, fig=fig_all, ax=ax_all,
              scenario=scenario, plot_all=True,fill_bet=True,
              title=f"All - Scenario {scenario}", fig_type="AUROC_sep",
              individual_alpha=0.5, #for the five-fold lines
              individual_color="grey",
              individual_lw=1.5, #for the five-fold lines
              mean_lw=2.5, #for the mean line)
              col_line=scenarios_colors[scenario],
              fig_path=fig_path)


### Multiple Estimators Comparison

### Data Import: Multiple TPRS Files (e.g. for comparison RFC vs XGB)

In [None]:
model_types = ["XGB", "RFC", "CatBoost", "neuronMLP"]
line_styles = {
    'XGB': '--',
    'RFC': '-',
    'CatBoost': ':',
    'Log_l1': (0, (5, 5)),  # example dash tuple
    'NeuronMLP': (0, (3, 5, 1, 5)) # Dashdotdotted


    # ... other estimators if needed
}

base_path = path + '/Models/Pipelines/'
all_tprs = pd.DataFrame()



for model_type in model_types:
    # File paths
    tprs_joblib_path = os.path.join(base_path, model_type, "combined_output/val/TPRS_combined.joblib")
    tprs_excel_path = os.path.join(base_path, model_type, "combined_output/val/TPRS_combined.xlsx")

    # Try joblib first, fallback to Excel
    if os.path.exists(tprs_joblib_path):
        tprs = joblib.load(tprs_joblib_path)
        print(f"Loaded joblib TPRs for {model_type}")
    elif os.path.exists(tprs_excel_path):
        tprs = pd.read_excel(tprs_excel_path)
        print(f"Loaded Excel TPRs for {model_type}")
    else:
        print(f"No TPR file found for {model_type} in either format.")
        continue  # Skip to next model

    # Rename columns to remove '_met' suffix if present
    try:
        columns = [col.replace('_met', '') for col in tprs.columns]
    except Exception as e:
        print(f"Error processing column names for {model_type}: {e}")
        columns = tprs.columns  # fallback

    tprs.columns = columns

    mapper=pd.DataFrame({'col_names':columns})
    mapper["estimator"] = model_type
    mapper['cohort']=[i.split('_')[0] for i in mapper.col_names]
    mapper['scenario']=[i.split('_')[2] for i in mapper.col_names]
    mapper['model']=[i.split('_model')[1] for i in mapper.col_names]
    mapper.set_index('col_names',inplace=True)
    tprs.transpose()
    mapped_tprs=pd.concat([mapper,tprs.transpose()],axis=1).set_index(['cohort','scenario','model', 'estimator'])
    mapped_tprs.groupby(level=['cohort','scenario']).agg('mean').transpose()
    mapped_tprs

    # # Concatenate to the main DataFrame
    all_tprs = pd.concat([all_tprs, mapped_tprs])

#### PAR Multi-Estimator

In [None]:
plot_rocs_multi_estimator(
    all_tprs=all_tprs,
    scenario_list=scenario_lists['incremental'],
    model_types=model_types,
    cohorts=['par'],
    scenarios_colors=scenarios_colors,
    n_splits=5,
    fig_path=auroc_path,
    title='PAR - AUROC Multi-Estimator',
    line_styles=line_styles,
    font_size=20
)

#### All Multi-Estimator

In [None]:
plot_rocs_multi_estimator(
    all_tprs=all_tprs,
    scenario_list=scenario_lists['incremental'],
    model_types=model_types,
    cohorts=['all'],
    scenarios_colors=scenarios_colors,
    n_splits=5,
    fig_path=auroc_path,
    title='All - AUROC Multi-Estimator',
    line_styles=line_styles,
    font_size=20
)