# Ensemble Top-2 Analysis

This notebook summarizes the top-ranked 2-model ensemble (rank 1) from `models/ensemble_results_2model/ensemble_2model_results.json`, displays metrics and visualizations (ROC curve and confusion matrix) for the best pair, and provides a short conclusion.

## 1) Environment & Dependencies

# (Optional) Install dependencies; typically this is already satisfied in your environment.
# !pip install -r ../requirements.txt

import sys
import platform
print('Python:', sys.version.split()[0])
print('Platform:', platform.platform())
import tensorflow as tf
print('TensorFlow:', tf.__version__)
import sklearn
print('scikit-learn:', sklearn.__version__)


In [None]:
## 2) Imports & Global Constants

import os
import json
from IPython.display import Image, display
import matplotlib.pyplot as plt

MODELS_DIR = os.path.abspath(os.path.join('..', 'models'))
ENSEMBLE_RESULTS = os.path.join(MODELS_DIR, 'ensemble_results_2model', 'ensemble_2model_results.json')
OUTPUT_DIR = os.path.join(MODELS_DIR, 'ensemble_results_2model')


In [None]:
## 3) Utilities & Helper Functions

def load_json(path):
    with open(path, 'r') as f:
        return json.load(f)


def find_top_combo(results_json):
    # results_json is a list of dicts sorted by rank
    return results_json[0] if results_json else None


def combo_folder_name(combo_str):
    # combo_str like 'densenet121 + vgg16' -> 'densenet121_vgg16'
    return combo_str.replace(' + ', '_')


In [None]:
## 4) Load Ensemble Results and Pick Top Pair

results = load_json(ENSEMBLE_RESULTS)
print(f"Loaded {len(results)} combinations from {ENSEMBLE_RESULTS}")

top = find_top_combo(results)
if top is None:
    print('No results found.')
else:
    print('Top-ranked combination:')
    print(json.dumps(top, indent=2))


In [None]:
## 5) Locate Top Combination Folder and Files

if top:
    combo_dir = os.path.join(OUTPUT_DIR, f"ensemble_{top['rank']}_" + combo_folder_name(top['combination']))
    print('Expecting combo folder at:', combo_dir)
    roc_path = os.path.join(combo_dir, 'roc_curve.png')
    cm_path = os.path.join(combo_dir, 'confusion_matrix.png')
    metrics_path = os.path.join(combo_dir, 'metrics.json')
    
    print('\nFiles present:')
    for p in [metrics_path, roc_path, cm_path]:
        print(p, '->', 'FOUND' if os.path.exists(p) else 'MISSING')


In [None]:
## 6) Display Metrics and Visualizations for Top Pair

if top:
    if os.path.exists(metrics_path):
        combo_metrics = load_json(metrics_path)
        print('Metrics for top combination:')
        print(json.dumps(combo_metrics, indent=2))
    else:
        print('metrics.json missing for top combo')
    
    # Display ROC and confusion matrix images if present
    if os.path.exists(roc_path):
        display(Image(roc_path, width=700))
    else:
        print('ROC image missing')
    
    if os.path.exists(cm_path):
        display(Image(cm_path, width=500))
    else:
        print('Confusion matrix image missing')


## 7) Short Conclusion

Based on the saved evaluation, the top-ranked 2-model ensemble is **densenet121 + vgg16** (rank 1) with AUC = 0.96.

Recommendation: Use the `densenet121 + vgg16` averaging ensemble as the final 2-model result. Consider further checks on misclassified examples and calibration if you want to deploy.
