# Pediatric Pneumonia Detection - Model Evaluation

This notebook provides a comprehensive evaluation of the trained ResNet-50 model. It includes performance metrics (Accuracy, Sensitivity, Specificity), visualization of results (Confusion Matrix, ROC Curve), and model interpretability using Grad-CAM.

---

## 1. Environment Setup
Import necessary libraries and project modules.

In [None]:
import sys
import os
import matplotlib.pyplot as plt
from pathlib import Path
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Add project root to path
project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
if project_root not in sys.path:
    sys.path.append(project_root)

from model_core.evaluator import ModelEvaluator
from model_core.gradcam import GradCAMVisualizer

## 2. Configuration
Set the paths for the dataset and the trained model file.

In [None]:
# === UPDATE THESE PATHS ===
DATASET_PATH = "/path/to/chest_xray" 
MODEL_PATH = "../outputs/run_TIMESTAMP/checkpoints/stage2_best.h5"
# ===========================

IMG_SIZE = (224, 224)
BATCH_SIZE = 32

## 3. Load Resources
Load the saved model and prepare the test data generator.

In [None]:
print(f"LOADING MODEL: {MODEL_PATH}...")
model = load_model(MODEL_PATH)
print("âœ… Model loaded successfully.")

In [None]:
print("PREPARING DATA GENERATOR...")
test_datagen = ImageDataGenerator(rescale=1./255)

test_gen = test_datagen.flow_from_directory(
    str(Path(DATASET_PATH) / 'test'),
    target_size=IMG_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='binary',
    shuffle=False
)

## 4. Quantitative Evaluation
Calculate and display key performance metrics.

In [None]:
# Initialize Evaluator
evaluator = ModelEvaluator(model, test_gen)

# Calculate Metrics
metrics = evaluator.calculate_metrics()

### Classification Report
Detailed breakdown of precision, recall, and F1-score for each class.

In [None]:
evaluator.generate_classification_report();

## 5. Visualizations
Visualizing the model's performance to understand its behavior.

### Confusion Matrix
Shows the number of True Positives, False Positives, True Negatives, and False Negatives.

In [None]:
evaluator.plot_confusion_matrix()

### ROC Curve
Receiver Operating Characteristic curve illustrating the diagnostic ability of the classifier.

In [None]:
evaluator.plot_roc_curve()

### Precision-Recall Curve
Useful for assessing performance on imbalanced datasets.

In [None]:
evaluator.plot_precision_recall_curve()

## 6. Explainability (Grad-CAM)
Visualizing where the model looks in the X-ray images to make its decision. Heatmaps show high-attention regions.

In [None]:
gradcam = GradCAMVisualizer(model)

# Visualize a batch of random samples from the test set
gradcam.visualize_batch(test_gen, num_samples=8)