# Brain Tumor Detection - Exploration and Analysis

This notebook demonstrates how to use the Brain Tumor Detection system for data exploration, model training, and analysis.

## Table of Contents
1. [Setup and Imports](#setup)
2. [Data Exploration](#data-exploration)
3. [Model Training](#model-training)
4. [Model Evaluation](#model-evaluation)
5. [Predictions and Visualization](#predictions)
6. [Grad-CAM Analysis](#gradcam)

## 1. Setup and Imports {#setup}

In [None]:
# Import necessary libraries
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

# Add src directory to path
sys.path.append('../src')

# Import custom modules
from models.brain_tumor_model import create_model
from data.data_processor import DatasetManager, prepare_data
from utils.evaluation import ModelEvaluator, GradCAM, plot_sample_predictions

# Set plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

# Configure pandas display
pd.set_option('display.max_columns', None)

print("✅ All imports successful!")

## 2. Data Exploration {#data-exploration}

Let's explore the dataset structure and characteristics.

In [None]:
# Set data directory path
DATA_DIR = "../data"  # Update this path to your dataset

# Check if data directory exists
if not os.path.exists(DATA_DIR):
    print(f"⚠️ Data directory {DATA_DIR} not found.")
    print("Please update the DATA_DIR variable with the correct path to your dataset.")
    print("\nExpected structure:")
    print("data/")
    print("├── tumor/")
    print("│   ├── image1.jpg")
    print("│   ├── image2.jpg")
    print("│   └── ...")
    print("└── no_tumor/")
    print("    ├── image1.jpg")
    print("    ├── image2.jpg")
    print("    └── ...")
else:
    print(f"✅ Data directory found: {DATA_DIR}")
    
    # List subdirectories (classes)
    subdirs = [d for d in os.listdir(DATA_DIR) if os.path.isdir(os.path.join(DATA_DIR, d))]
    print(f"Classes found: {subdirs}")
    
    # Count images in each class
    for subdir in subdirs:
        class_path = os.path.join(DATA_DIR, subdir)
        image_count = len([f for f in os.listdir(class_path) 
                          if f.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp', '.tiff'))])
        print(f"  {subdir}: {image_count} images")

In [None]:
# Create dataset manager and load data
if os.path.exists(DATA_DIR):
    dataset_manager = DatasetManager(DATA_DIR, image_size=(224, 224), batch_size=32)
    
    # Load dataset
    images, labels, class_names = dataset_manager.load_dataset_from_folder()
    
    print(f"Dataset loaded successfully!")
    print(f"Total images: {len(images)}")
    print(f"Image shape: {images[0].shape}")
    print(f"Classes: {class_names}")
    
    # Analyze dataset
    dataset_manager.analyze_dataset(images, labels, class_names)
else:
    print("Skipping data exploration - please provide dataset path")

## 3. Model Training {#model-training}

Create and train a brain tumor detection model.

In [None]:
# Model configuration
MODEL_CONFIG = {
    'base_model': 'resnet50',  # Options: 'resnet50', 'vgg16', 'efficientnetb0'
    'input_shape': (224, 224, 3),
    'num_classes': 2,  # Binary classification: tumor/no tumor
    'learning_rate': 0.001,
    'epochs': 10,  # Reduced for demo purposes
    'batch_size': 32
}

print("Model Configuration:")
for key, value in MODEL_CONFIG.items():
    print(f"  {key}: {value}")

In [None]:
# Create model
model = create_model(
    base_model=MODEL_CONFIG['base_model'],
    input_shape=MODEL_CONFIG['input_shape'],
    num_classes=MODEL_CONFIG['num_classes']
)

# Compile with specified learning rate
model.compile_model(learning_rate=MODEL_CONFIG['learning_rate'])

print("✅ Model created and compiled successfully!")
print("\nModel Summary:")
model.get_model_summary()

In [None]:
# Prepare data generators (if dataset is available)
if os.path.exists(DATA_DIR):
    train_gen, val_gen, class_indices = prepare_data(
        data_dir=DATA_DIR,
        image_size=(224, 224),
        batch_size=MODEL_CONFIG['batch_size'],
        validation_split=0.2
    )
    
    print(f"Training samples: {train_gen.samples}")
    print(f"Validation samples: {val_gen.samples}")
    print(f"Class indices: {class_indices}")
    
    # Train model
    print("\nStarting training...")
    history = model.train(
        train_data=train_gen,
        validation_data=val_gen,
        epochs=MODEL_CONFIG['epochs']
    )
    
    print("✅ Training completed!")
else:
    print("Skipping training - please provide dataset")
    history = None

## 4. Model Evaluation {#model-evaluation}

Evaluate the trained model performance.

In [None]:
# Plot training history
if history is not None:
    from utils.evaluation import ModelEvaluator
    
    # Create evaluator
    evaluator = ModelEvaluator(model.model, list(class_indices.keys()))
    
    # Plot training history
    evaluator.plot_training_history(history)
else:
    print("No training history available")

In [None]:
# Evaluate model on validation data
if 'val_gen' in locals():
    # Reset validation generator
    val_gen.reset()
    
    # Evaluate model
    results = evaluator.evaluate_model(val_gen)
    
    # Plot confusion matrix
    evaluator.plot_confusion_matrix(results['y_true'], results['y_pred_classes'])
    
    # Plot ROC curve
    evaluator.plot_roc_curve(results['y_true'], results['y_pred'])
else:
    print("No validation data available for evaluation")

## 5. Predictions and Visualization {#predictions}

Make predictions on sample images and visualize results.

In [None]:
# Show sample predictions
if 'val_gen' in locals():
    val_gen.reset()
    plot_sample_predictions(model.model, val_gen, list(class_indices.keys()), num_samples=9)
else:
    print("No validation data available for sample predictions")

In [None]:
# Make prediction on a single image (example)
# Note: Replace 'path_to_image' with an actual image path

def predict_single_image(model, image_path, class_names):
    """
    Make prediction on a single image
    """
    from data.data_processor import ImagePreprocessor
    
    preprocessor = ImagePreprocessor(target_size=(224, 224))
    
    # Load and preprocess image
    image = preprocessor.load_image(image_path)
    
    if image is not None:
        # Make prediction
        predicted_class, confidence = model.predict(image)
        
        print(f"Image: {image_path}")
        print(f"Predicted class: {class_names[predicted_class]}")
        print(f"Confidence: {confidence:.3f}")
        
        # Display image
        plt.figure(figsize=(8, 6))
        plt.imshow(image)
        plt.title(f'Prediction: {class_names[predicted_class]} (Confidence: {confidence:.3f})')
        plt.axis('off')
        plt.show()
        
        return predicted_class, confidence
    else:
        print(f"Failed to load image: {image_path}")
        return None, None

# Example usage (uncomment and provide actual image path)
# image_path = "path/to/your/image.jpg"
# if os.path.exists(image_path):
#     pred_class, confidence = predict_single_image(model, image_path, list(class_indices.keys()))
# else:
#     print("Please provide a valid image path to test single image prediction")

print("Single image prediction function defined. Uncomment the example usage to test.")

## 6. Grad-CAM Analysis {#gradcam}

Generate Grad-CAM visualizations for model interpretability.

In [None]:
# Initialize Grad-CAM
gradcam = GradCAM(model.model)

print("✅ Grad-CAM initialized")
print(f"Target layer: {gradcam.layer_name}")

In [None]:
# Generate Grad-CAM for sample images
if 'val_gen' in locals():
    val_gen.reset()
    
    # Get a batch of validation images
    batch_images, batch_labels = next(val_gen)
    
    # Generate Grad-CAM for first few images
    num_samples = min(3, len(batch_images))
    
    for i in range(num_samples):
        image = batch_images[i]
        true_label = np.argmax(batch_labels[i])
        
        print(f"\nSample {i+1} - True label: {list(class_indices.keys())[true_label]}")
        
        # Generate Grad-CAM visualization
        gradcam.visualize_activation(image, figsize=(15, 5))
else:
    print("No validation data available for Grad-CAM analysis")

## 7. Save Model

Save the trained model for later use.

In [None]:
# Save trained model
MODEL_DIR = "../models"
os.makedirs(MODEL_DIR, exist_ok=True)

model_name = f"{MODEL_CONFIG['base_model']}_brain_tumor_model"
model_path = os.path.join(MODEL_DIR, f"{model_name}.h5")

# Save model
model.save_model(model_path)
print(f"✅ Model saved to: {model_path}")

# Save configuration
config = {
    'model_name': MODEL_CONFIG['base_model'],
    'input_shape': MODEL_CONFIG['input_shape'],
    'num_classes': MODEL_CONFIG['num_classes'],
    'image_size': 224,
    'class_names': list(class_indices.keys()) if 'class_indices' in locals() else ['no_tumor', 'tumor']
}

import json
config_path = os.path.join(MODEL_DIR, f"{model_name}.json")
with open(config_path, 'w') as f:
    json.dump(config, f, indent=2)

print(f"✅ Configuration saved to: {config_path}")

## Summary

This notebook demonstrated:

1. **Data Exploration**: Loading and analyzing the brain tumor dataset
2. **Model Training**: Creating and training a CNN model with transfer learning
3. **Model Evaluation**: Assessing model performance with various metrics
4. **Predictions**: Making predictions on individual images
5. **Grad-CAM Analysis**: Generating interpretable visualizations
6. **Model Saving**: Saving the trained model for deployment

### Next Steps:

- **Web Application**: Use the trained model with the Streamlit web app (`streamlit run ../src/web_app.py`)
- **API Server**: Deploy the model as a REST API (`python ../src/api_server.py`)
- **Batch Processing**: Process multiple images (`python ../src/predict.py`)
- **Fine-tuning**: Improve model performance with more data or fine-tuning

### Important Notes:

⚠️ **Medical Disclaimer**: This model is for research and educational purposes only. It should not be used for actual medical diagnosis. Always consult qualified healthcare professionals for medical decisions.