## 0. Install Dependencies

First, let's make sure all required packages are installed. Run the cell below to install the necessary dependencies.

**Note**: After running this cell, you'll need to restart the kernel as the cell will force a restart.

In [None]:
# Install required dependencies
# Run this cell first to ensure all necessary packages are installed

# Essential packages
!pip install ruptures==1.1.9  # Change point detection
!pip install git+https://github.com/Project-MONAI/MONAI.git@7c26e5af385eb5f7a813fa405c6f3fc87b7511fa  # Medical image processing
!pip install torch==2.7.0.dev20250221 torchvision==0.22.0.dev20250221  # Deep learning
!pip install numpy==1.26.4 pandas==2.2.3 matplotlib==3.10.0  # Data analysis and visualization
!pip install scikit-learn scikit-image==0.25.2 scipy==1.13.1  # Scientific computing
!pip install tqdm tensorboard==2.19.0  # Progress and visualization
!pip install nibabel==5.3.2  # Medical image I/O
!pip install statsmodels==0.14.4  # Time series analysis and statistical models
!pip install torchmetrics==1.2.1  # Additional PyTorch metrics

# Optional packages
!pip install h5py==3.13.0 SimpleITK==2.4.1 opencv-python networkx pillow

# Restart the kernel to ensure changes take effect
import os
os.kill(os.getpid(), 9)


## 0. Install Dependencies

First, let's make sure all required packages are installed. Run the cell below to install the necessary dependencies.

**Note**: After running this cell, you'll need to restart the kernel as the cell will force a restart.

In [None]:
# Install required dependencies
# Run this cell first to ensure all necessary packages are installed

# Essential packages
!pip install ruptures==1.1.9  # Change point detection
!pip install git+https://github.com/Project-MONAI/MONAI.git@7c26e5af385eb5f7a813fa405c6f3fc87b7511fa  # Medical image processing
!pip install torch==2.7.0.dev20250221 torchvision==0.22.0.dev20250221  # Deep learning
!pip install numpy==1.26.4 pandas==2.2.3 matplotlib==3.10.0  # Data analysis and visualization
!pip install scikit-learn scikit-image==0.25.2 scipy==1.13.1  # Scientific computing
!pip install tqdm tensorboard==2.19.0  # Progress and visualization
!pip install nibabel==5.3.2  # Medical image I/O

# Optional packages
!pip install h5py==3.13.0 SimpleITK==2.4.1 opencv-python networkx pillow

# Restart the kernel to ensure changes take effect
import os
os.kill(os.getpid(), 9)


# HYPERA Training on RunPod

This notebook runs the HYPERA training with agent-based hyperparameter optimization on RunPod.

## 1. Setup Environment

First, let's check that we have GPU access and set up our environment.

In [None]:
import os
import sys
import torch
import matplotlib.pyplot as plt

# Check GPU availability
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.2f} GB")
else:
    print("WARNING: No GPU detected. Training will be very slow.")

## 2. Install Required Packages

Let's make sure we have all the required packages installed.

In [None]:
# Install required packages if not already installed
!pip install monai matplotlib scikit-learn scikit-image tensorboard stable-baselines3

## 3. Add HYPERA to Python Path

Make sure Python can find the HYPERA modules.

In [None]:
# Add HYPERA to path
hypera_path = os.path.abspath("HYPERA1")
if hypera_path not in sys.path:
    sys.path.append(hypera_path)
    print(f"Added {hypera_path} to Python path")

# List the contents of the HYPERA1 directory to confirm it's there
!ls -la HYPERA1

## 4. Prepare Dataset

Make sure the BBBC039 dataset is properly set up.

In [None]:
# Check if dataset exists
if not os.path.exists("BBBC039"):
    print("Dataset not found. Please upload or download the BBBC039 dataset.")
    # Uncomment these lines to download the dataset (adjust URLs as needed)
    !mkdir -p BBBC039/images BBBC039/masks BBBC039_metadata
    !wget -P BBBC039/images https://data.broadinstitute.org/bbbc/BBBC039/images.zip
    !wget -P BBBC039/masks https://data.broadinstitute.org/bbbc/BBBC039/masks.zip
    !unzip BBBC039/images/images.zip -d BBBC039/images/
    !unzip BBBC039/masks/masks.zip -d BBBC039/masks/
else:
    print("Dataset found. Checking contents...")
    !ls -la BBBC039
    
# Check if metadata exists
if not os.path.exists("BBBC039_metadata"):
    print("Metadata not found. Creating metadata directory...")
    !mkdir -p BBBC039_metadata
    
    # Create training and validation split files
    print("Creating training and validation splits...")
    !ls BBBC039/masks/*.png > BBBC039_metadata/all_masks.txt
    !head -n $(( $(wc -l < BBBC039_metadata/all_masks.txt) * 80 / 100 )) BBBC039_metadata/all_masks.txt > BBBC039_metadata/training.txt
    !tail -n $(( $(wc -l < BBBC039_metadata/all_masks.txt) * 20 / 100 )) BBBC039_metadata/all_masks.txt > BBBC039_metadata/validation.txt
    
    print(f"Training samples: {!wc -l BBBC039_metadata/training.txt}")
    print(f"Validation samples: {!wc -l BBBC039_metadata/validation.txt}")
else:
    print("Metadata found. Checking contents...")
    !ls -la BBBC039_metadata
    print(f"Training samples: {!wc -l BBBC039_metadata/training.txt}")
    print(f"Validation samples: {!wc -l BBBC039_metadata/validation.txt}")

## 5. Fix MONAI Error

Apply the fix for the MONAI one_hot error we encountered earlier.

In [None]:
# Check if we need to apply the MONAI fix
import re

train_file_path = os.path.join(hypera_path, "legacy", "train_bbbc039_with_agents.py")

with open(train_file_path, 'r') as file:
    content = file.read()

# Check if the fix is already applied
if "SelectItemsd(keys=[\"label\"], dim=0, index=0)" not in content:
    print("Applying MONAI fix to train_bbbc039_with_agents.py...")
    
    # Replace the AsDiscreted transform in train_transforms
    content = re.sub(
        r'(\s+)AsDiscreted\(keys="label", argmax=True, to_onehot=\d+\)',
        r'\1# Make sure labels have only one channel before one-hot encoding\n\1SelectItemsd(keys=["label"], dim=0, index=0),  # Select only the first channel\n\1AsDiscreted(keys="label", to_onehot=2, argmax=False)',
        content
    )
    
    # Replace the AsDiscreted transform in val_transforms
    content = re.sub(
        r'(\s+)AsDiscreted\(keys="label", argmax=True, to_onehot=\d+\)',
        r'\1# Make sure labels have only one channel before one-hot encoding\n\1SelectItemsd(keys=["label"], dim=0, index=0),  # Select only the first channel\n\1AsDiscreted(keys="label", to_onehot=2, argmax=False)',
        content
    )
    
    with open(train_file_path, 'w') as file:
        file.write(content)
    
    print("Fix applied successfully!")
else:
    print("MONAI fix already applied.")

## 5b. Fix Loss Function Configuration

Apply the fix for the loss function configuration to avoid double one-hot encoding.

In [None]:
# Fix the loss function configuration in train_bbbc039_with_agents.py
import re

train_file_path = os.path.join(hypera_path, "legacy", "train_bbbc039_with_agents.py")

with open(train_file_path, 'r') as file:
    content = file.read()

# Check if the fix is already applied
if "to_onehot_y=True" in content:
    print("Applying loss function fix to train_bbbc039_with_agents.py...")
    
    # Replace to_onehot_y=True with to_onehot_y=False in loss functions
    content = re.sub(
        r'to_onehot_y=True',
        r'to_onehot_y=False',  # Labels are already one-hot encoded',
        content
    )
    
    with open(train_file_path, 'w') as file:
        file.write(content)
    
    print("Loss function fix applied successfully!")
else:
    print("Loss function fix already applied or not needed.")


## 5c. Apply RunPod-Specific Fixes

Apply fixes to ensure compatibility with RunPod environment.

In [None]:
# Set environment variable to indicate we're on RunPod
import os
os.environ['RUNPOD_POD_ID'] = 'NOTEBOOK_ENVIRONMENT'
print("Set RunPod environment variable to ensure proper DataLoader configuration")

# Check if our RunPod detection code is present
train_file_path = os.path.join(hypera_path, "legacy", "train_bbbc039_with_agents.py")

with open(train_file_path, 'r') as file:
    content = file.read()

if "is_runpod = os.environ.get('RUNPOD_POD_ID')" not in content:
    print("WARNING: RunPod detection code not found in training script.")
    print("The script may not be configured for RunPod environment.")
    print("Please check the latest version of the code.")
else:
    print("RunPod detection code found in training script.")
    print("DataLoader will use 0 worker processes to avoid multiprocessing issues.")


## 5b. Fix Loss Function Configuration

Apply the fix for the loss function configuration to avoid double one-hot encoding.

In [None]:
# Fix the loss function configuration in train_bbbc039_with_agents.py
import re

train_file_path = os.path.join(hypera_path, "legacy", "train_bbbc039_with_agents.py")

with open(train_file_path, 'r') as file:
    content = file.read()

# Check if the fix is already applied
if "to_onehot_y=True" in content:
    print("Applying loss function fix to train_bbbc039_with_agents.py...")
    
    # Replace to_onehot_y=True with to_onehot_y=False in loss functions
    content = re.sub(
        r'to_onehot_y=True',
        r'to_onehot_y=False',  # Labels are already one-hot encoded',
        content
    )
    
    with open(train_file_path, 'w') as file:
        file.write(content)
    
    print("Loss function fix applied successfully!")
else:
    print("Loss function fix already applied or not needed.")


## 5b. Fix Loss Function Configuration

Apply the fix for the loss function configuration to avoid double one-hot encoding.

In [None]:
# Fix the loss function configuration in train_bbbc039_with_agents.py
import re

train_file_path = os.path.join(hypera_path, "legacy", "train_bbbc039_with_agents.py")

with open(train_file_path, 'r') as file:
    content = file.read()

# Check if the fix is already applied
if "to_onehot_y=True" in content:
    print("Applying loss function fix to train_bbbc039_with_agents.py...")
    
    # Replace to_onehot_y=True with to_onehot_y=False in loss functions
    content = re.sub(
        r'to_onehot_y=True',
        r'to_onehot_y=False',  # Labels are already one-hot encoded',
        content
    )
    
    with open(train_file_path, 'w') as file:
        file.write(content)
    
    print("Loss function fix applied successfully!")
else:
    print("Loss function fix already applied or not needed.")


## 6. Run Training with Agent-Based Hyperparameter Optimization

Now let's run the training with the agent-based approach.

In [None]:
# Import HYPERA training module
from legacy.train_bbbc039_with_agents import main

# Run training with agent-based hyperparameter optimization
main(
    experiment_type="agent_factory",  # Use the agent-based approach
    epochs=100,                       # Number of epochs
    batch_size=16,                    # Batch size
    early_stopping=20,                # Early stopping patience
    use_cloud=False                   # Don't use cloud storage
)

## 7. Visualize Results with TensorBoard

Use TensorBoard to visualize the training metrics.

In [None]:
%load_ext tensorboard
%tensorboard --logdir=./runs

## 8. Compare with No-Agent Approach (Optional)

Run training without agents to compare performance.

In [None]:
# Run training without agents (using fixed hyperparameters)
# Uncomment to run
'''
main(
    experiment_type="no_agent",  # Use fixed hyperparameters
    epochs=100,                  # Number of epochs
    batch_size=16,               # Batch size
    early_stopping=20,           # Early stopping patience
    use_cloud=False              # Don't use cloud storage
)
'''

## 9. Analyze and Compare Results

Compare the performance of the agent-based approach vs. fixed hyperparameters.

In [None]:
# Plot comparison of training curves
# This is just a placeholder - you'll need to adapt this to your actual results
'''
import pandas as pd
import matplotlib.pyplot as plt

# Load results from CSV files (adjust paths as needed)
agent_results = pd.read_csv('results_agent/metrics.csv')
no_agent_results = pd.read_csv('results_no_agent/metrics.csv')

# Plot Dice scores
plt.figure(figsize=(12, 6))
plt.plot(agent_results['epoch'], agent_results['val_dice'], label='Agent-Based')
plt.plot(no_agent_results['epoch'], no_agent_results['val_dice'], label='Fixed Hyperparameters')
plt.xlabel('Epoch')
plt.ylabel('Validation Dice Score')
plt.title('Agent-Based vs. Fixed Hyperparameters')
plt.legend()
plt.grid(True)
plt.show()
'''

## 10. Save and Download Results

Make sure to save and download your results before stopping the pod.

In [None]:
# Compress results for easy download
!tar -czvf hypera_results.tar.gz runs/ results_*/ models/
print("Results compressed to hypera_results.tar.gz")
print("Download this file using the Jupyter file browser.")