# Lab 6: nnU-Net v2 Self-Configuring Segmentation on SageMaker

This notebook demonstrates how to train a medical image segmentation model on Amazon SageMaker using nnU-Net v2. nnU-Net automatically determines optimal preprocessing, architecture, and training hyperparameters.

## What You'll Learn
- Running nnU-Net v2 on SageMaker with a custom Docker image
- Self-configuring preprocessing, training, and evaluation
- Comparing nnU-Net with manually configured MONAI models (Lab 1)

## Prerequisites
- Medical imaging data in S3 in nnU-Net format (`imagesTr/`, `labelsTr/`, `dataset.json`)
- SageMaker execution role with S3 access

## Step 1: Setup and Imports

In [None]:
import os
import sagemaker
from sagemaker.pytorch import PyTorch
from sagemaker import get_execution_role
import boto3


sagemaker_session = sagemaker.Session(boto3.Session(region_name='<your-region>'))
role = get_execution_role()
# IF the role is giving an error, uncomment the following line and provide your role ARN
# role = "arn:aws:iam::123456789012:role/service-role/AmazonSageMaker-ExecutionRole-20200101T000001"
region = sagemaker_session.boto_region_name
bucket = sagemaker_session.default_bucket()
print(f"SageMaker role: {role}")
print(f"Region: {region}")
print(f"Bucket: {bucket}")

## Step 2: Configure Data Paths

In [None]:
bucket = 'your-sagemaker-bucket-name'  # Replace with your actual bucket name
data_path = f's3://{bucket}/segmentation_data/'
output_path = f's3://{bucket}/segmentation_data/output'

print(f"Training data: {data_path}")
print(f"Output path: {output_path}")

## Step 3: Build and Push Docker Image

nnU-Net uses a separate Dockerfile since it has different dependencies than MONAI.

In [None]:
account_id = boto3.client('sts').get_caller_identity()['Account']
image_name = "nnunet-segmentation"
ecr_repo = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{image_name}:latest"

print(f"ECR Repository: {ecr_repo}")
print("\nTo build and push the Docker image, run:")
print(f"  cd ../code")
print(f"  docker build -f docker/Dockerfile.nnunet -t {image_name}:latest .")
print(f"  aws ecr get-login-password --region {region} | docker login --username AWS --password-stdin {account_id}.dkr.ecr.{region}.amazonaws.com")
print(f"  aws ecr create-repository --repository-name {image_name} --region {region} || true")
print(f"  docker tag {image_name}:latest {ecr_repo}")
print(f"  docker push {ecr_repo}")

## Step 4: Create SageMaker Estimator

In [None]:
estimator = PyTorch(
    image_uri=ecr_repo,
    entry_point="nnunet_pipeline.py",
    source_dir="../code/training/nnunet",
    role=role,
    instance_count=1,
    instance_type="ml.g5.xlarge",
    hyperparameters={
        "stages": "preprocess,train,evaluate",
    },
    output_path=output_path,
    base_job_name="nnunet-pipeline",
    keep_alive_period_in_seconds=1800,
    sagemaker_session=sagemaker_session,
)
print("Estimator created successfully!")

## Step 5: Start Training

In [None]:
estimator.fit({"training": data_path}, wait=True, logs="All")

## Step 6: View Training Results

In [None]:
model_data = estimator.model_data
training_job_name = estimator.latest_training_job.name
print(f"Training job: {training_job_name}")
print(f"Model artifacts: {model_data}")