# üè• Rural Emergency Triage AI - MedGemma Fine-Tuning Pipeline

**MedGemma Impact Challenge Submission**

This notebook fine-tunes **MedGemma 4B** for emergency radiology triage using the official Google approach:
- **QLoRA** (4-bit quantization + LoRA adapters)
- **SFTTrainer** from HuggingFace TRL
- **Conversational format** (image + text prompt ‚Üí classification answer)

### Requirements
- **GPU**: A100 (40GB) via Colab Pro, or T4 with reduced batch size
- **HuggingFace token**: With access to `google/medgemma-4b-it`
- **Kaggle API key**: For dataset downloads

### Tasks
1. **Hemorrhage Detection** (CT head scans ‚Üí 6 subtypes)
2. **Pneumothorax Detection** (Chest X-rays ‚Üí binary)

---

## How It Works

MedGemma is a **generative** vision-language model, not a traditional classifier.
We frame classification as a multiple-choice question:

```
User: <image> What critical finding is present in this CT scan?
A: No hemorrhage
B: Epidural hemorrhage
...
Assistant: B: Epidural hemorrhage
```

The model learns to output the correct answer letter via supervised fine-tuning.

## ‚öôÔ∏è Step 1: Environment Setup

In [None]:
# Check GPU - MedGemma requires bfloat16 support (A100 ideal, T4 works with adjustments)
!nvidia-smi
import torch
print(f"\nPyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    cap = torch.cuda.get_device_capability()
    print(f"Compute capability: {cap[0]}.{cap[1]}")
    print(f"bfloat16 supported: {cap[0] >= 8}")
    if cap[0] < 8:
        print("‚ö†Ô∏è GPU does not support bfloat16. Will use float16 instead.")
        print("   For best results, use an A100 GPU (Colab Pro).")

In [None]:
# Install all dependencies
!pip install -q --upgrade pip
!pip install -q "transformers>=4.50.0" accelerate peft bitsandbytes
!pip install -q trl datasets evaluate tensorboard
!pip install -q pydicom opencv-python-headless Pillow
!pip install -q scikit-learn pandas matplotlib seaborn
!pip install -q kaggle tqdm pyyaml
print("‚úì All dependencies installed!")

In [None]:
# Mount Google Drive for persistent model storage
from google.colab import drive
drive.mount('/content/drive')

!mkdir -p /content/drive/MyDrive/rural_triage_ai/models
!mkdir -p /content/drive/MyDrive/rural_triage_ai/results
print("‚úì Google Drive mounted!")

## üîë Step 2: Authentication

You need two sets of credentials:
1. **HuggingFace token** ‚Äî for MedGemma model access
2. **Kaggle API key** ‚Äî for dataset downloads

### Get MedGemma access:
1. Go to [google/medgemma-4b-it](https://huggingface.co/google/medgemma-4b-it)
2. Accept the usage conditions
3. Get your token from [HF Settings](https://huggingface.co/settings/tokens)

In [None]:
import os
import sys

# --- HuggingFace Authentication ---
if "google.colab" in sys.modules:
    from google.colab import userdata
    try:
        os.environ["HF_TOKEN"] = userdata.get("HF_TOKEN")
        print("‚úì HF_TOKEN loaded from Colab Secrets")
    except Exception:
        print("‚ö†Ô∏è HF_TOKEN not found in Colab Secrets.")
        print("   Add it: click üîë Secrets (left panel) ‚Üí New secret ‚Üí Name: HF_TOKEN")
        print("   Or run: huggingface-cli login")
        from huggingface_hub import notebook_login
        notebook_login()
else:
    from huggingface_hub import get_token
    if get_token() is None:
        from huggingface_hub import notebook_login
        notebook_login()

# --- Kaggle Authentication ---
if os.path.exists("/content/drive/MyDrive/kaggle.json"):
    !mkdir -p ~/.kaggle
    !cp /content/drive/MyDrive/kaggle.json ~/.kaggle/
    !chmod 600 ~/.kaggle/kaggle.json
    print("‚úì Kaggle credentials loaded from Google Drive")
else:
    print("Upload your kaggle.json (from https://www.kaggle.com/account ‚Üí Create New API Token):")
    from google.colab import files
    uploaded = files.upload()
    !mkdir -p ~/.kaggle
    !mv kaggle.json ~/.kaggle/
    !chmod 600 ~/.kaggle/kaggle.json
    print("‚úì Kaggle credentials configured")

## üì• Step 3: Download Dataset

We'll use the **RSNA Intracranial Hemorrhage Detection** dataset from Kaggle.
For quick iteration, we download a small subset first.

**Note**: You must accept the competition rules first at:
https://www.kaggle.com/competitions/rsna-intracranial-hemorrhage-detection/rules

In [None]:
import os
import pandas as pd
import numpy as np
from pathlib import Path

DATA_DIR = "/content/data"
os.makedirs(DATA_DIR, exist_ok=True)

# --- Option A: Download RSNA Hemorrhage subset from Kaggle ---
# Uncomment if you have Kaggle credentials and accepted competition rules:
# !kaggle competitions download -c rsna-intracranial-hemorrhage-detection -p {DATA_DIR}/rsna -f stage_2_train.csv
# !kaggle competitions download -c rsna-intracranial-hemorrhage-detection -p {DATA_DIR}/rsna -f stage_2_train_images.zip

# --- Option B: Download a smaller public head CT dataset ---
print("Downloading head CT hemorrhage dataset (~500MB)...")
os.makedirs(f"{DATA_DIR}/head_ct", exist_ok=True)
!kaggle datasets download -d felipekitamura/head-ct-hemorrhage -p {DATA_DIR}/head_ct
!cd {DATA_DIR}/head_ct && unzip -q -o "*.zip" 2>/dev/null; true

# --- Option C: Use chest X-ray dataset for pneumothorax ---
# print("Downloading chest X-ray pneumonia dataset (~1.2GB)...")
# os.makedirs(f"{DATA_DIR}/chest_xray", exist_ok=True)
# !kaggle datasets download -d paultimothymooney/chest-xray-pneumonia -p {DATA_DIR}/chest_xray
# !cd {DATA_DIR}/chest_xray && unzip -q -o "*.zip" 2>/dev/null; true

print("\n‚úì Dataset downloaded!")
!du -sh {DATA_DIR}/*

## üóÇÔ∏è Step 4: Prepare Dataset for MedGemma

MedGemma uses a **conversational format** for fine-tuning. We convert our image classification
dataset into message-based examples:

```
User: [image] What critical finding is present?
Assistant: B: Epidural hemorrhage
```

In [None]:
import os
import glob
import random
from pathlib import Path
from typing import Any
from PIL import Image
from datasets import Dataset, DatasetDict

# ============================================================
# Configuration ‚Äî change these for your task
# ============================================================
TASK = "hemorrhage"  # "hemorrhage" or "pneumothorax"

if TASK == "hemorrhage":
    CLASS_LABELS = [
        "A: No hemorrhage",
        "B: Epidural hemorrhage",
        "C: Subdural hemorrhage",
        "D: Subarachnoid hemorrhage",
        "E: Intraventricular hemorrhage",
        "F: Intraparenchymal hemorrhage",
    ]
    PROMPT = (
        "You are an emergency radiology AI assistant. "
        "Analyze this CT head scan and identify the most likely finding.\n"
        + "\n".join(CLASS_LABELS)
    )
    DATA_DIR = "/content/data/head_ct"
else:
    CLASS_LABELS = [
        "A: No pneumothorax",
        "B: Pneumothorax present",
    ]
    PROMPT = (
        "You are an emergency radiology AI assistant. "
        "Analyze this chest X-ray and determine if pneumothorax is present.\n"
        + "\n".join(CLASS_LABELS)
    )
    DATA_DIR = "/content/data/chest_xray"

print(f"Task: {TASK}")
print(f"Classes: {len(CLASS_LABELS)}")
print(f"Data dir: {DATA_DIR}")

# ============================================================
# Build dataset from image files
# ============================================================
def find_images_and_labels(data_dir: str) -> list[dict]:
    """
    Scan data directory for images and assign labels.
    Adapt this function to your specific dataset structure.
    """
    examples = []
    data_path = Path(data_dir)

    # Strategy 1: Folder-based labels (e.g., chest_xray/train/NORMAL/, chest_xray/train/PNEUMONIA/)
    for label_dir in sorted(data_path.rglob("*")):
        if label_dir.is_dir():
            images = list(label_dir.glob("*.png")) + list(label_dir.glob("*.jpg")) + list(label_dir.glob("*.jpeg"))
            if images:
                folder_name = label_dir.name.upper()
                # Map folder names to label indices
                if "NORMAL" in folder_name or "NEGATIVE" in folder_name or "NO" in folder_name:
                    label_idx = 0
                else:
                    label_idx = min(1, len(CLASS_LABELS) - 1)
                for img_path in images:
                    examples.append({"image_path": str(img_path), "label": label_idx})

    # Strategy 2: If no folder-based labels found, use all images with label 0
    # (You'll need to add proper labels from a CSV or other source)
    if not examples:
        all_images = []
        for ext in ["*.png", "*.jpg", "*.jpeg", "*.dcm"]:
            all_images.extend(data_path.rglob(ext))
        print(f"Found {len(all_images)} images (no folder-based labels detected)")
        print("‚ö†Ô∏è Assigning random labels for demo. Replace with real labels!")
        for img_path in all_images:
            examples.append({
                "image_path": str(img_path),
                "label": random.randint(0, len(CLASS_LABELS) - 1),
            })

    random.seed(42)
    random.shuffle(examples)
    return examples

raw_examples = find_images_and_labels(DATA_DIR)
print(f"\nTotal examples: {len(raw_examples)}")

# Limit dataset size for quick iteration (increase for full training)
MAX_TRAIN = 2000
MAX_VAL = 200
raw_examples = raw_examples[: MAX_TRAIN + MAX_VAL]

# ============================================================
# Convert to HuggingFace Dataset with conversational format
# ============================================================
def load_and_format(example: dict) -> dict:
    """Load image and format as MedGemma conversation."""
    try:
        img_path = example["image_path"]
        if img_path.endswith(".dcm"):
            import pydicom
            dcm = pydicom.dcmread(img_path)
            arr = dcm.pixel_array.astype(np.float32)
            arr = (arr - arr.min()) / (arr.max() - arr.min() + 1e-8) * 255
            image = Image.fromarray(arr.astype(np.uint8)).convert("RGB")
        else:
            image = Image.open(img_path).convert("RGB")

        label_idx = example["label"]
        return {
            "image": image,
            "label": label_idx,
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {"type": "image"},
                        {"type": "text", "text": PROMPT},
                    ],
                },
                {
                    "role": "assistant",
                    "content": [
                        {"type": "text", "text": CLASS_LABELS[label_idx]},
                    ],
                },
            ],
        }
    except Exception as e:
        print(f"Error loading {example['image_path']}: {e}")
        return None

# Process examples
print("Loading and formatting images...")
formatted = []
for ex in raw_examples:
    result = load_and_format(ex)
    if result is not None:
        formatted.append(result)

# Split into train/val
split_idx = int(len(formatted) * 0.9)
train_data = formatted[:split_idx]
val_data = formatted[split_idx:]

data = DatasetDict({
    "train": Dataset.from_list(train_data),
    "validation": Dataset.from_list(val_data),
})

print(f"\n‚úì Dataset ready!")
print(f"  Train: {len(data['train'])} examples")
print(f"  Validation: {len(data['validation'])} examples")
print(f"\nSample message format:")
print(data["train"][0]["messages"])

## ü§ñ Step 5: Load MedGemma with QLoRA

We load MedGemma 4B instruction-tuned model with 4-bit quantization to fit in GPU memory.

In [None]:
import torch
from transformers import AutoProcessor, AutoModelForImageTextToText, BitsAndBytesConfig

MODEL_ID = "google/medgemma-4b-it"

# Determine dtype based on GPU capability
if torch.cuda.is_available() and torch.cuda.get_device_capability()[0] >= 8:
    compute_dtype = torch.bfloat16
    print("Using bfloat16 (A100/H100 detected)")
else:
    compute_dtype = torch.float16
    print("Using float16 (T4/V100 detected)")

# QLoRA quantization config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=compute_dtype,
    bnb_4bit_quant_storage=compute_dtype,
)

print(f"Loading {MODEL_ID}...")
model = AutoModelForImageTextToText.from_pretrained(
    MODEL_ID,
    quantization_config=bnb_config,
    attn_implementation="eager",
    torch_dtype=compute_dtype,
    device_map="auto",
)
processor = AutoProcessor.from_pretrained(MODEL_ID)
processor.tokenizer.padding_side = "right"

print(f"\n‚úì MedGemma loaded!")
print(f"  Model parameters: {model.num_parameters():,}")
print(f"  Dtype: {compute_dtype}")

In [None]:
from peft import LoraConfig
from typing import Any

# ============================================================
# LoRA Configuration
# ============================================================
peft_config = LoraConfig(
    lora_alpha=16,
    lora_dropout=0.05,
    r=16,
    bias="none",
    target_modules="all-linear",
    task_type="CAUSAL_LM",
    modules_to_save=["lm_head", "embed_tokens"],
)

# ============================================================
# Custom Data Collator for multimodal inputs
# ============================================================
def collate_fn(examples: list[dict[str, Any]]):
    """Process examples with text + images into model input format."""
    texts = []
    images = []
    for example in examples:
        images.append([example["image"].convert("RGB")])
        texts.append(
            processor.apply_chat_template(
                example["messages"], add_generation_prompt=False, tokenize=False
            ).strip()
        )

    # Tokenize texts and process images
    batch = processor(text=texts, images=images, return_tensors="pt", padding=True)

    # Create labels: mask padding and image tokens in loss computation
    labels = batch["input_ids"].clone()

    # Mask image tokens
    image_token_id = processor.tokenizer.convert_tokens_to_ids(
        processor.tokenizer.special_tokens_map.get("boi_token", "<image>")
    )
    labels[labels == processor.tokenizer.pad_token_id] = -100
    if isinstance(image_token_id, int):
        labels[labels == image_token_id] = -100
    labels[labels == 262144] = -100  # Additional image placeholder token

    batch["labels"] = labels
    return batch

print("‚úì LoRA config and data collator ready")

In [None]:
from trl import SFTConfig, SFTTrainer

# ============================================================
# Training Configuration
# ============================================================
OUTPUT_DIR = f"medgemma-4b-it-{TASK}"
DRIVE_SAVE_DIR = f"/content/drive/MyDrive/rural_triage_ai/models/{TASK}"
os.makedirs(DRIVE_SAVE_DIR, exist_ok=True)

# Adjust batch size based on GPU memory
# A100 (40GB): batch_size=4, grad_accum=4
# T4 (16GB):   batch_size=1, grad_accum=16
if torch.cuda.is_available() and torch.cuda.get_device_capability()[0] >= 8:
    batch_size = 4
    grad_accum = 4
else:
    batch_size = 1
    grad_accum = 16

training_args = SFTConfig(
    output_dir=OUTPUT_DIR,
    num_train_epochs=1,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    gradient_accumulation_steps=grad_accum,
    gradient_checkpointing=True,
    optim="adamw_torch_fused",
    logging_steps=10,
    save_strategy="epoch",
    eval_strategy="steps",
    eval_steps=50,
    learning_rate=2e-4,
    bf16=(compute_dtype == torch.bfloat16),
    fp16=(compute_dtype == torch.float16),
    max_grad_norm=0.3,
    warmup_ratio=0.03,
    lr_scheduler_type="linear",
    push_to_hub=False,
    report_to="tensorboard",
    gradient_checkpointing_kwargs={"use_reentrant": False},
    dataset_kwargs={"skip_prepare_dataset": True},
    remove_unused_columns=False,
    label_names=["labels"],
)

# ============================================================
# Create Trainer
# ============================================================
trainer = SFTTrainer(
    model=model,
    args=training_args,
    train_dataset=data["train"],
    eval_dataset=data["validation"],
    peft_config=peft_config,
    processing_class=processor,
    data_collator=collate_fn,
)

print(f"‚úì Trainer configured!")
print(f"  Batch size: {batch_size} √ó {grad_accum} grad accum = {batch_size * grad_accum} effective")
print(f"  Training steps: ~{len(data['train']) // (batch_size * grad_accum)}")
print(f"  Output: {OUTPUT_DIR}")

## ? Step 6: Train!

This will take approximately:
- **A100**: ~1-3 hours (2000 samples, 1 epoch)
- **T4**: ~4-8 hours (2000 samples, 1 epoch)

You can safely close your laptop ‚Äî training runs in the cloud.

In [None]:
print("üöÄ Starting training...")
print(f"   Task: {TASK}")
print(f"   Model: {MODEL_ID}")
print(f"   Train samples: {len(data['train'])}")
print(f"   Val samples: {len(data['validation'])}")
print()

trainer.train()

print("\n‚úì Training complete!")

## üíæ Step 7: Save Model to Google Drive

import shutil
import json

# Save LoRA adapter locally
trainer.save_model(OUTPUT_DIR)
processor.save_pretrained(OUTPUT_DIR)

# Copy to Google Drive for persistence
print(f"Copying model to Google Drive: {DRIVE_SAVE_DIR}")
if os.path.exists(DRIVE_SAVE_DIR):
    shutil.rmtree(DRIVE_SAVE_DIR)
shutil.copytree(OUTPUT_DIR, DRIVE_SAVE_DIR)

# Save training metrics
metrics = trainer.state.log_history
with open(f"{DRIVE_SAVE_DIR}/training_metrics.json", "w") as f:
    json.dump(metrics, f, indent=2)

# Save task config for inference
task_config = {
    "model_id": MODEL_ID,
    "task": TASK,
    "class_labels": CLASS_LABELS,
    "prompt": PROMPT,
    "lora_dir": DRIVE_SAVE_DIR,
}
with open(f"{DRIVE_SAVE_DIR}/task_config.json", "w") as f:
    json.dump(task_config, f, indent=2)

print(f"\n‚úì Model saved to Google Drive!")
print(f"  Adapter: {DRIVE_SAVE_DIR}/")
print(f"  Metrics: {DRIVE_SAVE_DIR}/training_metrics.json")
print(f"  Config:  {DRIVE_SAVE_DIR}/task_config.json")

In [None]:
import torch
from sklearn.metrics import classification_report, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

# ============================================================
# Run inference on validation set
# ============================================================
model.eval()

y_true = []
y_pred = []
y_pred_text = []
num_eval = min(100, len(data["validation"]))  # Evaluate on subset for speed

print(f"Running inference on {num_eval} validation samples...\n")

for i in range(num_eval):
    example = data["validation"][i]
    image = example["image"].convert("RGB")
    true_label = example["label"]

    # Build input (user message only, no assistant response)
    messages = [
        {
            "role": "user",
            "content": [
                {"type": "image"},
                {"type": "text", "text": PROMPT},
            ],
        },
    ]

    inputs = processor.apply_chat_template(
        messages, add_generation_prompt=True, tokenize=False
    )
    model_inputs = processor(
        text=inputs, images=[image], return_tensors="pt"
    ).to(model.device, dtype=compute_dtype)

    input_len = model_inputs["input_ids"].shape[-1]

    with torch.inference_mode():
        output = model.generate(**model_inputs, max_new_tokens=50, do_sample=False)
    
    generated = processor.decode(output[0][input_len:], skip_special_tokens=True).strip()

    # Parse predicted class
    pred_label = -1
    for idx, class_name in enumerate(CLASS_LABELS):
        if class_name.split(":")[0].strip() in generated:
            pred_label = idx
            break
    if pred_label == -1:
        pred_label = 0  # Default fallback

    y_true.append(true_label)
    y_pred.append(pred_label)
    y_pred_text.append(generated)

    if i < 5:
        print(f"  Sample {i}: True={CLASS_LABELS[true_label]} | Pred={generated}")

# ============================================================
# Classification Report
# ============================================================
short_labels = [c.split(": ")[1] for c in CLASS_LABELS]
print("\n" + "=" * 60)
print("Classification Report")
print("=" * 60)
print(classification_report(y_true, y_pred, target_names=short_labels, zero_division=0))

# ============================================================
# Confusion Matrix
# ============================================================
cm = confusion_matrix(y_true, y_pred, labels=list(range(len(CLASS_LABELS))))
plt.figure(figsize=(8, 6))
sns.heatmap(
    cm, annot=True, fmt="d", cmap="Blues",
    xticklabels=short_labels, yticklabels=short_labels,
)
plt.xlabel("Predicted")
plt.ylabel("True")
plt.title(f"Confusion Matrix ‚Äî {TASK.title()} Detection")
plt.tight_layout()
plt.savefig(f"{DRIVE_SAVE_DIR}/confusion_matrix.png", dpi=150)
plt.show()
print(f"‚úì Confusion matrix saved to {DRIVE_SAVE_DIR}/confusion_matrix.png")

## üîç Step 9: Single Image Inference Demo

Use this cell to test the model on any image. Upload an image or provide a path.

In [None]:
def predict_image(image_path: str, task_prompt: str = PROMPT) -> str:
    """Run inference on a single image and return the prediction."""
    if image_path.endswith(".dcm"):
        import pydicom
        dcm = pydicom.dcmread(image_path)
        arr = dcm.pixel_array.astype(np.float32)
        arr = (arr - arr.min()) / (arr.max() - arr.min() + 1e-8) * 255
        image = Image.fromarray(arr.astype(np.uint8)).convert("RGB")
    else:
        image = Image.open(image_path).convert("RGB")

    messages = [
        {
            "role": "user",
            "content": [
                {"type": "image"},
                {"type": "text", "text": task_prompt},
            ],
        },
    ]

    inputs = processor.apply_chat_template(
        messages, add_generation_prompt=True, tokenize=False
    )
    model_inputs = processor(
        text=inputs, images=[image], return_tensors="pt"
    ).to(model.device, dtype=compute_dtype)

    input_len = model_inputs["input_ids"].shape[-1]

    with torch.inference_mode():
        output = model.generate(**model_inputs, max_new_tokens=100, do_sample=False)

    response = processor.decode(output[0][input_len:], skip_special_tokens=True).strip()

    # Display
    plt.figure(figsize=(6, 6))
    plt.imshow(image)
    plt.axis("off")
    plt.title(f"Prediction: {response}", fontsize=12, pad=10)
    plt.tight_layout()
    plt.show()

    return response


# --- Test on a validation image ---
test_example = data["validation"][0]
test_path = test_example.get("image_path", None)
if test_path and os.path.exists(test_path):
    result = predict_image(test_path)
else:
    # Use the PIL image directly
    image = test_example["image"].convert("RGB")
    messages = [{"role": "user", "content": [{"type": "image"}, {"type": "text", "text": PROMPT}]}]
    inputs = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
    model_inputs = processor(text=inputs, images=[image], return_tensors="pt").to(model.device, dtype=compute_dtype)
    input_len = model_inputs["input_ids"].shape[-1]
    with torch.inference_mode():
        output = model.generate(**model_inputs, max_new_tokens=100, do_sample=False)
    result = processor.decode(output[0][input_len:], skip_special_tokens=True).strip()

    plt.figure(figsize=(6, 6))
    plt.imshow(image)
    plt.axis("off")
    plt.title(f"Prediction: {result}", fontsize=12, pad=10)
    plt.tight_layout()
    plt.show()

print(f"\nTrue label: {CLASS_LABELS[test_example['label']]}")
print(f"Prediction: {result}")

In [None]:
print("=" * 60)
print("‚úÖ PIPELINE COMPLETE!")
print("=" * 60)
print(f"""
What we did:
  ‚úì Loaded MedGemma 4B with QLoRA (4-bit quantization)
  ‚úì Prepared {TASK} dataset in conversational format
  ‚úì Fine-tuned with SFTTrainer + LoRA adapters
  ‚úì Evaluated on validation set with classification metrics
  ‚úì Saved LoRA adapter + config to Google Drive

Model saved at:
  {DRIVE_SAVE_DIR}/

To load the fine-tuned model later:
  from peft import PeftModel
  base_model = AutoModelForImageTextToText.from_pretrained("{MODEL_ID}", ...)
  model = PeftModel.from_pretrained(base_model, "{DRIVE_SAVE_DIR}")

Next steps:
  1. Train on full RSNA dataset (100K+ images) for better accuracy
  2. Train pneumothorax task (change TASK="pneumothorax" and re-run)
  3. Integrate into FastAPI backend (src/api/)
  4. Build React Native UI for tablet deployment
  5. Submit to MedGemma Impact Challenge!
""")
print("=" * 60)