## 1. Download Example Data from HuggingFace

First, we'll download the anonymized example dataset from HuggingFace. This data includes:
- Synthetic CT images (for spatial reference)
- Real dose distributions
- Real structure masks (organs at risk and targets)

In [None]:
from huggingface_hub import snapshot_download
from pathlib import Path

# Download the dataset (cached locally after first download)
data_path = snapshot_download(
    repo_id="contouraid/dosemetrics-data",
    repo_type="dataset"
)

data_path = Path(data_path)
print(f"✓ Data downloaded to: {data_path}")
print(f"\nAvailable datasets:")
for item in data_path.iterdir():
    if item.is_dir():
        print(f"  - {item.name}")

## 2. Basic Data Loading

The simplest way to load data is using `read_dose_and_mask_files()`. This function:
- Automatically finds the dose file
- Loads all structure masks
- Returns a dose array and StructureSet object

In [None]:
from dosemetrics import read_dose_and_mask_files

# Load test subject data
subject_path = data_path / "longitudinal" / "time_point_1"
dose, structures = read_dose_and_mask_files(subject_path)

print(f"✓ Loaded dose distribution")
print(f"  Shape: {dose.shape}")
print(f"  Data type: {dose.dtype}")
print(f"  Dose range: {dose.min():.2f} - {dose.max():.2f} Gy")
print(f"\n✓ Loaded {len(structures.structure_names)} structures")

## 3. Working with the StructureSet

The `StructureSet` object provides a convenient interface for accessing structure masks.

In [None]:
# List all available structures
print("Available structures:")
for i, name in enumerate(structures.structure_names, 1):
    print(f"  {i:2d}. {name}")

# Get a specific structure mask
ptv_mask = structures.get_structure_mask("PTV")
print(f"\nPTV mask:")
print(f"  Shape: {ptv_mask.shape}")
print(f"  Data type: {ptv_mask.dtype}")
print(f"  Unique values: {len(ptv_mask.unique())} (binary mask)")

# Check which structures are available
if structures.has_structure("BrainStem"):
    print("\n✓ BrainStem structure is available")

## 4. Loading Individual Files

You can also load individual NIfTI files directly.

In [None]:
from dosemetrics import read_from_nifti
import nibabel as nib

# Load dose file with metadata
dose_file = subject_path / "Dose.nii.gz"
dose_nii = nib.load(dose_file)

# Get metadata
print("Dose file metadata:")
print(f"  Dimensions: {dose_nii.shape}")
print(f"  Voxel spacing (mm): {dose_nii.header.get_zooms()}")
print(f"  Data type: {dose_nii.get_data_dtype()}")

# Load as numpy array
dose_array = read_from_nifti(str(dose_file))
print(f"\nDose array:")
print(f"  Shape: {dose_array.shape}")
print(f"  Mean dose: {dose_array.mean():.2f} Gy")

## 5. Loading Multiple Files

Load all structure masks from a folder at once.

In [None]:
from dosemetrics.io import read_structures_from_folder

# Load all structure masks
structures_dict = read_structures_from_folder(subject_path)

print(f"Loaded {len(structures_dict)} structure masks:")
for name, mask in structures_dict.items():
    voxel_count = (mask > 0).sum()
    print(f"  {name:20s} - {voxel_count:6d} voxels")

## 6. Comparing Two Treatment Plans

Load data from two different plans for comparison.

In [None]:
# Load first plan
plan1_path = data_path / "longitudinal" / "time_point_1"
dose1, structures1 = read_dose_and_mask_files(plan1_path)

# Load second plan
plan2_path = data_path / "longitudinal" / "time_point_2"
dose2, structures2 = read_dose_and_mask_files(plan2_path)

print("Plan 1:")
print(f"  Dose shape: {dose1.shape}")
print(f"  Max dose: {dose1.max():.2f} Gy")
print(f"  Structures: {len(structures1.structure_names)}")

print("\nPlan 2:")
print(f"  Dose shape: {dose2.shape}")
print(f"  Max dose: {dose2.max():.2f} Gy")
print(f"  Structures: {len(structures2.structure_names)}")

# Compare dose distributions
import numpy as np
dose_diff = dose2 - dose1
print(f"\nDose difference:")
print(f"  Mean: {np.mean(dose_diff):.2f} Gy")
print(f"  Std: {np.std(dose_diff):.2f} Gy")
print(f"  Range: [{dose_diff.min():.2f}, {dose_diff.max():.2f}] Gy")

## 7. Comparing Dose Distributions

You can compare dose distributions between different time points.

In [None]:
# Load dose from time point 2
time_point_2_path = data_path / "longitudinal" / "time_point_2"
dose2 = read_from_nifti(str(time_point_2_path / "Dose.nii.gz"))

print("Time Point 1 dose:")
print(f"  Mean: {dose.mean():.2f} Gy")
print(f"  Max: {dose.max():.2f} Gy")

print("\nTime Point 2 dose:")
print(f"  Mean: {dose2.mean():.2f} Gy")
print(f"  Max: {dose2.max():.2f} Gy")

# Compare in PTV region
ptv_mask = structures.get_structure_mask("PTV")
ptv_mask_np = ptv_mask.numpy() > 0

dose1_in_ptv = dose[ptv_mask_np]
dose2_in_ptv = dose2[ptv_mask_np]

print("\nIn PTV region:")
print(f"  Time point 1 mean dose: {dose1_in_ptv.mean():.2f} Gy")
print(f"  Time point 2 mean dose: {dose2_in_ptv.mean():.2f} Gy")
print(f"  Difference: {abs(dose2_in_ptv.mean() - dose1_in_ptv.mean()):.2f} Gy")

## 8. Inspecting Data Structure

Explore what files are available in a subject folder.

In [None]:
import os

def inspect_subject_folder(folder_path):
    """Display the structure of a subject folder."""
    print(f"Contents of {folder_path.name}:")
    print("\nDose files:")
    for f in sorted(folder_path.glob("*Dose*.nii.gz")):
        size_mb = f.stat().st_size / (1024 * 1024)
        print(f"  {f.name:30s} ({size_mb:.2f} MB)")
    
    print("\nStructure files:")
    for f in sorted(folder_path.glob("*.nii.gz")):
        if "Dose" not in f.name and "CT" not in f.name:
            size_mb = f.stat().st_size / (1024 * 1024)
            print(f"  {f.name:30s} ({size_mb:.2f} MB)")
    
    print("\nImaging files (synthetic):")
    for f in sorted(folder_path.glob("CT*.nii.gz")):
        size_mb = f.stat().st_size / (1024 * 1024)
        print(f"  {f.name:30s} ({size_mb:.2f} MB)")

inspect_subject_folder(subject_path)

## Summary

In this notebook, you learned how to:

1. ✓ Download data from HuggingFace datasets
2. ✓ Load dose distributions and structure masks
3. ✓ Use the StructureSet API
4. ✓ Read individual NIfTI files
5. ✓ Compare multiple treatment plans
6. ✓ Work with predicted vs actual doses
7. ✓ Inspect data structure

## Next Steps

- **Computing Metrics**: Learn how to compute DVHs, quality indices, and dose constraints
- **Exporting Results**: Generate reports, plots, and export data
- **API Documentation**: Explore the full [DoseMetrics API](https://contouraid.github.io/dosemetrics/api/)

## References

- [DoseMetrics Documentation](https://contouraid.github.io/dosemetrics/)
- [Dataset on HuggingFace](https://huggingface.co/datasets/contouraid/dosemetrics-examples)
- [GitHub Repository](https://github.com/contouraid/dosemetrics)