# Auto-Segmentation Inference using TotalSegmentator

This example demonstrates how to apply the TotalSegmentator open-source model to our data that has been prepared using PyDicer.

In [None]:
try:
    from pydicer import PyDicer
except ImportError:
    ! pip install pydicer TotalSegmentator
    from pydicer import PyDicer


from pydicer.utils import fetch_converted_test_data

from pydicer.generate.segmentation import read_all_segmentation_logs
from pydicer.generate.models import run_total_segmentator

from pydicer.analyse.compare import (
    compute_contour_similarity_metrics,
    get_all_similarity_metrics_for_dataset,
)

## Setup PyDicer

For this example, we will use the LCTSC test data which has already been converted using PyDicer.
We also initialise our PyDicer object. We also prepare a subset of this data to only run the segmentation
on one CT image per patient.

In [None]:
working_directory = fetch_converted_test_data("./testdata_hnscc", dataset="HNSCC")
pydicer = PyDicer(working_directory)

dataset_name = "totalset"
pydicer.dataset.prepare("totalset", "rt_latest_struct")

## Run Auto-segmentation

The [segment_dataset](https://australiancancerdatanetwork.github.io/pydicer/generate.html#pydicer.generate.segmentation.segment_dataset) function will run over all images in our dataset and will pass the images to
a function we define for segmentation. We pass in the name of our `validation_dataset` so that only
the images in this dataset will be segmented.

In [None]:
segment_id = "totalseg" # Used to generate the ID of the resulting auto-segmented structure sets

pydicer.segment_dataset(segment_id, run_total_segmentator, dataset_name=dataset_name)

We can use PyDicer's [visualisation module](https://australiancancerdatanetwork.github.io/pydicer/_examples/VisualiseData.html) to produce snapshots of the auto-segmentations produced.

In [None]:
pydicer.visualise.visualise()

## Read Segmentation Logs

After running the auto-segmentation on across the dataset, we can fetch the logs to confirm that
everything went well using the [read_all_segmentation_logs](https://australiancancerdatanetwork.github.io/pydicer/generate.html#pydicer.generate.segmentation.read_all_segmentation_logs) function.
This will also let us inspect the runtime of the segmentation. In case something went wrong, we can
use these logs to help debug the issue.

In [None]:
# Read the segmentation log DataFrame
df_logs = read_all_segmentation_logs(working_directory)
df_logs

In [None]:
# Use some Pandas magic to produce some stats on the segmentation runtime
df_success = df_logs[df_logs.success_flag]
agg_stats = ["mean", "std", "max", "min", "count"]
df_success[["segment_id", "total_time_seconds"]].groupby("segment_id").agg(agg_stats)

## Auto-segmentation Analysis

Now that our auto-segmentation model has been run, we can compare these
structures to the manual structures available on this dataset. PyDicer provides functionality to
compute similarity metrics, but we must first prepare a DataFrame containing our auto structure
sets (`df_target`) and a separate DataFrame with our manual structure sets (`df_reference`).

In [None]:
df = pydicer.read_converted_data(dataset_name=dataset_name)
df_structs = df[df.modality=="RTSTRUCT"]

df_reference = df_structs[~df_structs.hashed_uid.str.startswith("totalseg_")]
df_target = df_structs[df_structs.hashed_uid.str.startswith("totalseg_")]

In [None]:
df_reference

In [None]:
df_target

### Compute Similarity 

We use the [compute_contour_similarity_metrics](https://australiancancerdatanetwork.github.io/pydicer/analyse.html#pydicer.analyse.compare.compute_contour_similarity_metrics) function to compute the metrics comparing our
target structures to our reference structures.

We can specify which metrics we want to compute, in this example we compute the Dice Similarity
Coefficient (DSC), Hausdorff Distance, Mean Surface Distance and the Surface DSC.

> Structure names must match exactly, so we use a [structure name mapping](https://australiancancerdatanetwork.github.io/pydicer/_examples/WorkingWithStructures.html#Add-Structure-Name-Mapping) to standardise our
> structure names prior to computing the similarity metrics.

In [None]:
# Add our structure name mapping
mapping_id = "totalseg_hn"
mapping = {
    "SpinalCord": ["spinal_cord", "cord", "Cord"],
}
pydicer.add_structure_name_mapping(mapping, mapping_id)

# Specify the metrics we want to compute
compute_metrics = ["DSC", "hausdorffDistance", "meanSurfaceDistance", "surfaceDSC"]

# Compute the similarity metrics
compute_contour_similarity_metrics(
    df_target,
    df_reference,
    segment_id,
    compute_metrics=compute_metrics,
    mapping_id=mapping_id
)

### Fetch the similarity metrics

Here we fetch the metrics computed and output some stats. Note that if a segmentation fails, 
surface metrics will return NaN and will be excluded from these stats.

In [None]:
# Fetch the similarity metrics
df_metrics = get_all_similarity_metrics_for_dataset(
    working_directory,
    dataset_name=dataset_name,
    structure_mapping_id=mapping_id
)

# Aggregate the stats using Pandas
df_metrics[
    ["segment_id", "structure", "metric", "value"]
    ].groupby(
        ["segment_id", "structure", "metric"]
    ).agg(
        ["mean", "std", "min", "max", "count"]
    )
