# Example Notebook to run pydicer

This notebooks provides a basic example to run the pydicer pipeline using some test data.

In [None]:
import sys
sys.path.insert(0, "..")

from pathlib import Path

from pydicer.input.test import TestInput
from pydicer.preprocess.data import PreprocessData
from pydicer.convert.data import ConvertData
from pydicer.visualise.data import VisualiseData
from pydicer.dataset.preparation import PrepareDataset

## Setup directories

First we'll set up some directories in which to fetch and convert our data. Change the `directory`
location to a folder on your system where you'd like to work with this data.

In [None]:

directory = Path("./data")
directory.mkdir(exist_ok=True, parents=True)

dicom_directory = directory.joinpath("dicom")
dicom_directory.mkdir(exist_ok=True, parents=True)

nifti_directory = directory.joinpath("nifti")
nifti_directory.mkdir(exist_ok=True, parents=True)

clean_directory  = directory.joinpath("clean")


## Fetch some data

A TestInput class is provided in pydicer to download some sample data to work with. Several other
input classes exist if you'd like to retrieve DICOM data for conversion from somewhere else, [see 
the docs for information on how these work](https://australiancancerdatanetwork.github.io/pydicer/html/input.html).

In [None]:
test_input = TestInput(dicom_directory)
test_input.fetch_data()

## Preprocess the data

Before pydicer goes ahead and converts all the data, it first runs through it all once to figure
out how it is linked and move invalid data to the quarantine folder.

In [None]:
preprocessed_data = PreprocessData(dicom_directory, nifti_directory)
preprocessed_result = preprocessed_data.preprocess()

# Convert the data

Next we convert all the DICOM data into Nifti format. Check out the nifti folder to see the files
arrive as they are converted!

Alongside the Nifti files there are also a few other files made available. The JSON files which is
output stores all of the meta data from the original DICOM so that you can use it later.

In [None]:
convert_data = ConvertData(preprocessed_result, output_directory=nifti_directory)
convert_data.convert()

# Visualise the data

Nifti format is great, but it can be a bit time consuming to load each file in to 3D Slicer or a
similar tool to look at it. So in this step some visualisations providing snapshots of the images,
structures and dose will be saved along side the converted files in PNG format.

In [None]:
visualise_data = VisualiseData(nifti_directory)
visualise_data.visualise()

# Prepare a dataset

Datasets which are extracted in DICOM format can often be a bit messy and require some cleaning up
after conversion. Exactly what data objects to extract for the clean dataset will differ by project
but here we use a somewhat common approach of extracting the latest Structure Set for a patient and
the image linked to that.


In [None]:
prepare_dataset = PrepareDataset(directory)
prepare_dataset.prepare("clean", "rt_latest_struct")