# DICOManager Tutorial

## Overview

DICOManager is designed to sort, reconstruct, process and convert DICOM files to numpy arrays for use in Machine Learning and Deep Learning.

## Sorting

DICOManager begins with sorting DICOMs into a file tree with the following heirarchy:

1. Cohort
2. Patient
3. Study
4. Frame Of Reference
5. Series 
6. Modality
7. DICOM file

File tree construction is automatic and can be called at any level using `groupings.<type>(files=<list_of_files>)`

For example:


In [5]:
from glob import glob
import groupings

files = glob('/list/to/dicoms/*.dcm')
cohort = groupings.Cohort(name='MyCohort', files=files)
print(cohort)

Cohort: MyCohort


Upon each grouping level, the following basic functions can be used to alter the organization of the file tree, with `tree` refering to the current tree, `->` refering to the returned object:
- `tree.merge(other) -> None`: Merge two trees 
- `tree.steal(other) -> None`: Moves a child of one tree to another tree
- `tree.append(other) -> None`: Appends a node to another tree 
- `tree.prune(childname) -> None`: Removes a child from the tree 
- `tree.adopt(child) -> None`: Adopts a child to the current tree
- `tree.flatten() -> None`: Restricts each parent to having one child within the tree 
- `tree.pop() -> <child type>`: Removes the first child from the tree and returns it 
- `save_dicoms(filepath) -> None`: Saves only the dicoms from the tree 
- `save_volumes(filepath) -> None`: Saves only the reconstructed volumes from the tree 
- `only_dicoms() -> bool`: Returns true if tree only contains dicoms
- `has_dicoms() -> bool`: Returns true if tree contains dicoms
- `has_volumes() -> bool`: Returns true if tree contains volumes
- `iter_modalities() -> iterator`: Returns iterator of modalities within the tree
- `iter_frames() -> iterator`: Returns iterator of frames of reference within the tree
- `iter_dicoms() -> iterator`: Returns iterator of lists of dicoms for each series
- `iter_volumes() -> iterator`: Returns iterator of each volume within the tree
- `iter_volumes_frames() -> iteratorr`: Returns an iterator of all volumes within each frame of references
- `clear_dicoms() -> None`: Removes all dicoms from a tree
- `clear_volumes() -> None`: Removes all volumes from a tree
- `split_tree() -> tuple`: Splits the tree into a dicom only and volume only tree
- `split_dicoms() -> tree`: Returns only a tree of dicoms, removes dicoms from source tree
- `split_volumes() -> tree`: Returns only a tree of volumes, removes volumes from source tree
- `volumes_to_pointers() -> None`: Converts all volumes to pointers, writing the arrays to disk
- `pointers_to_volumes() -> None`: Converts all pointer to volumes, loading the arrays into memory 
- `recon(in_memory=False, parallelize=True) -> None`: Reconstructs all DICOMs within the tree, loading into memory or writing to disk at ~/.

An example for a few is below:

In [None]:
# Checking if the cohort contains dicoms
print(cohort.has_dicoms())





In [None]:
# Flatten all patients
for patient in cohort:
    patient.flatten()



In [None]:
# Adopting the first 3 patients into a new cohort
new_cohort = groupings.Cohort(name='NewCohort', files=None)

for _ in range(3):
    patient = cohort.pop()
    new_cohort.adopt(patient)

print(cohort)
print(new_cohort)

## Reconstruction
Reconstructing DICOMs into numpy arrays suitable for AI can be time consuming and tedious. DICOManager quickly converts DICOMs into volumes using parallelized processes. The simpliest way to reconstruct a patient or cohort is using the `.recon()` function. This function supports reconstruction of CT, MR, PET, NM, RTSTRUCT and RTDOSE files. Default behavior is to write the reconstructed volumes to disk and place a pointer within the tree indicating the volume location. If `.recon(in_memory=True)`, then the volume will be stored in the tree. This process is slower, does not allow for parallelization and can quickly consume an entire systems memory, but it is ideal for single patient inference or usage on systems where read-write access to disk is restricted. In some applications, parallelization across all CPU cores is not desired, in which case `.recon(parallelized=False)` can be used, at the expense of reconstruciton runtime. 

An example of reconstruction of our cohort is:

In [None]:
new_cohort.recon()
print(new_cohort)

new_cohort.pointers_to_volumes()
print(new_cohort)

## Image Manipulation
Image manuplation of reconstructed volumes can be conducted on volumes which are stored within memory (for now). These image manuplations are then stored in the header fo the ReconstructedFile or ReconstructedVolume objects. The image manuplation functions, within `dicomanager.processing.tools` are:

- `dose_max_points(dose_array, dose_coords)` -> np.ndarray: Returns the maximum point in index, or coordinates, of the dose array
- `window_level(window, level)` -> ReconstructedVolume: Window and levels the reconstructed volume array
- `normalize(img)` -> RecontructedVolume: Normalizes the reconstructedVolume
- `standardize(img) -> ReconstructedVolume`: Standardizes the reconstructedVolume
- `crop(img, centroid, crop_size) -> ReconstructedVolume`: Crops the ReconstructedVolume around the centroid to dimensions of crop_size. Will offset from centroid to maintain crop_size dimensions. Will add option for padding with zeros instead in a later revision of the function.
- `resample(img, ratio) -> ReconstructedVolume`: Resamples the image by the ratio provided
- `bias_field_correction(img) -> ReconstructedVolume`: Uses N4BiasFieldCorrection from SimpleITK

Exmaple of cropping and window, leveling


In [None]:
from processing import tools

for vol in cohort.iter_volumes():
    vol = tools.window_level(vol, window=400, level=800)
    vol = tools.crop(vol, centroid=(100, 100, 50), crop_size=(250, 250, 50))

## Deconstruction (Numpy to RTSTRUCT)
In some instances, particularly inference, conversion from boolean numpy array to an RTSTRUCT is desired. For this to be possible, the associated image DICOM files must be contained within the tree. Deconstruction is only possible at the Frame Of Reference level, or lower, as the RTSTRUCT must contain the equivalent dimensions to a CT group. If a Frame of Reference contains more than one CT, deconstruction is not currently supported. 

Deconstruction can occur as:
- `from_ct()`: Creates a new RTSTRUCT from a CT files based upon the CT header
- `from_rt()`: Creates a new RTSTRUCT from an existin RTSTRUCT file
- `to_rt()`: Appends a segmentation to an existing RTSTRUCT file

An example of deconstruction is provided below:

In [None]:
frame = new_cohort.iter_frames()[0]  # One frame of reference from the cohort
vol = frame.iter_volumes()[0]  # One volume from the frame of reference
rtstruct = np.zeros((1, vol.shape))  # Create an example array for demonstration

print(f'before: {frame}')
frame.decon.from_ct(rtstruct)
print(f'after: {frame}')