# 1. Download Decathlon Dataset with Monai APIs 

[Medical Decathlon](http://medicaldecathlon.com/) dataset is public avalibe dataset which you can manually download. 
This notebook using monai API to download any of the decathlon data sets to your Data folder. 
You can then use this data to train models using the SDK. 
Medical Decathlon has multiple data sets for different organs as:
Liver Tumours, Brain Tumours, Hippocampus, Lung Tumours, 
Prostate, Cardiac, Pancreas Tumour, Colon Cancer, Hepatic Vessels, Spleen

In [1]:
DataRoot="/home/msskzx/deepc_fl/spleen_dataset"
%cd $DataRoot
!pwd


/home/msskzx/deepc_fl/spleen_dataset
/home/msskzx/deepc_fl/spleen_dataset


In [2]:
import monai
monai.config.print_config()


MONAI version: 0.5.3
Numpy version: 1.17.4
Pytorch version: 1.9.0+cu102
MONAI flags: HAS_EXT = False, USE_COMPILED = False
MONAI rev id: d78c669c67e38ddfbe572f6a0438e9df0b8c65d7

Optional dependencies:
Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: NOT INSTALLED or UNKNOWN VERSION.
scikit-image version: NOT INSTALLED or UNKNOWN VERSION.
Pillow version: 7.0.0
Tensorboard version: NOT INSTALLED or UNKNOWN VERSION.
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: NOT INSTALLED or UNKNOWN VERSION.
ITK version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.61.2.dev9+g4735e81
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: NOT INSTALLED or UNKNOWN VERSION.

For details about installing the optional dependencies, please visit:
    https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies



## 1.1.1. Brain Tumor Data Set
This dataset has the following details:
- Target: Gliomas segmentation necrotic/active tumour and oedema
- Modality: Multimodal multisite MRI data (FLAIR, T1w, T1gd,T2w)
- Size: 750 4D volumes (484 Training + 266 Testing)
- Source: BRATS 2016 and 2017 datasets.
- Challenge: Complex and heterogeneously-located targets

In [None]:
myTask='Task01_BrainTumour'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.2. Heart Data Set
This dataset has the following details:
- Target: Left Atrium
- Modality: Mono-modal MRI  
- Size: 30 3D volumes (20 Training + 10 Testing)
- Source: King’s College London
- Challenge: Small training dataset with large variability

In [None]:
myTask='Task02_Heart'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.3. Liver Data Set
This dataset has the following details:
- Target: Liver and tumour
- Modality: Portal venous phase CT
- Size: 201 3D volumes (131 Training + 70 Testing)
- Source: IRCAD Hôpitaux Universitaires
- Challenge: Label unbalance with a large (liver) and small (tumour) target

In [None]:
myTask="Task03_Liver" # will take along time (3+ hours )
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.4. Hippocampus Data Set
This dataset has the following details:
- Target: Hippocampus head and body
- Modality: Mono-modal MRI 
- Size: 394 3D volumes (263 Training + 131 Testing)
- Source: Vanderbilt University Medical Center
- Challenge: Segmenting two neighbouring small structures with high precision 


In [None]:
myTask='Task04_Hippocampus'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.5. Prostate Data Set
This dataset has the following details:
- Target: Prostate central gland and peripheral zone 
- Modality: Multimodal MR (T2, ADC)
- Size: 48 4D volumes (32 Training + 16 Testing)
- Source: Radboud University, Nijmegen Medical Centre
- Challenge: Segmenting two adjoint regions with large inter-subject variations

In [None]:
myTask='Task05_Prostate'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.6. Lung Data Set
This dataset has the following details:
- Target: Lung and tumours
- Modality: CT
- Size: 96 3D volumes (64 Training + 32 Testing)
- Source: The Cancer Imaging Archive
- Challenge: Segmentation of a small target (cancer) in a large image

In [None]:
myTask='Task06_Lung'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.7. Pancreas Data Set
This dataset has the following details:
- Target: Liver and tumour
- Modality: Portal venous phase CT
- Size: 420 3D volumes (282 Training +139 Testing)
- Source: Memorial Sloan Kettering Cancer Center
- Challenge: Label unbalance with large (background), 
medium (pancreas) and small (tumour) structures. 

In [None]:
myTask='Task07_Pancreas'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.8. Hepatic Vessel Data Set
This dataset has the following details:
- Target: Hepatic vessels and tumour
- Modality: CT
- Size: 443 3D volumes (303 Training + 140 Testing)
- Source: Memorial Sloan Kettering Cancer Center
- Challenge: Tubular small structures next to heterogeneous tumour. 

In [None]:
myTask='Task08_HepaticVessel'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.1.9. Spleen Data Set
This dataset has the following details:
- Target: Spleen
- Modality: CT  
- Size: 61 3D volumes (41 Training + 20 Testing)
- Source: Memorial Sloan Kettering Cancer Center
- Challenge: Large ranging foreground size

In [3]:
myTask= 'Task09_Spleen'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)

Task09_Spleen.tar:  42%|████▏     | 645M/1.50G [1:23:08<1:45:47, 147kB/s]

## 1.1.10. Colon Data Set
This dataset has the following details:
- Target: Colon Cancer Primaries
- Modality: CT  
- Size: 190 3D volumes (126 Training + 64 Testing)
- Source: Memorial Sloan Kettering Cancer Center
- Challenge: Heterogeneous appearance

In [None]:
myTask='Task10_Colon'
dataset = monai.apps.DecathlonDataset(root_dir=DataRoot, task=myTask, section="training", download=True)


## 1.2 Remove unneeded tar files 

In [None]:
! rm $DataRoot/*.tar

## 1.3 Check on data
For any of the downloaded data set above you can run cell below to check on number of data sets and the image shape

In [None]:
print(dataset.get_properties("numTraining"))
print(dataset.get_properties("description"))


In [None]:
print(dataset[0]['image'].shape)
print(dataset[0]['label'].shape)


# 2. Load images into OHIF
If you are using AIAA with OHIF integration, 
then you would need to Dicom images.
We will convert nifti to Dicom then upload it to the pacs backend.
First lets install needed packages

In [None]:
! apt-get install -y plastimatch
! apt-get install -y dcmtk


Next we will do set some directories and do some clean up 

In [None]:
spleenDir=DataRoot+"/Task09_Spleen/imagesTs/"
spleenDcmDir=DataRoot+"/Task09_Spleen/imagesTsDCM/"
cmd = "rm "+spleenDir+"._spleen*" # remove some mac files from download
! $cmd

## 2.1 Convert nifti to dicoms 

In [None]:
import glob
import os
for i,patName in enumerate(os.listdir(spleenDir)): #glob.glob(spleenDir+"/*.nii.gz"):
    patName=patName[:-7]
    print("------converting ",patName)
    PatDcmDir=spleenDcmDir+patName
    os.makedirs(PatDcmDir, exist_ok=True)
    fName=spleenDir+patName+".nii.gz"
    cmd = "plastimatch convert --input " + fName + " --output-dicom " + PatDcmDir +" --patient-name "+ patName 
    cmd +=" --series-description "+ patName 
#     cmd +=" --series-number 1"
#     cmd +=" --series-uid <arg>        series UID for image metadata: string 
#     cmd +=" --patient-pos ffp" #hfs hfp ffs ffp
#     cmd +=" --direction-cosines rotated-3" #{identity,rotated-{1,2,3},sheared}
    !$cmd
    if i>1:
        break

## 2.2 Push dicom to orthanc

In [None]:
cmd='storescu -v +sd +r -xb -v -aet "fromtest" -aec "ORTHANC" orthanc 4242 '+spleenDcmDir
!$cmd