# WORC Tutorial: Simple

Welcome to the tutorial of WORC: a Workflow for Optimal Radiomics Classification! It will provide you with basis knowledge and practical skills on how to run the WORC. For advanced topics and WORCflows, please see the other notebooks provided with this tutorial. For installation details, see the ReadMe.md provided with this tutorial.


This tutorial interacts with  WORC through SimpleWORC and is especially suitable for first time usage.

In [1]:
# impor neccesary packages
from WORC import SimpleWORC
import os

# These packages are only used in analysing the results
import pandas as pd
import json
import fastr
import glob

# If you don't want to use your own data, we use the following example set,
# see also the next code block in this example.
from WORC.exampledata.datadownloader import download_HeadAndNeck

# Define the folder this script is in, so we can easily find the example data
script_path = os.getcwd()


Module 'pyfftw' (FFTW Python bindings) could not be imported. To install it, try
running 'pip install pyfftw' from the terminal. Falling back on the slower
'fftpack' module for 2D Fourier transforms.
  'fftpack' module for 2D Fourier transforms.""")





---------------------------------------------------------------------------
Input
---------------------------------------------------------------------------
The minimal inputs to WORC are:
  - Images
  - Segmentations
  - Labels

In SimpleWORC, we assume you have a folder "datadir", in which there is a
folder for each patient, where in each folder there is a image.nii.gz and a mask.nii.gz:
          Datadir
              Patient_001
                  image.nii.gz
                  mask.nii.gz
              Patient_002
                  image.nii.gz
                  mask.nii.gz
              ...


You can skip this part if you use your own data.
In the example, We will use open source data from the online XNAT platform
at https://xnat.bmia.nl/data/archive/projects/stwstrategyhn1. This dataset
consists of CT scans of patients with Head and Neck tumors. 

In [2]:
# Download a subset of 20 patients in this folder. You can change these if you want.
nsubjects = 20  # use "all" if you want to download all patients.
data_path = os.path.join(script_path, 'Data')
download_HeadAndNeck(datafolder=data_path, nsubjects=nsubjects)

Working on subject 1/137
	Downloading patient HN1331, experiment HN1331_20190402_CT, scan 1.
resource is NIFTI


183.6 KiB |#                                                      |   4.4 MiB/s


	Downloading patient HN1331, experiment HN1331_20190402_CT, scan 1_3_6_1_4_1_40744_29_120873174302085915918976213152022016685.
resource is NIFTI


 21.4 MiB |           #                                           |  19.3 MiB/s


Working on subject 2/137
	Downloading patient HN1519, experiment HN1519_20190402_CT, scan 1.
resource is NIFTI


187.2 KiB |#                                                      |   4.7 MiB/s


	Downloading patient HN1519, experiment HN1519_20190402_CT, scan 1_3_6_1_4_1_40744_29_178045394474815118369144065452878666404.
resource is NIFTI


 19.1 MiB |        #                                              |  23.6 MiB/s


Working on subject 3/137
	Downloading patient HN1088, experiment HN1088_20190402_CT, scan 1.
resource is NIFTI


209.3 KiB |#                                                      |   2.4 MiB/s


	Downloading patient HN1088, experiment HN1088_20190402_CT, scan 1_3_6_1_4_1_40744_29_120903286350475892686849765504908597220.
resource is NIFTI


 20.7 MiB |         #                                             |  21.9 MiB/s


Working on subject 4/137
	Downloading patient HN1260, experiment HN1260_20190402_CT, scan 1.
resource is NIFTI


217.0 KiB | #                                                     |   1.5 MiB/s


	Downloading patient HN1260, experiment HN1260_20190402_CT, scan 1_3_6_1_4_1_40744_29_161797156309701878529360999104102719772.
resource is NIFTI


 19.7 MiB |          #                                            |  19.5 MiB/s


Working on subject 5/137
	Downloading patient HN1192, experiment HN1192_20190402_CT, scan 1.
resource is NIFTI


209.3 KiB | #                                                     |   1.2 MiB/s


	Downloading patient HN1192, experiment HN1192_20190402_CT, scan 1_3_6_1_4_1_40744_29_76038811511814195873478246519752470711.
resource is NIFTI


 19.8 MiB |      #                                                |  29.5 MiB/s


Working on subject 6/137
	Downloading patient HN1501, experiment HN1501_20190403_CT, scan 1.
resource is NIFTI


520.6 KiB | #                                                     |   3.3 MiB/s


	Downloading patient HN1501, experiment HN1501_20190403_CT, scan 1_3_6_1_4_1_40744_29_294587997382261494217023015100501023442.
resource is NIFTI


 20.3 MiB |       #                                               |  28.0 MiB/s


Working on subject 7/137
	Downloading patient HN1259, experiment HN1259_20190402_CT, scan 1.
resource is NIFTI


 94.5 KiB |#                                                      |   8.3 MiB/s


	Downloading patient HN1259, experiment HN1259_20190402_CT, scan 1_3_6_1_4_1_40744_29_163029519172728433236936734659203480802.
resource is NIFTI


 17.8 MiB |      #                                                |  27.1 MiB/s


Working on subject 8/137
	Downloading patient HN1372, experiment HN1372_20190402_CT, scan 1.
resource is NIFTI


488.1 KiB |#                                                      |   6.7 MiB/s


	Downloading patient HN1372, experiment HN1372_20190402_CT, scan 1_3_6_1_4_1_40744_29_144464477074766979879254219958116907520.
resource is NIFTI


 23.6 MiB |       #                                               |  31.3 MiB/s


Working on subject 9/137
	Downloading patient HN1560, experiment HN1560_20190402_CT, scan 1.
resource is NIFTI


134.9 KiB |#                                                      |  11.0 MiB/s


	Downloading patient HN1560, experiment HN1560_20190402_CT, scan 1_3_6_1_4_1_40744_29_131924080354178326530168765579293823568.
resource is NIFTI


 19.1 MiB |       #                                               |  26.6 MiB/s


Working on subject 10/137
	Downloading patient HN1748, experiment HN1748_20190403_CT, scan 1.
resource is NIFTI


474.7 KiB | #                                                     |   3.8 MiB/s


	Downloading patient HN1748, experiment HN1748_20190403_CT, scan 1_3_6_1_4_1_40744_29_33682544755686290289659265674027017323.
resource is NIFTI


 21.6 MiB |       #                                               |  29.7 MiB/s


Working on subject 11/137
	Downloading patient HN1004, experiment HN1004_20190403_CT, scan 1.
resource is NIFTI


488.4 KiB | #                                                     |   2.5 MiB/s


	Downloading patient HN1004, experiment HN1004_20190403_CT, scan 1_3_6_1_4_1_40744_29_33371661027192187491509798061184654147.
resource is NIFTI


 24.5 MiB |        #                                              |  28.6 MiB/s


Working on subject 12/137
	Downloading patient HN1491, experiment HN1491_20190402_CT, scan 1.
resource is NIFTI


469.0 KiB |  #                                                    |   1.7 MiB/s


	Downloading patient HN1491, experiment HN1491_20190402_CT, scan 1_3_6_1_4_1_40744_29_83995353912492289117434904755737246047.
resource is NIFTI


 17.5 MiB |       #                                               |  24.8 MiB/s


Working on subject 13/137
	Downloading patient HN1146, experiment HN1146_20190402_CT, scan 1.
resource is NIFTI


501.3 KiB | #                                                     |   4.0 MiB/s


	Downloading patient HN1146, experiment HN1146_20190402_CT, scan 1_3_6_1_4_1_40744_29_45084863479635845781442267453092995463.
resource is NIFTI


 21.2 MiB |       #                                               |  26.9 MiB/s


Working on subject 14/137
	Downloading patient HN1339, experiment HN1339_20190402_CT, scan 1.
resource is NIFTI


178.9 KiB |#                                                      |   7.9 MiB/s


	Downloading patient HN1339, experiment HN1339_20190402_CT, scan 1_3_6_1_4_1_40744_29_159924772318099919605643394483351782918.
resource is NIFTI


 19.6 MiB |      #                                                |  28.4 MiB/s


Working on subject 15/137
	Downloading patient HN1159, experiment HN1159_20190402_CT, scan 1.
resource is NIFTI


233.4 KiB | #                                                     |   1.6 MiB/s


	Downloading patient HN1159, experiment HN1159_20190402_CT, scan 1_3_6_1_4_1_40744_29_304948212229053443748913181579593614456.
resource is NIFTI


 19.9 MiB |       #                                               |  27.5 MiB/s


Working on subject 16/137
	Downloading patient HN1554, experiment HN1554_20190402_CT, scan 1.
resource is NIFTI


 35.3 KiB |#                                                      |  10.0 MiB/s


	Downloading patient HN1554, experiment HN1554_20190402_CT, scan 1_3_6_1_4_1_40744_29_70836216037838166025574231058696520721.
resource is NIFTI


 23.7 MiB |        #                                              |  29.1 MiB/s


Working on subject 17/137
	Downloading patient HN1077, experiment HN1077_20190403_CT, scan 1.
resource is NIFTI


 64.3 KiB |#                                                      |   3.3 MiB/s


	Downloading patient HN1077, experiment HN1077_20190403_CT, scan 1_3_6_1_4_1_40744_29_289821604949411243837449577785917441701.
resource is NIFTI


 18.1 MiB |       #                                               |  25.5 MiB/s


Working on subject 18/137
	Downloading patient HN1524, experiment HN1524_20190403_CT, scan 1.
resource is NIFTI


 84.7 KiB |#                                                      |   7.9 MiB/s


	Downloading patient HN1524, experiment HN1524_20190403_CT, scan 1_3_6_1_4_1_40744_29_217593089023368120144332181985591455451.
resource is NIFTI


 26.9 MiB |         #                                             |  29.7 MiB/s


Working on subject 19/137
	Downloading patient HN1323, experiment HN1323_20190403_CT, scan 1.
resource is NIFTI


517.4 KiB |#                                                      |   6.0 MiB/s


	Downloading patient HN1323, experiment HN1323_20190403_CT, scan 1_3_6_1_4_1_40744_29_148778682230814551705999265250460576784.
resource is NIFTI


 20.8 MiB |       #                                               |  29.6 MiB/s


Working on subject 20/137
	Downloading patient HN1342, experiment HN1342_20190404_CT, scan 1.
resource is NIFTI


 34.4 KiB |#                                                      |   3.9 MiB/s


	Downloading patient HN1342, experiment HN1342_20190404_CT, scan 1_3_6_1_4_1_40744_29_135520745570260793022859088319256240566.
resource is NIFTI


 19.5 MiB |       #                                               |  27.3 MiB/s


Done downloading!


Define the inputs of our network

In [3]:
# Identify our data structure: change the fields below accordingly
# if you use your own data.
imagedatadir = os.path.join(data_path, 'stwstrategyhn1')
image_file_name = 'image.nii.gz'
segmentation_file_name = 'mask.nii.gz'

# File in which the labels (i.e. outcome you want to predict) is stated
# Again, change this accordingly if you use your own data.
label_file = os.path.join(data_path, 'Examplefiles', 'pinfo_HN.csv')

# Name of the label you want to predict
label_name = 'imaginary_label_1'

# Determine whether we want to do a coarse quick experiment, or a full lengthy
# one. Again, change this accordingly if you use your own data.
coarse = True

# Give your experiment a name
experiment_name = 'Example_STWStrategyHN'

# Instead of the default tempdir, let's but the temporary output in a subfolder
# in the same folder as this script
tmpdir = os.path.join(script_path, 'WORC_' + experiment_name)


---------------------------------------------------------------------------
The actual experiment
---------------------------------------------------------------------------

NOTE:  Precomputed features can be used instead of images and masks
by instead using ``I.features_from_this_directory()`` in a similar fashion to below. 

In [6]:
# Create a WORC object
experiment = SimpleWORC(experiment_name)

# Set the input data according to the variables we defined earlier
experiment.images_from_this_directory(imagedatadir,
                             image_file_name=image_file_name)
experiment.segmentations_from_this_directory(imagedatadir,
                                    segmentation_file_name=segmentation_file_name)
experiment.labels_from_this_file(label_file)
experiment.predict_labels([label_name])

# Use the standard workflow for binary classification
experiment.binary_classification(coarse=coarse)

# Set the temporary directory
experiment.set_tmpdir(tmpdir)



BigrClusterDetector detected False.
CartesiusClusterDetector detected False.
BigrClusterDetector detected False.
CartesiusClusterDetector detected False.


In [7]:
# Run the experiment!
experiment.execute()

DebugDetector detected False.
[INFO] networkrun:0517 >> ####################################
[INFO] networkrun:0518 >> #     network execution STARTED    #
[INFO] networkrun:0519 >> ####################################
[INFO] networkrun:0544 >> Running network via /home/martijn/Documents/WORC3/src/fastr/fastr/api/__init__.py (last modified Wed Oct 16 16:42:02 2019)
[INFO] networkrun:0545 >> FASTR loaded from /home/martijn/Documents/WORC3/src/fastr/fastr
[INFO] networkrun:0561 >> Network run tmpdir: /home/martijn/git/WORCTutorial/WORC_Example_STWStrategyHN
[INFO] networkchunker:0146 >> Adding classification to candidates (blocking False)
[INFO] networkchunker:0146 >> Adding performance to candidates (blocking False)
[INFO] networkchunker:0146 >> Adding features_train_CT_0 to candidates (blocking False)
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/config_classification_source s

[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/segmentations_train_CT_0 sample id <SampleId ('HN1331',)>, index <SampleIndex (9)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/segmentations_train_CT_0 sample id <SampleId ('HN1339',)>, index <SampleIndex (10)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/segmentations_train_CT_0 sample id <SampleId ('HN1342',)>, index <SampleIndex (11)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/segmentations_train_CT_0 sample id <SampleId ('HN1372',)>, index <SampleIndex (12)>
[INFO]   noderun:0576 >> Creating job

[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/convert_seg_train_CT_0 sample id <SampleId ('HN1088',)>, index <SampleIndex (2)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/convert_seg_train_CT_0 sample id <SampleId ('HN1146',)>, index <SampleIndex (3)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/convert_seg_train_CT_0 sample id <SampleId ('HN1159',)>, index <SampleIndex (4)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/convert_seg_train_CT_0 sample id <SampleId ('HN1192',)>, index <SampleIndex (5)>
[INFO]   noderun:0576 >> Creating job for node f

[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/preprocessing_train_CT_0 sample id <SampleId ('HN1524',)>, index <SampleIndex (16)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/preprocessing_train_CT_0 sample id <SampleId ('HN1554',)>, index <SampleIndex (17)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/preprocessing_train_CT_0 sample id <SampleId ('HN1560',)>, index <SampleIndex (18)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/preprocessing_train_CT_0 sample id <SampleId ('HN1748',)>, index <SampleIndex (19)>
[INFO]   noderun:0470 >> Generating 

[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/features_train_CT_0 sample id <SampleId ('HN1260',)>, index <SampleIndex (7)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/features_train_CT_0 sample id <SampleId ('HN1323',)>, index <SampleIndex (8)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/features_train_CT_0 sample id <SampleId ('HN1331',)>, index <SampleIndex (9)>
[INFO]   noderun:0576 >> Creating job for node fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/features_train_CT_0 sample id <SampleId ('HN1339',)>, index <SampleIndex (10)>
[INFO]   noderun:0576 >> Creating job for node fastr:///net

[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___features_train_CT_0___HN1146___0 with status JobState.cancelled
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___features_train_CT_0___HN1088___0 with status JobState.cancelled
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___features_train_CT_0___HN1077___0 with status JobState.cancelled
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___features_train_CT_0___HN1004___0 with status JobState.cancelled
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___calcfeatures_train_predict_CalcFeatures_1_0_CT_0___HN1748 with status JobState.cancelled
[ERROR]   noderun:0508 >> Could not find required data for fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/calcfeatures_train_predict_CalcFeatures_1_0_CT_0/outputs/features in {}!
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___calcf

[ERROR]   noderun:0508 >> Could not find required data for fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/preprocessing_train_CT_0/outputs/image in {}!
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___preprocessing_train_CT_0___HN1560 with status JobState.cancelled
[ERROR]   noderun:0508 >> Could not find required data for fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/preprocessing_train_CT_0/outputs/image in {}!
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___preprocessing_train_CT_0___HN1554 with status JobState.cancelled
[ERROR]   noderun:0508 >> Could not find required data for fastr:///networks/WORC_Example_STWStrategyHN/0.0/runs/WORC_Example_STWStrategyHN_2019-10-16T17-50-17/nodelist/preprocessing_train_CT_0/outputs/image in {}!
[INFO] networkrun:0765 >> Finished job WORC_Example_STWStrategyHN___preprocessing_train_C

**NOTE:**  Precomputed features can be used instead of images and masks by instead using ``I.features_from_this_directory()`` in a similar fashion.

---------------------------------------------------------------------------
Analysis of results
---------------------------------------------------------------------------

There are two main outputs: the features for each patient/object, and the overall
performance. These are stored as .hdf5 and .json files, respectively. By
default, they are saved in the so-called "fastr output mount", in a subfolder
named after your experiment name.

In [None]:
# Locate output folder
outputfolder = fastr.config.mounts['output']
experiment_folder = os.path.join(outputfolder, 'WORC_' + experiment_name)

print(f"Your output is stored in {experiment_folder}.")

# Read the features for the first patient
# NOTE: we use the glob package for scanning a folder to find specific files
feature_files = glob.glob(os.path.join(experiment_folder,
                                       'Features',
                                       'features_*.hdf5'))
featurefile_p1 = feature_files[0]
features_p1 = pd.read_hdf(featurefile_p1)

# Read the overall peformance
performance_file = os.path.join(experiment_folder, 'performance_all_0.json')
with open(performance_file, 'r') as fp:
    performance = json.load(fp)

# Print the feature values and names
print("Feature values:")
for v, l in zip(features_p1.feature_values, features_p1.feature_labels):
    print(f"\t {l} : {v}.")

# Print the output performance
print("\n Performance:")
stats = performance['Statistics']
del stats['Percentages']  # Omitted for brevity
for k, v in stats.items():
    print(f"\t {k} {v}.")

**NOTE:** the performance is probably horrible, which is expected as we ran
the experiment on coarse settings. These settings are recommended to only
use for testing: see also below.


---------------------------------------------------------------------------
Tips and Tricks
---------------------------------------------------------------------------

For tips and tricks on running a full experiment instead of this simple
example, adding more evaluation options, debuggin a crashed network etcetera,
please go to https://worc.readthedocs.io/en/latest/static/user_manual.html

Some things we would advice to always do:
  - Run actual experiments on the full settings (coarse=False):
  
      ``coarse = False``
      
      ``experiment.binary_classification(coarse=coarse)``
      
  **Note**: this will result in more computation time. We therefore recommmend
  to run this script on either a cluster or high performance PC. If so,
  you may change the execution to use multiple cores to speed up computation
  just before before experiment.execute():
  
      ``experiment.set_multicore_execution()``


  - Add extensive evaluation: experiment.add_evaluation() before experiment.execute():
  
      ``experiment.add_evaluation()``