# MIST Example Notebook - MSD Dataset

## Getting started

This example is intended to be run out of a Docker container with TensorFlow 2.6.0 or later. Once you start this Jupyter notebook in a container, uncomment the cell below to install some necessary dependencies.

## Import the necessary scripts

Import the necessary scirpts from the `mist/` directory. The three main components are the Preprocess, RunTime, and Inference classes. 

In [2]:
import json

# Import preprocess, runtime, and inference scripts
from mist.preprocess import Preprocess
from mist.runtime import RunTime

# Import the msd conversion tool from mist.utils
from mist.utils import convert_msd

2022-07-24 18:28:22.083084: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.


## Convert MSD data to MIST format

MIST provides a simple conversion script to convert MSD formated data to MIST compatible datasets.

In [None]:
msd_source = '/tf/data/msd/original/Task02_Heart/'
mist_dest = '/tf/data/msd/mist/Task02_Heart/'
convert_msd(msd_source, mist_dest)

MIST will convert the MSD dataset and provide a sample user parameters JSON file. The parameters JSON file will be written to the same destination as the newly formated MIST data. You can run the MIST pipeline with this provided parameters file as seen below.

In [None]:
json_file = '/tf/data/msd/mist/Task02_Heart/user_params.json'

# Preprocess training data
preproccess = Preprocess(json_file)
preproccess.run()

# # Run MIST pipeline
runtime = RunTime(json_file)
runtime.run()

Alternatively, you can copy and paste the printed output of the msd conversion script and customize the inputs to the MSD pipeline. Below we use mutli-gpu training, specify which folds to train on, modify the number of epochs per fold, and give the pipeline a specific patch size to use.

In [3]:
user_params = {'train_data_dir': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/raw/train',
               'test_data_dir': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/raw/test',
               'processed_data_dir': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/tfrecord',
               'log_dir': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/logs',
               'base_model_name': 'LeftAtrium',
               'model_dir': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/models',
               'prediction_dir': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/predictions',
               'raw_paths_csv': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/paths.csv',
               'inferred_params': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/inferred_params.json',
               'results_csv': '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/results.csv',
               'modality': 'mr',
               'mask': ['mask.nii.gz'],
               'images': {'MRI': ['MRI.nii.gz']},
               'labels': [0, 1],
               'final_classes': {'left_atrium': [1]},
               'loss': 'dice_ce',
               'model': 'unet',
               'folds': 0,
               'gpu': 6,
               'epochs': 50, 
               'patch_size': [128, 128, 128]}

json_file = '/rsrch1/ip/aecelaya/data/msd/mist/Task02_Heart/user_params.json'
with open(json_file, 'w') as outfile: 
    json.dump(user_params, outfile, indent = 2)
    
# Preprocess training data
# preprocess = Preprocess(json_file)
# preprocess.run()

# Run MIST pipeline
runtime = RunTime(json_file)
runtime.run()

Setting up GPU(s)...
Starting fold 0...
Fold 0: Epoch 1/50


2022-07-24 18:28:26.902980: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-07-24 18:28:27.517570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 46286 MB memory:  -> device: 0, name: Quadro RTX 8000, pci bus id: 0000:da:00.0, compute capability: 7.5


InvalidArgumentError: Graph execution error:

2 root error(s) found.
  (0) INVALID_ARGUMENT:  Input to reshape is a tensor with 14080000 values, but the requested shape has 28160000
	 [[{{node Reshape_1}}]]
	 [[IteratorGetNext]]
	 [[IteratorGetNext/_4]]
  (1) INVALID_ARGUMENT:  Input to reshape is a tensor with 14080000 values, but the requested shape has 28160000
	 [[{{node Reshape_1}}]]
	 [[IteratorGetNext]]
0 successful operations.
0 derived errors ignored. [Op:__inference_train_function_10248]