# Tutorial: Optimising a Quantitative MRI Protocol Using Simulated Data

This notebook demonstrates how to use TADRED to optimise a quantitative or diffusion MRI protocol using simulated data. The code can be adapted to optimise protocols for your chosen model, whether it's analytical or another type.

To use your own model, simulate data using it while adhering to the key caveats outlined throughout this notebook.

### Key User-Defined Quantities:

1. **Protocol Length (`n_volumes_opt`)**: Define the number of acquisition parameters or acquired volumes for your optimised protocol.

2. **Number of Divisions (`n_divisions_tadred`)**: Set the number of divisions for the subset size that TADRED will use during training. The default is 5, which typically works well.

3. **Number of Training Voxels (`n_train`)**: Specify the number of training voxels. Reduce this number for faster training or increase it for more accuracy.

### Simulation Requirements:

1. **Superdesign Size**: Parameters 1 and 2 determine the size of the superdesign required for simulations. In this tutorial, TADRED will sequentially halve the number of subsets. For example, with `n_volumes_opt=20` and `n_divisions_tadred=5`, the subsetting will be `[320, 160, 80, 40, 20]`. Therefore, the superdesign acquisition should include 320 acquisitions. The acquisition parameters of the superdesign should span the entire acquisition space from which optimal parameters will be selected. For instance, if your scanner has a maximum b-value of 3000, the superdesign might include 320 b-values equally spaced between 0 and 3000. This principle generalises to higher dimensions.

2. **Simulated Voxels**: Parameter 3 defines the number of simulated voxels needed. If `n_train=1000`, then 1200 voxels are required in total, as 100 voxels are allocated for the validation dataset and 100 for the test dataset.

Cell X provides an example of how to run the appropriate simulations for two simple models.

(c) Stefano B. Blumberg and Paddy J. Slator


In [None]:
# Define the number of acquisition volumes for the optimised protocol and the number of subset divisions in TADRED
# These values will determine the length of the superdesign required for the training set
# E.g. If n_divisions_tadred is 5 then the superdesign needs to be 16 times larger than the number of volumes in the optimised protocol

n_volumes_opt = 20  # Number of volumes in the optimised protocol

n_divisions_tadred = 5 #number of divisions to run in TADRED

print("Length of desired optimised protocol: " + str(n_volumes_opt))
print("Number of TADRED subset divisions: " + str(5))


In [None]:
#Calculate the required length of the superdesign
#this depends on the number of subset divisions - which are always halvings - that will be done in tadred
Cbar = n_volumes_opt * 2**(n_divisions_tadred-1) #

print("Size of the required superdesign for the simulations: " + str(Cbar))


In [None]:
########## (1)
# Import modules, see requirements.txt for tadred requirements, make sure things are in the path, set global seed

import numpy as np
import matplotlib.pyplot as plt
import os
import yaml 
from pathlib import Path
import sys

#replace with the top level ED_MRI install directory
EDMRIDIR = '/Users/scmps8/repos/github.com/ED_MRI/'
#TADRED should be installed in the ED_MRI directory  
TADREDDIR = os.path.join(EDMRIDIR,'tadred') #'/Users/scmps8/repos/github.com/ED_MRI/tadred'

# Make sure both directories are on sys.path
sys.path.append(EDMRIDIR)
sys.path.append(TADREDDIR)

#import TADRED code
from tadred import tadred_main, utils

np.random.seed(0)  # Random seed for entire script


In [None]:
# Directories and filenames to save data (Replace with location of TADRED code - possible to get this automatically?)
basedir = EDMRIDIR
#if directory isn't defined then just use the current working directory
try:
    basedir
except NameError:
    basedir = os.getcwd()
    print('data will be saved in the current working directory')



In [None]:
########## (2)
# Data split sizes

n_train = 10**4  # No. training voxels, reduce for faster training speed
n_val = n_train // 10  # No. validations set voxels
n_test = n_train // 10  # No. test set voxels

n_samples = n_train + n_val + n_test  # total number of samples to simulate




In [None]:
#If you are using your own simulated data, follow the guidelines to 
#generate simulated data that is appropriate for input to TADRED

use_own_simulated_data = True

if use_own_simulated_data:
    print('You need to simulate the following data to run TADRED.')
    print('Ground truth parameters vector of size n_samples by n_model_parameters')
    print('Where:')
    print('n_samples is ' + str(n_samples))
    print('n_model_parameters is the number of parameters in your model')
    
    print('Simulated ground truth signals with superdesign acquisition scheme of size n_samples by Cbar')
    print('Where:')
    print('The superdesign acquisition scheme highly oversamples the available acquisition parameter space')
    print('Cbar is the superdesign length ' + str(Cbar))
    
    print('Acquisition parameters of the superdesign of size Cbar by n_acquisition_parameters')
    print('Where:')
    print('n_acqusition_parameters is the dimension of the acquisition parameter space, e.g. 4 (gx, gy, gz, b) for HCP data')
    
    print('Save the data as .npy files.')
    print('Suggested filenames:')
    print('parameter array: parameters_gt_full.npy')
    print('simulated signals: signals_super_full.npy')
    print('acquisition parameters: acq_params_super.npy')
else:
    print('Run the cell below to simulate data suitable for TADRED')



In [None]:
#define some models and generate the data
#The data that will be generated is:
#parameters - n_samples by n_model_parameters  array containing the ground truth model parameters, where n_parameters is the number of parameters in your model
#signals - n_samples by Cbar array containing the corresponding simulated signals from the model 
#acq_params_super - Cbar by n_acqusition_parameters length array containing the acquisition parameters of the superdesign

if not use_own_simulated_data:

    model_name = 't1inv'
    # model_name = "adc"
    
    if model_name == "adc":
        # model equation for simulation
        def model(D, bvals):
            signals = np.exp(-bvals * D)
            return signals
    
        # min/max parameter values
        minD = 0.1
        maxD = 3
    
        # simulate parameter values
        parameters = np.random.uniform(low=minD, high=maxD, size=(n_samples, 1))
    
        # Generate data using the model
    
        # make the super design
        maxb = 5
        minb = 0
        acq_params_super = np.linspace(minb, maxb, Cbar)
    
        # generate the data
        raw_signals = np.zeros((n_samples, Cbar), dtype=np.float32)
        for i in range(0, n_samples):
            raw_signals[i, :] = model(parameters[i], acq_params_super)
    
    
    elif model_name == "t1inv":
    
        def model(T1, ti, tr):
            signals = abs(1 - (2 * np.exp(-ti / T1)) + np.exp(-tr / T1))
            return signals
    
        # min/max parameter values
        minT1 = 0.1
        maxT1 = 7
        # simulate parameter values
        parameters = np.random.uniform(low=minT1, high=maxT1, size=(n_samples, 1))
    
        # generate data using an T1 inversion recovery model
    
        # make the super design
        tr = 7  # repetition time
        maxti = tr
        minti = 0.1
        acq_params_super = np.linspace(minti, maxti, Cbar)
    
        # generate the data
        raw_signals = np.zeros((n_samples, Cbar), dtype=np.float32)
        for i in range(0, n_samples):
            raw_signals[i, :] = model(parameters[i], acq_params_super, tr)



    # add noise to the data
    def add_noise(data, scale=0.05):
        data_real = data + np.random.normal(scale=scale, size=np.shape(data))
        data_imag = np.random.normal(scale=scale, size=np.shape(data))
        data_noisy = np.sqrt(data_real**2 + data_imag**2)
    
        return data_noisy
    
    
    SNR = 20
    signals = add_noise(raw_signals, 1 / SNR)

    proj_name = model_name + "_simulations_" + "n_train_" + str(n_train) + "_SNR_" + str(SNR)


In [None]:
if use_own_simulated_data:
    #REPLACE WITH LOCATION OF THE SIMULATED DATA
    basedir = '/Users/scmps8/repos/github.com/ED_MRI/ED_MRI/examples/t1inv_simulations_n_train_10000_SNR_20'

    #Replace with project name
    proj_name = 'TADRED_test'

    #REPLACE WITH PATH TO SIMULATION GROUND TRUTH PARAMETERS
    #saved array should be array n_samples by n_model_parameters     
    simulation_gt_parameters_path = os.path.join(basedir,'parameters_gt_full.npy')
    
    #REPLACE WITH PATH TO SIMULATION GROUND TRUTH SIGNALS WITH "SUPER-DESIGN" ACQUISITION - HIGHLY OVERSAMPLING THE ACQUISITION PARAMETER SPACE
    #array is n_samples by Cbar
    simulation_gt_signals_path = os.path.join(basedir, 'signals_super_full.npy')
    
    #REPLACE WITH PATH TO SUPER-DESIGN ACQUISITION PARAMETERS
    #acq_params_super - Cbar by n_acqusition_parameters length array containing the acquisition parameters of the superdesign

    
    #array is Cbar by n_acqusition_parameters, e.g. n_acqusition_parameters is 4 (gx, gy, gz, b) for HCP data
    acq_params_super_signals_path = os.path.join(basedir, 'acq_params_super.npy')
    
    #load the files
    parameters = np.load(simulation_gt_parameters_path)
    signals = np.load(simulation_gt_signals_path)
    acq_params_super = np.load(acq_params_super_signals_path)


In [None]:
########## (3-A)
# Create dummy, randomly generated (positive) data

# C_bar = 220
# M = 12  # Number of input measurements \bar{C}, Target regressors
# rand = np.random.lognormal  # Random genenerates positive
# train_inp, train_tar = rand(size=(n_train, C_bar)), rand(size=(n_train, M))
# val_inp, val_tar = rand(size=(n_val, C_bar)), rand(size=(n_val, M))
# test_inp, test_tar = rand(size=(n_test, C_bar)), rand(size=(n_test, M))


# #########

In [None]:
# #convert signal and parameters to pytorch float32
# import torch

# signals = torch.tensor(signals, dtype=torch.float32)
# parameters = torch.tensor(parameters, dtype=torch.float32)

# signals = torch.tensor(signals, dtype=torch.float32)
# parameters = torch.tensor(parameters, dtype=torch.float32)




In [None]:
########## (4)
# Load data into TADRED format

# Data in TADRED format, \bar{C} measurements, M target regresors
data = dict(
    train=signals[0:n_train, :],  # Shape n_train x \bar{C}
    train_tar=parameters[0:n_train, :],  # Shape n_train x M
    val=signals[n_train : (n_train + n_val), :],  # Shape n_val x \bar{C}
    val_tar=parameters[n_train : (n_train + n_val), :],  # Shape n_val x M
    test=signals[(n_train + n_val) : (n_train + n_val + n_test), :],  # Shape n_test x \bar{C}
    test_tar=parameters[(n_train + n_val) : (n_train + n_val + n_test), :],  # Shape n_test x M
)

for key, value in data.items():
    data[key] = value.astype(np.float32)

args = utils.load_base_args()

In [None]:
########## 5

#save data to disk so TADRED can load it
proj_dir = Path(basedir, proj_name)
proj_dir.mkdir(parents=True, exist_ok=True)
np.save(Path(proj_dir, proj_name + ".npy"), data)

print("Saving data as", Path(proj_dir, proj_name + ".npy"))
pass_data = None

args.data_norm.data_fil = Path(proj_dir, proj_name + ".npy")



In [None]:
########## (6)
# Simplest version of TADRED, modifying the most important hyperparameters
# Here the decreasing subset sizes are hard-coded so that the final optimized protocol is 1/16 the size of the superdesign.
# Feel free to play around with the subset size reduction pattern, but halving the size of the subset sizes at each TADRED step seems to generally work well.


# Decreasing feature subsets sizes for TADRED to consider
args.tadred_train_eval.feature_set_sizes_Ci = [
    Cbar,
    Cbar // 2,
    Cbar // 4,
    Cbar // 8,
    Cbar // 16,
]

# Feature subset sizess for TADRED evaluated on test data
args.tadred_train_eval.feature_set_sizes_evaluated = [
    Cbar // 2,
    Cbar // 4,
    Cbar // 8,
    Cbar // 16,
]

# Scoring net Cbar -> num_units_score[0] -> num_units_score[1] ... -> Cbar units
args.network.num_units_score = [1000, 1000]

# Task net Cbar -> num_units_task[0] -> num_units_task[1] ... -> M units
args.network.num_units_task = [1000, 1000]

args.output.out_base = basedir  # "/Users/paddyslator/python/ED_MRI/test1" #"/home/blumberg/Bureau/z_Automated_Measurement/Output/paddy"
args.output.proj_name = proj_name
args.output.run_name = "test"
args.other_options.save_output = True

# tadred_args["total_epochs"] = 1000

TADRED_output = tadred_main.run(args)

In [None]:
# We can also load the saved results here
# TADRED_output = np.load(
#     Path(
#         proj_dir,
#         "results",
#         args.output.run_name + "_all.npy",
#     ),
#     allow_pickle=True,
# ).item()


# TADRED_output = np.load(
#     Path(
#         proj_dir,
#         "results",
#         args.output.run_name + "_all.npy",
#     ),
#     allow_pickle=True,
# )

In [None]:
#extract some useful parameters fom the tadred output
#final subset index
Clast = TADRED_output["args"]["tadred_train_eval"]["feature_set_sizes_Ci"][-1]

#index of chosen acquisition parameters
acq_params_tadred_index = TADRED_output[Clast]["measurements"]

# chosen acquisition parameters
acq_params_tadred = acq_params_super[acq_params_tadred_index]

print('TADRED chosen acquisition parameters are: ' + str(acq_params_tadred))

In [None]:
#plot the signals at the super design and the tadred chosen for a single voxel
#

#if the number of acquisition parameters is bigger than one, need to choose which one to plot on the x axis
if (acq_params_super.ndim > 1): 
    if (acq_params_super.shape[1] > 1):
        acq_param_to_plot = 1
        acq_params_super_to_plot = acq_params_super[:,acq_param_to_plot]
        acq_params_tadred_to_plot = acq_params_tadred[:,acq_param_to_plot]
        acq_params_tadred_index_to_plot = acq_params_tadred_index[:,acq_param_to_plot]
else:
    acq_params_super_to_plot = acq_params_super
    acq_params_tadred_to_plot = acq_params_tadred
    acq_params_tadred_index_to_plot = acq_params_tadred_index
    
voxel_to_plot = 0


plt.plot(acq_params_super_to_plot, signals[voxel_to_plot,:], 'x')
plt.plot(acq_params_tadred_to_plot, signals[voxel_to_plot,acq_params_tadred_index], 'o')

plt.title('signals from voxel ' + str(voxel_to_plot))
plt.legend(('Super design', 'TADRED chosen'))
plt.ylabel('signal')
plt.xlabel('acquisition parameter')


In [None]:
# save TADRED acquisition parameters
np.save(Path(proj_dir, "acq_params_tadred.npy"), acq_params_tadred)