# Gradient descent for MR, PET and CT
This demonstration shows how to do image reconstruction using gradient descent for different modalities. 

It builds on the the notebook *acquisition_model_mr_pet_ct.ipynb*. The first part of the notebook which creates acquisition models and simulates data from the brainweb is the same code but with fewer comments. If anything is unclear, then please refer to the other notebook to get some further information.

This demo is a jupyter notebook, i.e. intended to be run step by step.
You could export it as a Python file and run it one go, but that might
make little sense as the figures are not labelled.


Author: Christoph Kolbitsch, Edoardo Pasca  
First version: 23rd of April 2021  
Updated: 26nd of June 2021  

CCP SyneRBI Synergistic Image Reconstruction Framework (SIRF).  
Copyright 2015 - 2017, 2021 Rutherford Appleton Laboratory STFC.  
Copyright 2015 - 2019 University College London.   
Copyright 2021 Physikalisch-Technische Bundesanstalt.

This is software developed for the Collaborative Computational
Project in Positron Emission Tomography and Magnetic Resonance imaging
(http://www.ccppetmr.ac.uk/).

SPDX-License-Identifier: Apache-2.0

# Initial set-up

In [None]:
# Make sure figures appears inline and animations works
%matplotlib widget

# Setup the working directory for the notebook
import notebook_setup
from sirf_exercises import cd_to_working_dir
cd_to_working_dir('Synergistic', 'GD_MR_PET_CT')

In [None]:
# Initial imports etc
import numpy
from numpy.linalg import norm
import matplotlib.pyplot as plt

import os
import sys
import shutil
import brainweb
from tqdm.auto import tqdm

# Import MR, PET and CT functionality
import sirf.Gadgetron as mr
import sirf.STIR as pet
import cil.framework as ct

from sirf.Utilities import examples_data_path
from cil.plugins.astra.operators import ProjectionOperator as ap
from cil.optimisation.functions import LeastSquares

# Utilities

In [None]:
# First define some handy function definitions
# To make subsequent code cleaner, we have a few functions here. You can ignore
# ignore them when you first see this demo.

def plot_2d_image(idx,vol,title,clims=None,cmap="viridis"):
    """Customized version of subplot to plot 2D image"""
    plt.subplot(*idx)
    plt.imshow(vol,cmap=cmap)
    if not clims is None:
        plt.clim(clims)
    plt.colorbar()
    plt.title(title)
    plt.axis("off")

def crop_and_fill(templ_im, vol):
    """Crop volumetric image data and replace image content in template image object"""
    # Get size of template image and crop
    idim_orig = templ_im.as_array().shape
    idim = (1,)*(3-len(idim_orig)) + idim_orig
    offset = (numpy.array(vol.shape) - numpy.array(idim)) // 2
    vol = vol[offset[0]:offset[0]+idim[0], offset[1]:offset[1]+idim[1], offset[2]:offset[2]+idim[2]]
    
    # Make a copy of the template to ensure we do not overwrite it
    templ_im_out = templ_im.copy()
    
    # Fill image content 
    templ_im_out.fill(numpy.reshape(vol, idim_orig))
    return(templ_im_out)

# Get brainweb data

We will download and use data from the brainweb. We will use a FDG image for PET and the PET uMAP for CT. MR usually provides qualitative images with an image contrast proportional to difference in T1, T2 or T2* depending on the sequence parameters. Nevertheless, we will make our life easy, by directly using the T1 map provided by the brainweb for MR.

In [None]:
fname, url= sorted(brainweb.utils.LINKS.items())[0]
files = brainweb.get_file(fname, url, ".")
data = brainweb.load_file(fname)

brainweb.seed(1337)

In [None]:
for f in tqdm([fname], desc="mMR ground truths", unit="subject"):
    vol = brainweb.get_mmr_fromfile(f, petNoise=1, t1Noise=0.75, t2Noise=0.75, petSigma=1, t1Sigma=1, t2Sigma=1)

In [None]:
FDG_arr  = vol['PET']
T1_arr   = vol['T1']
uMap_arr = vol['uMap']

In [None]:
# Display it
plt.figure();
slice_show = FDG_arr.shape[0]//2
plot_2d_image([1,3,1], FDG_arr[slice_show, 100:-100, 100:-100], 'FDG', cmap="hot")
plot_2d_image([1,3,2], T1_arr[slice_show, 100:-100, 100:-100], 'T1', cmap="Greys_r")
plot_2d_image([1,3,3], uMap_arr[slice_show, 100:-100, 100:-100], 'uMap', cmap="bone")

# Acquisition Models

Here we will set up the acquisition models for __MR__, __PET__ and __CT__.

## MR

In [None]:
# 1. create MR AcquisitionData
mr_acq = mr.AcquisitionData(examples_data_path('MR') + '/grappa2_1rep.h5')

In [None]:
# 2. calculate CSM
preprocessed_data = mr.preprocess_acquisition_data(mr_acq)
preprocessed_data.sort()

csm = mr.CoilSensitivityData()
csm.smoothness = 50
csm.calculate(preprocessed_data)

In [None]:
# 3. calculate image template
recon = mr.FullySampledReconstructor()
recon.set_input(preprocessed_data)
recon.process()
im_mr = recon.get_output()

In [None]:
# 4. create AcquisitionModel
acq_mod_mr = mr.AcquisitionModel(preprocessed_data, im_mr)

# Supply csm to the acquisition model 
acq_mod_mr.set_coil_sensitivity_maps(csm)

## PET

In [None]:
# 1. create PET AcquisitionData
templ_sino = pet.AcquisitionData(examples_data_path('PET') + "/thorax_single_slice/template_sinogram.hs")

In [None]:
# 2. create a template PET ImageData
im_pet = pet.ImageData(templ_sino)

In [None]:
# 3. create AcquisitionModel

# create PET acquisition model
acq_mod_pet = pet.AcquisitionModelUsingRayTracingMatrix()
acq_mod_pet.set_up(templ_sino, im_pet)

## CT

In [None]:
# 1. define AcquisitionGeometry
angles = numpy.linspace(0, 360, 50, True, dtype=numpy.float32)
ag2d = ct.AcquisitionGeometry.create_Cone2D((0,-1000), (0, 500))\
          .set_panel(128,pixel_size=3.104)\
          .set_angles(angles)

In [None]:
# 2. get ImageGeometry
ct_ig = ag2d.get_ImageGeometry()

In [None]:
# 3. create ImageData
im_ct = ct_ig.allocate(None)

In [None]:
# 4. create AcquisitionModel
acq_mod_ct = ap(ct_ig, ag2d, device='cpu')

# Simulate raw data

Here we will use the acquisition models to create simulated raw data and then do a simple reconstruction to have some initial images (i.e. starting point) for our gradient descent algorithms. For each modality we will:

 * Fill an image template (`im_mr`, `im_pet`, `im_ct`)
 * Create raw data (`raw_mr`, `raw_pet`, `raw_ct`)
 * Reconstruct an initial guess of our image using `backward`/`adjoint`

In [None]:
# MR
im_mr = crop_and_fill(im_mr, T1_arr)
raw_mr = acq_mod_mr.forward(im_mr)
bwd_mr = acq_mod_mr.backward(raw_mr)

# PET
im_pet = crop_and_fill(im_pet, FDG_arr)
raw_pet = acq_mod_pet.forward(im_pet)
bwd_pet = acq_mod_pet.backward(raw_pet)

# CT
im_ct = crop_and_fill(im_ct, uMap_arr)
raw_ct = acq_mod_ct.direct(im_ct)
bwd_ct = acq_mod_ct.adjoint(raw_ct)

# Gradient descent or ascent

There are basically two things we need to be able to run a gradient descent algorithm. First we need an objective function (`obj_func`) which calculates the difference between the acquired raw data and our current image estimate. Second, we need to know the gradient of the objective function (`obj_func_grad`), because we need to know how we have to update our current image estimate in order to decrease the value of the objective function.

Both `obj_func` and `obj_func_grad` are modality specific and so here we will go through all modalities and define them. Let's start with __PET__ .

## PET

The noise in __PET__ follows a Poisson distribution and hence we can use a Poisson log-likelihood as our objective function. Luckily enough this is already part of __SIRF__ and hence we can simply create the objective function by providing the raw __PET__ data and __PET__ image data.

In [None]:
obj_fun_pet_sirf = pet.make_Poisson_loglikelihood(raw_pet)
obj_fun_pet_sirf.set_up(bwd_pet)

Because `obj_func_pet` cannot be called directly but `obj_func_pet.value()` has to be used, we write a quick wrapper around it

In [None]:
def obj_fun_pet(curr_image_estimate):
    return(obj_fun_pet_sirf.value(curr_image_estimate))

The gradient is also already implemented and can simply be calculated by calling `obj_fun_pet.gradient`. Nevertheless, to be more explicit and consistent with the other modalities we will define a new function to do the job:

In [None]:
def obj_fun_grad_pet(curr_image_estimate):
    # The 0 here means, only the gradient for subset 0 is returned. 
    # We will just accept this as is here, because subsets are too advanced for this demo.
    return(obj_fun_pet_sirf.gradient(curr_image_estimate, 0))

In __PET__ (and also in __CT__) we need to make sure that all the image values are positive, so we will create a small function for this

In [None]:
def make_positive(image_data):
    # The idea is to create an image with all 0s (zero_image) and then we can take the maximum over each voxel.
    # If the voxel value is larger than 0, then the original value is returned. If it is smaller than 0, then 
    # the value of the zero_image, i.e. 0, is returned.
    zero_image = image_data.clone()
    zero_image.fill(0.0)
    image_data = image_data.maximum(zero_image)
    return image_data

Great, __PET__ is all done, now we will continue with __CT__ .

## CT

For __CT__ we use a least squares objective function which is already available from __CIL__ . 

In [None]:
least_squares_cil = LeastSquares(acq_mod_ct, raw_ct)

To make sure we have the same function interface as for __PET__ and __MR__ we will also quickly wrap these functions:

In [None]:
def obj_fun_ct(curr_image_estimate):
    return(least_squares_cil(curr_image_estimate))

In [None]:
def obj_fun_grad_ct(curr_image_estimate):
    # We are returning the negative gradient, because we are going to add it later to our image estimate
    return(-least_squares_cil.gradient(curr_image_estimate))

## MR

And last but not least __MR__ . If you want to know more about the objective function of __MR__ and its gradient, then pleaes have a look at the notebook *d_undersampled_reconstructions.ipynb*.

In [None]:
def obj_fun_mr(curr_image_estimate):
    c =  acq_mod_mr.forward(curr_image_estimate) - raw_mr
    return(0.5 * c.norm() ** 2)

In [None]:
def obj_fun_grad_mr(curr_image_estimation):
    # We are returning the negative gradient, because we are going to add it later to our image estimate
    return(-acq_mod_mr.backward(acq_mod_mr.forward(curr_image_estimate) - raw_mr))

Now we have all our `obj_func` and `obj_func_grad` we will select one modality and then implement the gradient descent/ascent appproach. We also need an image `init_image` to start with. Here we will simply use the simple reconstruction which did above.

In [None]:
curr_modality = 'pet' # pet, ct, mr

if curr_modality.lower() == 'pet':
    obj_fun = obj_fun_pet
    obj_fun_grad = obj_fun_grad_pet
    init_image = bwd_pet
elif curr_modality.lower() == 'ct':
    obj_fun = obj_fun_ct
    obj_fun_grad = obj_fun_grad_ct
    init_image = bwd_ct  
elif curr_modality.lower() == 'mr':
    obj_fun = obj_fun_mr
    obj_fun_grad = obj_fun_grad_mr
    init_image = bwd_mr  
else:
    raise NameError('{:} not recognised'.format(curr_modality,))   

Now we come to the probably most important paramter for gradient descent, the infamous __step-size__ . Unfortunately, the gradient only gives us the direction along which we need to update the image, but does not tell us by how much we have to go in this direction. Therefore, we need to define how far we step along the gradient direction in each iteration by hand. 

To make sure the step-size is adapted to each modality as much as possible, we won't define the step-size directly, but we will calculate the step-size as `tau` times the norm of the current image estimate. Nevertheless, this is still just a guess. 

If the step-size is too small, we have very slow convergence. If the step-size is too large, then we are overstepping our target image and the objective function starts to oscillate. The value below is reasonable for __PET__, but for the other modalities you will have to optimise it to ensure good convergence. 

In [None]:
# Relative step-size
tau = .3

The second most important parameter is of course the number of iterations. 

In [None]:
# Number of iterations
num_iters = 24

Then we need to create an image as a starting point for the gradient descent algorithm and we will also create a numpy array, where we can store the image estimate at each iteration

In [None]:
# Initial image
curr_image_estimate = init_image.clone()

# Array for images at each iteration
image_iteration = numpy.ndarray((num_iters + 1,), dtype=object)

# We use numpy.abs here to be compatible with MR complex data
image_iteration[0] = curr_image_estimate

# Variable to store the current value of the objective function
obj_func_values = numpy.zeros(shape=(num_iters + 1,))
obj_func_values[0] = obj_fun(curr_image_estimate)

Now we can finally write down our gradient descent algorithm

In [None]:
for i in range(1, num_iters+1):  
    # First we calculate the gradient to find out how we need to update our current image estimate
    grad = obj_fun_grad(curr_image_estimate)

    # Compute step-size relative to current image-norm
    step_size = tau * curr_image_estimate.norm() / grad.norm()

    # Perform gradient ascent step
    curr_image_estimate = curr_image_estimate + step_size * grad

    # In PET and CT we have to ensure values are positive. 
    if curr_modality.lower() == 'ct' or curr_modality.lower() == 'pet':
        curr_image_estimate = make_positive(curr_image_estimate)
    
    # Compute objective function value for plotting, and write some diagnostics
    obj_func_values[i] = obj_fun(curr_image_estimate)
    
    # We use numpy.abs here to be compatible with MR complex data
    image_iteration[i] = curr_image_estimate

In [None]:
# Plot objective function values
plt.figure()
plt.title('Objective function value')
plt.plot(obj_func_values, 'o-b')

In [None]:
# Plot a slice for different iterations
plt.figure();
for i in range(4):
    curr_it = i*8
    image = numpy.abs(image_iteration[curr_it].as_array())
    centre_slice = image.shape[0]//2
    if len(image.shape) == 3: # PET, MR
        plot_2d_image([2,2,i+1], image[centre_slice,:,:], 'It '+str(curr_it), cmap="viridis")
    else: # CT
        plot_2d_image([2,2,i+1], image, 'It '+str(curr_it), cmap="viridis")    