
# Demonstration of maximum-likelihood reconstruction with SIRF
This demonstration shows how to monitor progress of a SIRF reconstructor (currently using OSEM as an example) and implement a (simplistic) gradient-ascent algorithm using SIRF. This notebook can be extended to use regularised reconstruction as well.

Please complete the [OSEM_reconstruction notebook](OSEM_reconstruction.ipynb) first.

Authors: Kris Thielemans and Evgueni Ovtchinnikov  
First version: 8th of September 2016  
Second version: 17th of May 2018  
Third version: June 2021

CCP SyneRBI Synergistic Image Reconstruction Framework (SIRF).  
Copyright 2015 - 2017 Rutherford Appleton Laboratory STFC  
Copyright 2015 - 2018, 2021 University College London

This is software developed for the Collaborative Computational Project in Synergistic Reconstruction for Biomedical Imaging (http://www.ccpsynerbi.ac.uk/).

SPDX-License-Identifier: Apache-2.0

# A note on terminology

Because we are maximising the likelihood, SIRF generally wants to *maximise* the objective function. Many optimisation books, and CIL, are written for minimisation. You would therefore have to multiply the objective function with `-1`.

OSEM and other algorithms use "subsets" of the data to compute an image update. `sirf.STIR` uses subsets in "views" and the following terminology
- `num_subsets`: the number of subsets
- `subset_num`: the subset that you're using now (range `0`...`num_subsets-1`)
- sub-iteration: one image update using one subset of the data
- `num_subiterations`: the total number of image updates used by the algorithm.

Therefore, a ("full") iteration updates the image `num_subiterations` times. A full iteration is also called an "epoch".

OSEM et al. use "ordered subsets", i.e. they go through the subsets in a fixed order (currently not changeable in `sirf.STIR`). Other algorithms like "stochastic gradient ascent" use subsets in a random order. These are currently not illustrated here, but could easily be implemented based on the code in this notebook (by using `recon.set_subset_num`).

# Initial set-up

In [None]:
#%% make sure figures appears inline and animations works
%matplotlib widget

# Setup the working directory for the notebook
import notebook_setup
from sirf_exercises import cd_to_working_dir
cd_to_working_dir('PET', 'ML_reconstruction')

In [None]:
#%% Initial imports etc
import numpy
from numpy.linalg import norm
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import os
import sys
import shutil
#import scipy
#from scipy import optimize
import sirf.STIR as pet
from sirf.Utilities import examples_data_path
from sirf_exercises import exercises_data_path

# define the directory with input files for this notebook
data_path = os.path.join(examples_data_path('PET'), 'thorax_single_slice')

In [None]:
# set-up redirection of STIR messages to files
msg_red = pet.MessageRedirector('info.txt', 'warnings.txt', 'errors.txt')

In [None]:
#%% some handy function definitions
def plot_2d_image(idx,vol,title,clims=None,cmap="viridis"):
    """Customized version of subplot to plot 2D image"""
    plt.subplot(*idx)
    plt.imshow(vol,cmap=cmap)
    if not clims is None:
        plt.clim(clims)
    plt.colorbar(shrink=.4)
    plt.title(title)
    plt.axis("off")

def make_positive(image_array):
    """truncate any negatives to zero"""
    image_array[image_array<0] = 0
    return image_array

def make_cylindrical_FOV(image):
    """truncate to cylindrical FOV"""
    filter = pet.TruncateToCylinderProcessor()
    filter.apply(image)

## Create some simulated data from ground-truth images
This is a repetition of the code in the OSEM notebook, just such that the current notebook is self-contained. However, there are no explanations here.

You should be able to adapt the notebook to use your own data as well of course. The actual reconstruction exercises and its evaluation does not require that the input is a simulation.

In [None]:
#%% Read in images
image = pet.ImageData(os.path.join(data_path, 'emission.hv'))*0.05
attn_image = pet.ImageData(os.path.join(data_path, 'attenuation.hv'))
template = pet.AcquisitionData(os.path.join(data_path, 'template_sinogram.hs'))

In [None]:
#%% save max for future displays
cmax = image.max()*.6

In [None]:
# create attenuation
acq_model_for_attn = pet.AcquisitionModelUsingRayTracingMatrix()
asm_attn = pet.AcquisitionSensitivityModel(attn_image, acq_model_for_attn)
asm_attn.set_up(template)
attn_factors = asm_attn.forward(template.get_uniform_copy(1))
asm_attn = pet.AcquisitionSensitivityModel(attn_factors)

In [None]:
# create acquisition model
acq_model = pet.AcquisitionModelUsingRayTracingMatrix()
# we will increase the number of rays used for every Line-of-Response (LOR) as an example
# (it is not required for the exercise of course)
acq_model.set_num_tangential_LORs(5)
acq_model.set_acquisition_sensitivity(asm_attn)
# set-up
acq_model.set_up(template,image)

In [None]:
#%% simulate some data using forward projection
acquired_data=acq_model.forward(image)

## create the objective function and  OSMAPOSL reconstructor

In [None]:
obj_fun = pet.make_Poisson_loglikelihood(acquired_data)
obj_fun.set_acquisition_model(acq_model)
# we could also add a prior, but we will not do that here (although the rest of the exercise would still work)
#obj_fun.set_prior(prior)

In [None]:
recon = pet.OSMAPOSLReconstructor()
recon.set_objective_function(obj_fun)
recon.set_num_subsets(4)

## create initial image

In the previous OSEM notebook, we just used a uniform image. Here, we will use a disk that roughly corresponds to the *Field of View (FOV)*. The reason for this is that it makes things easier for display and the gradient ascent code below.

An alternative solution would be to tell the `acq_model` to use a square FOV as opposed to a circular one, but that will slow down calculations just a little bit, so we won't do that here (feel free to try!).

In addition, the initial value is going to be a bit more important here as we're going to plot the value of the objective function. Obviously, having a descent estimate of the scale of the image will make that plot look more sensible. Feel free to experiment with the value!

In [None]:
initial_image=image.get_uniform_copy(cmax / 4)
make_cylindrical_FOV(initial_image)
# display
im_slice = initial_image.dimensions()[0] // 2
plt.figure()
plot_2d_image([1,1,1],initial_image.as_array()[im_slice,:,:], 'initial image',[0,cmax])

# Use the OSMAPOSL reconstructor to do all the work
This is the same as in the OSEM notebook

In [None]:
# set up the reconstructor
num_subiters=100
recon.set_num_subiterations(num_subiters)
recon.set_up(initial_image)
# do actual recon
recon.set_current_estimate(initial_image)
recon.process()
reconstructed_image=recon.get_output()

In [None]:
plt.figure(figsize=(9, 4))
plot_2d_image([1,2,1],image.as_array()[im_slice,:,:,],'ground truth image',[0,cmax*1.2])
plot_2d_image([1,2,2],reconstructed_image.as_array()[im_slice,:,:,],'reconstructed image',[0,cmax*1.2])
plt.tight_layout();

# Taking control of the iteration process
We will now show how to run each sub-iteration from in Python, as opposed to
letting the reconstructor do all sub-iterations at once.

The lines below are a bit complicated as we save the image at every update, as well as saving the objective function value. That way, we can display various things below.

In [None]:
#%% run same reconstruction but saving images and objective function values every sub-iteration
num_subiters = 64

# create an image object that will be updated during the iterations
current_image = initial_image.clone()

# create an array to store the values of the objective function at every
# sub-iteration (and fill in the first)
osem_objective_function_values = [obj_fun.value(current_image)]

# create an ndarray to store the images at every sub-iteration
all_osem_images = numpy.ndarray(shape=(num_subiters + 1,) + current_image.dimensions())
all_osem_images[0,:,:,:] = current_image.as_array()

# do the actual updates
for i in range(1, num_subiters+1):
    recon.update(current_image)
    # store results
    obj_fun_value = obj_fun.value(current_image)
    osem_objective_function_values.append(obj_fun_value)
    all_osem_images[i,:,:,:] =  current_image.as_array()

## Make some plots with these results

In [None]:
#%% define a function for plotting images and the updates
def plot_progress(all_images, title, subiterations = []):
    if len(subiterations) == 0:
        num_subiters = all_images[0].shape[0] - 1
        subiterations = range(1, num_subiters + 1)
    num_rows = len(all_images)

    for i in subiterations:
        plt.figure()
        for r in range(num_rows):
            plot_2d_image([num_rows,2,2 * r + 1],
                          all_images[r][i,im_slice,:,:],'%s at %d' % (title[r], i), [0,cmax])
            plot_2d_image([num_rows,2,2*r+2],
                          all_images[r][i,im_slice,:,:]-all_images[r][i - 1,im_slice,:,:],'update',[-cmax*.05,cmax*.05], cmap='seismic')
        #plt.pause(.05)
        plt.show()

In [None]:
#%% now call this function to see how we went along
# note that in the notebook interface, this might create a box with a vertical slider
subiterations = (1,2,4,8,16,32,64)
# close all "open" images as otherwise we will get warnings (the notebook interface keeps them "open" somehow)
plt.close('all')    
plot_progress([all_osem_images], ['OSEM'],subiterations)

In [None]:
#%% plot objective function values
plt.figure()
#plt.plot(subiterations, [ osem_objective_function_values[i] for i in subiterations])
plt.plot(osem_objective_function_values)
plt.title('Objective function values')
plt.xlabel('sub-iterations');

The above plot seems to indicate that (OS)EM converges to a stable value of the
log-likelihood very quickly. However, as we've seen, the images are still changing.

Convince yourself that the likelihood is still increasing (either by zooming into the figure, or by using `plt.ylim`).

We can compute some simple ROI values as well. Let's plot those.

You might want to convince yourself first that these ROI are in the correct place (but it doesn't matter too much for this exercise).

In [None]:
#%% ROI
ROI_lesion = all_osem_images[:,(im_slice,), 65:70, 40:45]
ROI_lung = all_osem_images[:,(im_slice,), 75:80, 45:50]

ROI_mean_lesion = ROI_lesion.mean(axis=(1,2,3))
ROI_std_lesion = ROI_lesion.std(axis=(1,2,3))

ROI_mean_lung = ROI_lung.mean(axis=(1,2,3))
ROI_std_lung = ROI_lung.std(axis=(1,2,3))

plt.figure()
#plt.hold('on')
plt.subplot(1,2,1)
plt.plot(ROI_mean_lesion,'k',label='lesion')
plt.plot(ROI_mean_lung,'r',label='lung')
plt.legend()
plt.title('ROI mean')
plt.xlabel('sub-iterations')
plt.subplot(1,2,2)
plt.plot(ROI_std_lesion, 'k',label='lesion')
plt.plot(ROI_std_lung, 'r',label='lung')
plt.legend()
plt.title('ROI standard deviation')
plt.xlabel('sub-iterations');

The above plots indicate that the log-likelihood is not very sensitive
to changes in the image. This is because it measures changes in the projected data, and is an illustration that image reconstruction is an ill-conditioned inverse problem.

# Implement gradient ascent and compare with OSEM
Here we will implement a simple version of Gradient Ascent using SIRF functions.We will use
the SIRF capability to return the gradient of the objective function directly.

Gradient ascent (GA) works by updating the image in the direction of the gradient

    new_image = current_image + step_size * gradient

Here we will use a fixed step-size and use "truncation" to enforce
non-negativity of the image.

In the code below, manipulations such as positivity are done via `numpy`, so we use `as_array()` and do all additions etc on `numpy` objects as well.

In [None]:
#%% Define some variables to perform gradient ascent for a few (sub)iterations
num_subiters = 32
# relative step-size
tau = .3

# set initial image and store it
# also store the value of the objective function for plotting
current_image = initial_image.clone()
GA_objective_function_values = [obj_fun.value(current_image)]
# create an array with all reconstruct images for plotting
idata = current_image.as_array()
all_images = numpy.ndarray(shape=(num_subiters + 1,) + idata.shape)
all_images[0,:,:,:] =  idata;

In [None]:
#%% perform GA iterations
# executing this cell might take a while
for i in range(1, num_subiters+1):  
    # obtain gradient for subset 0
    # with current settings, this means we will only use the data of that subset
    # (gradient ascent with subsets is too complicated for this demo)
    grad = obj_fun.gradient(current_image, 0)
    grad_array = grad.as_array()

    # compute step-size as relative to current image-norm
    step_size = tau * norm(idata) / norm(grad_array)

    # perform gradient ascent step and truncate to positive values
    idata = make_positive(idata + step_size*grad_array)
    current_image.fill(idata)

    # compute objective function value for plotting, and write some diagnostics
    obj_fun_value = obj_fun.value(current_image)
    GA_objective_function_values.append(obj_fun_value)
    all_images[i,:,:,:] = idata;

In [None]:
#%% Plot objective function values
plt.figure()
#plt.hold('on')
plt.title('Objective function value vs subiterations')
plt.plot(GA_objective_function_values,'b')
plt.plot(osem_objective_function_values,'r')
plt.legend(('gradient ascent', 'OSEM'),loc='lower right');

In [None]:
#%% compare GA and OSEM images
plot_progress([all_images, all_osem_images], ['GA' ,'OSEM'],[2,4,8,16,32])

The above implementation used a fixed (relative) step-size. Experiment with different values for `tau` and see how that influences convergence.

Steepest gradient ascent will include a line search to estimate the step size. There is a demo
in the SIRF code on this. You can [find the code here as well](https://github.com/CCPPETMR/SIRF/blob/master/examples/Python/PET/steepest_ascent.py). You could implement this here.

# Exercise: repeat this analysis with noisy data

In the above simulation, the `acquired_data` was "perfect", i.e. it was the output of the same acquisition model as used for the reconstruction *and* there was no noise in the data. In real life, you will never be so lucky!

Of course, performance of a reconstruction algorithm needs to be investigated in more realistic scenarios. We suggest that you use a Poisson realisation of the data, and then repeat the above cells.

We can use the `numpy.random.poisson` function to create a noisy realisation of the simulated data. Of course, we will need to convert the data to `numpy` first via `as_array()`.

One thing to watch out for is that Poisson statstics is (solely) determined by the mean (in contrast to the normal distribution, which has separate mean and standard deviation). Therefore, the "magnitude" of the simulated data will be very important to determine the noise level. The relevant formula for Poisson statistics is that
<center>variance = mean</center>

This exercise is set-up such that the mean of the simulated data is "reasonable" such that you will get some noise in the data, but not too much. Obviously, if you use other data, you will have to check what happens. You can simply rescale the `acquired_data` (up for less noise, down for more), which will then of course rescale the reconstructed images as well.

In [None]:
#%% Generate a noisy realisation of the data

noisy_array=numpy.random.poisson(acquired_data.as_array()).astype('float64')
print(' Maximum counts in the data: %d' % noisy_array.max())
# stuff into a new AcquisitionData object
noisy_data = acquired_data.clone()
noisy_data.fill(noisy_array);

In [None]:
#%% Display bitmaps of the middle sinogram
plt.figure(figsize=(9, 4))
plot_2d_image([1,2,1],acquired_data.as_array()[0,im_slice,:,:,],'original',[0,acquired_data.max()])
plot_2d_image([1,2,2],noisy_array[0,im_slice,:,:,],'noisy',[0,acquired_data.max()])
plt.tight_layout();

Now we set the objective function to use the noisy data instead. The rest of the cells above wouldn't need any changes (but do check!).

In [None]:
obj_fun.set_acquisition_data(noisy_data)

## Things that you might discover
- Without any noise, the OSEM (or MLEM or gradient ascent) reconstructions looked pretty good, and increasing the number of updates is beneficial. However, with noise they will gradually deteriorate. This is a consequence of the illposedness of the reconstruction problem, and shows that regularisation is needed.
- With noise, the objective function no longer increases monotonically, but a clear pattern is seen in terms of the number of subsets. We recommend changing the number of subsets to see this how this affects your images.<br>The underlying reason is that OSEM is *not* a convergent algorithm and (nearly always) results in a "limit-cycle". Other algorithms that use subsets and do converge exist.
- You might have to change the step-size for the gradient ascent algorithm, as also discussed above.