# *SuperTomo2* fusion testing

The whole image fusion coda was rewritten in *SuperTomo2* in order to take advantage of certain new features in the Anaconda Python distribution. The fusion code was modified to allow processing of images of any size, by dividing them in several smaller blocks. GPU acceleration was added as well. 

In [1]:
%matplotlib inline
import os
import numpy
from ipywidgets import interact, fixed

from supertomo.ui import arguments                     # Command line arguments
from supertomo.io import image_data                    # Data structure
from supertomo.reconstruction import fusion            # Image registration functions
from supertomo.ui import show
from supertomo.utils import itkutils

from accelerate import profiler



The same command line layout is used as in all the other *SuperTomo2* scripts. Here the parameters are simulated with a string. The *hela_full_test.hdf5* dataset contains the HeLa cell images from the first SuperTomo publication.

The full size images are used here. The image fusion is divided into 24 blocks and the blocks are padded with 125 pixels on each side to avoid fusion artefacts.

In [2]:
fuse_args = ("hela_full_test.hdf5 --dir=/home/sami/Data/SuperTomo2/Import "  
             "--max-nof-iterations=1  --first-estimate=constant " 
             "--fusion-method=summative --scale=100 --blocks=24 --pad=125").split()
            
print fuse_args

options = arguments.get_fusion_script_options(fuse_args)




['hela_full_test.hdf5', '--dir=/home/sami/Data/SuperTomo2/Import', '--max-nof-iterations=1', '--first-estimate=constant', '--fusion-method=summative', '--scale=100', '--blocks=24', '--pad=125']


In [3]:
full_path = os.path.join(options.working_directory,
                         options.data_file)

if not os.path.isfile(full_path):
    raise AttributeError("No such file: %s" % full_path)
elif not full_path.endswith(".hdf5"):
    raise AttributeError("Not a HDF5 file")

data = image_data.ImageData(full_path)

## MKLFFT Fusion on CPU

The old image fusion code was rewritten to take advantage of MKL optimized FFT algorithms in the Anaconda package. All the same algorithms are available as in *SuperTomo1*

In [4]:
task = fusion.MultiViewFusionRL(data, options)

The original image size is 198 1024 1024




The fusion will be run with 24 blocks
The internal image size is 200 1026 1024


The image fusion can now be run on a regular computer, which was not previously possible due to huge memory requirements when using large images. However, as one could expect, the going is slow. One estimate calculation takes 5470s (1,5h). The performance can be improved by reducing the number of blocks -- but of course the amount of available blocks plays a role in that. Most of the time is spent in the FFT functions. The MKL optimizations work, but do not make miracles.

In [5]:

task.estimate = numpy.ones(task.image_size, dtype=numpy.float32)
#cProfile.run('task.compute_estimate()')
p = profiler.Profile(signatures=False)
p.enable()
task.compute_estimate()
p.disable()
profiler.plot(p)

Beginning the computation of the 0. estimate
The current block is 1


  cache = block / cache


The current block is 2
The current block is 3
The current block is 4
The current block is 5
The current block is 6
The current block is 7
The current block is 8
The current block is 9
The current block is 10
The current block is 11
The current block is 12
The current block is 13
The current block is 14
The current block is 15
The current block is 16
The current block is 17
The current block is 18
The current block is 19
The current block is 20
The current block is 21
The current block is 22
The current block is 23
The current block is 24


## CUDA fusion on a GPU

Hardware acceleration features were added in order to make the image fusion times more reasonable. With 24 blocks calculation of a single estimate on a GPU takes 157 seconds (2,5 minutes), an improvement of ~36 times over the non-accelerated code. **The CUDA code can be run at 8 blocks at minimum, which further reduces the time to ~110 seconds per iteration, an improvement of ~50 times**.

In [6]:
from supertomo.reconstruction import fusion_cuda

task2 = fusion_cuda.MultiViewFusionRLCuda(data, options)

The original image size is 198 1024 1024
The fusion will be run with 24 blocks
The internal image size is 200 1026 1024
kernel config: (24, 19, 38) x (32, 32, 8)


In [7]:

task2.estimate = numpy.ones(task2.image_size, dtype=numpy.float32)
p2 = profiler.Profile(signatures=False)
p2.enable()
task2.compute_estimate()
p2.disable()
profiler.plot(p2)
#cProfile.run('task2.compute_estimate()')

Beginning the computation of the 0. estimate


In [8]:
data.close()