# CAMERA Workshop 2019

Lawrence Berkeley National Laboratory - LBNL

* Support material for the tutorial _Image processing for microCT using scikit-image (Part I)_.

This tutorial will introduce how to analyze three dimensional stacked and volumetric
images in Python, mainly using scikit-image. Here we will learn how to:
  * pre-process data using filtering, binarization and segmentation techniques.
  * inspect, count and measure attributes of objects and regions of interest in the data.
  * visualize 3D data.

Please prepare for the tutorial by [installing the pre-requisite software](preparation.md) beforehand.

For more info:
  * [[CAMERA Workshop 2019]](http://microct.lbl.gov/cameratomo2019/)
  * [[scikit-image]](https://scikit-image.org/)


## Using joblib to parallelize code

In [3]:
import numpy as np
from skimage import restoration, data, io

In [3]:
filename = "../data/cells.tif"

cells = io.imread(filename)

In [4]:
def bilateral_classic_loop():
    cells_bilateral = np.empty_like(cells)
    for plane, image in enumerate(cells):
        cells_bilateral[plane] = restoration.denoise_bilateral(image, multichannel=False)
    return cells_bilateral

%timeit bilateral_classic_loop()

3.55 s ± 24.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [5]:
from joblib import Parallel, delayed

# when using n_jobs=-2, all CPUs but one are used.

def bilateral_joblib_loop():
    cells_bilateral = Parallel(n_jobs=-2)(
        delayed(restoration.denoise_bilateral)(
            plane,multichannel=False) for plane in cells_rescaled)

%timeit bilateral_joblib_loop()

1.42 s ± 42.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


## Using numba to "compile" code

Numba is a just-in-time (or JIT) compiler, which translates a part of Python and Numpy into faster code.

The most basic way to use Numba is through the `@jit` *decorator*:

In [12]:
!conda install numba --yes

Collecting package metadata (repodata.json): done
Solving environment: done


  current version: 4.7.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /Users/dani/anaconda3/envs/imagexd19

  added / updated specs:
    - numba


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    llvmlite-0.29.0            |   py37h98b8051_0        13.7 MB
    numba-0.45.1               |   py37h6440ff4_0         2.9 MB
    tbb-2019.4                 |       h04f5b5a_0         163 KB
    ------------------------------------------------------------
                                           Total:        16.8 MB

The following NEW packages will be INSTALLED:

  llvmlite           pkgs/main/osx-64::llvmlite-0.29.0-py37h98b8051_0
  numba              pkgs/main/osx-64::numba-0.45.1-py37h6440ff4_0
  tbb          

In [14]:
from numba import jit

This way, Numba decides what it should optimize in your functions. To show how fast this **decorator** is, let's define two versions of a dummy Fibonacci function. One `fibonacci_no_numba`, does not use Numba:

In [15]:
def fibonacci_no_numba(elem=10):
    """
    """
    fibonacci = np.zeros(elem)
    aux_1, aux_2 = 1, 1
    fibonacci[0: 2] = aux_1, aux_2

    for idx in range(2, elem):
        fibo_current = aux_1 + aux_2
        fibonacci[idx] = fibo_current
        aux_1 = aux_2
        aux_2 = fibo_current
    return fibonacci

The other, `fibonacci_numba`, uses Numba to optimize the function.

In [16]:
fibonacci_numba = jit()(fibonacci_no_numba)

Check that the functions are equal, and produce the same output:

In [17]:
elem = 9
print(f'* The {elem} first elements of the Fibonacci sequence, using fibonacci_no_numba: {fibonacci_no_numba(elem=elem)}')
print(f'* The {elem} first elements of the Fibonacci sequence, using fibonacci_numba: {fibonacci_numba(elem=elem)}')

* The 9 first elements of the Fibonacci sequence, using fibonacci_no_numba: [ 1.  1.  2.  3.  5.  8. 13. 21. 34.]
* The 9 first elements of the Fibonacci sequence, using fibonacci_numba: [ 1.  1.  2.  3.  5.  8. 13. 21. 34.]


Now we use the Jupyter Notebook magic `%timeit` to calculate how fast these functions return a Fibonacci sequence with 1000 elements:

In [20]:
time_no_numba = %timeit -o fibonacci_no_numba(elem=10000)

OverflowError: int too large to convert to float

In [21]:
time_numba = %timeit -o fibonacci_numba(elem=10000)

8.34 µs ± 107 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [None]:
print(f'Numba version is around {int(time_no_numba.best / time_numba.best)} times faster than non-Numba one.')

# Using Dask and scikit-image together

In [None]:
# https://nbviewer.jupyter.org/gist/mrocklin/ec745d6c2a12dddddb125ef460a4da76
# https://drive.google.com/file/d/0Byr4wsGTdf46a3NTV2V0NkZBTXc/view
# http://emmanuelle.github.io/segmentation-of-3-d-tomography-images-with-python-and-scikit-image.html

In [4]:
data = np.fromfile('../data/Al8Cu_1000_g13_t4_200_250.vol', 
                    dtype=np.float32).reshape((50, 1104, 1104))

In [5]:
from skimage.filters import gaussian

%time _ = gaussian(data, sigma=3)

CPU times: user 2.76 s, sys: 57.1 ms, total: 2.82 s
Wall time: 2.82 s


## Parallelize with Dask.array

We split this array into four chunks in the large spatial dimensions, leaving the short dimension (50) full.

We then map the skimage.filters.gaussian function on each block, providing an overlap of 9, which should provide enough space for the gaussian filter to be smooth over chunk transitions.

In [10]:
import dask.array as da

%time
x = da.from_array(data, chunks=(50, data.shape[1] // 3, data.shape[2] // 3), name=False)
y = x.map_overlap(gaussian, depth=9, sigma=3, boundary='none')

CPU times: user 14 µs, sys: 2 µs, total: 16 µs
Wall time: 29.6 µs


In [11]:
y

Unnamed: 0,Array,Chunk
Bytes,243.76 MB,27.08 MB
Shape,"(50, 1104, 1104)","(50, 368, 368)"
Count,86 Tasks,9 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 243.76 MB 27.08 MB Shape (50, 1104, 1104) (50, 368, 368) Count 86 Tasks 9 Chunks Type float32 numpy.ndarray",1104  1104  50,

Unnamed: 0,Array,Chunk
Bytes,243.76 MB,27.08 MB
Shape,"(50, 1104, 1104)","(50, 368, 368)"
Count,86 Tasks,9 Chunks
Type,float32,numpy.ndarray


In [12]:
%time _ = y.compute()

CPU times: user 4.86 s, sys: 314 ms, total: 5.18 s
Wall time: 1.71 s
