# Preprocessing data
Image processing basics.

**Acknowledgment**    
Adapted some content below from Bioimage Analysis Notebook compilation at:    
https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/intro.html

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from fastplotlib.widgets import ImageWidget

In [None]:
arr_image = np.array([
   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0,], # row 0
   [0, 0, 1, 1, 0, 0, 1, 1, 0, 0,], # row 1
   [0, 0, 1, 1, 0, 0, 1, 1, 0, 0,], # row 2
   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0,], # row 3
   [0, 0, 0, 0, 1, 0, 0, 0, 0, 0,], # row 4
   [0, 1, 0, 0, 1, 1, 0, 0, 1, 0,], # row 5
   [0, 0, 1, 0, 0, 0, 0, 1, 1, 0,], # row 6
   [0, 0, 1, 1, 1, 1, 1, 1, 0, 0,], # row 7 
   [0, 0, 0, 1, 1, 1, 1, 0, 0, 0,], # row 8
   [0, 0, 0, 0, 0, 0, 0, 0, 0, 0,], # row 9 
]);

The shape attribute of an array returns a tuple that contains the lengths of the array along each dimension. For 2d arrays the shape returns (rows, columns). 

In [None]:
arr_image.shape

In [None]:
plt.imshow(arr_image)

## 3D arrays
When dealing with 3D volumes, or movies with stacks of images over time, the 3d arrays typically have dimensions `(frames, row, column)`. Calcium imaging applications like `caiman` thankfully deal with greyscale image data, so we'll only have to handle 3d arrays with dimensions `num_frames x rows x columns`.

Individual rgb images are *(row x column x 3)* arrays, where the final three arrays in the third dimension are R, G, and B channels respectively. 

If you had a 1000-frame movie with RGB channels, and each image has 600 rows and 400 columns, you will end up with a 4d array with the following shape: 

    (1000, 600, 400, 3)

But we will not deal with that case here.

### RGB image example

In [None]:
rgb_image = np.random.rand(10,10,3)
rgb_image.shape

In [None]:
plt.imshow(rgb_image)

### Image stack example

In [None]:
my_stack = np.random.rand(100, 20,20)

View in fastplotlib

In [None]:
iw = ImageWidget(
    data=my_stack, 
    cmap="gnuplot2"
)
iw.show()

In [None]:
iw.plot.canvas.close()

## Cropping and subsampling
### Crop in time

In [None]:
my_stack_ct = my_stack[10:]
my_stack_ct.shape

### Crop in space

In [None]:
my_stack_cs = my_stack[:, 5:10, 15:20]
my_stack_cs.shape

In [None]:
iw = ImageWidget(
    data=my_stack_cs, 
    cmap="gnuplot2"
)
iw.show()

In [None]:
iw.plot.canvas.close()

### Subsampling
You can subsample in space or time by indicating a step size as the third argument. If you leave the first arguments blank, it will subsample across all the images (or rows/columns). For instance, to sample every fifth frame;

    subsampled = my_stack[::5]

In [None]:
my_stack_sst = my_stack[::5]
my_stack_sst.shape

## Filtering (convolution)
### Smoothing with a gaussian

In [None]:
from skimage.filters import gaussian
from skimage import filters

In [None]:
test_image = np.zeros((10,10))
test_image[5,3] = 1
test_image[5,7] = 1

In [None]:
plt.imshow(test_image);

Convolve with gaussian with sigma parameters of different widths.

In [None]:
blurred05 = gaussian(test_image, sigma=0.5)
blurred1 = gaussian(test_image, sigma=1)
blurred2 = gaussian(test_image, sigma=2)
blurred3 = gaussian(test_image, sigma=3)

fig, axs = plt.subplots(1, 4)
axs[0].imshow(blurred05)
axs[1].imshow(blurred1)
axs[2].imshow(blurred2)
axs[3].imshow(blurred3);

### Denoising
Median, mean, gaussian filter

In [None]:
from skimage.io import imread

In [None]:
noisy_mri = imread('Haase_MRT_tfl3d1.tif')[90]

In [None]:
noisy_mri_zoom = noisy_mri[50:100, 50:100]

In [None]:
fig, axs = plt.subplots(1, 2, figsize=(8,4))
axs[0].imshow(noisy_mri)
axs[1].imshow(noisy_mri_zoom);

Now let's convolve with three denoising filters and compare them for how well they preserve contrast/lines

In [None]:
from skimage.morphology import disk

Disk defines the "local" neighborhood for median/mean calculation for the convolution operation.

In [None]:
disk1 = disk(1)
plt.imshow(disk1, cmap='gr);

In [None]:
median_filtered = filters.median(noisy_mri, disk(1))
mean_filtered = filters.rank.mean(noisy_mri, disk(1))
gaussian_filtered = filters.gaussian(noisy_mri, sigma=1)

fig, axs = plt.subplots(2, 3)

# first row
axs[0, 0].imshow(median_filtered)
axs[0, 0].set_title("Median")
axs[0, 1].imshow(mean_filtered)
axs[0, 1].set_title("Mean")
axs[0, 2].imshow(gaussian_filtered)
axs[0, 2].set_title("Gaussian")

# second row
axs[1, 0].imshow(median_filtered[50:100, 50:100])
axs[1, 1].imshow(mean_filtered[50:100, 50:100])
axs[1, 2].imshow(gaussian_filtered[50:100, 50:100]);

You can see the median filter is edge-preserving (median filter at each location replaces the value by the median of the local values).

## Morphological Operations
Erosion, dilation, opening, closing

If this was a broader notebook on image processing relevant to caiman, we'd have this 

https://scikit-image.org/docs/dev/auto_examples/applications/plot_morphology.html