# Synopsis

In the previous notebooks, we learned how to take advantage of the fact that images are ingested as `numpy` arrays to perform some analysis and processing of images.

Are you seeing why image processing requires `GPUs` and potentially heavy calculations? Yes, matrix operations.  
In this notebook, we will continue to play with our coins, but towards the end we will do some science.


## Authors

> Helio Tejedor
>
> Luis Amaral

## Words to remember

**Memory usage**

**Dilation**

**Erosion**

**Flood fill**

**Skeletonization**



# Read libraries

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

from colorama import Back, Fore, Style
from pathlib import Path
path.append( str(Path.cwd().parent) )
path

In [None]:
my_fontsize = 15

In [None]:
import os

import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np

from sys import path, getsizeof
from pylab import imread, imshow
from skimage import data, img_as_float, img_as_ubyte, measure
from skimage.filters import rank, threshold_otsu
from skimage.morphology import (
    disk, 
    binary_dilation, binary_erosion, binary_closing, binary_opening,
    remove_small_holes, remove_small_objects,
    flood_fill,
    skeletonize
)

from Amaral_libraries.my_stats import place_commas

# Awareness of memory needs 

In the previous notebook, we loaded our figures without worrying about the format in which we were getting it. Since we were dealing with only a single image, and not a very large one at that, we did not need to discuss memory needs. However, when doing heavy imaging processing, memory needs may become a concern.

Let goes over a couple of steps described earlier and the implications for memory usage of our choices.


In [None]:
coins = data.coins().astype( np.uint8 )
coins_int = data.coins().astype( int )
coins_float = data.coins().astype( float )


In [None]:
fig = plt.figure( figsize=(12, 10) )
ax = []

for i, im in enumerate([coins, coins_int, coins_float]):
    ax.append(fig.add_subplot(1,3,i+1))
    ax[-1].imshow(im, cmap = 'gray')
    line = f"Memory use: {place_commas(getsizeof(im))} B"
    ax[-1].text( 35, -10, line, fontsize = my_fontsize )
    
plt.tight_layout()


Applying the threshold filter (line 2 above) yields an array of Boolean values, but we can transform
it easily to the integer values (either 0 or 1). Using the least memory expensive `int` type in `numpy`, we get:

In [None]:
coins

In [None]:
coins_int

In [None]:
coins_float

<br>

When printing, the `int` and `uint8` formats look identical, and the `float` format looks different.  However, `uint8` uses a single byte to store each value, whereas the others use 4 bytes.

**What is the implication of this transformation?**

When considering the typical photos taken by smartphones, now around 4096 x 3072, it makes a difference:

> Using single bytes, image requires is 12 MB
>
> Using ints, image requires 96 MB

If you have a stack of images from a confocal microscope (let's say 32 images):

> Using single bytes requires 384 MB
>
> Using ints requires 3 GB

Now, imagine that you are analyzing a video that contains tens of thousands to millions of frames.

> **TAKING MEMORY USAGE INTO CONSIDERATION IS CRITICAL!**


# Dilation and erosion

As you recall from the previous notebook, Otsu's algorithm works quite well...

In [None]:
radius = 30
vicinity = disk(radius)


fig = plt.figure( figsize = (10, 8))

plt.subplot(121)
plt.imshow(coins, cmap="gray")

plt.subplot(122)
local_threshold = rank.otsu(coins, vicinity)
foregound_mask = coins > local_threshold
plt.imshow(foregound_mask, cmap="gray")

plt.tight_layout()


<br>

However, you can see that our foreground mask includes some some spots outside the true foreground, and excludes some spots within the foreground.

This is more visible if we zoom on the bottom left corner of the image. The slicing recipe for this is:

> [220:, :85] .

For the 2 coin from the right in the top row, the slicing recipe is: 

>[25:85, 244:305] .

Change the slicing recipe to check how these transformation would work for the other coin.

In [None]:
im_selection = coins[220:, :85]
mask_selection = foregound_mask[220:, :85]

fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append(fig.add_subplot(121))
ax[-1].imshow(im_selection, cmap = 'gray')

ax.append(fig.add_subplot(122))
ax[-1].imshow(mask_selection, cmap = 'gray')

plt.tight_layout()


We are going to make use of two important transformations that can help denoise an image. They are called  **dilation** and **erosion**.

> **dilation** applies a structuring pattern $-$ a disk or a square $-$  to all pixels in the image that belong to the foreground. This action will *dilate* the foreground. 
>  
> **erosion** also applies a structuring pattern but now to all pixels in the image that belong to the background. This action will *dilate* the background and thus *erode* the foreground. 
  
Let's see how dilation and erosion alter the foreground mask.

In [None]:
contours = measure.find_contours(mask_selection)
pattern = disk(3) 

In [None]:
fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append(fig.add_subplot(131))
ax[-1].text(30, -5, 'Dilation', fontsize = my_fontsize)
dilated_mask = binary_dilation(mask_selection, pattern)
ax[-1].imshow(dilated_mask, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )


ax.append(fig.add_subplot(132))
ax[-1].text(30, -5, 'Original', fontsize = my_fontsize)
ax[-1].imshow(mask_selection, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2,
                     color = 'orange' )


ax.append(fig.add_subplot(133))
ax[-1].text(30, -5, 'Erosion', fontsize = my_fontsize)
eroded_mask = binary_erosion(mask_selection, pattern)
ax[-1].imshow(eroded_mask, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )
        
plt.tight_layout()


Cool! 

With **dilation** (left panel), we see that the black spots inside the coin were obliterated but, conversely, the spots outside expanded.

And the foreground now extends beyond the coin.

With **erosion** (right panel), we see that the black spots inside the coin were obliterated but, conversely, the spots inside expanded.

And the background now extends into the coin.

As the phrasing we used above suggests, dilation and erosion are kind of complementary operations (a bit like multiplication and division).  It may be that using them one after the other will yield the best results...


Ok, now we solved the top right corner pixel problem, but the others... got worst!

Also, the coin reduced its area.

Well, it seems that we solved one problem when one algorithm is applied... could
the second problem be solved if we applying the other algorithm? It seems both are
a kind of complementary, isn't it?

## Chaining erosions and dilations

In [None]:
fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append(fig.add_subplot(131))
ax[-1].text(10, -5, 'Dilation then Erosion', fontsize = my_fontsize)
mask_1 = binary_erosion( binary_dilation(mask_selection, pattern), pattern )
ax[-1].imshow(mask_1, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )


ax.append(fig.add_subplot(132))
ax[-1].text(30, -5, 'Original', fontsize = my_fontsize)
ax[-1].imshow(mask_selection, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2,
                     color = 'orange' )


ax.append(fig.add_subplot(133))
ax[-1].text(10, -5, 'Erosion then Dilation', fontsize = my_fontsize)
mask_2 = binary_dilation( binary_erosion(mask_selection, pattern), pattern )
ax[-1].imshow(mask_2, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )
        
plt.tight_layout()

**We are getting close to a good solution!!!**

Dilating then eroding really solves all the issues within the foreground.  The background spots inside the foreground are removed and the overall size of the foreground is left unchanged.

Eroding then dilating really solves all the issues outside the foreground. The foreground spots inside the background are removed and the overall size of the foreground is left unchanged.

**Isn't it interesting that the two operations do no commute?**  That is matrix operations for you.

<br>


## Opening and closing

How to solve the remaining issues left over by the two approaches?

The two approaches are so important that they were even named!

> an erosion followed by a dilation is denoted an **opening**
>
> a dilation followed by erosion is denoted a **closing**

What if we first **open** and then **close**?


In [None]:
fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append(fig.add_subplot(131))
ax[-1].text(5, -5, 'Closing after Opening', fontsize = my_fontsize)
mask_1 = binary_closing( binary_opening(mask_selection, pattern), pattern )
ax[-1].imshow(mask_1, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )


ax.append(fig.add_subplot(132))
ax[-1].text(30, -5, 'Original', fontsize = my_fontsize)
ax[-1].imshow(mask_selection, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2,
                     color = 'orange' )


ax.append(fig.add_subplot(133))
ax[-1].text(5, -5, 'Opening after Closing', fontsize = my_fontsize)
mask_2 = binary_opening( binary_closing(mask_selection, pattern), pattern )
ax[-1].imshow(mask_2, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )
        
plt.tight_layout()

Actually it seems that **opening after closing** is the way to go...


> **Exploration:** 
>
> > Subtract the eroded image from the original, 
> >
> > Subtract the dilated image from the original, 
> >
> > Subtract the dilated image from the eroded image. 
>
> What do you expect to get in return?

# Other morphology operations

Erosion, dilations and their chaining are not the only morphology operations that we can apply to images.  In fact there are other operations that are better suited to closing 'holes' or removing 'specks'.

Let's look at couple of them.

## `.remove_small_holes` and `.remove_small_objects`

In [None]:
#help(remove_small_holes)
help(remove_small_objects)

In [None]:
fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append(fig.add_subplot(131))
ax[-1].text(10, -5, 'remove_small_holes', fontsize = my_fontsize)
mask_1 = remove_small_holes( mask_selection )
ax[-1].imshow(mask_1, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )


ax.append(fig.add_subplot(132))
ax[-1].text(30, -5, 'Original', fontsize = my_fontsize)
ax[-1].imshow(mask_selection, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2,
                     color = 'orange' )


ax.append(fig.add_subplot(133))
ax[-1].text(5, -5, 'remove_small_objects', fontsize = my_fontsize)
mask_2 = remove_small_objects( mask_selection )
ax[-1].imshow(mask_2, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )
        
plt.tight_layout()

In [None]:
fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append(fig.add_subplot(121))
ax[-1].text(15, -5, 'holes after objects', fontsize = my_fontsize)
mask_1 = remove_small_holes( remove_small_objects( mask_selection ) )
ax[-1].imshow(mask_1, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        ax[-1].plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )
        
ax.append(fig.add_subplot(122))
ax[-1].text(15, -5, 'objects after holes', fontsize = my_fontsize)   
mask_2 = remove_small_objects( remove_small_holes( mask_selection ) )
plt.imshow(mask_2, cmap = 'gray')
for n, contour in enumerate(contours):
    if len(contour) > 100 :
        plt.plot( contour[:, 1], contour[:, 0], linewidth=2, 
                     color = 'orange' )

## `.flood_fill`

A operation that you likely are familiar from image processing software such as `Gimp` is the `bucket tool`.  When you click with it on some pixel, it will change the color of all other pixels with the same color connected to it. 

`flood_fill` is the name of the function that provides that functionality in `skimage`. It takes a `seed pixel` that it uses to define a connected cluster of pixels with identical properties. 


In [None]:
simple_im = np.zeros((20, 20), dtype=np.uint8)
simple_im[:, 10] = 4
simple_im[10, :] = 4
simple_im[10, 15] = 0
simple_im[15, 10] = 0
simple_im[15:, 15:] = 1

fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append(fig.add_subplot(131))
ax[-1].imshow( simple_im, cmap = "gray", vmin = 0, vmax = 4 )

ax.append(fig.add_subplot(132))
ax[-1].imshow( flood_fill(simple_im, (5, 5), 2), cmap = "gray", 
               vmin = 0, vmax = 4 )

ax.append(fig.add_subplot(133))
ax[-1].imshow( flood_fill(simple_im, (5, 15), 3), cmap = "gray", 
               vmin = 0, vmax = 4 )

plt.tight_layout()


## `skeletonization`

Another powerful algorithm is `skeletonization`.  It eats away at blobs until they are a single pixel wide. Since our coins are quite round, `skeletonization` would return a single point.  For this reason, it makes sense to consider some object that is less rotationally invariant.

Fortunately, science comes to the rescue.  The little roundworm [*Caenorhabditis elegans*](https://en.wikipedia.org/wiki/Caenorhabditis_elegans).

<img src = 'Images/caenorhabditis-elegans.png' width = '800'>

[Source](https://www.researchgate.net/publication/273324033/figure/fig1/AS:601675555430449@1520462039878/The-free-living-nematode-Caenorhabditis-elegans-1-mm-long-grows-on-an-agar-plate.png)

<img src = 'Images/crawling_c_elegans.gif' width = '300'>

[Source](https://upload.wikimedia.org/wikipedia/commons/b/be/CrawlingCelegans.gif)


In [None]:
filename = Path.cwd() / 'Data' / 'celegans-frame.png'
celegans_frame = imread(celegans_frame_filename)

intens_max = celegans_frame.max()
intens_min = celegans_frame.min()

print(f"Size of image is {place_commas(getsizeof(celegans_frame))} B.")

celegans_frame

We can reduce image size by casting as `np.float16`...

In [None]:
celegans_frame = np.float16( celegans_frame )

print(f"Size of image is now {place_commas(getsizeof(celegans_frame))} B.")

plt.figure( figsize = (10, 10) )
plt.imshow( celegans_frame, cmap = "gray", 
            vmin = intens_min, vmax = intens_max );


Clearly, there is a large section of the image that is of no importance to us.

We can easily remove it just by providing the coordinates of the area of interest, but we could also use more automated approaches, right?


In [None]:
celegans_frame_cut = celegans_frame[150:1350, 200:1825]

plt.figure( figsize = (10, 10) )
plt.imshow( celegans_frame_cut, cmap = "gray", 
            vmin = intens_min, vmax = intens_max );


Also, we can inverse the image, since the worms are black and the background is white. We
will transform it also to an array of bytes:

In [None]:
celegans_frame_cut_inv = intens_max - celegans_frame_cut
frame = img_as_ubyte(celegans_frame_cut_inv) 

print(f"Size of image is now {place_commas(getsizeof(frame))} B.")

frame

Now, let's binarize the image using a local thresholding algorithm. We saw earlier that we can use Otsu's algorithm locally to detect the foreground.

The worms are about 50 to 100 pixels long, so we could set the radius of the region over which we find the local threshold to 50.

In [None]:
vicinity = disk(50)

fig = plt.figure( figsize = (18, 12) )
ax = []

# We do not set the values of vmin and vmax to their true values 
# because the region of interest is very dark
#
ax.append( fig.add_subplot(121) )
ax[-1].imshow( frame, cmap = "gray")

ax.append( fig.add_subplot(122) )
local_threshold = rank.otsu(frame, vicinity)
binarized_frame = frame > local_threshold
ax[-1].imshow( binarized_frame, cmap = "gray" )

plt.tight_layout()

**Not what we expected, right??**

Because most of the image is background, we are seeing background noise as foreground in extensive areas of the image.

Maybe, in this case, it is best to go with a global value. We can again use Otsu's algorithm for finding a global threshold value, but it turns out that this would cause us to lose some of the foreground.

By trial and error, it turns out that subtracting 10 provides a good compromise between getting rid of the noise without loosing foreground.


In [None]:
vicinity = disk(50)

fig = plt.figure( figsize = (18, 12) )
ax = []

# We do not set the values of vmin and vmax to their true values 
# because the region of interest is very dark
#
ax.append( fig.add_subplot(121) )
ax[-1].imshow( frame, cmap = "gray")

ax.append( fig.add_subplot(122) )
local_threshold = threshold_otsu(frame)
binarized_frame = frame > local_threshold - 10
ax[-1].imshow( binarized_frame, cmap = "gray" )

plt.tight_layout()

This looks pretty nice and was so much faster. 

We can remove the frame around the image using `floodfill` from any of the corners of the image.

And, we can remove the small specks using `remove_small_objects`. 


In [None]:
binarized_frame = flood_fill(binarized_frame, (0, 0), 0)

binarized_clean = remove_small_objects(binarized_frame)

fig = plt.figure( figsize = (18, 12) )
ax = []

ax.append( fig.add_subplot(121) )
ax[-1].imshow( frame, cmap = "gray")

ax.append( fig.add_subplot(122) )
ax[-1].imshow( binarized_clean, cmap = "gray" )

plt.tight_layout()


We still have a few problems left over (two bright circles in the bottom left corner, a two small specks, one in the middel of the image and another toward the bottom on the mid-right).  

Let's ignore them for now and do the `skeletonization`.

In [None]:
help(skeletonize)

In [None]:
skeleton_frame = skeletonize(binarized_clean)

w, h = binarized_clean.shape
final_image = np.zeros((w, h, 3), dtype = np.uint8)
final_image[binarized_clean, :] = [128, 128, 128]
final_image[skeleton_frame, :] = [255, 0, 0]

fig = plt.figure( figsize = (18, 12) )
ax = []

ax.append( fig.add_subplot(121) )
ax[-1].imshow( final_image )

ax.append( fig.add_subplot(122) )
ax[-1].imshow( final_image[900:1150, 50:400, :] )

plt.tight_layout()


Very nice!!!


# Segmentation

Finally, let's do the segmentation and label all the worms we find, like we did in the previous notebooks:

In [None]:
fig = plt.figure(figsize=(12, 9))
ax = fig.add_subplot(111)
ax.imshow(binarized_clean, cmap = "gray", vmin = 0, vmax = 1)

contours = measure.find_contours(binarized_clean)
k = 0
for n, contour in enumerate(contours):
    if len(contour) > 100:
        ax.plot(contour[:, 1], contour[:, 0], linewidth = 1)
        k += 1
    
print(f"There are {len(contours)} contours in the image, "
      f"{k} of which are good. ")

You can now use some of the different functions in the module `skimage.measure` to obtain some interesting measurement about each worm $-$ area, perimeter, center of mass, ...
