# Synopsis

So far we've covered the basics of what constitutes an image, color is encoded, and how we can manipulate it. However, when you conduct research the tasks you need to perform are typically more complex (although they always seem easy to do before you start coding!). Some examples of common tasks are:

* Automatically identify regions
* Identify the borders of said regions
* Find bright spots/blobs
* Skeletonize shapes (i.e. find the backbone)

We'll go over some basic methods to do some of these methods using [`scikit-image`](https://scikit-image.org/). The  package is a sister to the `scikit-learn` package, both of these packages are focused on implementing machine-learning methods in Python but `scikit-image`, as you probably guessed, is geared towards algorithms that can be applied to images. 

`scikit-image` functions are stored in the library `skimage`.

## Authors

> Helio Tejedor 
>
> Luis Amaral


## Words to remember

**Background**

**Foreground**

**Contours**

**Otsu's algorithm**

# Read libraries

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

from colorama import Back, Fore, Style
from copy import copy, deepcopy
from pathlib import Path
from sys import path

path.append( str(Path.cwd().parent) )

In [None]:
import os

import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np

from pylab import imread, imshow
from skimage import data, img_as_ubyte, measure, transform
from skimage.filters import rank, threshold_otsu
from skimage.morphology import disk 

from Amaral_libraries.my_stats import half_frame, place_commas
from Amaral_libraries.my_image_library import grayscale_zoom

In [None]:
my_fontsize = 15

# Foreground and background

In many situations we want to identify some specific regions in an image:  

* Identifying the borders of cells in a microscope image
* Finding animals, or faces, or  some other 'thing' in an photograph

In those contexts, we are trying to separate foreground from background, and the boundaries between the two. This can be achieved by identifying the background (if uninteresting) or finding the contours of objects in the image. 

We will work through some of those tasks in this notebook using some algorithms implemented in `scikit-image`.

We will make use of some data examples provided with `scikit-image`.  We first will consider a photograph of some Greek coins found at Pompeii.

In [None]:
coins_original = img_as_ubyte( data.coins() )

print(Style.BRIGHT, 'Shape:', Style.RESET_ALL, coins_original.shape)
print()

intens_max = coins_original.max()
intens_min = coins_original.min()
print(f" Maximum in image is {intens_max}, minimum is "
      f"{intens_min}.\n")


imshow(coins_original, cmap = 'gray', vmin = intens_min, vmax = intens_max);


## Zooming in

To help us with image processing it is often useful to be able to magnify parts of an image.  The function `grayscale_zoom` in `image_library` does exactly that.

**As an exercise, create a similar function for magnifying portions of `RGB` images.**

In [None]:
help(grayscale_zoom)

In [None]:
# You can chance the values of these three parameters
zoom_factor = 4
x = 43
y = 51

zoomed_image, x0, y0 = grayscale_zoom(coins_original, x, y, zoom_factor)

fig = plt.figure( figsize = (10, 6))
ax = []

ax.append(fig.add_subplot(121))
ax[-1].imshow( coins_original[:150, :150], cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )
ax[-1].plot([x], [y], 'ro');

ax.append(fig.add_subplot(122))

ax[-1].imshow( zoomed_image, cmap = 'gray', 
              vmin = intens_min, vmax = intens_max )
ax[-1].plot([zoom_factor * (x-x0)], [zoom_factor * (y-y0)], 'ro');

plt.tight_layout()


## Why it is important to set limits of intensity consistently

If you do not set the limits of intensity in grayscale images, then the function `imshow` will automatically  use the maximum and minimum values in the image to scale intensities.  This means that a section of a larger image will look different from what it looks like in the actual image.

As shown below...

In [None]:
fig = plt.figure( figsize = (10, 6))
ax = []

ax.append(fig.add_subplot(121))
ax[-1].imshow( zoomed_image, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )
ax[-1].plot([zoom_factor * (x-x0)], [zoom_factor * (y-y0)], 'ro');
ax[-1].text(100, -10, 'Original', fontsize = 1.3 * my_fontsize)

ax.append(fig.add_subplot(122))
ax[-1].imshow( zoomed_image, cmap = 'gray' )
ax[-1].plot([zoom_factor * (x-x0)], [zoom_factor * (y-y0)], 'ro');
ax[-1].text(100, -10, 'Re-scaled', fontsize = 1.3 * my_fontsize)

plt.tight_layout()

# Contours

The `scikit-image` library includes a method, `find_contours`, that implements [marching squares algorithm](http://users.polytech.unice.fr/~lingrand/MarchingCubes/algo.html). The `find_countours()` function took two arguments from us: image array and a `level` parameter value. 

> Parameters
>
> ----------
>
> image : 2D ndarray of double
>
>    Input image in which to find contours.
>
> level : float, optional
 

The `level` controls the pixel intensity around which the algorithm should attempt to find the contours $-$ it is our free parameter.  

The default value for the threshold is 

> $ \frac{ {\rm max} ~{\bf A} + {\rm min}~{\bf A}}{2} $,

where **A** is the image array. 

The algorithm looks for pixels in the image array whose values transition from above to below the predefined `level`. In this manner, the image gets segmented between foreground (inside the contours) and background (outside the contours).  

Contour pixels get organized into connected sets and the algorithm returns a list of arrays with the pixel coordinates of connected contour pixels.


In [None]:
contours = measure.find_contours(coins_original)
print(f"The method .find_contours returns a {type(contours)}.\n"
      f"\nEach contour is an array.\n")

print(contours[0])


print(f"\n\n--> The algorithm found {len(contours)} contours. "
      f"There are only 24 coins in the image.\n" )




That seems a bit much...  So let us see what this did... 

In [None]:
fig = plt.figure( figsize = (11, 10) )
ax = fig.add_subplot(111)

ax.imshow( coins_original, cmap = 'gray', 
           vmin = intens_min, vmax = intens_max )

for n, contour in enumerate(contours):
    ax.plot(contour[:, 1], contour[:, 0], linewidth=2)


**Holy &%(^&!!!!**

Yes, there are some good contours there but...

If you look at the image attentively, it becomes clear that the background in the top left corner is brighter than in the bottom right corner.

Moreover, no contours are found in the top left corner, whereas many absurd looking contours are found in the transition region between dark and light background. 

We accepted the default value for `level`. **Is there a better value?**

**Let's explore!**

In [None]:
#Try level from 50 to 200
#
level = 120
minimum_contour_len = 40 

fig = plt.figure( figsize = (11, 10) )
ax = fig.add_subplot(111)

ax.imshow( coins_original, cmap = 'gray', 
           vmin = intens_min, vmax = intens_max )

contours = measure.find_contours(coins_original, level)
for n, contour in enumerate(contours):
    
    # We exclude very short contours, which likely are just noise
    #
    if len(contour) > minimum_contour_len:
        ax.plot(contour[:, 1], contour[:, 0], linewidth=2)


Clearly, there is no good single value. The problem is that the background has different properties in different regions of the image $-$ compare background inside and outside the **very long contour**. 

So maybe the solution is to try to make the background uniform...

But first we need to find out what the background properties are...


# Identifying the background


In [None]:
w, h = coins_original.shape
intensities = coins_original.reshape((w*h, 1)) # Make it 1D for plotting

print(f"There are {place_commas(len(intensities[:,0]))} pixels in the image.")
print(f"\nTheir average intensity is {int(np.median(intensities))} "
      f"(theoretical maximum is {intens_max}).\n")

fig = plt.figure(figsize = (10, 4))
ax = fig.add_subplot(111)

half_frame(ax, 'Intensity', 'Frequency', font_size = my_fontsize)
ax.hist(intensities, bins = np.arange(0, 255, 5), rwidth = 0.9)

ax.vlines(np.median(intensities), 0, 6000, color = 'black', lw = 4)

plt.tight_layout()

**About 50% of image is background, so we could set all values below the median intensity to zero and see what that does to the image...**


In [None]:
image_mask = coins_original > np.median(intensities)

fig = plt.figure(figsize = (10, 4))
ax = []

ax.append( fig.add_subplot(121) )
ax[-1].imshow( image_mask, cmap = 'gray' )

ax.append( fig.add_subplot(122) )
ax[-1].imshow( coins_original, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )

plt.tight_layout()

It is visually apparent that some of the background is brighter than the median intensity and that some pixels within the coins are darker.

In [None]:
print(Style.BRIGHT, f"{'Original':>20} {'With mask':>35}", Style.RESET_ALL)

fig = plt.figure( figsize = (8, 8))
ax = []

ax.append(fig.add_subplot(221))
zoomed_image, x0, y0 = grayscale_zoom(coins_original, 50, 50, 4)
ax[-1].imshow( zoomed_image, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )

ax.append(fig.add_subplot(222))
zoomed_image, x0, y0 = grayscale_zoom(coins_original * image_mask, 50, 50, 4)
ax[-1].imshow( zoomed_image, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )

ax.append(fig.add_subplot(223))
zoomed_image, x0, y0 = grayscale_zoom(coins_original, 200, 275, 4)
ax[-1].imshow( zoomed_image, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )

ax.append(fig.add_subplot(224))
zoomed_image, x0, y0 = grayscale_zoom(coins_original * image_mask, 200, 275, 4)
ax[-1].imshow( zoomed_image, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )

plt.tight_layout()

## Adaptive level for background selection.

So, it is not great that some regions insider the foreground were set to zero or that the background in the top left corner remained unchanged.

A solution to this issue is to set a threshold for the background that depends on the specific region of the image.

[**Nobuyuki Otsu**](https://en.wikipedia.org/wiki/Nobuyuki_Otsu) proposed a more sophisticated algorithm for thresholding. His algorithm, [Otsu's methods](https://en.wikipedia.org/wiki/Otsu%27s_method), performs and optimization by exhaustively searching for the threshold $-$ a value between 0 and 1 $-$ that minimizes the intra-class variance, defined as a weighted sum of variances of foreground and background. You can see the nice animation of the algorithm in action on the *Wikipedia* page.


In [None]:
otsu_value = threshold_otsu(coins_original)
change = (otsu_value / np.median(intensities) - 1) * 100

print(f"Otsu's threshold is {otsu_value}. "
      f"This is a {int(change)}% difference to the median.\n")


otsu_binarized_im = coins_original > otsu_value

fig = plt.figure( figsize = (10, 8))
ax = []

ax.append(fig.add_subplot(121))
ax[-1].imshow( otsu_binarized_im, cmap = "gray" )

ax.append(fig.add_subplot(122))
ax[-1].imshow( coins_original, cmap = "gray", 
               vmin = intens_min, vmax = intens_max )

plt.tight_layout()


Clearly this approach improves the output, but cannot fully handle the lighting issues with our image.
In the previous notebook, we used a local threshold, looking for the value of each
segment of the image (it was cut in 4x4 rectangles).

Now, we will do something more sophisticated and systematic.  We will define a circle around every pixel with some pre-determined radius, and find the local threshold within that circle. 

If we apply this procedure to every pixel, we will be using an local threshold which will be able to handle image lighting issues. Specifically, we will be convoluting the disk procedure with every pixel position in the image.

For learning purposes, we will select a 70 x 70 image and set the circle radius to a number of different values.


In [None]:
im_selection = coins_original[20:90, 10:80]

fig = plt.figure( figsize = (10, 8))
fig.text(0.12, 1., 'Original', fontsize = 1.3* my_fontsize)
fig.text(0.36, 1., 'Vicinity', fontsize = 1.3* my_fontsize)
fig.text(0.56, 1., 'Threshold', fontsize = 1.3* my_fontsize)
fig.text(0.82, 1., 'Binarized', fontsize = 1.3* my_fontsize)

# We explore the impact of changing the radius to understand how 
# things work!
#
for i, radius in enumerate( [8, 16, 32] ):
    vicinity = disk(radius)

    plt.subplot(3, 4, i*4 + 1)
    plt.imshow( im_selection, cmap = "gray", 
                vmin = intens_min, vmax = intens_max )

    plt.subplot(3, 4, i*4 + 2)
    w, h = im_selection.shape
    pretty_d_im = np.zeros(im_selection.shape)
    pretty_d_im[(w//2 - radius):(w//2 + radius+1), 
                (h//2 - radius):(h//2 + radius+1)] = vicinity
    plt.imshow( pretty_d_im, cmap="gray" )
    
    # We do not set intensity limits in order to exagerate differences
    # so they are more visible
    #
    plt.subplot(3, 4, i*4 + 3)
    plt.imshow( rank.otsu(im_selection, vicinity), cmap = "gray", )
#                 vmin = intens_min, vmax = intens_max )
    
    plt.subplot(3, 4, i*4 + 4)
    local_threshold = rank.otsu(im_selection, vicinity)
    local_binarized_im = im_selection > local_threshold
    plt.imshow( local_binarized_im, cmap = "gray" )
    
plt.tight_layout()


<br>

It is visually apparent that this is working quite well for what appear to be scales that are on the order of the size of the features in the foreground.

Let us look at what the contours can do for us now that we know how identify the background.



# Contour length as a clue to relevance

Considering the contours identified earlier, it is clearly that not all of them are similar. Some are very small and likely are identifying similar regions within the background or regions of the foreground. Others are very long and may be related to a background with changing properties.

In [None]:
radius = 32
vicinity = disk(radius)

local_threshold = rank.otsu( coins_original, vicinity )
binarized_im = coins_original > local_threshold

fig = plt.figure( figsize = (10, 8))
ax = []

ax.append(fig.add_subplot(121))
ax[-1].imshow(binarized_im, cmap="gray")

ax.append(fig.add_subplot(122))
ax[-1].imshow( coins_original, cmap="gray", 
               vmin = intens_min, vmax = intens_max )

plt.tight_layout()



In [None]:
fig = plt.figure( figsize = (11, 10) )
ax = fig.add_subplot(111)

ax.imshow( coins_original, cmap = 'gray', 
           vmin = intens_min, vmax = intens_max )

contours = measure.find_contours(binarized_im)

clean_contours = []
for n, contour in enumerate(contours):
    
    # We exclude very short contours, which likely are just noise
    #
    if len(contour) > 150 and len(contour) < 300:
        ax.plot(contour[:, 1], contour[:, 0], linewidth=2)
        clean_contours.append( contour )
        
print(f"There are {len(clean_contours)} good contours in image.")

<br> 

Well, not fully. The second coin from the right on the top row is not great.

# Properties of foreground features

Now that we have our contours, if we could retrieve the pixels inside each of the contours we could calculate some properties of the objects inside.

Fortunately, there is a function that returns a mask array for points inside a polygon (or a contour):

> `measure.grid_points_in_poly`( shape, verts )

The documentation let us know that 

> Test whether points on a specified grid are inside a polygon.
>
> For each ``(r, c)`` coordinate on a grid, i.e. ``(0, 0)``, ``(0, 1)`` etc.,
test whether that point lies inside a polygon.
>
> **Parameters**
>
> ----------
> shape : tuple (M, N)
> >
> > Shape of the grid.
>
> verts : (V, 2) array
> >
> > Specify the V vertices of the polygon, sorted either clockwise
    or anti-clockwise. The first point may (but does not need to be)
    duplicated.



In [None]:
# Choose a coin
#
#k, x, y = 0, 330, 45
k, x, y = 1, 150, 50
 

contour_mask = measure.grid_points_in_poly( coins_original.shape, 
                                            clean_contours[k] )

masked_coin = coins_original * contour_mask
zoomed_image, x0, y0 = grayscale_zoom( masked_coin, x, y, 4 )

fig = plt.figure( figsize = (12, 6))
ax = []

ax.append( fig.add_subplot(121) )
ax[-1].imshow( masked_coin, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )

ax.append( fig.add_subplot(122) )
ax[-1].imshow( zoomed_image, cmap = 'gray', 
               vmin = intens_min, vmax = intens_max )

plt.tight_layout()

By counting the number of values that are `True` in the mask, we can determine the area of our object.

We can also extract the intensities of the image inside the contour and find their distribution.


In [None]:
print(f"The area inside contour {k} contains {contour_mask.sum()} pixels.\n" )

w, h = masked_coin.shape
intensities = masked_coin.reshape((w*h, 1))
intensities = np.array( [x for x in intensities if x > 0] )

fig = plt.figure(figsize = (12, 4))
ax = fig.add_subplot(111)

half_frame(ax, 'Intensity', 'Frequency', font_size = my_fontsize)
ax.hist(intensities, bins = np.arange(0, 255, 5), rwidth = 0.9)
ax.set_xlim(0, 255)

ax.vlines(np.median(intensities), 0, 300, color = 'black', lw = 4)

plt.tight_layout()

## Properties of all the coins 

Now, you can go create a mask that covers all the coins at the same time.  Then repeat the analysis for the intensities of the coins in the image.

# Identifying cells in microscopy images


Microscopy offers incredible windows into biological systems at the cellular and molecular level.  To experience this wonderful world, we will look at images of cells on plates from Cell Image Library.

We will write code to identify the contour of the cells, and then measure how well our code performs.

In [None]:
cells_folder = Path.cwd() / 'Data' / 'Cell_images' / 'BBBC022_v1_images_20585w1'
cell_images = list( cells_folder.glob('*') )
print(f"There are {len(cell_images)} images in folder.")
print()

for i, file_path in enumerate( cell_images ):
    print(f"{i:>2} - ...{str(file_path)[146:]}")

In [None]:
k = 1
plate = imread(cell_images[k])

print(f"Image with index {k} has shape {plate.shape}.\n\n"
      f"Its maximum and minimum intensities are {plate.max()} "
      f"and {plate.min()}, respectively.\n")

plate_min = plate.min()
delta = plate.max() - plate_min

# Have to be very careful with operations with np arrays 
# involving several steps. Important to use parenthesis exhaustively. 
# 
plate_u = ( 255 * ((plate - plate_min) / delta) ).astype(np.uint8)

print(f"After re-scaling, maximum and minimum intensities are "
      f"{plate_u.max()} and {plate_u.min()}, respectively.")


fig = plt.figure( figsize = (10, 10) )
ax = []

ax.append( fig.add_subplot(121) )
ax[-1].imshow( plate[200:300,200:300], cmap = 'gray', 
               vmin = plate.min(), vmax = plate.max() )

ax.append( fig.add_subplot(122) )
ax[-1].imshow( plate_u[200:300,200:300], cmap = 'gray', 
               vmin = 0, vmax = 255 )

plt.tight_layout()

This image is actually nicer than the one with the coins.  The background appears to be much more uniform and more distinct from the foreground.

## Function for calculating histogram of intensities

Let's put the code that calculates the distribution of intensities in a picture into a function and use it to generate histograms for all cell plates.


## Functions for identifying relevant contours

Re-write some of the code above in order to identify the contours of the cells in the plates.  Does the code need to be different based on whether the background is uniform or not?

## Functions for calculating properties of the cells

Write functions that generate a mask for the pixels within a contour.

Write a function that calculates the total number of pixels within a contour.

Write a function that calculates an histogram of the intensities within a contour and descriptive statistics of those values. 
