In [None]:
%matplotlib inline

# Webvalley 2017 - Imaging Pt. 1

The goal of this lab is to familiarize with basic image manipulation in the jupyter notebook environment.

## Basic image operations

In this lab we will work with two-dimensional signals, i.e. images. A handy Python module for image manipulation is <a href="http://scikit-image.org/"><font style="TrueType">scikit-image</font></a>.

Our first task will be to load and display some pictures.

In [None]:
from __future__ import division
import matplotlib.pyplot as plt
import numpy as np
from skimage import data

astro = data.astronaut() # scikit-image comes already with some images (try to use the <TAB> key on data.)
fish = data.imread('lionfish.jpg') # or you can load a custom one

plt.subplot(1,2,1)
plt.imshow(astro)

plt.subplot(1,2,2)
plt.imshow(fish);

Can you guess data type and shape of the images?

In [None]:
print("image data type is: {}".format('...'))
print("image shape is: {}".format('...'))

As expected, an *RGB* image is stored in memory as a three dimensional array ($shape = rows \times columns \times channels$). The intensities of each color channel are saved in a separate matrix.

Do you know how to slice a ``numpy.array``? Check the next image.

![idx](numpy_indexing.png)

Try to extract the three color channels from an image of choice and print their shape.

In [None]:
R = astro[:,:,0]
G = '...'
B = '...'
print("R shape is: {}".format('...'))
print("G shape is: {}".format('...'))
print("B shape is: {}".format('...'))

Let's try to inspect the content of a channel, for instance printing the first 5 elements of the first 3 lines of the red one.

In [None]:
print(R['...'])

It looks like an image is stored in memory as a matrix filled with integer numbers. Can you guess the right range? Check it out in the next box.

Hint: you can use <font face="TrueType">np.max</font> and <font face="TrueType">np.min</font> on any input <font face="TrueType">numpy.array</font>.

In [None]:
print("Max value for the red channel: {}".format('...'))
print("Min value for the red channel: {}".format('...'))

8-bit images are stored in memory as matrices filled with integer numbers spanning between $0$ and $255$. However, sometimes it is useful to represent an image as a matrix of float spanning from $0$ to $1$. Write a Python function that implements such normalization and test it on an input matrix.

In [None]:
def my_uint2float(img):
    return '...'
plt.imshow(my_uint2float(G), cmap='Blues');

Well done. However, ``skimage`` can do that for us. Check the documentation of <a href="http://scikit-image.org/docs/dev/api/skimage.html#skimage.img_as_float">``skimage.img_as_float``</a>.

In [None]:
import skimage
skimage.img_as_float??

Now, apply your new function on the three channels and try to visualize them in separate sections of the same figure. In which channel do you expect the astronaut suit will have highest values?

Hint 1: use <a href="http://matplotlib.org/api/colorbar_api.html"><font face="TrueType">plt.colorbar</font></a> to see the color mapping.

Hint 2: stick to the same colormap used before

In [None]:
plt.figure(figsize=(15,5))

plt.subplot(1,3,1)
plt.imshow(my_uint2float(R), cmap='gray');
plt.colorbar(orientation='vertical')
plt.title('R')

plt.subplot(1,3,2)
plt.imshow(my_uint2float(G), cmap='gray');
plt.title('G')
plt.colorbar(orientation='vertical')

plt.subplot(1,3,3)
plt.imshow(my_uint2float(B), cmap='gray');
plt.title('B')
plt.colorbar(orientation='vertical')

plt.tight_layout() # a handy command that increases spacing between subplots

Did you guess the right color channel? Bravo! Let's move on.

## RGB to grayscale conversion

So, a color image is a collection of three matrices each one representing a different color channel. How can we represent a grayscale image? How many *color channels* do we need? We know that it's possible to encode a color image in grayscale using the following linear transformation:

$Y = 0.2125 \cdot R + 0.7154 \cdot G + 0.0721 \cdot B$

*The coefficients represent the measured intensity perception of typical trichromat humans, depending on the primaries being used; in particular, human vision is most sensitive to green and least sensitive to blue.* [cit. <a href="https://en.wikipedia.org/wiki/Grayscale">Wikipedia</a>]

Write a Python function that converts the three channels of an input RGB image to float and then combines them in a grayscale encoding.

In [None]:
def my_rgb2gray(img):
    return '...'

Now test it on one of the images above (to obtain more pleasant results try to specify the option ``cmap='gray'`` for the function ``plt.imshow``).

In [None]:
plt.imshow(my_rgb2gray(fish), cmap='gray');

Very good. We developed our grayscale conversion utility, but as you can imagine ``skimage`` can do that for us. Import the ``color`` module from the main library and check the help function for <a href="http://scikit-image.org/docs/dev/api/skimage.color.html#skimage.color.rgb2gray">``color.rgb2gray``</a>.

In [None]:
from skimage import color
color.rgb2gray??

In [None]:
plt.imshow(color.rgb2gray(fish), cmap='gray');

Apparently the two functions return the same thing. But, how can we be sure of that? In other words, can we measure the similarity between two images (matrices)? 

The answer is: of course we can, and there are several ways to to that. Let's introduce here the most basic image distance measure.

The key idea here is to unroll the two images $A$ and $B$ with shape <font face="TrueType">(m, n)</font>, in two vectors $a$ and $b$ shaped as <font face="TrueType">(m $\cdot$ n, 1)</font>. Then a simple distance between them can be evaluated as follows.

$$\text{RSS} = \sum_{i=0}^{m\cdot n} (a_i - b_i)^2$$

This measure is known as <a href="https://en.wikipedia.org/wiki/Residual_sum_of_squares">Residual Sum of Squares</a> and it's gonna be useful in the next classes.

Implement a Python function that calculates the RSS between two input images and test it on the output obtained from <font face="TrueType">color.rgb2gray</font> and <font face="TrueType">my_rgb2gray</font> on the same image.

In [None]:
def RSS(a, b):
    return '...'

In [None]:
fish1 = color.rgb2gray(fish)
fish2 = my_rgb2gray(fish)
print("RSS(fish1, fish2) = {}".format(RSS(fish1, fish2)))

Did you get $RSS=0$? Good.

## Histograms

Another strategy to check the distance between images is to take advantage of their color intensity distribution on the three channels, let's try to visualize them as histograms using the image of the lionfish.

Hint: check the documentation of  <font face="TrueType">plt.hist</font>.

In [None]:
plt.hist??

In [None]:
R = '...'
G = '...'
B = '...'

plt.figure(figsize=(8,3))

plt.subplot(1,3,1)
plt.hist(R.ravel(), normed=True, color='R');
plt.ylim([0,0.03])

plt.subplot(1,3,2)
plt.hist(G.ravel(), normed=True, color='G');
plt.ylim([0,0.03])

plt.subplot(1,3,3)
plt.ylim([0,0.03])
plt.hist(B.ravel(), normed=True, color='B');

plt.tight_layout() # a handy command that increases spacing between subplots

From those histograms it looks like the blue channel can be used to discriminate the foreground (a lionfish) from the background. This is gonna be the goal of the next section.

## Background suppression

A <i>binary mask</i> is an simple but effective way to perform a fast background suppression. You can obtain a binary mask in several ways, let's see an example. Create $A$: a simple $3\times 3$ matrix and then print a binary mask corresponding to the positions where its values are bigger than a certain threshold.

In [None]:
A = np.array([[2,2,2], [2,3,2], [2,4,2]])
print(A>2)

Easy, right? Now you can perform a simple background suppression identifying two thresholds from the histogram above and then representing the binary mask of the values lying in between them.

Hint: check the documentation for <font face="TrueType">np.multiply</font>.

In [None]:
np.multiply??

In [None]:
mask = '...'
plt.imshow(mask, cmap='gray');

We may think of improving a bit our mask performing some morphological operations (**optional**).

In [None]:
from skimage import morphology

selem = morphology.disk(8)
mask2 = morphology.dilation(mask, selem)

plt.imshow(mask2);

Now we can finally suppress the background in our image.

Hint 1: with ``numpy`` arrays you can use ``bool`` indexes.

Hint 2: to recompose an image that was previously decomposed in its three channels you can use ``np.dstack``.

In [None]:
np.dstack??

Hint 3: <font face="TrueType">numpy</font> has some utilities to initialize matrices and vectors, check for instance <font face="TrueType">np.ones</font> or <font face="TrueType">np.zeros</font>.

In [None]:
np.ones??

In [None]:
'...'
fish2 = '...'

plt.imshow(fish2);

The result is a bit ugly, yeah. But it's reasonable given its extreme simplicity.

## Image noise

As we did before for 1D signals, we can modify an image applying some transformations to the intensity of their pixels. Let's try, for instance, to write a function that converts an image to greyscale and then adds some Gaussian noise to it.

In [None]:
from skimage import color

def rgb2noisygray(img, mu, sigma):
    return '...'

noisy_img = rgb2noisygray(fish, 0, 1)
plt.imshow(noisy_img, cmap='gray');

### Exercises

Write some custom function that implement the following basic operations.

<ol>
<li>Decompose an image into its three channels, add some Gaussian random noise to each one of them, recompose and visualize the noisy image.</li>
<li>Pick an image you like and select the foreground, then suppress one its the three channels. The result may look like the following image.
<img src="ex2.png" width=600 height=450></img></li>
<li>Generate and visualize some 2D sinusoids.</li>
<li>Convert a color image to grayscale and then add the 2D sinusoidal noise you just generated. The output should look like the follwing image.
<img src="ex1.png" width=600 height=450></img>
Hint: check this out <a href="http://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html"><font style="TrueType">np.meshgrid</font></a>
</li>
</ol>