# Image Analysis and Featurization

Chris Reger

Credit to: J. Gartner, D. Rupp, and F. Burkholder 

## Objectives:
1. Describe how images are represented in computers
2. Use `numpy` methods to manipulate images
3. Issues of applying models to images
4. Understand basic feature extraction
5. Manipulate images using the `scikit-image` library

Bonus: Show you a nice example of an image-based capstone, and make the case that figuring out and explaining **why** a machine learning model works can be **more interesting** and **more fun** than just optimizing it to get a good score.

## How are images stored in computers?


Images are saved as a matrix of numbers.  Pixels intensities are quantified using various scales.  In each scale the higher the number, the more saturated it is.

* `uint8` is an unsigned, 8 bit integer.  2^8 = 256, so `uint8` values go from 0 - 255.  In a grayscale image, 0 will be black and 255 white.  This is the most common format.
* `uint16` also unsigned, 16 bit.  2^16 = 65536.  Twice the memory of a `uint8` value.
* `float` a float value (usually 64 bit) between 0-1.  0 black, 1 white.

![grayscale](imgs/grey_image.jpg)    

What do you notice about this image?  How is the quality?

![deer](imgs/deer_pixeled.jpg)   

### What do we change for color images?

Color images are typically 3 equally-sized matrices (rows x columns), where each matrix specifies how much **red**, **blue**, and **green** is present in the image.

![RGB](imgs/RGB_channels_separation.png)

Color images are stored in _three dimensional_ matrices. 
* The first dimension is usually the height dimension starting from the _top_ of the image. 
* The second dimension is usually the width, starting from the left.
* The third dimension is usually the color. This is the "channel" dimension.

For example, the 0th row (the top), the 0th column (the left), and the 0th color (usually red).

Images may sometimes have a fourth channel for transparency ("alpha channel"), or store the colors in an order other than the standard red-green-blue (for example, `OpenCV` uses a blue-green-red ordering for color images).

### Dealing with images in Python  
  
There are several python libraries you can use when working with images in python:
 1. **Numpy** - this is a standard way to store and work with the image matrix in python
 2. **scikit-image** - included with Anaconda it was developed by the SciPy community.  It can be used to do many things we will cover shortly
 3. **OpenCV** - there is a Python wrapper of OpenCV which is a C++ library for computer vision.  It is a very powerful tool as it has several pretrained models in it and the structure to allow training of your own classical image models.
 4. **PIL** and **Pillow** - Pillow is supposed to be a user friendly version of PIL which is the Python Imaging Library. It adds image processing capabilities to Python.

This notebook will use **skimage**.  It has very nice [examples](https://scikit-image.org/docs/dev/auto_examples/) and [documentation](https://scikit-image.org/docs/dev/index.html).

In [None]:
import numpy as np
import matplotlib.pyplot as plt

from skimage import io, color, filters
from skimage.transform import resize, rotate
from skimage import data

## Working with an image

### Loading the Image

In [None]:
coin = io.imread('data/coin.jpg')

### What is kind of data is `coin`?

In [None]:
type(coin)

### What are its shape and size?

In [None]:
print(f'Shape : {coin.shape}')
print(f'Size  : {coin.size}')

This shows that coin is a 316-by-316 pixel image with three channels (red, green, and blue).

### How do we look at the image?

In [None]:
io.imshow(coin)
io.show()

The above code applies `imshow()` and `show()` functions:
- `imshow()` displays an image
- `show()` displays the pending images queued by `imshow()`


**Take-Away**: *You will need to use* `show()` *when displaying images from non-interactive shells, however it is not necessary for notebooks*

### Manipulating the image with Numpy

Let's try and zoom in on the face.

In [None]:
face = coin[70:190,120:240,:]  #[rows, columns, channels]
print(face.shape)
io.imshow(face);

Now let's set the pixels in the 50th row to "black".

In [None]:
coin2 = coin.copy()
coin2[50] = 0
io.imshow(coin2);

And let's color an area "green".

In [None]:
coin2[148:154, 129:135] = [0, 255, 0]  # [red, green, blue]
io.imshow(coin2);

Lets do some boolean indexing!

In [None]:
reddish = coin[:, :, 0] < 100 # red channel, values below 100
coin2[reddish] = [255, 0, 0] # Make them Red
io.imshow(coin2);

Lastly, lets do a vertical flip.

In [None]:
io.imshow(coin2[::-1]);

## Image transformations

Often you need to modify the image before working with it, like **resizing** it or **rotating** it.

In [None]:
io.imshow(resize(coin, (100,100)));

In [None]:
io.imshow(rotate(coin, 90));  #skimage to the rescue - try other angles besides 90 degrees

## Breakout time
### In your own code-blocks, try each of the following:

- Zoom in on the baby's face
- Perform a horizontal flip
- Rotate 90 degrees
- Make the area between the 50th and 60th row black *(=0)*
- Resize the image to double its size
- Use boolean indexing to make the white areas black

### Let's look at a new image.

In [None]:
sunset = io.imread('data/mich_1.jpg')

In [None]:
print(f'Type  : {type(sunset)}')
print(f'Shape : {sunset.shape}')
io.imshow(sunset);

Let's look at the sunset's top left 100 pixel intensities (10 rows x 10 columns) for the first layer (red).

In [None]:
sunset[:10,:10, 0].reshape(10,10)  # channel 0 is red, 1 is green, 2 is blue

### Let's look at the pixel intensities of each of the channels

In [None]:
sunset_red = sunset[:, :, 0]

In [None]:
print(f'Shape of sunset_red : {sunset_red.shape}') 
print(f'Shape of sunset : {sunset.shape}')

Looking at just sunset_red, what do you think we will see?

In [None]:
io.imshow(sunset_red);

Because there is only one channel, the image is displayed in gray scale. By displaying the pixel intensities for each channel in this way, we can get an idea of how much stauration each channel has.

In [None]:
def plot_channel_saturations():
    fig, axes = plt.subplots(2,2, figsize=(14,10))
    for i, color, ax in zip(range(0, 4), ['Original', 'Red', 'Green', 'Blue'], axes.flatten()):
        if i == 0:
            ax.imshow(sunset)
            ax.set_title("Original image")
            continue
        ax.imshow(sunset[:, :, i-1], cmap='gray')
        ax.set_title(f"{color} channel saturation")
        fig.tight_layout;

In [None]:
plot_channel_saturations()

### Adjusting Exposure Levels

It can be helpful to adjust the exposure.  There is a fairly straightforward method to accomplish this.

In [None]:
from skimage import exposure

In [None]:
def show_exposure_example():
    sunset_bright = exposure.adjust_gamma(sunset, gamma=0.5) # gamma controls the exposure level (default =1)
    sunset_dark = exposure.adjust_gamma(sunset, gamma=2)

    fig, axes = plt.subplots(1,3, figsize=(14,10))
    axes[0].imshow(sunset)
    axes[0].set_title("Original")
    axes[1].imshow(sunset_bright)
    axes[1].set_title("Bright")
    axes[2].imshow(sunset_dark)
    axes[2].set_title("Dark");

In [None]:
show_exposure_example()

### What if we don't care about color in an image?
Make it gray-scale: 1/3rd the size in memory from original image.

In [None]:
from skimage.color import rgb2gray
sunset_gray = rgb2gray(sunset)

Looking at their shapes:

In [None]:
print("Image shapes:")
print(f"Sunset RGB (3 channel): {sunset.shape}")
print(f"Sunset (gray): {sunset_gray.shape}")

And sizes:

In [None]:
print("\nMinimum and maximum pixel intensities:")
print("Original sunset RGB: ", sunset.min(), ",", sunset.max())
print("Sunset gray (grayscale):", sunset_gray.min(), ",", sunset_gray.max())

And rendering:

In [None]:
io.imshow(sunset_gray);

### A common featurization approach for images: flattening/ raveling

`np.ravel()` can be used to make the image into one long row vector

In [None]:
sunset_gray_values = np.ravel(sunset_gray)
sunset_gray_values.shape

In [None]:
def plot_sunset_histogram():
    fig = plt.figure(figsize=(10,6))
    ax = fig.add_subplot(111)
    ax.hist(sunset_gray_values, bins=256)
    ax.set_xlabel('Pixel Intensities', fontsize=14)
    ax.set_ylabel('Frequency in Image', fontsize=14)
    ax.set_title("Sunset Image Histogram", fontsize=16);

In [None]:
plot_sunset_histogram()

Looking at the pixel intensities above, can you think of a way to segment the image that would only show the setting sun?

Try playing with the threshold in your notebooks.

In [None]:
sun_threshold_intensity = 0.9 # play with this
setting_sun = (sunset_gray >= sun_threshold_intensity).astype(int)
io.imshow(setting_sun, cmap='gray');

### Featurization: how big was the setting sun in our image?

In [None]:
size_sun_pixels = setting_sun.sum()
print(f'The setting sun was represented by {size_sun_pixels} pixels.')

### Let's look at the coin again

In [None]:
coin_gray = rgb2gray(coin)
coin_gray_values = np.ravel(coin_gray)
print(coin_gray_values.shape)
io.imshow(coin_gray);

In [None]:
fig = plt.figure(figsize=(12,6))
ax = fig.add_subplot(111)
ax.hist(coin_gray_values, bins=256)
ax.set_xlabel('Pixel Intensities', fontsize=14)
ax.set_ylabel('Frequency in Image', fontsize=14)
ax.set_title("Coin Image Histogram", fontsize=16)
ax.set_ylim([0, 10000]);

### Breakout!
- Figure out a way to segment the coin in the image.
- Determine many pixels are used to represent the coin.
- Assuming the radius of the coin is 150 pixels, if the coin was perfectly round, how many pixels would have been expected?
- How close was your answer?

## What are some things we could learn about the image?

If we have a color image how could we find out about aspects of that image?
 - We can get the mean of each color in the image to get the mood
 - More complex we can use K-means and look at centroids (*we will look at clustering later this week*)
   - KMeans clustering is a common way to extract the dominant colors in an image. [The PyImageSearch blog](https://www.pyimagesearch.com/2014/05/26/opencv-python-k-means-color-clustering/)

#### The average color:

In [None]:
sunset.reshape(-1,3).mean(axis=0)  # the average r,g,b value

#### K-means:

In [None]:
from sklearn.cluster import KMeans  # unsupervised learning

Fit the model and look at what it accomplishes.

In [None]:
X = sunset.reshape(-1,3)
clf = KMeans(n_clusters=3).fit(X)  # looking for the 3 dominant colors
print(f'Dominant Colors :\n {clf.cluster_centers_}\n')
print(f'Shape of Labels: {clf.labels_.shape}\n')
labels = set(clf.labels_)
print(f'Unique labels: {labels}')

What does this look like?

In [None]:
plt.imshow( clf.labels_.reshape(480, 640) );  # color here means nothing, only labels

## What does ML data look like?

What does our data typically look like in data science?

| &nbsp;| feature_1   | feature_2   | feature_3   | feature_4    | 
| :---- | :---- | :--   | :--   | :--   | 
| sample_1    |   0  | A  |  3.3  |  1  | 
| sample_2    |   1  | A  |  2.3  |  1  | 
| sample_3    |   0  | A  |  2.7  |  0  | 
| sample_4    |   0  | B  |  3.0  |  1  |    

### Why is this an issue with images?

 - We have to unravel the image to make it flat losing the relationship of surrounding pixels <br/>
 - Lighting will affect the pixel values (a car will have a very different set of pixel values if it is cloudy vs sunny)<br/>
 - Humans are very good at finding shapes and using those to classify a image<br/>
 - You can think of shapes/edges as the difference in adjoining pixels <br/>

## Featurization - Looking for edges in images

It *may* help our ML model to give it edges, instead of pixel intensities, to train and predict on.

This is called [edge detection](https://en.wikipedia.org/wiki/Edge_detection) and there are a variety of ways to do it.

A straight-forward approach is to use the gradient (rate of change) of the pixel intensities.

Looking for the direction of "color" change.  (We will be using grayscale from here on out).

Looking for the change in gradient for a given pixel to the ones immediately above and below and to the left and right can be described with the following function.  

![gradient](imgs/gradient_fromula.png)

![grid](imgs/image_grid.png)

Thus giving us 

![solved](imgs/solved.png)

With these values we can complete the calculations:
 - Magnitude (L2-norm): 
   * $ g = \sqrt{g_x^2 + g_y^2}$
 - Direction is the arctangent of the ratio of the partial derivatives:  
   * $ \theta = arctan(g_y/g_x)$

This process is computationally slow it can be approximated by applying what is called a **convolution** or a vector of the needed form.  For the x vector that would be `[-1,0,1]`.  Which given the above example will give us the value of `-50`.  This is sometimes referred to as edge detection.

### Sobel operator

The convolutions are used to gain information about the pixels surrounding the single pixel.  You can slide a convolution over a image and end up with a new image where every pixel now represents numerically the surrounding pixels.   
One of the more well known ones is the Sobel.  It has a mathematical make up like below:

![sobel](imgs/sobel.png)

The magnitude is calculated between these two edge detectors to get the final result.

In [None]:
sobel_img = filters.sobel(coin_gray)
io.imshow(sobel_img);

We are left with a image that shows the presence of edges as distinct numeric values

In [None]:
sobel_img[0:10,140:150].round(2)

There are many different filters that have varying mathematics behind them however the idea that we are getting data on the difference of one pixel to another is followed.  Getting the difference between the pixels is important as this minimizes the issue of different lighting.  Also normalization of images first is usually recommended

In [None]:
from skimage.filters import gaussian

In [None]:
#Example of a blur filter
io.imshow(gaussian(coin_gray, sigma=3));

## Objectives:
1. Describe how images are represented in computers
2. Issues of applying models to images
3. Understand basic feature extraction

### Review
Let's check out an amazing image-based capstone 1:  [What the fish?!(betta)](https://github.com/joeshull/what_the_fish_betta)