# Image Analysis and Featurization

Frank Burkholder  
Adapted from Austin campus - Joe Gartner and Dan Rupp

Objectives:
1. Describe how images are represented in computers
2. Issues of applying models to images
3. Understand basic feature extraction
4. Show you a nice example of an image-based capstone, and make the case that figuring out and explaining **why** a machine learning model works can be **more interesting** and **more fun** than just optimizing it to get a good score.

## How are images stored in computers?


Images are saved as a matrix of numbers.  Pixels intensities are quantified using various scales.  In each scale the higher the number, the more saturated it is.  
* `uint8` is an unsigned, 8 bit integer.  2^8 = 256, so `uint8` values go from 0 - 255.  In a grayscale image, 0 will be black and 255 white.  This is the most common format.
* `uint16` also unsigned, 16 bit.  2^16 = 65536.  Twice the memory of a `uint8` value.
* `float` a float value (usually 64 bit) between 0-1.  0 black, 1 white.


![grayscale](imgs/grey_image.jpg)    

    
What do you notice about this image?  How is the quality?


![deer](imgs/deer_pixeled.jpg)   
    
What do we change for color images?  

Color images are typically 3 equally-sized matrices (rows x columns), where each matrix specifies how much **red**, **blue**, and **green** is present in the image.

![RGB](imgs/RGB_channels_separation.png)
    
<br>

Color images are stored in _three dimensional_ matrices. 
* The first dimension is usually the height dimension starting from the _top_ of the image. 
* The second dimension is usually the width, starting from the left.
* The third dimension is usually the color. This is the "channel" dimension.

For example, the 0th row (the top), the 0th column (the left), and the 0th color (usually red).   
    
Images may sometimes have a fourth channel for transparency ("alpha channel"), or store the colors in an order other than the standard red-green-blue.

### Dealing with images in Python  
  
There are several tools you can use when working with images in python
 1. **Numpy** - this is a standard way to store and work with the image matrix in python
 2. **scikit-image** - included with Anaconda it was developed by the SciPy community.  It can be used to do many things we will cover shortly
 3. **OpenCV** - there is a Python wrapper of OpenCV which is a C++ library for computer vision.  It is a very powerful tool as it has several pretrained models in it and the structure to allow training of your own classical image models.
 4. **PIL** and **Pillow** - Pillow is supposed to be a user friendly version of PIL which is the Python Imaging Library. It adds image processing capabilities to Python.


This notebook will use **skimage**.  It has very nice [examples](https://scikit-image.org/docs/dev/auto_examples/) and [documentation](https://scikit-image.org/docs/dev/index.html).

In [None]:
import numpy as np
import matplotlib.pyplot as plt
# %matplotlib inline 

from skimage import io, color, filters
from skimage.transform import resize, rotate

### Working with an image

In [None]:
coin = io.imread('data/coin.jpg')

In [None]:
print('Type: {}'.format(type(coin)))
print('Shape: {}'.format(coin.shape))

In [None]:
io.imshow(coin);

In [None]:
type(coin)

In [None]:
# let's zoom in
face = coin[75:175,140:240,:]  #it's just an array - you can slice it
print(face.shape)
io.imshow(face);

In [None]:
sunset = io.imread('data/mich_1.jpg')

In [None]:
print('Type: {}'.format(type(sunset)))
print('Shape: {}'.format(sunset.shape))
io.imshow(sunset);

Let's look at the sunset's top left 100 pixel intensities (10 rows x 10 columns) for the first layer (red).

In [None]:
sunset[:10,:10, 0].reshape(10,10)  # channel 0 is red, 1 is green, 2 is blue

### Let's look at the pixel intensities of each of the channels

In [None]:
sunset_red = sunset[:, :, 0]
sunset_green = sunset[:, :, 1]
sunset_blue = sunset[:, :, 2]

In [None]:
print(sunset_red.shape) 
print(sunset.shape)

In [None]:
fig, ax = plt.subplots(1,2, figsize=(10,10))
ax[0].imshow(sunset_red, cmap='gray')
ax[0].set_title("Red channel saturation")
ax[1].imshow(sunset)
ax[1].set_title("Original image");

In [None]:
fig, ax = plt.subplots(1,2, figsize=(10,10))
ax[0].imshow(sunset_blue, cmap='gray')
ax[0].set_title("Blue channel saturation")
ax[1].imshow(sunset)
ax[1].set_title("Original image");

In [None]:
fig, ax = plt.subplots(1,2, figsize=(10,10))
ax[0].imshow(sunset_green, cmap='gray')
ax[0].set_title("Green channel saturation")
ax[1].imshow(sunset)
ax[1].set_title("Original image");

### What if we don't care about color in an image?
Make it gray-scale: 1/3rd the size in memory from original image.

In [None]:
from skimage.color import rgb2gray

In [None]:
sunset_gray = rgb2gray(sunset)
print("Image shapes:")
print("Sunset RGB (3 channel): ", sunset.shape)
print("Sunset (gray): ", sunset_gray.shape)

print("\nMinimum and maximum pixel intensities:")
print("Original sunset RGB: ", sunset.min(), ",", sunset.max())
print("Sunset gray (grayscale):", sunset_gray.min(), ",", sunset_gray.max())
io.imshow(sunset_gray);

### A common featurization approach for images: flattening/raveling

In [None]:
# make it into one long row vector
sunset_gray_values = np.ravel(sunset_gray)
sunset_gray_values.shape

In [None]:
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
ax.hist(sunset_gray_values, bins=256)
ax.set_xlabel('pixel intensities', fontsize=14)
ax.set_ylabel('frequency in image', fontsize=14)
ax.set_title("Sunset image histogram", fontsize=16);

### Looking at the pixel intensities above, can you think of a way to segment the image that would only show the setting sun?

In [None]:
sun_threshold_intensity = 0 # play with this
setting_sun = (sunset_gray >= sun_threshold_intensity).astype(int)
io.imshow(setting_sun, cmap='gray');

### Featurization: how big was the setting sun in our image?

In [None]:
size_sun_pixels = setting_sun.sum()
print(f"The setting sun was represented by {size_sun_pixels} pixels.")

### Let's look at the coin again

In [None]:
coin_gray = rgb2gray(coin)
coin_gray_values = np.ravel(coin_gray)
print(coin_gray_values.shape)
io.imshow(coin_gray);

In [None]:
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
ax.hist(coin_gray_values, bins=256)
ax.set_xlabel('pixel intensities', fontsize=14)
ax.set_ylabel('frequency in image', fontsize=14)
ax.set_title("Coin image histogram", fontsize=16)
ax.set_ylim([0, 10000]);

### Breakout on your own
Figure out a way to segment the coin in the image.

### What are some things we could learn about the image?

If we have a color image how could we find out about aspects of that image?
 - We can get the mean of each color in the image to get the mood
 - More complex we can use K-means and look at centroids
 
KMeans clustering is a common way to extract the dominant colors in an image. [The PyImageSearch blog](https://www.pyimagesearch.com/2014/05/26/opencv-python-k-means-color-clustering/)

In [None]:
sunset.reshape(-1,3).mean(axis=0)  # the average r,g,b value

In [None]:
from sklearn.cluster import KMeans  # unsupervised learning

In [None]:
X = sunset.reshape(-1,3)
clf = KMeans(n_clusters=3).fit(X)  # looking for the 3 dominant colors
clf.cluster_centers_

In [None]:
clf.labels_.shape

In [None]:
labels = set(clf.labels_)
print(labels)

In [None]:
plt.imshow( clf.labels_.reshape(480, 640) );  # color here means nothing, only labels

## Image transformations

Often you need to modify the image before working with it, like **resizing** it or **rotating** it.

In [None]:
io.imshow(resize(coin, (100,100)));

In [None]:
io.imshow(rotate(coin, 90));  #skimage to the rescue - try other angles besides 90 degrees

## What does ML data look like?

What does our data typically look like in data science?

| &nbsp;| feature   | feature   | feature   | feature    | 
| :---- | :---- | :--   | :--   | :--   | 
| sample 1     |  A    | B | A| B    | 
| sample 2    |  B   | A     |B     | A| 
| sample 3    | A| A| A| A     | 
| sample 4    | B| B    | B| B    |    
    


<br/><br/>

<details><summary><font size='4'>
Q: Why is this an issue with images? (Click)
    </font>
</summary>
    
<font size='3'>
    <br/>
 - We have to unravel the image to make it flat losing the relationship of surrounding pixels <br/>
 - Lighting will affect the pixel values (a car will have a very different set of pixel values if it is cloudy vs sunny)<br/>
 - Humans are very good at finding shapes and using those to classify a image<br/>
 - You can think of shapes/edges as the difference in adjoining pixels <br/>
    
</details>
<br/><br/><br/>


# Featurization - Looking for edges in images

It *may* help our ML model to give it edges, instead of pixel intensities, to train and predict on.

This is called [edge detection](https://en.wikipedia.org/wiki/Edge_detection) and there are a variety of ways to do it.

A straight-forward approach is to use the gradient (rate of change) of the pixel intensities.

Looking for the direction of "color" change.  (I will be using grayscale from here on out)

Looking for the change in gradient for a given pixel to the ones immediately above and below and to the left and right can be described with the following function.  
   

![gradient](imgs/gradient_fromula.png)

![grid](imgs/image_grid.png)

For the calculations
 - Magnitude (L2-norm): 
   * $ g = \sqrt{g_x^2 + g_y^2}$
 - Direction is the arctangent of the ration of the partial derivatives:  
   * $ \theta = arctan(g_y/g_x)$

Thus giving us 

![solved](imgs/solved.png)

This process is computationally slow it can be approximated by applying what is called a **convolution** or a vector of the needed form.  For the x vector that would be `[-1,0,1]`.  Which given the above example will give us the value of `-50`.  This is sometimes referred to as edge detection.

### Sobel operator

The convolutions are used to gain information about the pixels surrounding the single pixel.  You can slide a convolution over a image and end up with a new image where every pixel now represents numerically the surrounding pixels.   
One of the more well known ones is the Sobel.  It has a mathematical make up like below:


![sobel](imgs/sobel.png)


The magnitude is calculated between these two edge detectors to get the final result.


In [None]:
sobel_img = filters.sobel(coin_gray)
io.imshow(sobel_img);

We are left with a image that shows the presence of edges as distinct numeric values

In [None]:
sobel_img[0:10,140:150].round(2)

There are many different filters that have varying mathematics behind them however the idea that we are getting data on the difference of one pixel to another is followed.  Getting the difference between the pixels is important as this minimizes the issue of different lighting.  Also normalization of images first is usually recommended

In [None]:
from skimage.filters import gaussian

In [None]:
#Example of a blur filter
io.imshow(gaussian(coin_gray, sigma=3));

## Objectives:
1. Describe how images are represented in computers
2. Issues of applying models to images
3. Understand basic feature extraction

### Review
Let's check out an amazing image-based capstone 1:  [What the fish?!(betta)](https://github.com/joeshull/what_the_fish_betta)