# Synopsis

There is an extensive amount of data that is stored in images and is available for analysis. On the web, images are everywhere and being able to algorithmically filter them (say for a search engine or to identify infringement) is an essential task. Scientifically, many studies rely on visual images to ascertain the presence or absence of some behavior (remember, a video is really just a series of images in time!).

To start we're going to work on:
* The basics of what an image is
* How to read an image into code
* How to manipulate an image in Python



# Read libraries and functions

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

my_fontsize = 15

In [None]:
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np

from pathlib import Path
from pylab import imread

In [None]:
def half_frame(sub, xaxis_label, yaxis_label, font_size = 15, padding = -0.02):
    """Formats frame, axes, and ticks for matplotlib made graphic 
       with half frame.
       
    """

    # Format graph frame and tick marks
    sub.yaxis.set_ticks_position('left')
    sub.xaxis.set_ticks_position('bottom')
    sub.tick_params(axis = 'both', which = 'major', length = 7, width = 1.5, 
                    direction = 'out', pad = 10, labelsize = font_size)
    sub.tick_params(axis = 'both', which = 'minor', length = 5, width = 1.5, 
                    direction = 'out', labelsize = 10)
    for axis in ['bottom','left']:
        sub.spines[axis].set_linewidth(1.5)
        sub.spines[axis].set_position(("axes", padding))
    for axis in ['top','right']:
        sub.spines[axis].set_visible(False)

    # Format axes
    sub.set_xlabel(xaxis_label, fontsize = 1.6 * font_size)
    sub.set_ylabel(yaxis_label, fontsize = 1.6 * font_size)


# Working with images

Images are what made the Web.  Cats, dogs, porn.

Naturally, programmers wanted to work with images. But images come in all sort of formats. Considering the strain of misogyny amongst many male programmers, you will not be surprised that a [classic image of a woman](https://en.wikipedia.org/wiki/Lenna) used in explaining image compression approaches is actually of an 'adult entertainment' model.

The kind of inside joke that helped create a hostile environment for women in CS.

Moving on, the package `pylab` has a function -- `imread()` -- that enables us to easily and reliably import images from a multitude of formats.

We downloaded a bunch of Picasso paintings for you:


In [None]:
picasso_folder = Path.cwd() / 'Data' / 'Picasso'

for i, file in enumerate( picasso_folder.glob('*') ):
    print(f"{i:>3}--{str(file)[80:]}")


Let's select an example to play with

In [None]:
self_portrait_07 = imread( picasso_folder / '1907-Self-Portrait.-13.jpg' )
self_portrait_07

## Images are ingested as `numpy` arrays !

Cool, ha? 

An image is ingested as a multi-dimensional array.

Can you see a lot of `numpy` in the near future?

The `numpy` arrays have stereotypical shapes 


In [None]:
self_portrait_07.shape

Two big numbers and a little one.

The image is actually a rectangle of $n_x$ by $n_y$ pixels.  Because the image is in color, we then need three number -- does `RGB` right a bell? -- to define the pixel's color.


Each pixel has three values: the Red value, the Green value, and the Blue value. 

You should be aware that there are several [color encoding schemes](https://en.wikipedia.org/wiki/List_of_color_spaces_and_their_uses) besides `RGB`. 


It is nice and reassuring to see that an image gets encoded in a manner comprehensible to us. However, it would definitely be nice to **see it** too.

`Matplotlib` to the rescue

In [None]:
plt.imshow(self_portrait_07)
plt.show()

We wrote that the image is actually a rectangle of $n_x$ by $n_y$ pixels.

And `self_portrait_07.shape` returned:

> (766, 597, 3)

So we can see that there are 766 rows and 597 columns.

An interesting point to notice is that **the origin of the picture is in the top-left of the image and the y axis grows in the downward direction**. When the first cathode ray tubes were being developed, it was decided  $-$ arbitrarily? because of European writing convention? $-$ to start the rastering of the rows at the top and moving down, instead of starting from the bottom and moving up, as was the style in mathematical notation. Somehow that convention is petrified is image processing even though we no longer user electromagnetic fields to control electrons hitting a screen. 


## RGB

If we go to [Wikipedia](https://en.wikipedia.org/wiki/RGB_color_model), we find:

    The RGB color model is an additive color model in which red, green and blue light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three additive primary colors, red, green and blue.
    
So the question is, what scale is used for each color value? With numpy we can find that out easily

In [None]:
self_portrait_07.max()

In [None]:
self_portrait_07.min()

So each of the color elements has a value from `0` to `255`, and the mixture of the R, G, and B values produces the final color.

The color values are actually stored in that order in the matrix and we can easily check that by plotting.

In [None]:
# Create figure and subplots

fig = plt.figure( figsize = (15, 6)) 
ax = []

# Create color maps
cmaps = [cm.Reds, cm.Greens, cm.Blues]
labels = ['red', 'green', 'blue']

# print each color component separately
for i in range(3):
    ax.append(fig.add_subplot(1, 4, i+1))
    ax[i].imshow(self_portrait_07[:,:,i], cmap=cmaps[i])

# print full image
ax.append( fig.add_subplot(1, 4, 4) )
ax[3].imshow(self_portrait_07)

plt.tight_layout()
plt.show()


Some information that will help you make sense of the `RGB` color scheme.

It is an **additive** scheme.  Adding the maximum value of every channel (i.e., primary color) yields **white**.  Conversely, adding the minimum value of every channel yields **black**.

You can see this property at play in the shirt collar, which has very saturated (i.e., large) values in every channel. 

In contrast, the background has almost no blue in it, but has a saturated red in many parts.

## Creating our own images.

Since you now understand `RGB` and how images are encoded, you can create your own `RGB` images. 

**Yes, it is AI again!**


In [None]:
# Select your color by specifying rgb values
#
r, g, b = 122, 0, 0

# Create a numpy array filled with INTEGER ones and the desired shape
#
color_patch = np.ones( dtype= 'int64',  shape = (20, 20, 3) ) 

# Set the desired values in each of the channels by 
# multiplying by rgb values

color_patch[:,:,0] *= r   # red
color_patch[:,:,1] *= g   # green
color_patch[:,:,2] *= b   # blue

plt.imshow(color_patch)
plt.show()

The best way to understand how these colors mix is to play a bit with it. 

You could use a [tool](http://www.colortools.net/color_mixer.html) online to get a basic sense or just play with the code above. 

## Slicing images

You will notice as we work through the examples below that slicing, and indexing of `numpy` arrays looks very similar to how the `.iloc` approach works in `dataframes`.

Indeed, we can slice an array on any dimension that we want. For example, if we wanted just 20 columns of data we could do that with one slice like so:

In [None]:
print( self_portrait_07[:, :20, :].shape )

plt.imshow(self_portrait_07[:, :20, :])
plt.show()

So we could easily plot only a portion of the image using the built-in slicing

In [None]:
plt.imshow(self_portrait_07[500:, :200, :])
plt.show()

And we can change an entire channel easily. For example, imagine I want to remove the red channel from the image.

**ALERT: To avoid overwriting the array, we must truly copy it!**

In [None]:
# ALERT:
# To avoid overwriting the array, we must truly copy it
#
self_portrait_wo_red = np.copy(self_portrait_07)

self_portrait_wo_red[:, :, 0] = 0

plt.imshow(self_portrait_wo_red)
plt.show()

## Playing with slicing

1. Choose a painting and reverse the x axis on the image

In [None]:

plt.show()

2. Load the three musicians painting and cut out just the rightmost musician.

In [None]:

plt.show()

3. Choose a painting and switch the red and blue channels

In [None]:

plt.show()

4. Choose an image and reduce its resolution by a factor of 4 (2 along each dimension).

In [None]:

plt.show()

# Array Methods

## Column and row operations

Many `numpy` functions -- especially those involving summary statistics -- allow you to specify if the operation should be performed on the rows or columns with the `axis` keyword.

> axis = 0 <-- columns

> axis = 1 <-- rows

> axis = 2 <-- depth

> ...

You can name the other dimensions.


In [None]:
my_array = np.array([[19.72, 20.34], 
                     [21.30, 17.26]])


In [None]:
print(my_array)
print()

print(my_array.shape)
print()


In [None]:
print(f"The mean of my_array with no axis specified is {my_array.mean()}\n")

print(f"The mean of my_array with axis 0 specified is {my_array.mean(axis = 0)}\n")

print(f"The mean of my_array with axis 1 specified is {my_array.mean(axis = 1)}\n")

print(f"The mean of my_array with axis 2 specified is {my_array.mean(axis = 2)}\n")


## Scanning the rows of an image

Using these functions we can profile the usage of color throughout an image. 

One example would be, how does the color usage change as we scan through the rows of an image? This can be useful for numerically identifying different portions of an image that may be of interest. 

Let's use a painting from Picasso's blue period to see if there is any blue signal...

In [None]:
old_guitarist_03 = imread(picasso_folder / '1903-The_Old_Guitarist.-7.jpg')
print(old_guitarist_03.shape)

plt.imshow(old_guitarist_03)
plt.show()

And now let's plot the row average (or said another way, what is color usage as a function of the row)

In [None]:
# Specify color maps
cmaps = [cm.Reds, cm.Greens, cm.Blues]
labels = ['red', 'green', 'blue']

# Create figure and subplots
fig = plt.figure( figsize = (10, 4)) 
ax = fig.add_subplot(111)

half_frame(ax, 'Row index', 'Channel\nmean intensity', font_size= my_fontsize)

# Calculate means by row
for i in range(len(labels)):
    ax.plot( old_guitarist_03.mean(axis=1)[:, i], 
             color = labels[i], linewidth = 1.5 , 
             label = labels[i])
    
ax.legend(loc = 'best', frameon = False, fontsize = my_fontsize)    
# ax.set_ylim(0, 255)

plt.tight_layout()
plt.show()



Isn't that special? as the [Church Lady](https://www.youtube.com/watch?v=puwoUKhZQbg) would say. 

We have some data but it is difficult to match it to the image.  So let's work a little bit more on the visualization.

Add the actual image, rotate the plot so it aligns with the rows, show the full possible range of the channel intensities, that sort of thing... 


In [None]:
# Create figure and subplots
fig = plt.figure( figsize = (10, 4)) 
ax = []

ax.append( fig.add_subplot(121) )
ax[0].imshow(old_guitarist_03)

ax.append( fig.add_subplot(122) )
half_frame(ax[1], 'Row index', 'Channel\nmean intensity', font_size= 10)

# Calculate means by row
for i in range(len(labels)):
    ax[1].plot( old_guitarist_03.mean(axis=1)[:, i], 
                range(0, -old_guitarist_03.shape[0], -1),
                color = labels[i], linewidth = 1.5 , 
                label = labels[i])
    
ax[1].legend(loc = 'best', frameon = False, fontsize = my_fontsize)    
ax[1].set_xlim(0, 255)
ax[1].set_ylim(-702,0)

plt.tight_layout()


plt.show()

Ok.

So, we can confirm that the figure is quite dark -- notice how the maximum for every channel is below 100.

Blue is the most intense channel in the top segment above the head. That may be a window or the view of the darkening sky.  

Skin is painted in a greenish tone. So, green is the most intense channel in the rows showing the head and the right arm.

On the other hand, the eye is attracted to the brightest regions of the image, and those do have blue overtones.  


We could repeat this analysis but for axis 0 (i.e., columns instead of rows). And I recommend that you try it in order to figure out how to modify the visualization to best effect.

However, the issue is that we do not look at paintings by rows or by columns, we look at them by **patches**.



# Filtering image regions

In order to move us toward the ability to analyze patches in an image, let's start by considering how to filter regions of the image that fulfill some criteria.

For example, what if we are only interest in parts of an image where the blue channel's intensity exceeds 50?

In order to do this, we will introduce the concept of masks.  Are you having `pandas` *deja vu*?

In [None]:
woman_w_crow_04 = imread(picasso_folder / '1904-Woman_with_a_Crow.-4.jpg')

plt.imshow(woman_w_crow_04)
plt.show()


In [None]:
woman_w_crow_reds = woman_w_crow_04[:,:, 0]
plt.imshow(woman_w_crow_reds, cmap='gray')
plt.show()

Notice that, as usual, low intensities are shown as black, and high intensities as white.

Notice also, that slicing off the first layer has created an array with a different shape.

In [None]:
print('original shape', woman_w_crow_04.shape)
print('new shape', woman_w_crow_reds.shape)

## Creating simple masks

This new two dimensional array consists of many numbers between 0 and 255

In [None]:
woman_w_crow_reds

To create a Boolean mask array, we just can just write down a logical expression as we did with `pandas`

In [None]:
red_mask = woman_w_crow_reds > 100
red_mask

When plotting the mask, `False` will be seen as 0 and `True` as 1.

In [None]:
print('type:', type(red_mask))
print('dtype', red_mask.dtype)
print('shape', red_mask.shape)

plt.imshow(woman_w_crow_reds, cmap='gray')
plt.show()

plt.imshow(red_mask, cmap='gray')
plt.show()

plt.imshow(red_mask[500:600, 200:400], cmap='gray')
plt.show()


A cool property of `numpy` arrays is that we can easily multiplying arrays pixel by pixel. 

If we multiple the original image by the mask array, we can obliterate anything that we want to ignore. 



In [None]:
plt.imshow(woman_w_crow_reds * red_mask, cmap='gray')
plt.show()


In [None]:
image_red_mask = np.ones( shape = woman_w_crow_04.shape, dtype = 'int8' )

image_red_mask[:,:,0] = red_mask
image_red_mask[:,:,1] = red_mask
image_red_mask[:,:,2] = red_mask

In [None]:
plt.imshow(woman_w_crow_04)
plt.show()

plt.imshow(woman_w_crow_04 * image_red_mask)
plt.show()

With these operations, we set to black every single pixel for which the red channel's intensity is below 100. 



## Creating complex masks

We can string together multiple boolean operations as we did with `pandas` thus creating more complex masks. 



In [None]:
first_communion_1895 = imread(picasso_folder / '1895-First_Communion.-11.jpg')
plt.imshow(first_communion_1895)
plt.show()


Imagine we want focus on  all the white or nearly-white stuff in this painting.

A reasonable hypothesis is that the intensity in every channel must be greater than 200...

In [None]:
red_mask = first_communion_1895[:,:,0] > 200
green_mask = first_communion_1895[:,:,1] > 200
blue_mask = first_communion_1895[:,:,2] > 200

combination_mask = red_mask & green_mask & blue_mask

plt.imshow(combination_mask, cmap = 'gray')
plt.show()

image_mask = np.ones( shape = first_communion_1895.shape, dtype = 'int8' )
image_mask[:,:,0] = combination_mask
image_mask[:,:,1] = combination_mask
image_mask[:,:,2] = combination_mask


plt.imshow( first_communion_1895 * image_mask )
plt.show()

If you wanted to highlight anything with pure red, what would you?