# Image analysis

This week let's consider some basic image analysis problems.

Transformations, blurring, sharpening, etc. 

## Topic 1 - Image representation

Computer images (in particular **.png** files) are stored as $r \times c \times 3$ arrays of floating-point numbers.  $r$ represents the rows, $c$ the columns and the $3$ is for the *red*, *green* and *blue* colour fields. 

Representing a picture in this format is good for adjusting colour levels, converting to grayscale, etc.  But it is not particularly useful for resizing the image.  Or extracting features such as transitions, or edge detection.  

More afield, the subject of *computer vision* looks for techniques of automatically extracting information from images.  Think of products like *Google glass*, it is something of an attempt at a heads-up-display like in the Iron Man movie.  Its goal is to be an assistant, to identify products, people, dangers, etc. 

There are many techniques for representing real-valued functions, for example,

* Polynomials and power series 
$$f(x) \simeq a_0 + a_1 x + a_2 x^2 + \cdots $$

* Trigonometric functions and Fourier series 
$$f(x) \simeq a_0 \cos x + a_1 \sin x + a_2 \cos 2x + a_3 \sin 2x + \cdots$$

For **polynomial representation** of functions, we have two primary techniques that could be useful for us.  There is the [Stone-Weirstrauss Bernstein](https://en.wikipedia.org/wiki/Bernstein_polynomial) Theorem, which states that all functions can be approximated by polynomials.  Bernstein's approach gives an algorithm that we could implement.  Of course, the power series technique is also potentially available but it is riddled with problems:

 * Data does not always come equipped with derivatives. 
 * Raw data is rarely *analytic* i.e. the power series often will not converge to the data. 

For **trigonometric representation** of functions, there is a beautiful tool called [Fourier series](https://en.wikipedia.org/wiki/Fourier_series).  This theory provides a tool that allows one to write a (continuous) $2\pi$-periodic function

$$f(x+2\pi) = f(x)$$

as a sum of trig functions, specifically:

$$f(x) = c + a_1\sin(x) + b_1\cos(x) + a_2\sin(2x) + b_2\cos(2x) + a_3\sin(3x) + b_3\cos(3x) + \cdots$$

and the coefficients $c, a_i, b_i$ are computable via integration. 

Let's start by picking a random image from the internet. 

<img src="platypus_on_a_rock.png">

In [None]:
%matplotlib nbagg
#%matplotlib inline
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np

img = mpimg.imread('platypus_on_a_rock.smaller.png')
print(type(img))
print(img.shape)

## Mechanics of Fourier series

Given a continuous $2\pi$-periodic function $f : \mathbb R \to \mathbb C$ the (complex) Fourier series for $f$ is

$$f(x) = \sum_{k \in \mathbb Z} c_k e^{ikx} $$

where the coefficients are given by

$$c_k = \frac{1}{2\pi}  \int_{-\pi}^\pi f(x) e^{-ikx} dx. $$

In the case of real-valued functions we rewrite the above in terms of trig functions.  The idea is to observe that $f$ being real forces $c_{-k} = \overline{c_k}$, so one can fold the series for $f$ to

$$f(x) = c_0 + \sum_{k \in \mathbb N} c_ke^{ikx} + \overline{c_k e^{ikx}} dx$$

further writing $e^{ikx}$ and $c_k$ in terms of its real and imaginary parts gives

$$f(x) = c_0 + \sum_{k \in \mathbb N} a_k \cos(kx) + b_k \sin(kx) $$

where $c_0$ is as before $c_0 = \frac{1}{2\pi} \int_{-\pi}^\pi f(x) dx$ and

$$a_k = \frac{1}{\pi} \int_{-\pi}^\pi f(x) \cos(kx) dx$$

$$b_k = \frac{1}{\pi} \int_{-\pi}^\pi f(x) \sin(kx) dx$$

## Multi-variable Fourier series

There is an an analogous theory for multi-variable $2\pi$-periodic continuous functions in the plane.  By periodic we mean functions that satisfy

$$f(x + 2\pi, y) = f(x,y) = f(x,y+2\pi)$$

The Fourier expansion for them is a series expression

$$f(x,y) = \sum_{j,k \in \mathbb Z} c_{jk} e^{i(jx+ky)} $$

where

$$c_{jk} = \frac{1}{4\pi^2} \int_{-\pi}^\pi \int_{-\pi}^\pi f(x,y) e^{-i(jx+ky)} dx dy$$

an analogous argument shows us that

$$c_{-j,-k} = \overline{c_{jk}}$$

provided $f$ is real-valued, giving us

$$f(x,y) = c_{00} + \sum_{j,k} a_{jk} \cos(jx+ky) + b_{jk} \sin(jx+ky)$$  

with 
$$a_{jk} = \frac{1}{2\pi^2} \int_{\pi}^\pi \int_{-\pi}^{\pi} f(x,y) \cos(jx+ky) dxdy$$
$$b_{jk} = \frac{1}{2\pi^2} \int_{\pi}^\pi \int_{-\pi}^{\pi} f(x,y) \sin(jx+ky) dxdy$$

where the sum is over integer pairs $(j,k) \in \mathbb Z^2 \setminus \{0\}$, taking only one representative up to negation $\pm (j,k)$.  Perhaps the simplest way to do this would be to consider only the pairs $(j,k)$ with $j>0$, or $j=0$ with $k>0$. 

In [None]:
## Let's code up the integrals for the a_jk and b_jk coefficients.  Let's store the coefficients
## as defaultdicts. Probably the simplest. 

from collections import defaultdict
import itertools as it

##img.shape[0] and [1] are the width of the x and y fields, respectively. For the purpose of integration
## we can have x and y integers between 0 and img.shape[0,1], but  we compute cos(jx*2pi/img.shape[0]), etc. 

def fourierSeries(img, jr, kr):
    ## indexed by (j,k,c) j and k integers, c = 0,1,2 color channel. 
    ajk = defaultdict(float)
    bjk = defaultdict(float)
    for j,k,c in it.product(range(-jr, jr), range(kr), range(3)):
        apsum = 0.0
        bpsum = 0.0
        for x,y in it.product( range(img.shape[0]), range(img.shape[1]) ):
            apsum += (2.0/(img.shape[0]*img.shape[1]))*img[x,y,c]*\
                            np.cos(2*np.pi*(j*x/img.shape[0]+k*y/img.shape[1]))
            bpsum += (2.0/(img.shape[0]*img.shape[1]))*img[x,y,c]*\
                            np.sin(2*np.pi*(j*x/img.shape[0]+k*y/img.shape[1]))
        ajk[(j,k,c)] = apsum
        bjk[(j,k,c)] = bpsum
    for c in range(3): ajk[(0,0,c)] *= 0.5
    return ajk, bjk

## give the ajk and bjk arrays, and xresolution, yresolution
def undoFourier(ajk, bjk, xr, yr):
    imgarray = np.ndarray( (xr,yr,3) ) ## set type to int.
    for x,y in it.product( range(xr), range(yr) ):
        lev = [0.0,0.0,0.0]
        
        for I,v in ajk.items(): 
            if I[1]==0 and I[0]<0: continue
            lev[I[2]] += v*np.cos(2*np.pi*(I[0]*x/xr+I[1]*y/yr))
        for I,v in bjk.items():
            if I[1]==0 and I[0]<0: continue
            lev[I[2]] += v*np.sin(2*np.pi*(I[0]*x/xr+I[1]*y/yr))
            
        ## compute the level. . . we need a sum over . . .? 
        for c in range(3): 
            if lev[c]>1.0: lev[c]=1.0
            if lev[c]<0.0: lev[c]=0.0
            imgarray[x,y,c] = lev[c]
        
    return imgarray

In [None]:
xs = img.shape[0]
ys = img.shape[1]
print(xs, ys)

In [None]:
af, bf = fourierSeries(img, 3,3)

In [None]:
img2 = undoFourier(af, bf, xs, ys)

In [None]:
plt.imshow(img2)

### Okay, that's really slow!  Let's try something else. 

The [Fast Fourier Transform](https://en.wikipedia.org/wiki/Fast_Fourier_transform) is an alternative family of algorithms to compute Fourier series.  These algorithms are (usually) based on matrix algebra factorizations. 

Lucky for us, **numpy** has an [implementation](https://docs.scipy.org/doc/numpy/reference/routines.fft.html) implementation of the Fast Fourier Transform.  Let's try it. 

**What we need to know** the numpy 2-dimensional Fourier transform is called **fft2**.  It takes as input a numpy 2-dimensional array of floats.  

The first thing we will need to do is split our **y by x by 3** numpy array into three **y by x** arrays, one for each color.  The [**numpy.moveaxis**](https://docs.scipy.org/doc/numpy/reference/generated/numpy.moveaxis.html#numpy.moveaxis) command is useful for this step. 

In [None]:
img = mpimg.imread('platypus_on_a_rock.png')
print(type(img))
print(img.shape)

img2 = np.moveaxis(img, 2, 0)
print(img2.shape)

## blue part
print("Blue field of image")
plt.imshow(img2[2], cmap='gray')

In [None]:
plt.close()

## Good, we have our three colour slices of the image now.
imgFTS = [np.fft.fftshift(np.fft.fft2(img2[k])) for k in range(3)]

ctr = img.shape[0]//2 ## row count
ctc = img.shape[1]//2 ## column count

## grab low frequencies from Fourier transform
##  all fields zero, except low frequency.

## technical detail disguised -- these need to be complex arrays, not
## real floats!  
imgFTSLF = [np.zeros_like(imgFTS[k]) for k in range(3)]
## take the LowF smallest frequencies.
LowF = 40
for i in range(ctr-LowF, ctr+LowF+1):
    for j in range(ctc-LowF, ctc+LowF+1):
        for k in range(3):
            imgFTSLF[k][i, j] = imgFTS[k][i,j]

imgFTLF = [np.fft.ifftshift(imgFTSLF[k]) for k in range(3)]
imgLowF = [np.abs(np.fft.ifft2(imgFTLF[k])) for k in range(3)]
imgLowF = np.array(imgLowF)
imgLowF = np.moveaxis(imgLowF, 0, 2)

plt.imshow(imgLowF)

In [None]:
from matplotlib import animation as ani

plt.close()

fig = plt.figure(figsize=(10,10))

## Good, we have our three colour slices of the image now.
imgFTS = [np.fft.fftshift(np.fft.fft2(img2[k])) for k in range(3)]

ctr = img.shape[0]//2 ## row count
ctc = img.shape[1]//2 ## column count

pic = plt.imshow(img)

def init():
    pic.set_data(img)
    return pic

## grab low frequencies from Fourier transform
##  all fields zero, except low frequency.
def animate(res):
    imgFTSLF = [np.zeros_like(imgFTS[k]) for k in range(3)]
    ## take the LowF smallest frequencies.
    LowF = res
    for i in range(ctr-LowF, ctr+LowF+1):
        for j in range(ctc-LowF, ctc+LowF+1):
            for k in range(3):
                imgFTSLF[k][i, j] = imgFTS[k][i,j]

    imgFTLF = [np.fft.ifftshift(imgFTSLF[k]) for k in range(3)]
    imgLowF = [np.abs(np.fft.ifft2(imgFTLF[k])) for k in range(3)]
    imgLowF = np.array(imgLowF)
    imgLowF = np.moveaxis(imgLowF, 0, 2)

    pic.set_data(imgLowF)
    return pic

anim = ani.FuncAnimation(fig, animate, init_func=init,\
                         frames=[i+2 for i in range(100)], interval=1)
plt.show()

### Let's try the opposite, looking at the high frequencies. 

In [None]:
plt.close()

## Good, we have our three colour slices of the image now.
imgFTS = [np.fft.fftshift(np.fft.fft2(img2[k])) for k in range(3)]

ctr = img.shape[0]//2 ## row count
ctc = img.shape[1]//2 ## column count

## grab low frequencies from Fourier transform
##  all fields zero, except low frequency.

LowF = 40
for i in range(ctr-LowF, ctr+LowF+1):
    for j in range(ctc-LowF, ctc+LowF+1):
        for k in range(3):
            imgFTS[k][i, j] = 0.0

imgFTLF = [np.fft.ifftshift(imgFTS[k]) for k in range(3)]
imgHighF = [np.abs(np.fft.ifft2(imgFTLF[k])) for k in range(3)]
imgHighF = np.array(imgHighF)
imgHighF = np.moveaxis(imgHighF, 0, 2)

plt.imshow(imgHighF)

### And again, let's animate it.

In [None]:
plt.close()

fig = plt.figure(figsize=(10,10))

## Good, we have our three colour slices of the image now.
imgFTS = [np.fft.fftshift(np.fft.fft2(img2[k])) for k in range(3)]

ctr = img.shape[0]//2 ## row count
ctc = img.shape[1]//2 ## column count

pic = plt.imshow(img)

def init():
    pic.set_data(img)
    return pic

## grab low frequencies from Fourier transform
##  all fields zero, except low frequency.
def animate(res):
    imgFTSt = np.copy(imgFTS)
    
    ## take the LowF smallest frequencies.
    LowF = res
    for i in range(ctr-LowF, ctr+LowF+1):
        for j in range(ctc-LowF, ctc+LowF+1):
            for k in range(3):
                imgFTSt[k][i, j] = 0.0

    imgFTLF = [np.fft.ifftshift(imgFTSt[k]) for k in range(3)]
    imgHighF = [np.abs(np.fft.ifft2(imgFTLF[k])) for k in range(3)]
    imgHighF = np.array(imgHighF)
    imgHighF = np.moveaxis(imgHighF, 0, 2)

    pic.set_data(imgHighF)
    return pic

anim = ani.FuncAnimation(fig, animate, init_func=init,\
                         frames=[0 for i in range(100)], interval=1)
plt.show()

### the scikit-image library

Is a useful resource for robustly-coded image manipulation algorithms.

In [None]:
from skimage import data, io, segmentation, color, transform
from skimage.future import graph

plt.close()
img = mpimg.imread('platypus_on_a_rock.png')
imgr = transform.resize(img, (400, 600))
plt.imshow(imgr)

In [None]:
from scipy import ndimage as ndi


plt.close()

im = ndi.rotate(img, 15, mode='constant')
plt.imshow(im)

[Canny Edge Detector](https://en.wikipedia.org/wiki/Canny_edge_detector)


In [None]:
from skimage.color import rgb2gray
from skimage import feature

imG = rgb2gray(img) ## grayscale version 

edg1 = feature.canny(imG)
edg2 = feature.canny(imG, sigma=3)

plt.close()

fig, (ax1, ax2, ax3) = plt.subplots(nrows=1, ncols=3)
ax1.imshow(imG, cmap = plt.get_cmap('gray'))
ax2.imshow(edg1, cmap = plt.get_cmap('gray'))
ax3.imshow(edg2, cmap = plt.get_cmap('gray'))
plt.show()

[CENSURE feature detector](http://scikit-image.org/docs/dev/auto_examples/features_detection/plot_censure.html)

In [None]:
## CENSURE feature detector
from skimage import data
from skimage import transform as tf
from skimage.feature import CENSURE

tform = tf.AffineTransform(scale=(1.2, 1.2), rotation=0.1,
                           translation=(40, -80))
img_warp = tf.warp(imG, tform)

detector = CENSURE()

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12, 6))
plt.tight_layout()

detector.detect(imG)

ax[0].imshow(imG, cmap=plt.cm.gray)
ax[0].axis('off')
ax[0].scatter(detector.keypoints[:, 1], detector.keypoints[:, 0],
              2 ** detector.scales, facecolors='none', edgecolors='r')

detector.detect(img_warp)

ax[1].imshow(img_warp, cmap=plt.cm.gray)
ax[1].axis('off')
ax[1].scatter(detector.keypoints[:, 1], detector.keypoints[:, 0],
              2 ** detector.scales, facecolors='none', edgecolors='r')

plt.show()

A suite of image processing examples:

[Scikit-image examples](http://scikit-image.org/docs/dev/auto_examples/)

[Scipy-lectures](http://www.scipy-lectures.org/advanced/image_processing/)

[Pythonvision](http://pythonvision.org/basic-tutorial/)

In [None]:
from skimage.filters import sobel
from skimage.segmentation import slic, join_segmentations
from skimage.morphology import watershed
from skimage.color import label2rgb
from skimage import data, img_as_float

from skimage.morphology import disk
from skimage.filters import threshold_otsu, rank
from skimage.util import img_as_ubyte

from skimage import io, segmentation, color
from skimage.future import graph
from skimage import data, filters, io
from skimage.util.colormap import viridis

import matplotlib
import matplotlib.pyplot as plt

### Otsu's method

Otsu's method uses a clustering technique to turn grayscale images into *binary* images (color full 1.0 or color off 0.0). 

In [None]:
matplotlib.rcParams['font.size'] = 9
img = img_as_ubyte(imG)

radius = 15
selem = disk(radius)

local_otsu = rank.otsu(img, selem)
threshold_global_otsu = threshold_otsu(img)
global_otsu = img >= threshold_global_otsu

fig, ax = plt.subplots(2, 2, figsize=(8, 5), sharex=True, sharey=True,
                       subplot_kw={'adjustable': 'box-forced'})
ax0, ax1, ax2, ax3 = ax.ravel()
plt.tight_layout()

fig.colorbar(ax0.imshow(img, cmap=plt.cm.gray),
             ax=ax0, orientation='horizontal')
ax0.set_title('Original')
ax0.axis('off')

fig.colorbar(ax1.imshow(local_otsu, cmap=plt.cm.gray),
             ax=ax1, orientation='horizontal')
ax1.set_title('Local Otsu (radius=%d)' % radius)
ax1.axis('off')

ax2.imshow(img >= local_otsu, cmap=plt.cm.gray)
ax2.set_title('Original >= Local Otsu' % threshold_global_otsu)
ax2.axis('off')

ax3.imshow(global_otsu, cmap=plt.cm.gray)
ax3.set_title('Global Otsu (threshold = %d)' % threshold_global_otsu)
ax3.axis('off')

plt.show()

Segmentation example. 

In [None]:
img = mpimg.imread('platypus_on_a_rock.png')

labels1 = segmentation.slic(img, compactness=20, n_segments=800)
out1 = color.label2rgb(labels1, img, kind='avg')

plt.figure()
io.imshow(out1)
io.show()

Region adjacency graphs.  Technique for triangulating the image, *primary segmentation* terminology in the literature. 

In [None]:
plt.close()
plt.figure()

gimg = color.rgb2gray(img)

labels = segmentation.slic(img, compactness=30, n_segments=400)
edges = filters.sobel(gimg)
edges_rgb = color.gray2rgb(edges)

g = graph.rag_boundary(labels, edges)

out = graph.draw_rag(labels, g, edges_rgb, node_color="#999999",
                     colormap=viridis)

io.imshow(out)
io.show()