<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>

# Image Pyramids

## Outline

- Gaussian image pyramids
- Laplacian image pyramids
- Laplacian blending

## Gaussian Image Pyramid

The basic idea for constructing Gaussian image pyramid is as follows:

1. Gaussian blur the image
2. Reduce image dimensions by half by discarding every other row and and every other column
3. Repeat this process until desired numbers levels are achieved or the image is reduced to size $1 \times 1$.

<figure style="margin-left:auto; margin-right: auto; text-align: center; display: block; max-width: 700px;">
<img src="images/zebra-gaussian-pyramid.png" alt="Drawing" style="width: 300px;"/>
<figcaption style="text-align: center; margin-bottom: 10px; font-style: italic;">Courtesy D. Forsyth</figcaption>
</figure>

Image pyramids often used for scale-invariant image processing:

- template matching;
- image registration;
- image enhancement;
- interest point detection; and
- object detection.

#### Exercise: construct 2-level Gaussian pyramid in Python

- Load image `data/apple.jpg`
- Blur each channel with a 5-by-5 Gaussian kernel
- Construct the next level of Gaussian pyramid by discarding every other row or column

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import cv2 as cv

In [None]:
# %load solutions/04/solution_01.py
I = cv.imread('data/apple.jpg')
I = cv.cvtColor(I, cv.COLOR_BGR2RGB)

print('Shape of I = {}'.format(I.shape))
I[:,:,0] = cv.GaussianBlur(I[:,:,0], (5,5), 2, 2)
I[:,:,1] = cv.GaussianBlur(I[:,:,1], (5,5), 2, 2)
I[:,:,2] = cv.GaussianBlur(I[:,:,2], (5,5), 2, 2)

I2 = I[::2,::2,:]
print('Shape of I2 = {}'.format(I2.shape))

plt.imshow(I2)

#### Implementation (Gaussian Image Pyramid)

OpenCV function `pyrDown` method automates the steps above: blurring with a Gaussian filter & downsampling.  It uses the following kernel to blur the image (in step 1):

$$
\frac{1}{256}
\left[
\begin{array}{ccccc}
1 & 4 & 6 & 4 & 1 \\
4 & 16 & 24 & 16 & 4 \\
6 & 24 & 36 & 24 & 6 \\
4 & 16 & 24 & 16 & 4 \\
1 & 4 & 6 & 4 & 1
\end{array}
\right]
$$

In [None]:
# %load solutions/04/gen_gaussian_pyramid.py

In [None]:
# load 2 images to decompose into Gaussian pyramids
A = cv.imread('data/apple.jpg')
B = cv.imread('data/orange.jpg')
A = cv.cvtColor(A, cv.COLOR_BGR2RGB)
B = cv.cvtColor(B, cv.COLOR_BGR2RGB)

In [None]:
gpA = gen_gaussian_pyramid(A)
gpB = gen_gaussian_pyramid(B)

In [None]:
gp = gpB
num_levels = len(gp)
for i in range(num_levels):
    rows = gp[i].shape[0]
    cols = gp[i].shape[1]
    print('level={}: size={}x{}'.format(i, rows, cols))

In [None]:
gp = gpA
level = 3
plt.figure(figsize=(10,10))
plt.title('Level {}'.format(level))
plt.imshow(gp[level]);

---

## Laplacian Image Pyramid

Proposed by Burt and Adelson is a bandpass image decomposition derived from the Gaussian pyramid.  Each level encodes information at a particular spatial frequency.  The basic steps for constructing Laplacian pyramids are:

1. Convolve the original image $g_0$ with a lowpass filter $w$ (e.g., the Gaussian filter) and subsample it by two to create a reduced lowpass version of the image $g_1$.  Recall that this is what function `pyrDown` does.

2. Upsample this image (i.e., $g_1$) by inserting zeros in between each row and column and interpolating the missing values by convolving it with the same filter $w$ to create the expanded lowpass image $g'_1$ which is subtracted pixel by pixel from the original to give the detail image $L_0$.  Specifically $L_0 = g_0 - g'_1$.
3. Repeat steps 1 and 2.

<figure style="margin-left:auto; margin-right: auto; text-align: center; display: block; max-width: 700px;">
<img src="images/zebra-laplacian-pyramid.png" alt="Drawing" style="width: 300px;"/>
<figcaption style="text-align: center; margin-bottom: 10px; font-style: italic;">Courtesy D. Forsyth</figcaption>
</figure>

We define Laplacian operator as follows:

$$
\nabla^2 f = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2}
$$

In addition we can approximate the Laplacian of a Gaussian as follows:

<figure style="margin-left:auto; margin-right: auto; text-align: center; display: block; max-width: 700px;">
<img src="images/laplacian-of-a-gaussian.png" alt="Drawing" style="width: 300px;"/>
<figcaption style="text-align: center; margin-bottom: 10px; font-style: italic;">Source: Lazebnik</figcaption>
</figure>

We use this property when constructing Laplacian image pyramids above.

### Reconstructing the original image

It is possible to reconstruct the original image $g_0$ from its Laplacian image pyramd consisting of $N+1$ detail images $L_i$, where $i \in [0,N]$ and the low pass image $g_N$.

1. $g_N$ is upsampled by inserting zeros between the sample values and interpolating the missing values by convolving it with the filter $w$ to obtain the image $g'_N$.
2. The image $g'_N$ is added to the lowest level detail image $L_N$ to obtain the approximation image at the next upper level.
3. Steps 1 and 2 are repeated on the detail images $L_i$ to obtain the original image.

### Uses

Laplacian image pyramids are often used for compression.  Instead of encoding $g_0$, we encode $L_i$, which are decorrelated and can be represented using far fewer bits.

### Implementation (Laplacian Image Pyramid)

In [None]:
# %load solutions/04/gen_laplacian_pyramid.py

In [None]:
lpA = gen_laplacian_pyramid(gpA)
lpB = gen_laplacian_pyramid(gpB)

In [None]:
lp = lpA
num_levels = len(lp)
for i in range(num_levels):
    rows = lp[i].shape[0]
    cols = lp[i].shape[1]
    print('level={}: size={}x{}'.format(i, cols, rows))

### Laplacian Blending Example

In [None]:
lp = lpA
level = 3
plt.figure(figsize=(10,10))
plt.title('Level {}'.format(level))
plt.imshow(lp[level]);

In [None]:
# %load solutions/04/laplacian_blending.py

In [None]:
# %load solutions/04/reconstruct_from_laplacian.py

In [None]:
real = np.hstack((A[:,:cols//2],B[:,cols//2:]))

In [None]:
plt.figure(figsize=(10,10))
plt.imshow(ls_);

In [None]:
plt.figure(figsize=(10,10))
plt.imshow(real);

## Conclusions and Summary

- Gaussian pyramid
    - Coarse-to-fine search
    - Multi-scale image analysis (*hold this thought*)
- Laplacian pyramid
    - More compact image representation
    - Can be used for image compositing (computation photography)
- Downsampling
    - *Nyquist limit*:  The Nyquist limit gives us a theoretical limit to what rate we have to sample a signal that contains data at a certain maximum frequency. Once we sample below that limit, not only can we not accurately sample the signal, but the data we get out has corrupting artifacts. These artifacts are "aliases".
    - Need to sufficiently low-pass before downsampling

### Various image representations

- Pixels: great for spatial processing, poor access to frequency
- Fourier transform: great for frequency analysis, poor spatial info
- Pyramids: trade-off between spatial and frequency information

---
Based on materials from Prof. Faisal Qureshi (Faculty of Science, Ontario Tech University, Oshawa ON, Canada, http://vclab.science.ontariotechu.ca)

<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>