# 6. The Laplacian Pyramid

In [107]:
%matplotlib nbagg
import warnings
import inspect
import matplotlib.pyplot as plt
import IPython.display
import numpy as np
import math
from scipy import optimize
from cued_sf2_lab.familiarisation import load_mat_img, plot_image
from cued_sf2_lab.rl631_laplacian import *

## 6.1 The basic concept

<figure id="figure-2">
<div style="background-color: white">

![](figures/laplacian.svg)</div>
    
<figcaption style="text-align: center">Figure 2: A typical laplacian pyramid with 2 levels; forward (analysis) part only</figcaption>
</figure>

The Laplacian pyramid is an energy compaction technique, based on
the observation:

> For most real-world images, the high-frequency energy is much less than the low-frequency energy.

Since the lowpass image is much lower bandwidth than the
original image, it can be subsampled (*decimated*) 2:1 in both
horizontal and vertical directions, without significant loss of
information to give a quarter-size lowpass image.  We have
provided a row filter/decimation function, `rowdec(X, h)`, to
filter and decimate the rows of a matrix by 2:1.

In [108]:
from cued_sf2_lab.laplacian_pyramid import rowdec

Use this twice
(on the image and its transpose) to generate a quarter-size
lowpass image `X1` of Lighthouse.  For simplicity use a 3-tap (length 3)
filter `h` with coefficients: $\frac{1}{4} [1\ 2\ 1]$.

In [109]:
X, cmaps_dict = load_mat_img(img='lighthouse.mat', img_info='X', cmap_info={'map', 'map2'})

In [110]:
h = 0.25*np.array([1, 2, 1])
X1 = imgFD(rowdec, X, h)
# X1 = rowdec(rowdec(X, h).T, h).T

fig, ax = plt.subplots()
plot_image(X1, ax=ax)
plt.title("Quarter-size low pass image of X")

<IPython.core.display.Javascript object>

Text(0.5, 1.0, 'Quarter-size low pass image of X')

The image `X1` can be interpolated 2:1 in each direction to
generate a lowpass image of the original size, which can then be
subtracted from the original to yield a full-size highpass image.
We have also provided a row interpolation/filter function,
`rowint(X, h)`, which doubles the length of each row of the image, by
inserting zeros between alternate samples and then lowpass
filtering the result with the filter `h`. Note that to undo the decimation
performed by `Y = rowdec(X, h)`, this should be called as `Z = rowint(Y, 2*h)`,
where a DC gain of 2 is added, as shown in [Figure 2](#figure-2).

In [111]:
from cued_sf2_lab.laplacian_pyramid import rowint

Apply this twice (as before) to generate
a full-size lowpass image, and subtract this from `X` to
give a highpass image `Y0`. Display `X1` and `Y0` to see these effects.

In [112]:
# your code here
Z = imgFD(rowint, X1, 2*h)
# Z = rowint(rowint(X1, 2*h).T, 2*h).T

Y0 = X-Z

fig, ax = plt.subplots(1, 2, figsize = (10,5))
ax[0].set_title("Quarter-size low pass image of X")
plot_image(X1, ax=ax[0])
ax[1]. set_title("Quarter-size high pass image of X")
ax[1].imshow(Y0)


<IPython.core.display.Javascript object>

<matplotlib.image.AxesImage at 0x29bb8c9eb50>

The highpass images will
always have approximately zero mean (since any dc component is
removed). Hence for the remainder of this project, we recommend
that you also make all lowpass images be approximately zero mean
by subtracting 128 from them before you start any processing. This
makes the 8-bit pixel values cover the range -128 to 127,
instead of 0 to 255. The advantages of having a zero-mean input image are not very obvious here, but they
will become clearer as the project progresses. If you use `draw_image` the images will still be displayed correctly.

In [113]:
X, cmaps_dict = load_mat_img(img='lighthouse.mat', img_info='X', cmap_info={'map', 'map2'})
X = X - 128.0

Examine the functions ```rowdec``` and ```rowint``` and check that you understand how they work. Decimation is achieved by selecting every second element of a vector using selection indices `i:i+c:2`, and filtering is performed with symmetric extension as in ```convse```, except that we do not bother to calculate the filter outputs that are to be discarded by the decimation process. Interpolation is achieved by loading every second element of a double-size vector using `X2 = np.zeros((r, c2), dtype=X.dtype)`, `X2[:, ::2] = X`. The intermediate samples of the interpolated vector x must be zero before the vector is passed through the interpolation filter.

In [114]:
# IPython.display.Code(inspect.getsource(rowdec), language="python")

In [115]:
# IPython.display.Code(inspect.getsource(rowint), language="python")

If the small lowpass image `X1` and the full-size highpass image `Y0`
are transmitted to a distant decoder, then the decoder can exactly reconstruct
the original image by interpolating `X1` up to full size and adding in
`Y0` (which represents the error between the original and the interpolated
`X1`).  We have achieved image compression if `X1` and `Y0` can be
transmitted with fewer bits than `X`.  Usually this will be the case
because `Y0` contains so much less energy than `X`, and `X1` is
only one quarter of the size of `X`.  However we do start at a
disadvantage because there are 25% more samples to code.  Many of the `Y0` samples may be represented by zero, and we shall show later that runs of
zeros may be coded with relatively few bits.

The quarter-size lowpass image `X1` may be further subsampled, using the
same process as was applied to `X`, so that it may be transmitted as a
one-sixteenth-size lowpass image `X2` and a quarter-size highpass image
`Y1`.  This usually achieves further data compression and may be repeated
as many times as is desired (until, for typical images, no further compression
is achieved).  This leads to a pyramid of highpass images and a final tiny
lowpass image.  Usually three or four layers of the pyramid are sufficient to
give maximum compression.

Write a `py4enc(X, h)` function to generate a 4-layer pyramid, so that `X` is split into four
highpass images, `Y0 Y1 Y2 Y3`, each a quarter of the size of its predecessor, plus a tiny
lowpass image `X4`, which is a quarter of the size of `Y3`.

In [116]:
# def py4enc(X, h):
#     # your code here
#     # warnings.warn("you need to write this!")
    
#     X1 = imgFD(rowdec, X, h)
#     Z1 = imgFD(rowint, X1, 2*h)
#     Y0 = X-Z1
    
#     X2 = imgFD(rowdec, X1, h)
#     Z2 = imgFD(rowint, X2, 2*h)
#     Y1 = X1-Z2
    
#     X3 = imgFD(rowdec, X2, h)
#     Z3 = imgFD(rowint, X3, 2*h)
#     Y2 = X2-Z3
    
#     X4 = imgFD(rowdec, X3, h)
#     Z4 = imgFD(rowint, X4, 2*h)
#     Y3 = X3-Z4
    
#     # remove these lines when implementing your code - they are here to show you what shape the plots below should be
# #     Y0 = np.full(X.shape, np.nan)
# #     Y1 = np.full(X.shape >> np.int_(1), np.nan)
# #     Y2 = np.full(X.shape >> np.int_(2), np.nan)
# #     Y3 = np.full(X.shape >> np.int_(3), np.nan)
# #     X4 = np.full(X.shape >> np.int_(4), np.nan)

#     return Y0, Y1, Y2, Y3, X4

We can then plot the results to check them using the `beside` helper function:

In [117]:
from cued_sf2_lab.laplacian_pyramid import beside

In [118]:
# Y0, Y1, Y2, Y3, X4 = py4enc(X, h)
Ys, X4 = encode(X, h, 4)

Y0, Y1, Y2, Y3 = Ys
fig, ax = plt.subplots(figsize=(8, 4))
plot_image(beside(Y0, beside(Y1, beside(Y2, beside(Y3, X4)))), ax=ax);

<IPython.core.display.Javascript object>

If we wish to see the images separately from each other, instead of using our `beside` function that was written to match the old Matlab version, we can draw this directly with `matplotlib` using [`plt.subplots`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.subplots.html) to create a set of subplots, and the `width_ratios` argument to [`GridSpec`](https://matplotlib.org/stable/api/_as_gen/matplotlib.gridspec.GridSpec.html):

In [119]:
titles = ["Y_0", "Y_1", "Y_2", "Y_3", "X_4"]
imgs = [Y0, Y1, Y2, Y3, X4]
fig, axs = plt.subplots(1, 5, figsize=(8, 4),
                        gridspec_kw=dict(width_ratios=[img.shape[0] for img in imgs]))

for ax, img, title in zip(axs, imgs, titles):
    plot_image(img, ax=ax)
    ax.set(yticks=[], title=f'${title}$')

<IPython.core.display.Javascript object>

If we wish to see the images at the same scale as each other, we can use:

In [120]:
titles = ["Y_0", "Y_1", "Y_2", "Y_3", "X_4"]
imgs = [Y0, Y1, Y2, Y3, X4]
fig, axs = plt.subplots(1, 5, figsize=(8, 2))

for ax, img, title in zip(axs, imgs, titles):
    plot_image(img, ax=ax)
    ax.set(yticks=[], title=f'${title}$')

<IPython.core.display.Javascript object>

Get a demonstrator to check that your images look correct, and
then write another function `py4dec` to decode `X4` and
`Y3 Y2 Y1 Y0` into a set of lowpass images `Z3 Z2 Z1 Z0`.
(`Z3` is obtained by interpolating `X4` and adding
`Y3`, and then `Z2` is obtained from `Z3` and `Y2`, and
so on.)

In [121]:
# def py4dec(Y0, Y1, Y2, Y3, X4, h):
#     # your code here
#     # warnings.warn("you need to write this!")
#     Z3 = imgFD(rowint, X4, 2*h) + Y3
#     Z2 = imgFD(rowint, Z3, 2*h) + Y2
#     Z1 = imgFD(rowint, Z2, 2*h) + Y1
#     Z0 = imgFD(rowint, Z1, 2*h) + Y0
       

#     # remove these lines when implementing your code - they are here to show you what shape the plots below should be
# #     Z0 = np.full(Y0.shape, np.nan)
# #     Z1 = np.full(Y0.shape >> np.int_(1), np.nan)
# #     Z2 = np.full(Y0.shape >> np.int_(2), np.nan)
# #     Z3 = np.full(Y0.shape >> np.int_(3), np.nan)

#     return Z3, Z2, Z1, Z0

If all is correct, `Z0` should be identical to
`X`.  You can check that your function is correct by calculating `np.max(abs(X - Z0))` and also displaying your pyramid of decoded images, `Z3` to `Z0`.

In [122]:
h = 0.25*np.array([1, 2, 1])
# Z3, Z2, Z1, Z0 = py4dec(Y0, Y1, Y2, Y3, X4, h)
Zs = decode(Ys, X4, h)

encode_decode_err = np.max(np.abs(X - Zs[0]))
print(f'Encode-Decode Error = {encode_decode_err}')

# your code here to plot the images

titles = ["Z_0", "Z_1", "Z_2", "Z_3"]
imgs = Zs
fig, axs = plt.subplots(1, 4, figsize=(8, 2))

for ax, img, title in zip(axs, imgs, titles):
    plot_image(img, ax=ax)
    ax.set(yticks=[], title=f'${title}$')

Encode-Decode Error = 0.0


<IPython.core.display.Javascript object>

For further information on the Laplacian Pyramid see Burt and Adelson [IEEE Trans. on Communications, 1983, vol 31, no 4, pp 532-540, "The Laplacian Pyramid as a compact image code"].

## 6.2 Quantisation and Coding Efficiency
To see whether data compression is possible using the above pyramid decomposition, we must calculate the approximate number of bits required to code the image pyramid. This may be done using the *entropy* of the quantised image data. 

The entropy of a single data sample, whcih may randomly take one of $Q$ possible quantised values such that each value $q(i)$ has a probability $p(i)$ of being in state $i$ for $i=1\to Q$ is given by:

$$ \text{Entropy (bits/sample)} = \sum_{i=1}^Qp(i) \log_2\frac{1}{p(i)} = -\sum_{i=1}^Q p(i) \log_2p(i)$$

The entropy represents the minimum average number of bits per sample needed to code samples with the given probability distribution $p(i)$, assuming that an ideal variable-length entropy code is used, and that the samples are uncorrelated with each other. Arithmetic codes can get arbitrarily close to this bit rate, and simpler Huffman codes can also get vary close with many typical signals (you will be using Huffman codes later on). It is possible to code signals at bit rates less than the entropy, if the samples are correlated, but for simplicity we shall ignore this here.

To demonstrate the validity of the above formula, first consider a signal with 8 quantised values of equal probability $p(i) = 1/8$ for all $i$. The entropy is then $8 \times 1/8 \times \log_2(8) = 3$ bits per sample, as expected. 

Now consider a signal with only 3 values, with probabilities $p(1)=1/2$,  $p(2) = p(3) = 1/4$. The entropy is then $\frac{1}{2} \log_2(2) + 2 \times \frac{1}{4} \log_2(4) = 1.5$ bits per sample. This is consistent with using a single bit '0' to represent state 1 and two bits, '10' and '11' to represent states 2 and 3.

The function `bpp` has been written to calculate the entropy in bits per pixel of an image matrix `X`. First the function computes a histogram of `X` to determine the probabilities $p(i)$ and then it calculates the entropy, using the above formula.

In [123]:
from cued_sf2_lab.laplacian_pyramid import bpp

In the Laplacian Pyramid, the total number of bits is obtained by multiplying each of the sub-image entropies by the number of pixels in each corresponding sub-image. However, in order to compress the data, we also need to quantise the images. We have also provided a function `quantise` which will quantise `X` in steps centred on integer multiples of `step`. Hence `bpp(quantise(X, step))` will return the entropy of image `X` quantised in steps of `step`.

In [124]:
from cued_sf2_lab.laplacian_pyramid import quantise

<div class="alert alert-block alert-danger">

Calculate the entropies of images `X` `X1` `Y0` and hence the total numbers of bits to encode `X`, or `X1` and `Y0`, when quantised to a step size of 17 (which gives 15 distict grey levels if applied to a lowpass image with intensities from -127 to 127. Find the data compression for this simple one-stage pyramid, and then investigate the improvements from using more layers.
</div>

Entropy: <br> X:3.48 <br> X1:3.41 <br> Y0:1.62 <br>

Total number of bits: <br> X: 228119.03651868744<br> X1:  55929.71067909868<br> Y0:  106223.16163959922 <br>

Compression ratio: 1 1.40681465 1.59924733 1.64988645 1.66082141 1.66210453 1.66204739 <br>
Increases with more layers, but plateaus quickly at around 1.66

In [125]:
# Write your code to calculate compression ratios here.
step = 17
X1 = imgFD(rowdec, X, h)
Z = imgFD(rowint, X1, 2*h)
Y0 = X-Z

bX = bpp(quantise(X, step))
bX1 = bpp(quantise(X1, step))
bY0 = bpp(quantise(Y0, step))

print("Bits per pixel:")
print("X: ", bX)
print("X1: ", bX1)
print("Y0: ", bY0)

print("\nTotal bits to encode:")
print("X: ", bX*X.shape[0]*X.shape[1])
print("X1: ", bX1*X1.shape[0]*X1.shape[1])
print("Y0: ", bY0*Y0.shape[0]*Y0.shape[1])
print("X1+Y0: ", bX1*X1.shape[0]*X1.shape[1]+bY0*Y0.shape[0]*Y0.shape[1], end= "\n\n")

print("X: ", gettbit(X, h, 0, [17]))
print("X1+Y0: ", gettbit(X, h, 1, [17]*2))
print("Compression ratio:\n", gettbit(X, h, 0, [17])/gettbit(X, h, 1, [17]*2))

assert(gettbit(X, h, 0, [17]) == bX*X.shape[0]*X.shape[1])
assert(gettbit(X, h, 1, [17]*2) == bX1*X1.shape[0]*X1.shape[1]+bY0*Y0.shape[0]*Y0.shape[1])

layers = np.arange(0, 8, 1)
stepsizes = [[17]*(i+1) for i in layers]
tbits = np.empty(len(layers))
for i in layers:
    tbit = gettbit(X, h, layers[i], stepsizes[i])
    tbits[i] = tbit

print(tbits[0]/np.array(tbits))

Bits per pixel:
X:  3.480820259379386
X1:  3.4136786303160815
Y0:  1.620836817010486

Total bits to encode:
X:  228119.03651868744
X1:  55929.71067909868
Y0:  106223.16163959922
X1+Y0:  162152.8723186979

X:  228119.03651868744
X1+Y0:  162152.8723186979
Compression ratio:
 1.406814651240581
[1.         1.40681465 1.59924733 1.64988645 1.66082141 1.66210453
 1.66204739 1.66200458]


Since compressing an image will generally result in a reduction in quality, we also need a way to measure this quality reduction. It is actually quite hard to find a quality measure whcih matches individual perceptions of how an image has been changed due to compression, and for that reason it is important to always judge and comment on an image visually. However we also need a quantitative measure, and the most obvious is the rms error (standard deviation) between the input and compressed image (i.e. using ```np.std(X - Z)``` where `Z` is the compressed image).

<div class="alert alert-block alert-danger">
Quantise the Laplacian pyramid with a step size of 17, and reconstruct the output image from the decoding pyramid. Look at the visual features, and calculate the rms error (standard deviation) between the input image and the decoding pyramid output image. Repeat this for more layers in the pyramid.
</div>

RMS: <br>
Layer 0 : 4.861168497356846 <br>
Layer 1 : 5.382782204619935 <br>
Layer 2 : 6.067419036722591 <br>
Layer 3 : 6.751868881610953 <br>
Layer 4 : 7.629817744060247 <br>
Layer 5 : 8.45124264534953 <br>
Layer 6 : 9.203373685774112 <br>
Layer 7 : 9.405961872018967 <br>

Visual features of pyramid: <br>
Very patchy, blurry, loss of resolution even though most features can still be seen (fence, roof, tower, lamppost)

In [126]:
# IPython.display.Code(inspect.getsource(encode), language="python")

In [127]:
# IPython.display.Code(inspect.getsource(decode), language="python")

In [128]:
# # Write your code to explore rms error with the laplacian pyramid here.

# def rmsError(img, filt, layer, step, indiv = False):
#     """
#     Returns RMS error for quantised laplacian pyramid of specified level (rms), and decoded image (Zs[0])
    
#     Parameters:
#         img: Image matrix
#         filt: Filter coefficients
#         layers: Number of layers in Laplacian Pyramid
#         step: List of quantisation step sizes
#         indiv: False: Return total RMS, True: Return individual RMS of layers
        
#     Returns:
#         rms: Root Mean Squared error of quantized image at desired layer and step size
#              Type: Integer if indiv ==  False, Array if indiv == True
#         Zs[0]: Decoded image (to original size)
        
#     """
#     assert len(step)==layer+1, "Size of step array not compatible with number of layers"
    
#     # Calculate total RMS value across all layers
#     if not indiv:
#         Ys, Xn = encode(img, filt, layer)
#         Xq = quantise(Xn, step[-1])
#         Qs = [] # Quantised images of Ys
#         for j in range (0, layer):
#             Qs.append(quantise(Ys[j], step[j]))
#         Zs = decode(Qs, Xq, filt)
#         rms = np.std(img-Zs[0])
    
#     # Calculate RMS contribution by individual layers
#     # Decoded image not relevant
#     else:
#         rms = np.zeros(layer+1)
#         # Quantize individual Yn layer and calculate RMS
#         for i in range(0, layer):
#             Ys, Xn = encode(img, filt, layer)
#             Ys[i] = quantise(Ys[i], step[i])
#             Zs = decode(Ys, Xn, filt)
#             rms[i] = np.std(img-Zs[0])
            
#         # Quantize Xn only and calculate RMS
#         Ys, Xn = encode(img, filt, layer)
#         Xq = quantise(Xn, step[-1])
#         Zs = decode(Ys, Xq, filt)
#         rms[-1] = np.std(img-Zs[0])
        
#     return rms, Zs[0]


# # # Return RMS errors for quantised laplacian of levels 0 - layers (rmsErr, list), and decoded images (Ds, list)
# # def rmsError(X, h, layers = 4, step = 17):
# #     rmsErr = np.empty(layers+1)
# #     Ds = []
# #     for i in range (0, layers+1):
# #         Ys, Xn = encode(X, h, i)
# #         Xq = quantise(Xn, step)
# #         Qs = []
# #         for j in range (0, i):
# #             Qs.append(quantise(Ys[j], step))
# #         Zs = decode(Qs, Xq, h)
# #         rmsErr[i] = np.std(X-Zs[0])
# #         Ds.append(Zs[0])
# #     return rmsErr, Ds

In [129]:
layers = np.arange(0,8,1)

Zs = []
for i in range (0, len(layers)):
    step = [17]*(layers[i]+1)
    rms, Z0 = rmsError(X, h, layers[i], step)
    print("Layer", i, ":", rms)
    Zs.append(Z0)


Layer 0 : 4.861168497356846
Layer 1 : 5.382782204619935
Layer 2 : 6.067419036722591
Layer 3 : 6.751868881610953
Layer 4 : 7.629817744060247
Layer 5 : 8.45124264534953
Layer 6 : 9.203373685774112
Layer 7 : 9.405961872018967


Note that we call this error the rms error, but in fact we calculate the standard deviation, which only equals the true rms error if the mean error is zero. However the eye is very insensitive to small errors in the mean level of images, so the standard deviation (which ignores the mean) is a better measure of image quality. 

<div class="alert alert-block alert-danger">
Quantise the original image with the same step size (17) and note the visual features and rms error. Compare these results from the pyramid scheme above. Why are the rms errors larger in the pyramid scheme? 
</div>

Visual features: <br>
image becomes quite patchy due to quantisation, looks like some artwork lol<br>
most obvious from the sky <br>
with more layers, the image becomes very patchy, very low res, due to loss of information during quantisation

RMS errors: <br>
increases with more layers in the pyramid<br>
quantisation results in errors in decoding, the deeper down the layer u go the more error is generated as the decoder literally doubles the rows/cols which scales the error up


In [130]:
images = [X] + Zs
indices = ['Original'] + list(np.arange(0, len(images)-1, 1))


image = [X, Zs[0], Zs[4], Zs[7]]
index = ['Original', 0, 4, 7]

plotImg(image, cols = 4, title = 'Level: ', index = index, cmap = 'gray', save = True, name = 'Report 1 Figures\Quantised.png')

<IPython.core.display.Javascript object>

Comparisons of the number of bits with different coding strategies are only valid if they result in approximately the same image quantisation error. Write a function which will optimise the step size (resulting in a non-integer value) in the Laplacian scheme until the rms error is the same as for direct quantisation; you will find this optimisation useful for later investigations too.
<div class="alert alert-block alert-danger">
Investigate what step size of the quantisers for the pyramid scheme you need, in order to get approximately the same error as for direct quantisation at a step size of 17.
</div>

Quantisation level: <br>
0: 17 <br>
1: 15.186 <br>
2: 13.265 <br>
3: 11.607<br>
4: 10.303<br>
5: 9.429<br>
6: 8.869<br>
7: 8.675<br>

Obtained using one dimensional line search, tried using scipy optimise minimize but its very sensitive to intial position, though possible to get very low errors (but not rly required in this case).

In [131]:
# Write your optimisation and step size selection here

# RMS error for direct quantisation
rms0, D0 = rmsError(X, h, 0, [17])
# print(rms)
# plotImg([D0], 1)

levels = np.arange(0, 8, 1)
initGuess = [17.00, 15.186, 13.27, 11.62, 10.33, 9.45, 8.89, 8.72]
stepratio = [[1.0]*(i+1) for i in levels]
# print(levels, stepratio)

steps, Ds = findSteps(X, h, levels, stepratio, initGuess, rms0)

# # Print result
# for level in steps:
#     print('Level', level, ':', steps[level]['Ratio'])

# plotImg([X]+Ds, cols = 3, cmap = 'gray', title = "Level: ", save = True, name = 'Constant Step Lap 3.png')
# Error visualisation
# Dminus = Ds - D0
# plotImg(Dminus, cols = 5, cmap = 'gray')
Z_const_step_3 = Ds[2]

In [132]:
for level in steps:
    print('Level', level, ':', steps[level]['Step'])

# plotImg(Ds, cols = 4, cmap = 'gray', title = 'Level: ', save = True, name = 'Constant MSE Lap 3.png')
index = [0, 1, 2, 3]
plotImg(Ds[:4], cols = 2, cmap = 'gray', index = index, title = 'Level: ', save = True, name = 'Report 1 Figures\Constant Step Lap 3.png')

Level 0 : [17.]
Level 1 : [15.186 15.186]
Level 2 : [13.265 13.265 13.265]
Level 3 : [11.607 11.607 11.607 11.607]
Level 4 : [10.3035 10.3035 10.3035 10.3035 10.3035]
Level 5 : [9.4295 9.4295 9.4295 9.4295 9.4295 9.4295]
Level 6 : [8.8695 8.8695 8.8695 8.8695 8.8695 8.8695 8.8695]
Level 7 : [8.675 8.675 8.675 8.675 8.675 8.675 8.675 8.675]


<IPython.core.display.Javascript object>

In many of your results from now on you will need to expres the performance of your algorithms in terms of *compression ratio*, which is normally defined as:

$$ \text{Compression Ratio} = \frac{\text{Total bits for reference scheme}}{\text{Total bits for compressed scheme}} $$

Usually the *reference* scheme is the direct pixel quantisation method with its quantiser adjusted to give the same rms error as the scheme being evaluated (the *compressed* scheme). For good schemes we try to make the compression ratio as large as possible.

We now investigate the effect of using different step sizes for the different levels of the pyramid. There are many schemes for varying the step sizes between different levels: in this project we shall look at the *equal MSE* criterion. In the *equal MSE* scheme, step sizes are chosen such that quantisers in each layer contribute equally to the Mean Squared Error of the reconstructed image. In general the step sizes will depend on the image signal being coded. However this can be achieved approximately by choosing a separate step size for each layer such that:
* A single impulse of that step size will give a filtered pulse in the reconstructed image which has the same energy, whichever layer of the decoder the impulse excites.

### Impulse Response Measurement
Investigate the effect of a single impulse in a particular layer (e.g. **Y0,Y1** etc.) as it appears in the reconstructed image **Z0**. This can be done by first generating a test pyramid image, which is zero everywhere. Then place an impulse (e.g. of amplitude 100) in the centre of one layer, reconstruct the entire pyramid to give **Z0**, then measure the total energy of **Z0**. If this is repeated for each layer, you will have measured how much energy a fixed size impulse at each layer contributes to the decoded image.

We actually want to arrange for the impulse sizes to vary and the energy to stay the same. Therefore the impulse sizes (and hence chosen quantisation steps) we require in each level will be inversely proportional to the square root of the energies measured above. The important result is the *ratio* of the step sizes between layers. If this ratio is maintained for any overall quantiser scaling, we will get an (approximately) *equal MSE* scheme.

<div class="alert alert-block alert-danger">
Find new values for the data compression achievable when all schemes (i.e. constant step size or equal MSE, both with varying layer depth) produce the same rms error between the decoded image and the original image. Comment on the differences in compression, visual quality and rms error, and the optimum choice of layer depth in each case.
</div>

Compression ratios for equal RMS, constant step size:<br>
1.<br> 1.32894792 <br>1.38898136 <br>1.32482148 <br>1.24861567 <br>1.19249693<br>1.15477753

Compression ratio best for 2 layer pyramid, then decreases as more layers are involved. Likely due to increase in number of bits to encode as a result of more layers, but overall still better than directly quantising.
<br>

Compression ratios for equal RMS, equal MSE:
1. <br>1.39420491<br>1.54140208 <br>1.54887682 <br>1.54983322 <br>1.54861791<br>1.54812392 <br>1.54807039<br>

Compression ratio best for 4 layer pyramid, then decreases with more layers involved. Same reason as before. Generally performs better than equal stepsize quantisation, due to less emphasis (smaller stepsize) in deeper layers

In [133]:
# Find the compression ratios for constant step sizes 
levels = list(steps.keys())
stepsizes = [steps[level]["Step"] for level in levels]
# print(levels, stepsizes)

ratios = compRatio(X, h, levels, stepsizes)
print("Compression ratios for equal RMS, constant step size:\n", ratios)

# levels = [0, 1]
# stepsizes = [17, 17]
# ratios = compRatio(X, h, levels, stepsizes)
# print(ratios)

Compression ratios for equal RMS, constant step size:
 [1.         1.32894792 1.38898136 1.32482148 1.24861567 1.19249693
 1.15477753 1.14206435]


In [134]:
# Impulse response measurement
test = np.zeros((X.shape[0], X.shape[1]))
layers = 7

Ze = impRes(test, h, layers)
assert len(Ze) == layers+1, "Error in length of impulse energy"
print(Ze)

[1.00000000e+04 2.25000000e+04 7.56250000e+04 2.88906250e+05
 1.14222656e+06 4.55555664e+06 1.82088892e+07 2.89571376e+08]


In [135]:
# Find the step size ratios for equal MSE
ssratio = 1/np.sqrt(Ze)
print(ssratio)

print("Normalised: ", np.round(ssratio/ssratio[0],3))

[1.00000000e-02 6.66666667e-03 3.63636364e-03 1.86046512e-03
 9.35672515e-04 4.68521230e-04 2.34346393e-04 5.87654661e-05]
Normalised:  [1.    0.667 0.364 0.186 0.094 0.047 0.023 0.006]


In [136]:
# Find the quantisation step size for equal mse, matching the reference RMS error
level = 7
levels = np.arange(0, level+1, 1)
# stepratio sum to 1
# stepratio = [ssratio[:i+1]/sum(ssratio[:i+1]) for i in levels]

# stepratio normalised
stepratio = [ssratio[:i+1]/np.linalg.norm(ssratio[:i+1]) for i in levels]

# print(stepratio)
# stepratio sum to 1
# initGuess = [17.0, 31, 37.7, 40.2, 41.9, 42.7, 43.1, 43.2]

# stepratio normalised
initGuess = [17.0, 22.35, 23.27, 23.05]
initGuess += [23.07]*(len(levels)-len(initGuess))


assert len(initGuess) == level+1 == len(levels)

rms0, D0 = rmsError(X, h, levels[0], stepratio[0]*initGuess[0])
print(rms0)
steps, Ds = findSteps(X, h, levels, stepratio, initGuess, rms0, disp = False)
Z_const_MSE_3 = Ds[4]

4.861168497356846


In [137]:
for level in steps:
    print('Level', level, ':', steps[level]['Step'])

# plotImg(Ds, cols = 4, cmap = 'gray', title = 'Level: ', save = True, name = 'Constant MSE Lap 3.png')
index = [2, 3, 4, 5, 6, 7]
plotImg(Ds[2:8], cols = 3, cmap = 'gray', index = index, title = 'Level: ', save = True, name = 'Report 1 Figures\Constant MSE Lap 3.png')

Level 0 : [17.]
Level 1 : [18.57885102 12.38590068]
Level 2 : [18.51340777 12.34227185  6.73214828]
Level 3 : [18.10392458 12.06928306  6.5832453   3.36817202]
Level 4 : [18.0922599  12.0615066   6.5790036   3.36600184  1.69284303]
Level 5 : [18.08001505 12.05334337  6.57455093  3.36372373  1.69169731  0.84708709]
Level 6 : [18.08127297 12.05418198  6.57500835  3.36395776  1.69181502  0.84714603
  0.42372811]
Level 7 : [18.07911811 12.05274541  6.57422477  3.36355686  1.69161339  0.84704507
  0.42367761  0.10624278]


<IPython.core.display.Javascript object>

In [138]:
# Check equal mse
levels = list(steps.keys())
stepsizes = [steps[level]["Step"] for level in levels]

Zs = []
for i in range (0, len(levels)):
    level = levels[i]
    step = stepsizes[i]
    rms, Z0 = rmsError(X, h, level, step)
    print("Layer", i, ":", rms)
    Zs.append(Z0)

Layer 0 : 4.861168497356846
Layer 1 : 4.865807529435982
Layer 2 : 4.865024815699314
Layer 3 : 4.866504176482655
Layer 4 : 4.864813919452669
Layer 5 : 4.865861499965295
Layer 6 : 4.865997137174737
Layer 7 : 4.864960495315342


In [139]:
# Find the compression ratios for equal mse
levels = list(steps.keys())
stepsizes = [steps[level]['Step'] for level in levels]
# print(levels, stepsizes)

level = 7
testimg = np.ones(X.shape)*100
testimg[128][128] = 200

# img = testimg
# rms, D = rmsError(img, h, level, stepsizes[level], indiv = True)
# print(rms)
# print(np.sum(rms[1:]))
# rms, D = rmsError(img, h, level, stepsizes[level], indiv = False)
# print (rms)

ratios = compRatio(X, h, levels, stepsizes)
print("Compression ratios for equal RMS, equal MSE:\n", ratios)

Compression ratios for equal RMS, equal MSE:
 [1.         1.39420491 1.54140208 1.54887682 1.54983322 1.54861791
 1.54812392 1.54807039]


## 6.3 Changing the decimation / interpolation filter
It is also worth investigating whether a more complicated decimation / interpolation filter **h** can improve the compression. The z-transfer function of the filter used so far is

$$ h(z) = \left(\frac{1 + z^{-1}}{2}\right)^m $$

where $m = 2$. If $m$ is increased to 4(so the number of taps remains odd), the vector representation of **h** is \[1/16 4/16 6/16 4/16 1/16\]. This filter has a lower cut-off frequency than when $m=2$, so you should find that each interpolated lowpass image in the pyramid is a little more blurred, and there is a little more energy left in each highpass image.

<div class="alert alert-block alert-danger">
Optimise the step sizes for the pyramid with this new filter and determine the new compression performance and visual features.
</div>

In [140]:
# Repeat the previous experiments with a new h here
X, cmaps_dict = load_mat_img(img='lighthouse.mat', img_info='X', cmap_info={'map', 'map2'})
h1 = np.array([1, 4, 6, 4, 1]) / 16.0
layers = np.arange(0, 8, 1)

In [141]:
rms0, D0 = rmsError(X, h, 0, [17])
print(rms0)
rms0 = np.std(X - quantise(X,17))
print(rms0)

4.933982181978006
4.933982181978006


In [142]:
# Compression ratios for step size 17
stepsizes = [[17]*(i+1) for i in layers]
tbits = np.empty(len(layers))
for i in layers:
    tbit = gettbit(X, h1, layers[i], stepsizes[i])
    tbits[i] = tbit

print(tbits[0]/np.array(tbits))

levels = list(steps.keys())
stepsizes = [steps[level]["Step"] for level in levels]
# print(levels, stepsizes)

ratios = compRatio(X, h1, levels, stepsizes)
print("Compression ratios for equal RMS, constant step size:\n", ratios)

[1.         1.31118347 1.46812891 1.50732269 1.5138746  1.51459033
 1.51448262 1.5145178 ]
Compression ratios for equal RMS, constant step size:
 [1.         1.30136401 1.42052995 1.4194909  1.41842907 1.41672147
 1.4165609  1.41637448]


In [143]:
# RMS error for step size 17
Zs = []
for i in range (0, len(layers)):
    step = [17]*(layers[i]+1)
    rms, Z0 = rmsError(X, h1, layers[i], step)
    print("Layer", i, ":", rms)
    Zs.append(Z0)

Layer 0 : 4.933982181978006
Layer 1 : 5.117620910247526
Layer 2 : 5.582032083104553
Layer 3 : 6.393026408274269
Layer 4 : 7.067992007632795
Layer 5 : 7.348970121839558
Layer 6 : 8.119803898464646
Layer 7 : 7.661901518373323


In [144]:
# Decoded images for step size 17
image = [X] + Zs
index = ['Original'] + list(np.arange(0, len(images)-1, 1))


# image = [X, Zs[0], Zs[4], Zs[7]]
# index = ['Original', 0, 4, 7]

plotImg(image, cols = 4, title = 'Level: ', index = index, cmap = 'gray', save = True, name = 'Report 1 Figures\Quantised5.png')

<IPython.core.display.Javascript object>

In [145]:
# # rmsError function wrong?? fix it

# # initGuess = [17.00, 15.186, 13.27, 11.62, 10.33, 9.45, 8.89, 8.72]
# levels = np.arange(0,8,1)
# # Prob for the wrong rms0 (4.86 one)
# # initGuess = [16.82, 16.20, 14.52, 12.82, 12.11, 11.35, 10.53, 9.36]

# # For the new rms0 (4.93)
# # initGuess = [17.02, 16.40, 14.80, 13.27, 12.17, 11.12, 10.01, 8.75]

# # From dhilen
# initGuess = [17.02, 16.90, 16.90, 16.52, 16.52, 16.52, 16.52, 16.52]

# # initGuess = [10.00, 9.5]
# stepratio = [[1.0]*(i+1) for i in levels]

# rms0, D0 = rmsError(X, h1, 0, [17])
# # rms0, D0 = rmsError(X, h1, levels[0], stepratio[0]*initGuess[0])

# print(rms0)

# steps, Ds = findSteps(X, h1, levels, stepratio, initGuess, rms0, disp = True)

# # Print result
# for level in steps:
#     print('Level', level, ':', steps[level]['Ratio'], steps[level]['RMS'])

# # plotImg(Ds[:3], cols = 3, cmap = 'gray', title = "Level: ", save = True, name = 'Constant Step Lap 3.png')

In [146]:
# Impulse response measurement
test = np.zeros((X.shape[0], X.shape[1]))
layers = 7

Ze = impRes(test, h1, layers)
assert len(Ze) == layers+1, "Error in length of impulse energy"
print(Ze)

# Find the step size ratios for equal MSE
ssratio = 1/np.sqrt(Ze)
print(ssratio)

print("Normalised: ", np.round(ssratio/ssratio[0],3))

[1.00000000e+04 1.19628906e+04 3.90293980e+04 1.49228199e+05
 5.90402949e+05 2.35519106e+06 9.41436541e+06 2.50385627e+08]
[1.00000000e-02 9.14285714e-03 5.06178942e-03 2.58865724e-03
 1.30144477e-03 6.51608781e-04 3.25915094e-04 6.31968312e-05]
Normalised:  [1.    0.914 0.506 0.259 0.13  0.065 0.033 0.006]


In [147]:
from scipy.optimize import fsolve



In [148]:
# stepratio normalised
stepratio = [ssratio[:i+1]/ssratio[0] for i in levels]

initGuess = [24]*8


# assert len(initGuess) == level+1 == len(levels)

# rms0, D0 = rmsError(X, h1, levels[0], stepratio[0]*initGuess[0])
# steps, Ds = findSteps(X, h1, levels, stepratio, initGuess, rms0, disp = True)

# for level in steps:
#     print('Level', level, ':', steps[level]['Step'])

# plotImg(Ds[:5], cols = 5, cmap = 'gray', title = 'Level: ', save = True, name = 'Constant MSE Lap 3.png')

In [149]:
# Check equal mse
levels = list(steps.keys())
stepsizes = [steps[level]["Step"] for level in levels]

Zs = []
for i in range (0, len(levels)):
    level = levels[i]
    step = stepsizes[i]
    rms, Z0 = rmsError(X, h1, level, step)
    print("Layer", i, ":", rms)
    Zs.append(Z0)

Layer 0 : 4.933982181978006
Layer 1 : 4.761981543710789
Layer 2 : 4.848143266367848
Layer 3 : 4.809933165418528
Layer 4 : 4.819668884441683
Layer 5 : 4.814013959342189
Layer 6 : 4.811443721425479
Layer 7 : 4.807975501653034


In [150]:
# Get decoded images for constant step sizes
levels = np.arange(0,8,1)
stepscale = [17.00, 16.16161616161616, 14.434343434343434, 12.898989898989898, 11.555555555555555, 11.363636363636363, 10.595959595959595, 10.97979797979798]
stepsizes = [[1.0*stepscale[i]]*(i+1) for i in levels]

Zs_step5 = []
for i in range(len(levels)):
    Ys, Xn = encode(X, h1, levels[i])
    Xq = quantise(Xn, stepsizes[i][-1])
    Qs = []
    for j in range (0, i):
        Qs.append(quantise(Ys[j], stepsizes[i][j]))
    Zout = decode(Qs, Xq, h1)
    Zs_step5.append(Zout[0])

Z_const_step_5 = Zs_step5[2]
index = [0, 1, 2, 3]
plotImg(Zs_step5[:4], cols = 2, cmap = 'gray', title = 'Level: ', save = True, name = 'Report 1 Figures\Constant Step Lap 5.png')



<IPython.core.display.Javascript object>

In [151]:
# Get decoded images for constant MSE
levels = np.arange(0,8,1)
stepratio = [1.00, 0.914, 0.506, 0.259, 0.130, 0.065, 0.033, 0.006]
stepscale = [17.00, 16.89795918367347, 16.89795918367347, 16.51020408163265, 16.51020408163265, 16.51020408163265, 16.51020408163265, 16.51020408163265]
stepsizes = [np.array(stepratio[:i+1])*stepscale[i] for i in levels]

Zs_mse5 = []
for i in range(len(levels)):
    Ys, Xn = encode(X, h1, levels[i])
    Xq = quantise(Xn, stepsizes[i][-1])
    Qs = []
    for j in range (0, i):
        Qs.append(quantise(Ys[j], stepsizes[i][j]))
    Zout = decode(Qs, Xq, h1)
    Zs_mse5.append(Zout[0])

Z_const_MSE_5 = Zs_mse5[5]
index = [2, 3, 4, 5, 6, 7]
plotImg(Zs_mse5[2:], cols = 3, cmap = 'gray', index = index, title = 'Level: ', save = True, name = 'Report 1 Figures\Constant MSE Lap 5.png')

<IPython.core.display.Javascript object>

In [152]:
# # Confirm that step and MSE images are different for lap 5
# Z_diff5 = [Zs_step5[i]-Zs_mse5[i] for i in layers]
# plotImg(Z_diff5, cols = 5)

# First Interim Report
Discuss and explain your results, gathered so far, in your first interim report. Try to answer the questions posed in the above test. Be brief where things are straightforward, but pay more attention to detail in areas where you think something interesting is happening. Write this as a standalone report, not as a series of bullet points answering the questions posed.

In [153]:
layers = np.arange(0,8,1)

const_step_lap3 = [17.000, 15.186, 13.265, 11.60, 10.303,  9.429,  8.869,  8.675]
const_comp_lap3 = [1.000, 1.329, 1.389, 1.325, 1.249, 1.192, 1.155, 1.142]
mse_step_lap3 = [17.000, 18.579, 18.513, 18.104, 18.092, 18.080, 18.081, 18.079]
mse_comp_lap3 = [1.000, 1.394, 1.541, 1.549, 1.550, 1.549, 1.548, 1.548]



const_step_lap5 = [17.00, 16.16161616161616,
14.434343434343434,
12.898989898989898,
11.555555555555555,
10.963636363636363,
10.595959595959595,
10.57979797979798]
# const_step_lap5 = [17.00, 16.16161616161616,
# 14.434343434343434,
# 12.898989898989898,
# 11.555555555555555,
# 11.363636363636363,
# 10.595959595959595,
# 10.97979797979798]
const_comp_lap5 = [1.00, 1.2729716680098826,
1.3350379967019184,
1.2859288724925677,
1.2140525453236009,
1.1835030628417203,
1.1602188952614099,
1.1619572464894477]
# const_comp_lap5 = [1.00, 1.2729716680098826,
# 1.3350379967019184,
# 1.2859288724925677,
# 1.2140525453236009,
# 1.2035030628417203,
# 1.1602188952614099,
# 1.1819572464894477]
mse_step_lap5 = [17.00, 16.89795918367347,
16.89795918367347,
16.51020408163265,
16.51020408163265,
16.51020408163265,
16.51020408163265,
16.51020408163265]
mse_comp_lap5 = [1.00, 1.3004546736127922,
1.4585038369650054,
1.4773110260028086,
1.483996066643139,
1.4848756081244008,
1.4847548634639223,
1.4847152442365572]

In [154]:
fig, ax = plt.subplots()
plt.plot(layers, const_step_lap3, label = "3-tap Constant Step Size", marker = '.')
plt.plot(layers, mse_step_lap3, label = "3-tap Constant MSE", marker = '.')
plt.plot(layers, const_step_lap5, label = "5-tap Constant Step Size", marker = '.')
plt.plot(layers, mse_step_lap5, label = "5-tap Constant MSE", marker = '.')
plt.title("Step sizes for same RMS errors")
plt.xlabel("Layers")
plt.legend()
plt.savefig("D:\\Cambridge\\Part IIA\\Projects\\SF2-Image-Processing\\Reports\\Report 1 Figures\\Step sizes.png")
plt.show()

<IPython.core.display.Javascript object>

In [155]:
fig, ax = plt.subplots()
plt.plot(layers, const_comp_lap3, label = "3-tap Constant Step Size", marker = '.')
plt.plot(layers, mse_comp_lap3, label = "3-tap Constant MSE", marker = '.')
plt.plot(layers, const_comp_lap5, label = "5-tap Constant Step Size", marker = '.')
plt.plot(layers, mse_comp_lap5, label = "5-tap Constant MSE", marker = '.')

plt.title("Compression ratios")
plt.xlabel("Layers")
plt.legend()
plt.savefig("D:\\Cambridge\\Part IIA\\Projects\\SF2-Image-Processing\\Reports\\Report 1 Figures\\Compression ratios.png")
plt.show()

<IPython.core.display.Javascript object>

In [156]:
# [1.00 0.914 0.506 0.259 0.130 0.065 0.033 0.006]

In [157]:
# Plot images for report 2
images = [Z_const_step_3, Z_const_step_5, Z_const_MSE_3, Z_const_MSE_5]
index = ['3-tap Constant Step', '5-tap Constant Step', '3-tap Constant MSE', '5-tap Constant MSE']
plotImg(images, cols = 2, index=index, save = True, name = 'Report 2 Figures\Laplacian Comparison.png')

<IPython.core.display.Javascript object>