Osnabrück University - Computer Vision (Winter Term 2020/21) - Prof. Dr.-Ing. G. Heidemann, Ulf Krumnack, Axel Schaffland, Ludwig Schallner, Artem Petrov

# Exercise Sheet 09: Filter, Sampling, and Template Matching

## Introduction

This week's sheet should be solved and handed in before the end of **Saturday, January 16, 2019**. If you need help (and Google and other resources were not enough), feel free to contact your groups' designated tutor or whomever of us you run into first. Please upload your results to your group's Stud.IP folder.

## Exercise 0: Math recap (Covariance) [0 Points]

This exercise is supposed to be very easy, does not give any points, and is voluntary. There will be a similar exercise on every sheet. It is intended to revise some basic mathematical notions that are assumed throughout this class and to allow you to check if you are comfortable with them. Usually you should have no problem to answer these questions offhand, but if you feel unsure, this is a good time to look them up again. You are always welcome to discuss questions with the tutors or in the practice session. Also, if you have a (math) topic you would like to recap, please let us know.

**a)** What does *covariance* express?

YOUR ANSWER HERE

**b)** Provide a formula to compute the covariance of two 1-dimensional datasets. How can it be generalized to the $n$-dimensional case?

YOUR ANSWER HERE

**c)** Create and plot two (1-dimensional) datasets with low covariance (use `plt.scatter`). Then do the same for two datasets with high covariance.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

# YOUR CODE HERE

## Exercise 1: Filter design [6 points]

**a)** Create and plot the kernels of box filter and binomial filter in frequency space. Vary the kernel size. What do you observe?

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy.special import binom

kernel_size = 13 # vary this
image_size = 100

# compute the kernel
box_kernel = np.ones((image_size, image_size))
binomial_kernel = np.zeros((image_size, image_size))

for col in range(kernel_size):
    print(binom(kernel_size, col))

box_frequencies = np.fft.fftshift(np.fft.fft2(box_kernel))
binomial_frequencies = np.fft.fftshift(np.fft.fft2(binomial_kernel))

# plot kernel in frequency space
plt.figure(figsize=(12, 6))
plt.gray()
plt.subplot(1, 2, 1)
plt.title(f"Box filter (size={kernel_size})")
plt.imshow(np.abs(box_frequencies))
plt.plot(np.arange(image_size), np.abs(box_frequencies)[image_size // 2] * image_size)
plt.ylim(0, image_size - 1)
plt.subplot(1, 2, 2)
plt.title(f"Binomial filter (size={kernel_size})")
plt.imshow(np.abs(binomial_frequencies))
plt.plot(np.arange(image_size), np.abs(binomial_frequencies)[image_size // 2] * image_size)
plt.ylim(0, image_size - 1)
plt.show()

YOUR ANSWER HERE

**b)** Implement a low pass filter and apply it to the given image with different cutoff frequencies $F_\max$. What do you observe? Explain that observation and discuss how to improve the result.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import imageio

f_max = 33323000
image = imageio.imread('imageio:camera.png')

def low_pass_filter(image, f_max):
    result = image
    # YOUR CODE HERE

    # computes the n-dimensional discrete Fourier transform
    ft = np.fft.fft2(image)

    for x in range(ft.shape[0]):
        for y in range(ft.shape[1]):
            # passes signals with a frequency lower than a selected cutoff frequency
            # (remove the higher frequencies in a signal of data)
            if ft[x][y].real >= f_max:
                ft[x][y] = ft.min().real

    # apply FT to the resulting FT img
    restored = np.flip(np.flip(np.abs(np.fft.fft2(ft)), 0), 1)

    return restored

filtered_image = low_pass_filter(image, f_max=f_max)

# plot kernel in frequency space
plt.figure(figsize=(18,5))
plt.gray()
plt.subplot(1,3,1); plt.title(f"Original image")
plt.imshow(image) 
plt.subplot(1,3,2); plt.title(f"Lowpass-filterered image ($F_\max$={f_max})")
plt.imshow(filtered_image)
plt.subplot(1,3,3); plt.title(f"Difference")
plt.imshow(filtered_image-image)
plt.show()

YOUR ANSWER HERE

**c)** What is a good kernel size for a Gaussian filter? Justify your answer.

YOUR ANSWER HERE

**d)** Describe impulse ("salt and pepper") noise and explain what kind of filter should be used to remove such noise.

YOUR ANSWER HERE

## Exercise 2: Sampling theorem [4 points]

**a)** Express the statement of the sampling theorem in your own words. Explain its relevance.

The **sampling theorem** conditions a sample rate such that a discrete sequence of samples captures all the information from a continuous signal.  
So, it's enabling us to sample a signal in such a way to not lose information. 

The sampling theorem states that if we want to sample a signal we need samples with a frequency larger than twice the maximum frequency 
contained in the signal to not lose any information. If a system uniformly samples an analog signal at a rate that exceeds the signal’s highest 
frequency by at least a factor of two, the original analog signal can be perfectly recovered from the discrete values produced by sampling.
In contrast, if we sample an analog signal at a frequency that is lower than the suggested rate, we will not be able to perfectly reconstruct the original signal.

In modern technology, we constantly have to deal with analog signals (e.g. sound picked up by microphone or light entering a digital camera),  
but in order to perform computations with them, we need digital values. Therefore, the sampling theorem is of great relevance if we don't want to lose information.

$\rightarrow$ A signal with a highest frequency of $f_{max}$ can be exactly reconstructed if the sampling frequency is $> 2 f_{max}$

**b)** Assume you are given a document printed on a 600 dpi (dots per inch) printer. If you want to scan this document, what resolution should you choose to avoid aliasing effects?

$f_{max} = 600$  
Based on the sampling theorem, the resolution (sampling rate) should be $> 2 \cdot 600 = 1200$ dpi.

**c)** What is aliasing? Explain the Moiré effect shown in the following cell.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interactive, fixed
from IPython.display import display

def moire(exp):
    return np.sin(d ** exp)

nx, ny = (600, 300)
x = np.linspace(0, nx * np.pi / ny, nx)
y = np.linspace(0, np.pi, ny)
xv, yv = np.meshgrid(x, y)
d = np.sqrt(xv ** 2 + yv ** 2)

plt.figure(figsize=(30, 6))

plt.subplot(2, 4, 1); plt.imshow(moire(1))
plt.subplot(2, 4, 2); plt.imshow(moire(2))
plt.subplot(2, 4, 3); plt.imshow(moire(3))
plt.subplot(2, 4, 4); plt.imshow(moire(4))
plt.subplot(2, 4, 5); plt.imshow(moire(5))
plt.subplot(2, 4, 6); plt.imshow(moire(6))
plt.subplot(2, 4, 7); plt.imshow(moire(7))
plt.subplot(2, 4, 8); plt.imshow(moire(8))

**Alias effects** are errors caused by sampling frequencies that are higher than half of the sampling rate and thus conflicting the sampling theorem.

The **moiré effect** is an optical effect and example for aliasing. It is a visual perception that occurs when viewing a set of lines or dots  
that is superimposed on another set of lines or dots, where the sets differ in relative size, angle, or spacing.

**d)** Gaussian pyramid: motivate the construction process of a Gaussian pyramid with the sampling theorem.

YOUR ANSWER HERE

## Exercise 3: Template Matching [4 points]

**a)** Explain in your own words the idea of *template matching*. Is it a data or model based approach? What are the advantages and disadvantages? In what situations would you apply template matching?

The idea is to take a prototypical small image of what you are looking for (template) in the image and move that template  
across the image just as in convolution to compare it to the underlying image patch with the goal of finding the part of the image that matches the template.  

It's a **model-based** approach - the template is a model of what we are looking for in the image.

**Advantages**:
- robust against noise
- efficient implementation as convolution

**Disadvantages**:
- little robustness against variation of viewpoint / illumination
- gray value scaling can cause problems

It's probably good to use it in situations where not much variation of viewpoint and illumination is to be  
expected such as a part of quality control in manufacturing.

**b)** Explain the ideas of *mean absolute difference* and *correlation coefficient*? Name pros and cons.

**MAD**
- measure for similarity between template $T(i, j)$ and image $g(x, y)$
- idea: mean difference of gray values: $MAD(x, y) = \frac{1}{mn} \cdot \sum_{ij} | g(x+i, y+j) - T(i, j)|$
- **advantages:** robust to noise, easy to compute, rotation invariant
- **disadvantages:** gray value scaling can cause problems, sensitive to rotation

**Correlation Coefficient**
- computes a correlation coefficient to measure similarity between the image and the template
- $C_{g, T} = \frac{\sigma_{g, T}}{\sigma_g \cdot \sigma_T}$ where $\sigma_{g, T}(x, y)$ is the covariance between $g$ and $T(i, j)$
  and $\sigma_g, \sigma_T$ are the standard deviations of $g$ and $T$
- the possible values range from $−1$ to $1$, where $-1$ or $+1$ indicate the strongest possible pos / neg correlation and $0$ means that they don't correlate
- **advantages:** robust to gray value scaling and noise, rotation invariant
- **disadvantages:** not as efficient to compute as MAD


## Exercise 4: Where is Waldo [6 points]

In the two images `wheresWaldo1.jpg` and `wheresWaldo2.jpg`, Waldo is hiding in the midst of a busy crowd. He always wears the same red and white striped sweater and hat. However, he may be carrying a something that varies from scene to scene. Use template matching with the given Waldo templates (`waldo*.jpg`) to locate Waldo. Highlight
Waldo in the scene and indicate which template was matched.

**Hints:**
* You may use built-in functions to solve this exercise.
* The images are quite large! You may start by testing your code on a small image patch before applying it to the full scene.
* You may not achieve a perfect match. Analyse the problems you encounter and think how you can improve your result.

If you intend to use the [OpenCV library](https://opencv.org/) in this task, use the following command to install an appropriate version (we will also need this at some later exercise sheet):
```sh
conda install --channel conda-forge opencv
```

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import cv2 

waldos = [plt.imread('images/waldo/wheresWaldo{}.jpg'.format(i)) for i in range(1, 3)]
templates = [plt.imread('images/waldo/waldo{}.jpg'.format(i)) for i in range(0, 6)]
    
# YOUR CODE HERE
thresh = 0.5
for i, img in enumerate(waldos):
    plt.figure(figsize=(40, 18))
    # show each image together with all templates
    for j, template in enumerate(templates):
        # CCOEFF_NORMED worked best, others were too slow or just detecting too much
        # --> returns similarity map (map of correlation coefficients)
        matching = cv2.matchTemplate(img, template, cv2.TM_CCOEFF_NORMED)
        matched_points = np.where(abs(matching) >= thresh)
        w = template.shape[1]
        h = template.shape[0]
        tmp_point = None
        tmp = img.copy()

        for pt in zip(*matched_points[::-1]):
            cv2.rectangle(tmp, pt, (pt[0] + w, pt[1] + h), (128, 0, 128), 2)
            tmp_point = pt
        plt.subplot(2, len(templates), j + 1)
        if i == 0:
            plt.imshow(tmp[500:1000, 1400:1600])
        else:
            if tmp_point != None:
                plt.imshow(tmp[pt[1] - 100:pt[1] + 100, pt[0] - 100:pt[0] + 100])
            else:
                plt.imshow(tmp)
    for j, temp in enumerate(templates):
        plt.subplot(2, len(templates), len(templates) + j + 1)
        plt.imshow(temp)