Osnabrück University - Computer Vision (Winter Term 2022/23) - Prof. Dr.-Ing. G. Heidemann, Ulf Krumnack

# Exercise Sheet 07: Hough Transform/Fourier

## Introduction

This week's sheet should be solved and handed in before the end of **Sunday, January 8, 2023**. If you need help (and Google and other resources were not enough), feel free to use the StudIP forum. Please upload your results to your group's Stud.IP folder.

## Assignment 1: Implementing Hough transform (6 points)

In [None]:
# Auxiliary functions for this exercise
import numpy as np
from skimage.transform import hough_line, resize

steps = lambda p,q : max(map(lambda x,y: abs(x-y), p, q))+1
coords = lambda p,q,s : tuple([np.linspace(x,y,s,dtype=np.uint16) for x,y in zip(p,q)])

def point(img, p):
    """Insert a point in the black/white image at position p
    
    Args:
        img (ndarray): Input image.
        p (tuple): Coordinate of point.
    
    Returns:
    
    """
    img[p] = 1

def line(img, p, q):
    """Insert a line from p to q in the black/white image
    
    Args:
        img (ndarray): Input image.
        p (tuple): Coordinate of start position.
        q (tuple): Coordinate of end position.
    
    Returns:
    
    """
    img[coords(p,q,steps(p,q))] = 1

def polygon(img, vertices):
    """Insert a (closed) polygon given by a list of points into the black/white image
    
    Args:
        img (ndarray): Input image.
        vertices (list): List of coordinate tuples.
    
    Returns:
    
    """
    for p, q in zip(vertices, vertices[1:]+vertices[0:1]):
        line(img,p,q)

**a)** Implement your own version of linear Hough transform and compare the results with the function `skimage.transform.hough_line`.

In [None]:
from skimage.transform import hough_line
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import warnings
warnings.filterwarnings('ignore')

def my_hough_line(img, angles=180):
    """
    Apply linear Hough transform to the given image.
    
    Args:
        img (numpy.ndarray): The binary image to transform
        angles (int): The number of angles to consider.
    Result:
        accumulator (numpy.ndarray): The accumulator space.
    """
    # BEGIN SOLUTION
    # Calculate maximal length in the array.
    d_max = int(np.ceil(np.linalg.norm(img.shape)))
    # Initialize accumulator space.
    accumulator = np.zeros((2 * d_max, angles), dtype=np.uint64)
    # Create array for all angles we want to look at.
    theta = np.arange(angles, dtype='int')
    # Convert degrees to radiants.
    theta_radiant = np.deg2rad(theta)
    # For each white pixel in the image, 
    # calculate the distance to the origin 
    # and increment accumulator space for the distance and all angles.
    for r, c in np.argwhere(img):
        d = np.floor(c * np.cos(theta_radiant) + r * np.sin(theta_radiant))
        d = d.astype(int)
        accumulator[d + d_max, theta] += 1
    # END SOLUTION
    return accumulator


img = np.zeros((100,100))

# You may try different paintings here:
#point(img, (10,10))
#line(img,(10,20),(70,20))
polygon(img,[(20,10),(80,50),(80,80),(30,60)])


fig, [ax1, ax2, ax3] = plt.subplots(1,3, figsize=(12,4))
plt.gray()

# original image
ax1.set_title('Image'); 
ax1.imshow(img, origin = 'lower')

# skimage
out, angles, d = hough_line(img)

out_resized = resize(out, (out.shape[0], out.shape[0]), anti_aliasing=True, preserve_range=True)
ax2.set_title('Hough transform (skimage)');
ax2.set_xlabel('Angles (degrees)')
ax2.set_ylabel('Distance (pixels)')
ax2.imshow(np.log(1 + out_resized), origin = 'lower', cmap='gray')
ax2.set_yticks(np.linspace(0, out.shape[0], 7))
ax2.set_yticklabels((-1 * np.linspace(d[-1], d[0], 7)).astype(int))
ax2.set_xticks(np.linspace(0, out.shape[0], 5))
ax2.set_xticklabels(np.linspace(np.rad2deg(angles[0]), np.rad2deg(angles[-1]), 5).astype(int))

# own implementation
my_out = my_hough_line(img)

my_out_resized = resize(out, (out.shape[0], out.shape[0]), anti_aliasing=True, preserve_range=True)
ax3.set_title('Hough transform (own implementation)');
ax3.set_xlabel('Angles (degrees)')
ax3.set_ylabel('Distance (pixels)')
ax3.imshow(np.log(1 + my_out_resized), origin = 'lower', cmap='gray')
ax3.set_yticks(np.linspace(0, out.shape[0], 7))
ax3.set_yticklabels((-1 * np.linspace(d[-1], d[0], 7)).astype(int))
ax3.set_xticks(np.linspace(0, out.shape[0], 5))
ax3.set_xticklabels(np.linspace(np.rad2deg(angles[0]), np.rad2deg(angles[-1]), 5).astype(int))

plt.show()

**b)** Under which conditions is (linear) Hough transform invertible? Try to implement an inverse linear Hough transform. 

You could create a line for each point in accumulator space by doing the following: Compute angle and distance of the line represented by the current point in accumulator space. Add the value of this point to all image pixels which belong to the corresponding line. Finally you could threshold the image. 

However, other solution are feasible, too.


A point in the Hough space (accumulator space) can be seen as an indicator for a line in the original space. To invert the Hough transform, one has to accumulate these indicators, i.e. lines in the pixel space. An exact inversion will usually not be possible, due to discretization effects.

In [None]:
%matplotlib inline 
from skimage.transform import hough_line
import matplotlib.pyplot as plt
import numpy as np

# BEGIN SOLUTION
def my_line(img, distance, angle, value=1):
    """Draw a line provided in Hesse form, i.e. by normal angle and distance.
    Not very efficient, but simple.
    
    Args:
        img (numpy.ndarray): An image where the line should be added
        distance (float): The distance from the origin
        angle (float): The angle of the line
        value (int): The strength of the line
    """
    s = np.sin(angle)
    c = np.cos(angle)
    if abs(c) > abs(s):
        x0 = distance/c
        for y in range(img.shape[0]):
            x = int(np.round(x0 - y*s/c))
            if x >= 0 and x < img.shape[1]:
                img[y,x] += value
    else:
        y0 = distance/s
        for x in range(img.shape[1]):
            y = int(np.round(y0 - x*c/s))
            if y >= 0 and y < img.shape[0]:
                img[y,x] += value
# END SOLUTION

def my_inverse_hough_line(accumulator, shape):
    """Compute an inverse Hough transform, i.e. compute the image from the accumulator space.
    
    Args:
        accumulator (numpy.ndarray)
        shape (tuple): The shape (dimensions) of the image to be created

    Result:
        img (numpy.ndarray): The image created from the accumulator space.
    """
    img = np.zeros(shape, np.uint64)
    
    # BEGIN SOLUTION    
    for d,a in np.argwhere(accumulator):
        val = accumulator[d,a]
        distance = d-accumulator.shape[0]//2
        angle = (a/accumulator.shape[1] - .5) * np.pi
        my_line(img,distance,angle,val)
    # END SOLUTION
    return img

img = np.zeros((100,100))

#point(img, (10,10))
#line(img,(10,20),(70,20))
#line(img,(20,10),(20,70))
polygon(img,[(10,30),(50,50),(10,70)])

plt.figure(figsize=(12, 4))
plt.gray()
plt.subplot(1,3,1) ; plt.title('Image'); plt.imshow(img, origin = 'lower')

out, angles, d = hough_line(img)
out_resized = resize(out, (out.shape[0], out.shape[0]), anti_aliasing=True, preserve_range=True)
ax = plt.subplot(1,3,2) ; plt.title('Hough transform (skimage)');
ax.set_title('Hough transform (skimage)');
ax.set_xlabel('Angles (degrees)')
ax.set_ylabel('Distance (pixels)')
ax.imshow(np.log(1 + out_resized), origin = 'lower', cmap='gray')
ax.set_yticks(np.linspace(0, out.shape[0], 7))
ax.set_yticklabels((-1 * np.linspace(d[-1], d[0], 7)).astype(int))
ax.set_xticks(np.linspace(0, out.shape[0], 5))
ax.set_xticklabels(np.linspace(np.rad2deg(angles[0]), np.rad2deg(angles[-1]), 5).astype(int))

img2 = my_inverse_hough_line(out, img.shape)
plt.subplot(1,3,3) ; plt.title('Inverse Hough transform');
plt.imshow(img2, origin = 'lower')
plt.show()

## Assignment 2: Circular Hough (5 points)

**a)** Explain the idea of circular Hough transform. How many dimensions does the accumulator space have and what do they represent? What practical problems do you expect when computing circular Hough transform?

The accumulator space has three dimensions $x_c$, $y_c$, and $r$, with $x_c$ and $y_c$ spanning the image plan and $r$ being the radius. In practice circles of a fixed radius are searched, limiting the accumulator space to the image plane.

Not limiting the search to a fixed radius would make circular Hough transform computational complex: For each edge pixel the counters along all circles which are centered at this pixel need to be incremented.

**b)** Now implement circular hough transform. First think how the accumulator space should be represented. You may provide an additional parameter to only look for circles of a given radius (or a range of radii). Then apply it to the image `xmas.png` to detect the balls. Mark them in the original image. (Bonus task: also detect the presents using linear Hough transform).

In [None]:
%matplotlib inline
from skimage.transform import hough_line
from scipy import misc
from imageio.v2 import imread
import matplotlib.pyplot as plt
import numpy as np

# load the image
img = imread('images/xmas.png')

# BEGIN SOLUTION

# The radii of circles to be detected by the hough transform
radii = range(5, 15)

# xx and yy are 200x200 tables containing the x and y coordinates as values
# mgrid is a mesh creation helper
xx, yy = np.mgrid[:200, :200]
# circles contains the squared distance to the (100, 100) point
# we are just using the circle equation learnt at school
d2 = (xx - 100)**2 + (yy - 100)**2

# create circle shapes
circles = []
for r in radii:
    circle = (d2 < ((r + 1)**2)) & (d2 > ((r - 1)**2))
    circles.append(circle[100 - r - 1:100 + r + 1, 100 - r - 1:100 + r + 1])


# simple implementation:
def add_mask1(space, circle, x, y):
    """
    Add a given mask (circle) to the accumulator space.
    
    Args: 
        space (ndarray): Accumulator space.
        circle (tuple)
        x (uint): Coordinate.
        y (uint): Coordinate.
    """
    x -= circle.shape[1] // 2
    y -= circle.shape[0] // 2
    for r, c in circle.where():
        if r >= y and c >= y and r + y < space.shape[0] and c + y < space.shape[1]:
            space[r + y, c + x] += 1


# alternative implementation:
def add_mask2(space, mask, position, offset=(0, 0)):
    """
    Add a given mask to the accumulator space.
    
    Args: 
        space (ndarray): Accumulator space.
        mask (ndarray)
        position (tuple)
        offset (tuple)
    
    """
    position = np.subtract(position, offset)
    space_min = np.maximum(position, 0)
    mask_min = np.maximum(-position, 0)
    space_max = np.minimum(space.shape, np.add(position, mask.shape))
    mask_max = np.minimum(np.subtract(space.shape, position), mask.shape)
    space[space_min[0]:space_max[0], space_min[1]:space_max[1]] += mask[
        mask_min[0]:mask_max[0], mask_min[1]:mask_max[1]]


def circular_hough(img, masks, radii):
    """ Computes circular hough transform
    
    Args:
        img (ndarray): Input image.
        masks (list): List of circles.
        radii (list): List of raddii.
        
    Returns:
        accumulator (ndarray): Accumulator space.
    """
    # Create the accumulator space. This space will have three dimensions:
    # the x- and y-coordinate of the circle and its radius.
    accumulator = np.zeros(
        (len(radii), img.shape[0], img.shape[1]), dtype=np.int64)

    for r, c in np.argwhere(img):
        for i, radius in enumerate(radii):
            add_mask2(accumulator[i], circles[i], (r, c),
                      (radius + 1, radius + 1))
    return accumulator


# and perform circular hough transform
accumulator = circular_hough(img, circles, radii)

# Analyze the accumulator space
out = np.zeros(img.shape, np.int64)
for i, radius in enumerate(radii):
    max_fit = np.count_nonzero(circles[i])
    level = round(0.9 * max_fit)
    print("radius = {}, max = {}/{}, better than {}: {}".format(
        radius, accumulator[i].max(), max_fit, level,
        np.count_nonzero(accumulator[i] > level)))

    for r, c in np.transpose(np.where(accumulator[i] > level)):
        add_mask2(out, circles[i], (r, c), (radius + 1, radius + 1))

plt.figure(figsize=(12,6))        
plt.gray()
plt.subplot(1, 2, 1)
plt.imshow(img)
plt.subplot(1, 2, 2)
plt.imshow(out)
plt.show()

# END SOLUTION

## Assignment 3: Application: Eye detection (5 points)

**a)** It was suggested in the lecture, that circular Hough transform can be used to implement a simple
form of “eye tracking”. Try this using your webcam.

Hints:
* You may start with a single snapshot (or some other image of a face from wherever you like) before applying your function to a video stream.
* You may start using functions like `hough_circle` from `skimage.transform`, which implement circular Hough transform.
* You may try to apply some preprocessing to detect edges or regions prior to Hough transform (you may use anything you know from the lecture).
* `hough_circle` may be inappropriate, as it is designed to detect full circles, while eyes are often partially occluded. You may implement your own Hough transforms, which detects this kind of incomplete circles.
* You may apply additional knowledge to detect eyes, e.g. that they usually occur in pairs. You may also use the approximate size and position from last frame.

In [None]:
# imageio
%matplotlib notebook
from IPython.display import display, HTML
import matplotlib.pyplot as plt
import imageio
import numpy as np
from skimage.filters import gaussian
from skimage.feature import canny
from skimage.color import rgb2gray
from skimage.transform import hough_circle, hough_circle_peaks
from skimage.draw import circle_perimeter

    
def detect_eyes(img):
    """Find and mark eyes in image.
    
    Args:
        img (ndarray): The RGB input image to detect the eyes in. shape=(h, w, 3)
        
    Returns:
        ndarray: The image with detected eyes highlighted.
    
    """
    ### BEGIN SOLUTION

    # List of radii we compute hough transform for. Ideally we adjust for the perfect radius.
    hough_radii = [10]

    
    # Convert to grayscale
    img_gray = rgb2gray(img)
    
    # Detect edges
    img_edges = canny(img_gray)
    
    # Detect radii
    hough_res = hough_circle(img_edges, hough_radii)

    # Select the two most prominent 2circles
    accums, cx, cy, radii = hough_circle_peaks(hough_res, hough_radii,
                                           total_num_peaks=2)

    for center_y, center_x, radius in zip(cy, cx, radii):
        circy, circx = circle_perimeter(center_y, center_x, radius,
                                    shape=img.shape)
        img[circy, circx] = (255, 0, 0)
    
    ### END SOLUTION

# aquire some image a img
### BEGIN SOLUTION
img = np.zeros((256, 256))
### END SOLUTION

# display eyes detected in that image
plt.figure()
plt.imshow(detect_eyes(img))
plt.show()

In [None]:
# Accessing the webcam with imageio
import imageio.v2 as imageio

try:
    display(HTML("press <kbd>I</kbd>, <kbd>I</kbd> (Kernel Interrupt) to stop the demo!"))
    with imageio.get_reader('<video0>') as webcam:
        fig = plt.figure(figsize=(8,6))
        mpl_image = plt.imshow(webcam.get_next_data())

        while True:
            img = webcam.get_next_data()
            mpl_image.set_data(detect_eyes(img))
            fig.canvas.draw()
except KeyboardInterrupt:
    print("Interrupted")
finally:
    webcam.close()
    plt.close(fig)
    print("Camera was closed.")

In [None]:
# Accessing the webcam with OpenCV
import cv2

try:
    display(HTML("press <kbd>I</kbd>, <kbd>I</kbd> (Kernel Interrupt) to stop the demo!"))
    webcam = cv2.VideoCapture(0)
    success, frame_bgr = webcam.read()
    if not success:
        raise RuntimeError("Error accessing webcam")
    fig = plt.figure(figsize=(8,6))
    mpl_image = plt.imshow(cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB))

    while True:
        success, frame_bgr = webcam.read()
        if not success:
            raise RuntimeError("Error accessing webcam")
        img = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB)
        mpl_image.set_data(detect_eyes(img))
        fig.canvas.draw()
        
except KeyboardInterrupt:
    print("Interrupted")
finally:
    webcam.release()
    plt.close(fig)
    print("Camera was closed.")

## Assignment 4: Fourier Transform Basics (4 points)

**(a)** There are several "versions" of Fourier transform mentioned on the lecture slides:
* transformation of periodic function vs. aperiodic functions
* transformation of continuous vs. discrete functions
* real vs. complex basis functions
* different scaling factor 
* 1D and 2D transformation

How do these different versions affect the formula for Fourier transform? What version of Fourier transform is relevant in the context of Computer Vision?

* periodic function vs. aperiodic function (finite vs. infinite integrals/sums, CV-09, slide 12)
* continuous vs. discrete functions (integral vs. sums, CV-09, slide 21)
* real vs. complex form (sine/cosine vs. exponential, CV-09, slide 6ff, slide 16)
* different prefactor conventions (CV-09, slide 19)
* 1D and 2D transformation (CV-09, slide 21)

Relevant for Computer Vision: discrete and periodic 2D Fourier transform, typically using complex representation.

**(b)** The basis function for Fourier transform in Computer Vision can be considered as images. How do they look like? How many such basis images exist? Provide code to plot basis images for an image of shape `(100, 100)`.




The basis images have the same shape as the original image (the original image is a linear combination of the base images). A base image depicts a wave of a given horizontal and vertical frequency, with valid frequencies ranging from 0 up to the half of the image width and height.

In case of the real form (sine and cosine) there are two versions for each frequency, in the complex case, there is only one (instead the linear coefficient contains the phase information).

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

shape = (100,100)
### BEGIN SOLUTION
u, v = (2, 3)

def wave_image(shape, u, v, amplitude=1.0, phase=0.0):
    """Create a basis wave for Fourier transform of an image of a given shape,
    horizontal and vertical frequency, amplitude, and phase.
    """
    frequencies = np.zeros(shape, dtype=complex)
    frequencies[v,u] = frequencies.size * amplitude * np.exp(1j * phase)
    return np.real(np.fft.ifft2(frequencies))

fig = plt.figure()
plt_im = plt.imshow(wave_image(shape, u, v), cmap='gray', vmin=-1, vmax=1, interpolation='bilinear')
plt.show()
### END SOLUTION