Osnabrück University - Computer Vision (Winter Term 2020/21) - Prof. Dr.-Ing. G. Heidemann, Ulf Krumnack, Axel Schaffland

# Exercise Sheet 04: Segmentation and Color

## Introduction


This week's sheet should be solved and handed in before the end of **Saturday, November 28, 2020**. If you need help (and Google and other resources were not enough), feel free to contact your groups' designated tutor or whomever of us you run into first. Please upload your results to your group's Stud.IP folder.

## Assignment 0: Math recap (the exponential function) [0 Points]

This exercise is supposed to be very easy, does not give any points, and is voluntary. There will be a similar exercise on every sheet. It is intended to revise some basic mathematical notions that are assumed throughout this class and to allow you to check if you are comfortable with them. Usually you should have no problem to answer these questions offhand, but if you feel unsure, this is a good time to look them up again. You are always welcome to discuss questions with the tutors or in the practice session. Also, if you have a (math) topic you would like to recap, please let us know.

**a)** What is an *exponential function*? How can it be characterized? What is special about $e^x$?

YOUR ANSWER HERE

**b)** How is the exponential function defined for complex arguments? In what way(s) does this generalize the real case?

YOUR ANSWER HERE

**c)** The complex exponential function allows to define a mapping $\mathbb{R}\to\mathbb{C}$ by $x\mapsto e^{ix}$? How does the graph of this mapping look like? Where are the points $e^{2\pi i\frac mn}$ for $m=0,...,n\in\mathbb{N}$ located on this graph?

YOUR ANSWER HERE

In [None]:
# YOUR CODE HERE

## Assignment 1: Histogram-based segmentation [5 Points]

### a) Histogram-based segmentation

What is histogram-based segmentation? What are it's goals, benefits, and problems?

A histogram is computed from the pixels of the image, and the extrema in the histogram are used to locate the clusters in the image.  

**Goals**:
- separate foreground and background
- background as a single segment
- one segment for each other entity

**Benefits**:  
- histogram-based methods are very efficient compared to other image segmentation methods because they typically require only one pass through the pixels

**Problems**:  
- it may be difficult to identify significant extrema in the image
- hard to find suitable threshold for segmentation
- non-uniform brightness of background - homogeneous background would simplify the task

### b) Threshold computation

There exist different methods to automatically determine a threshold for an image. Find at least two that are provided by scikit-image and describe them in more detail. Then apply them to the images `schrift.png` and `pebbles.jpg`.

Thresholding is used to create a binary image from a grayscale image.  

Scikit-image includes the function `try_all_threshold` to evaluate thresholding algorithms provided by the library.  
The results of applying all the thresholding functions is depicted below. Two of the used functions are:  

- **Otsu’s method** (`threshold_otsu`)
    - calculates an optimal threshold by maximizing the variance between two classes of pixels, which are separated by the threshold (minimizes the intra-class variance)
- **Mean** (`threshold_mean`)
    - returns a threshold value based on the mean of grayscale values



In [None]:
# Run this cell to get an impression of how the histograms look

%matplotlib inline
import matplotlib.pyplot as plt
from imageio import imread

img1 = imread('images/schrift.png')
img2 = imread('images/pebbles.jpg') 

plt.figure(figsize=(15, 10)) 
plt.gray()
plt.subplot(2,2,1)
plt.axis('off')
plt.imshow(img1)
plt.subplot(2,2,2)
plt.hist(img1.flatten(), 256, (0, 255))
plt.subplot(2,2,3)
plt.axis('off')
plt.imshow(img2)
plt.subplot(2,2,4)
plt.hist(img2.flatten(), 256, (0, 255))
plt.show()

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
from imageio import imread
from skimage.filters import try_all_threshold

# YOUR CODE HERE
img1 = imread('images/pebbles.jpg')
img2 = imread('images/schrift.png')
# https://scikit-image.org/docs/dev/auto_examples/segmentation/plot_thresholding.html
fig, ax = try_all_threshold(img1, figsize=(10, 8), verbose=False)
plt.show()
fig, ax = try_all_threshold(img2, figsize=(10, 8), verbose=False)
plt.show()

### c) Shading

Shading may cause a problem to histogram based segmentation. In the lecture (CV-07 slide 13), it was proposed to compute a shading image to deal with that problem. Apply this approach to the images `schrift.png` and `pebbles.jpg`. You may use filter functions from scikit-image for this exercise.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from imageio import imread
from skimage.filters import threshold_otsu, rank

img1 = imread('images/schrift.png').astype(float) / 255
img2 = imread('images/pebbles.jpg').astype(float) / 255

# YOUR CODE HERE

# threshold img without shading img
thresh_img1 = img1 > threshold_otsu(img1)
thresh_img2 = img2 > threshold_otsu(img2)

# threshold img with shading img

# if the image consists mainly of foreground and background, 
# the shading image can be obtained using a ranking filter (return local maximum of an image)
# --> window must be large enough to always contain at least 1 fg and 1 bg pixel
shading_img1 = rank.maximum(img1, np.ones((7, 7)))
# it's not working that well for the pebbles, because fg and bg are not that easily separable
shading_img2 = rank.maximum(img2, np.ones((25, 25)))

# divide img by background img
corr_img1 = img1 / shading_img1
# normalize
corr_img1 *= (255 / corr_img1.max())
new_thresh1 = corr_img1 > threshold_otsu(corr_img1)

# divide img by background img
corr_img2 = img2 / shading_img2
# normalize
corr_img2 *= (255 / corr_img2.max())
new_thresh2 = corr_img2 > threshold_otsu(corr_img2)

plt.figure(figsize=(24, 16))

plt.subplot(2, 4, 1)
plt.title('original text')
plt.imshow(img1, cmap='gray')

plt.subplot(2, 4, 2)
plt.title('thresholding with original text')
plt.imshow(thresh_img1, cmap='gray')

plt.subplot(2, 4, 3)
plt.title('shading image')
plt.imshow(shading_img1, cmap='gray')

plt.subplot(2, 4, 4)
plt.title('shading image + thresholding text')
plt.imshow(new_thresh1, cmap='gray')

plt.subplot(2, 4, 5)
plt.title('original pebbles')
plt.imshow(img2, cmap='gray')

plt.subplot(2, 4, 6)
plt.title('thresholding with original pebbles')
plt.imshow(thresh_img2, cmap='gray')

plt.subplot(2, 4, 7)
plt.title('shading image')
plt.imshow(shading_img2, cmap='gray')

plt.subplot(2, 4, 8)
plt.title('shading image + thresholding pebbles')
plt.imshow(new_thresh2, cmap='gray')

## Assignment 2: Pyramid representation [5 Points]

**a)** What is the *Gaussian pyramid*? How does the **reduce** operation work? Explain in your own words what low pass filtering is and why it should be used when building the pyramid? Implement the **reduce** operation and generate a figure similar to the one on (CV-07 slide 32).

A pyramid representation is a type of multi-scale representation in which an image is subject to repeated smoothing and subsampling.  
In a Gaussian pyramid, subsequent images are weighted down using a Gaussian blur and scaled down. Each pixel containing a local average corresponds to a neighborhood pixel on a lower level of the pyramid. This technique is used especially in texture synthesis.  
Pyramids are used as multi-scale representation for computing multi-scale image features from real-world image data in a very efficient manner.

**Reduce Operation**:
Each pixel of level $i+1$ replaces four pixels of level $i$ (not necessarily calculated from these four).  

**Low Pass Filtering**:
Low pass filtering removes high frequencies to avoid artifacts which arise due to a violation of the sampling theorem.


In [None]:
%matplotlib inline
import numpy as np
from scipy import ndimage
import matplotlib.pyplot as plt
from imageio import imread

img = imread('images/mermaid.png').astype(float) / 255
reduced_img = img.copy()

# YOUR CODE HERE
kernel = 1 / 16 * np.array([[1, 2, 1], [2, 4, 2], [1, 2, 1]])

def reduce(img):
    # only use every 2nd row and col (reduction)
    # gaussian filtering + sub-sampling
    filtered = ndimage.convolve(img, kernel)[::2, ::2]
    # return normalized version
    return filtered * 1.0 / filtered.max()

while img.size > 2:
    img = reduce(img)
    reduced_img[-img.shape[0]:, :img.shape[1]] = img

plt.figure(figsize=(15,10))
plt.gray()
plt.imshow(reduced_img)
plt.show()

**b)** What is the **expand** operation? Why can the **reduce** operation not be inverted? Implement the **expand** operation and generate an image similar to the one on (CV-07 slide 34).

The expand operation reproduces level $i$ from level $i+1$. The pixels of level $i$ are generated by interpolation of pixels of levle $i+1$.  
It yields a blurred image, because the reduce operation can not be inverted since we don't know the values of the 'lost' pixels.

In [None]:
%matplotlib inline
import numpy as np
from scipy import ndimage
import matplotlib.pyplot as plt
from imageio import imread

img = imread('images/mermaid.png').astype(float) / 255
steps = 4

# YOUR CODE HERE
def expand(img):
    pyr = np.zeros((img.shape[0] * 2, img.shape[1] * 2))
    pyr[::2, ::2] = img
    # TODO: not sure about 'kernel * steps' - otherwise too dark
    return ndimage.convolve(pyr, kernel * steps, mode='constant')

for _ in range(steps):
    img = reduce(img)
pyramid_image = np.zeros((img.shape[0] * (2 ** steps), img.shape[1] * (2 ** steps)))

res = []
for _ in range(steps):    
    img = expand(img)
    res.append(img)

for img in res[::-1]:
    pyramid_image[-img.shape[0]:, :img.shape[1]] = img

plt.figure(figsize=(15,10))
plt.gray()
plt.imshow(pyramid_image)
plt.show()

## Assignment 3: Texture Segmentation [5 Points]

**a)** What is texture? Try to define it in your own words. Can there be a standard definition? What problems do you expect for texture based segmentation? 

**Texture** can be seen as the structure or patterning of a surface, but has no precise definition.  
It's an important feature for segmentation, because it can be used as a homogeneity condition.  
In general, texture is when there are groups of pixels exhibiting common properties and can not be defined for single pixels.  
As there is no hard definition, texture and texture measures are always a matter of definition.

Besides the lack of a hard definition, texture interpretation often highly depends on the context.

**b)** What is a co-occurrence matrix? How can it be used to characterize texture?

It's an important tool to recognize texture. It's basically a 2D histogram based on pairs of pixels representing the correlation between the pixels.  

To characterize texture, one has to evaluate the co-occurrence matrix by different texture features, e.g. the Haralick features:
- contrast
- entropy
- homogeneity
- energy
- ..


**c)** Implement a function to compute the co-occurence matrix of an image (patch). Apply it and compare your results to (CV-07 slide 54).

In [None]:
%matplotlib inline
import numpy as np
from scipy import misc
import matplotlib.pyplot as plt
import imageio 

img = imageio.imread('images/mermaid.png')#, mode='L')

def get_patch(img, x, y, size=40):
    """
    Extract a rectangular patch from an image and mark it in the original image.
    
    Args:
        img (nndarray): Input image.
        x (uint): X-coordinate.
        y (uint): Y-coordinate.
        size (uint): Size of the patch.
        
    Returns:
        result: The extracted patch.
    """
    result = img[x:x+size,y:y+size].copy()
    img[x:x+size, [y,y+1,y+size,y+size+1]] = 0
    img[[x,x+1,x+size,x+size+1], y:y+size] = 0
    return result

patches = []
patches.append(get_patch(img, 50,130))
patches.append(get_patch(img, 110,80))
patches.append(get_patch(img, 260,340))
patches.append(get_patch(img, 310,110))
patches.append(get_patch(img, 100,440))

def cooccurrence(img, dx=1, dy=1):
    """
    Compute a co-occurence matrix for the given image.
    
    Args:
        img          the grayscale image (uint8)
        dx,dy        the offset between the two reference points

    Returns:
        matrix       the co-occurence matrix
    """

    # YOUR CODE HERE
    matrix = np.zeros((256, 256))
    # dx 1 --> 0; dy 1 --> 90
    alpha = 0 if dx == 1 else 90
    d = np.array([int(np.cos(np.deg2rad(alpha))), int(np.sin(np.deg2rad(alpha)))])

    # iteration over 4 loops is way too slow, but we don't have to do it since we
    # already have all the gray value combinations in the matrix
    # --> basically counting the combinations (number of co-occurrences)
    for x in range((img.shape[0] - d[0])):
        for y in range((img.shape[1] - d[1])):
            p = img[x][y]
            p_plus_d = img[x + d[0]][y + d[1]]
            # count co-occurrence
            matrix[p][p_plus_d] += 1

    return matrix

plt.figure(figsize=(8, 8))
plt.gray()
plt.imshow(img)
plt.show()

plt.figure(figsize=(8, 8))
i = 0
for p in patches:
    plt.subplot(len(patches), 3, i + 1); plt.axis('off'); plt.imshow(p)
    # For visualization one may apply some extra me, e.g., logarithmization or binarization
    plt.subplot(len(patches), 3, i + 2); plt.imshow(np.log(1 + cooccurrence(p, 0, 1)), interpolation='none')
    plt.subplot(len(patches), 3, i + 3); plt.imshow(cooccurrence(p, 1, 0) > 0, interpolation='none')
    i += 3
plt.show()

## Assignment 4: Region merging [5 Points]

Implement the *region merging* algorithm (CV-07 slide 39) and apply it to the image `segments.png` (or some part of it). Use a simple *homogeneity condition*, e.g. that the maximal difference between gray values in a segment is not larger than a given threshold.

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import imageio
import warnings
warnings.filterwarnings('ignore')

img = imageio.imread('./images/segments.png', pilmode='L')

# Choosing a large image region lengthens computation time
img = img[64:128,64:128]

# compute the `label` array by implementing "region merging"
# YOUR CODE HERE

#############################################

#############################################


plt.figure(figsize=(12, 12))
plt.gray()
plt.subplot(1,2,1)
plt.imshow(img)
plt.subplot(1,2,2)
plt.imshow(label, cmap='prism')
plt.show()


warnings.filterwarnings('ignore')

## Bonus: Painting with a webcam using color detection [0 points]


### Testing your webcam: Images
From now on we will try to make the exercises a bit more interactive and use live feed from your webcam. Unfortunately, using the webcam may not always work out of box (depending on your hardware/os configuration). So first make sure that you can grab an image from the webcam.

1. Use the `imageio` library as presented in the tutorial sessions. You will probably need to install `ffmpeg` packages as shown in the tutorial code.
1. Use the `cv2` library (opencv will use `gstreamer`). You will probably need to install then `opencv` package.

Hint: Sometimes it helps to restart the kernel.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

# Set this flag to either use "imageio" or "cv2"
use_imageio = True
if use_imageio:
    # use imageio for accessing the webcam (requires ffmpeg to be installed on your computer)
    import imageio
    try:
        reader = imageio.get_reader('<video0>')
        img = reader.get_next_data()
        ok = True
        reader.close()
    except:
        ok = False
else:
    # use opencv for accessing the webcam
    import cv2
    camera = cv2.VideoCapture(0)
    ok, img = camera.read()
    camera.release()
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

if ok:
    plt.imshow(img)
    plt.show()
else:
    print("Accessing your webcam failed.")

### Testing your webcam: Video
You can now test your webcam with video. You can either select the methods presented in the tutorial session, namely `imageio` and `visvis`, or use `cvloop`. We recommend using the first method.

**imageio and visvis**


To test these modules run the following code

In [None]:
import imageio
import visvis as vv
import time 
import numpy as np

reader = imageio.get_reader('<video0>')

img = reader.get_next_data()
res = np.zeros_like(img)

fig = vv.figure() 
a1 = vv.subplot(121)
im_v = vv.imshow(img, clim=(0, 255))
a1 = vv.subplot(122)
res_v = vv.imshow(res, clim=(0, 255))

for im in reader:
    vv.processEvents()     
    im_v.SetData(im)
    res_v.SetData(255 - im)

**cvloop**

Atlernatively you can use `cvloop`. To install `cvloop` first activate your cv environment and then run the follwing cell. We recommend using

In [None]:
!pip install cvloop

Check that it works by executing the cell below:

In [None]:
from cvloop import cvloop
cvl = cvloop(function=lambda frame: 255 - frame, side_by_side=True)

### a)
In this task we will track a small colored object (like the cap of a pen) in front of a neutral background of a different color. We will use the location of the object to paint on a virtual canvas. For that you have to implement the following tasks in the `draw_func` function:

* Convert the image `img` given to the `draw_func` into HSV color space. 
* Measure the color of your object. You may return the converted image and interactively measure the color with your mouse. Define your measured hue value in a constant
* Discard all channel except the hue channel. 
* Find the location with the most similar hue to the measured hue of your object.
* Paint a marker, for example a circle, at this position in `img_draw`.


In [None]:
%matplotlib inline

import imageio
import visvis as vv
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
from skimage.color import rgb2hsv
from skimage.draw import disk

# Adapt this hue value to the hue of your object
hue = .4

# A global canvas to draw on
canvas = np.zeros((480,640,3), np.uint8) 

# radius and color of the brush
radius = 5
color = (255,255,255)

# saturation threshold for object
thresh = .2

def draw_func(img):
    """
    Draw a circle on img_draw at the detected object location.
    
    Args:
        img          the RGB input image (uint8)

    Returns:
        img_draw     img with circle drawn at postion of object
    """
    global canvas, hue, radius, color
    
    # YOUR CODE HERE
    
    
    return canvas



# Make a figure and axes with dimensions as desired.
fig = plt.figure(figsize=(8, 1))
ax = fig.add_axes([0.05, 0.80, 0.9, 0.15])
cb = mpl.colorbar.ColorbarBase(ax, cmap=mpl.cm.hsv, orientation='horizontal',
                               norm=mpl.colors.Normalize(vmin=0, vmax=1))
cb.set_ticks([hue])
cb.set_label('the hue value')
plt.show()

In [None]:
# First test your function with single image. You may either grab an image from your webcam (as described above),
# or choose an arbitrary image from wherever you like

%matplotlib inline
import matplotlib.pyplot as plt

draw_func(img)
plt.subplot(1,2,1); plt.imshow(img)
plt.subplot(1,2,2); plt.imshow(canvas)
plt.show()

In [None]:
# Now run your function with imageio and visvis or alternatively with cvloop

import imageio
import visvis as vv
import numpy as np


reader = imageio.get_reader('<video0>')

img = reader.get_next_data()
res = np.zeros_like(img)

fig = vv.figure() 
a1 = vv.subplot(121)
im_v = vv.imshow(img, clim=(0, 255))
a1 = vv.subplot(122)
res_v = vv.imshow(res, clim=(0, 255))

for im in reader:
    # mirror the image to make drawing easier
    im = im[:,::-1,:]
    vv.processEvents()     
    im_v.SetData(im)
    res_v.SetData(draw_func(im))

In [None]:

%matplotlib notebook
from cvloop import cvloop

# Now use cvloop to run the algorithm live on webcam data     
cvl = cvloop(function=draw_func, side_by_side=True)