# Practical task sheet 01 -- color space conversion, subsampling, chroma keying and noise

The first practical task sheet will be about color conversion and chroma keying.
For this please checkout the following links:

* https://en.wikipedia.org/wiki/Color_space
* https://en.wikipedia.org/wiki/YCbCr
* https://en.wikipedia.org/wiki/Chroma_subsampling
* https://en.wikipedia.org/wiki/Chroma_key
* https://en.wikipedia.org/wiki/Image_noise

Transmission standards use different color spaces, due to the different end devices, however, cameras also use different color spaces to record videos, it is thus required to convert from one color space to a different one.
We will tackle in this task sheet several parts of color space conversion and subsampling.

Afterward, we will have a look at traditional chroma-keying, a technique used in e.g. television studios.
The key idea of chroma keying is to replace the background with something else, e.g. a weather map or similar.

Finally, we will check out how to reduce salt and pepper noise removal and edge detection for an image.

**General Hint**: in each code cell the parts where code needs to be added are marked with TBD, prefer simple code than complicated.

In [4]:
# version 2.2

In [5]:
# install requirements (this cell should not produce any errors, otherwise check dependencies and guide)
!pip3 install --user numpy pandas matplotlib scipy jupyter scikit-image scikit-learn scikit-video

[31mERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.[0m[31m
[0m

In [6]:
import sys
!{sys.executable} -m pip install numpy pandas matplotlib scipy scikit-learn scikit-image sk-video




In [9]:
# helper functions and required imports

import skimage.io
import numpy as np

def show_image(img):
    """ shows an image (3d array) in a jupytor cell"""
    skimage.io.imshow(img)
    skimage.io.show()
    

## Subtask 1: Color space conversion
The most commonly used color space for video processing is $YC_bC_r$, in the following cells we will manually implement such a conversion from digital RGB values.

Important to know is that the conversion of $[0,1]$-scaled RGB to $YC_bC_r$ is done using the following equations (ITU-R BT.601 conversion):

$$ Y = 16 + ( 65.481 \cdot R + 128.553 \cdot G + 24.966 \cdot B) $$
$$ C_b = 128 + (-37.797 \cdot R - 74.203 \cdot G + 112.0 \cdot B) $$
$$ C_r = 128 + (112.0  \cdot R - 93.786 \cdot G - 18.214 \cdot B) $$

After conversion the components are handled as 8-bit unsigned integer planes.

In [10]:
# use method from skimage.io to read "color_example.jpg"
example_image = skimage.io.imread("color_example.jpg")  # TBD

show_image(example_image)

# access each color channel, and convert to [0,1] scaled values
R = example_image[:, :, 0] / 255.0
G = example_image[:, :, 1] / 255.0
B = example_image[:, :, 2] / 255.0


# show all channels
show_image(R)
show_image(G)
show_image(B)

FileNotFoundError: No such file: '/Users/basharatnaeem/Documents/color_example.jpg'

In [1]:
# just as a fun step, we create a new image (with same dimensions), and swap the channels to BGR
#   important if you use the R,G,B values from the previous cell, rescale to [0,255] values

channel_swap = np.zeros(example_image.shape, dtype=np.uint8)
channel_swap[:, :, 0] = (B * 255).astype(np.uint8)
channel_swap[:, :, 1] = (G * 255).astype(np.uint8)
channel_swap[:, :, 2] = (R * 255).astype(np.uint8)



# and we show the channel swapped image
show_image(channel_swap)

NameError: name 'np' is not defined

In [2]:
# lets now convert the R,G,B image to Y, C_b, C_r
Y  = 16   + (65.481 * R + 128.553 * G + 24.966 * B)
Cb = 128  + (-37.797 * R - 74.203 * G + 112.0 * B)
Cr = 128  + (112.0 * R - 93.786 * G - 18.214 * B)



# important convert type to uint8 
Y = np.clip(Y, 0, 255).astype(np.uint8)
Cb = np.clip(Cb, 0, 255).astype(np.uint8)
Cr = np.clip(Cr, 0, 255).astype(np.uint8)


show_image(Y)
show_image(Cb)
show_image(Cr)

NameError: name 'R' is not defined

In [3]:
# put the steps before for RGB to Y C_b C_r conversion into one method
def rgb_to_y_cb_cr(img_rgb):
    """ method to convert a given RGB image to YC_bC_r according to the steps before,
    """
    
    R = img_rgb[:, :, 0] / 255.0
    G = img_rgb[:, :, 1] / 255.0
    B = img_rgb[:, :, 2] / 255.0

    Y  = 16   + (65.481 * R + 128.553 * G + 24.966 * B)
    Cb = 128  + (-37.797 * R - 74.203 * G + 112.0 * B)
    Cr = 128  + (112.0 * R - 93.786 * G - 18.214 * B)

    Y = np.clip(Y, 0, 255).astype(np.uint8)
    Cb = np.clip(Cb, 0, 255).astype(np.uint8)
    Cr = np.clip(Cr, 0, 255).astype(np.uint8)

    combined = np.zeros(img_rgb.shape, dtype=np.uint8)
    combined[:, :, 0] = Y
    combined[:, :, 1] = Cb
    combined[:, :, 2] = Cr
    return combined
# save the result

converted = rgb_to_y_cb_cr(example_image)
skimage.io.imsave("ycbcr.png", converted)

NameError: name 'example_image' is not defined

## Subtask 2: 4:2:0 chroma subsampling
After we are now able to convert RGB images to $YC_bC_r$, we can now implement chroma subsampling.
The general idea here is that human perception is more sensitive to changes in luma than in color.
We will handly 4:2:0 subsampling in this task, this results in the full resolution for Y and only half of the resolution (thus each second pixel) for each C component in the $YC_bC_r$ color space.



In [None]:
import skimage.io
import skimage.color
import numpy as np





# convert example_image to  $YC_bC_r$ using skimage.color.rgb2yuv and perform 4:2:0 chroma subsampling
# note: skimage.color.rgb2yuv is not the same as rgb_to_y_cb_cr, we assume here rgb2yuv \approx ycbcr for demonstration

# use method from skimage.io to read "color_example.jpg"
example_image = skimage.io.imread("color_example.jpg")   # TBD

# convert example image to yuv space, we use this as ycbcr 
ycbcr = skimage.color.rgb2yuv(example_image)   

# select separate components
Y = ycbcr[:, :, 0]  #TBD
C_b = ycbcr[:, :, 1]
C_r = ycbcr[:, :, 2]

# sub sample Cb, Cr components, according to 4:2:0 method:
C_b_s =C_b[::2, ::2]    # TBD
C_r_s = C_r[::2, ::2]   # TBD

# show each component

show_image(Y)
show_image(C_b_s)
show_image(C_r_s)

# Y, C_b_s, and C_r_s are [0,1]-scaled float64 arrays
# to store them we need to convert each component to a [0,255] integer scaled numpy array

Y_uint8 = np.clip(Y * 255, 0, 255).astype(np.uint8)# TBD
C_b_s_uint8 = np.clip(C_b_s * 255, 0, 255).astype(np.uint8)# TBD
C_r_s_uint8 = np.clip(C_r_s * 255, 0, 255).astype(np.uint8)# TBD



# save each component
skimage.io.imsave("Y.png", Y_uint8)
skimage.io.imsave("C_b_2.png", C_b_s_uint8)
skimage.io.imsave("C_r_0.png", C_r_s_uint8)

assert((np.array(C_b_s.shape[0:2]) * 2 == example_image.shape[0:2]).all())
assert((np.array(C_r_s.shape[0:2]) * 2 == example_image.shape[0:2]).all())

In [None]:
# put everything together in one method
def chroma_subsampling_4_2_0(img_rgb):
    """ returns each component with applied 4:2:0 sampling"""
    ycbcr = skimage.color.rgb2yuv(img_rgb)# TBD start
    Y = ycbcr[:, :, 0]# ...
    C_b_s = ycbcr[:, :, 1][::2, ::2]# TBD end
    C_r_s = ycbcr[:, :, 2][::2, ::2]
    return np.array([Y, C_b_s, C_r_s], dtype=object)

res_sub_sampling = chroma_subsampling_4_2_0(example_image)

## Subtask 3: Chroma Keying
Assuming you have a well-illuminated scene, with a static colored background (usually blue or green background colors are used). 
The idea is to use a setup consisting of a camera (where parameters like FOV/focal length, camera position are captured), a blue/green box, lights, and some animated/replacement for the background.
Using such a recorded scene will end up, e.g., the following example image, background, and combined version.
(it should be mentioned that the image here is just an example and was processed before to make this task feasible)

<img src="chroma_foreground.jpg" alt="chroma_foreground" style="width: 25%;float: left;"/>
<img src="chroma_background.jpg" alt="chroma_background" style="width: 25%;float: left;"/> 
<img src="chroma_combined.jpg" alt="chroma_combined" style="width: 25%;float: left;"/>

<div style=" clear: both;"></div>

We will handle this task automatically using python, here we assume a static scene, moreover, in a real setup there are more things to be considered (camera position, changing background, ...)


In [None]:
# Hint: read the foreground and background images
# with scikit-image skimage.io function 
# (check for the suitable way)

# read the foreground image
foreground = skimage.io.imread("chroma_foreground.jpg")   # TBD
# show foreground image
show_image(foreground)

# read background in a similar manner and show it
background = skimage.io.imread("chroma_background.jpg")    # TBD
# show background image
show_image(background) 

# check if both images have same shape
assert(foreground.shape == background.shape)

In [None]:
# defined lower blue threshold
lower_threshold = np.array([0, 0, 100])    # [R, G, B] values

# defined upper blue threshold
upper_threshold =np.array([100, 100, 255])    # TBD

 
def threshold_mask(image, lower_threshold, upper_threshold):
    # for each color channel create a mask based on the defined thresholds
    r_mask = ((image[:,:, 0] >= lower_threshold[0]) & (image[:,:, 0] <= upper_threshold[0]))
    g_mask = ((image[:,:, 1] >= lower_threshold[1]) & (image[:,:, 1] <= upper_threshold[1]))  # TBD
    b_mask = ((image[:,:, 2] >= lower_threshold[2]) & (image[:,:, 2] <= upper_threshold[2])) # TBD
    # combine the channel masks
    mask_value = r_mask & g_mask & b_mask
    return mask_value.copy().astype('uint8')

# create a mask based on a lower and upper threshold
foreground_mask = threshold_mask(foreground, lower_threshold, upper_threshold)

# show the final foreground mask
show_image(foreground_mask)

In [None]:
# combine foreground with mask and add background
bg = background.copy()  # copy background image

# for background set the not masked values zero 
bg[foreground_mask != True] = [0, 0, 0]

# copy foreground
fg = foreground.copy()  # TBD 
# for the forground set the masked values to zero
fg[foreground_mask == 0] = [0, 0, 0]# TBO

# show both masked images
show_image(bg)
show_image(fg)

# combine fg and bg image
combined_image = bg + fg  # TBD
show_image(combined_image)

In [None]:
# put everything in one method

def threshold_chroma(foreground, background, lower_threshold, upper_threshold):  # TBD
    """ one method without showing any image, to perform the above implemented steps in one go, 
        * use threshold_mask that was defined before
    """
    foreground_mask = threshold_mask(foreground, lower_threshold, upper_threshold)

    bg = background.copy()# TBD start
    bg[foreground_mask != 1] = [0, 0, 0]# ...
    fg = foreground.copy() 
    fg[foreground_mask == 0] = [0, 0, 0] 
    combined_image = bg + fg # TBD end
    return combined_image

chroma_res_img = threshold_chroma(foreground, background, lower_threshold, upper_threshold)
show_image(chroma_res_img)

# save resulting combined image
skimage.io.imsave("chroma_result.jpg", chroma_res_img)

## Subtask 4: Salt and pepper noise
This type of noise is also known as impulse noise, here some white or black pixels occur randomly in an image, removing them will help to improve the image quality, and also for later post-processing the noise is usually not required.
They usually originate from dead pixels inside the camera, thus they can also occur in all three channels (colored salt and pepper noise).


In [None]:
# read the noise_example.jpg
from skimage.io import imread
example = imread('noise_example.jpg') # TBD
show_image(example)
# because this image does not have noise, we will include now some noise

def add_salt_pepper_noise(img, SNR):
    img_with_noise = img.copy()
    mask = np.random.choice((0, 1, 2), size=img.shape, p=[SNR, (1 - SNR) / 2., (1 - SNR) / 2.])
    img_with_noise[mask == 1] = 255 # salt noise
    img_with_noise[mask == 2] = 0 # pepper

    return img_with_noise

noise_example = add_salt_pepper_noise(example, 0.8)

show_image(noise_example)

In [None]:
# before we do the median filtering using a 2D convolution, 
#   we develop a method to perform 2D convolutions

def convolve_2d(image, kernel_size=(3,3)):
    """ yields (view, i, j) of a 2D convolution of 
            the 2D input image 
    
        for simplicity the borders are ignored
    """
    keh,kew = kernel_size
    h, w = image.shape[0:2]  # it should work for gray input images, as well as for colored 
    for i in range(h - keh + 1): # TBD
        for j in range( w - kew + 1): # TBD
            # view is the current "kernel" wide view of the image in the convolution
            view = image[i:i+keh, j:j+kew] # TBD
            
            # yield is similar to "return", however a local state of this iteration is stored
            # thus multiple times calling this method will produce each steps of the loops inside 
            # see https://www.geeksforgeeks.org/use-yield-keyword-instead-return-keyword-python/
            yield view, i, j

res = list(convolve_2d(
        np.array([
            [11,12,13,14],
            [21,22,23,24],
            [31,32,33,34],
            [41,42,43,44]
        ]),
        kernel_size = (3,3)
    )
)

assert(len(res) == 4)
assert([x[0].sum() for x in res] == [198, 207, 288, 297])
assert((res[0][0] == [[11, 12, 13], [21, 22, 23], [31, 32, 33]]).all())
assert((res[1][0] == [[12, 13, 14], [22, 23, 24], [32, 33, 34]]).all())
assert((res[2][0] == [[21, 22, 23], [31, 32, 33], [41, 42, 43]]).all())
assert((res[3][0] == [[22, 23, 24], [32, 33, 34], [42, 43, 44]]).all())

In [None]:
# the next step is to remove the introduced salt and pepper noise,
# a convolution with a median filter for each color component is one possible approach
# here an own implementation is required

def remove_salt_pepper_noise(img_rgb, kernel=(3,3)):
    """ remove colored salt and pepper noise using a convolutional kernel with a default size of 3x3 and 
        median filtering per channel,
        
        use the convolve_2d method inside, 
        for simplicity the borders are ignored
    """
    cleaned_img = img_rgb.copy()
    keh,kew = kernel
    
    for c in range(3):# TBD start
        chanel = img_rgb[:, :, c]
        new_chanel = chanel.copy()# ...
        for view, i, j in convolve_2d(chanel, kernel):
            new_chanel[i + keh//2, j + kew//2] = np.median(view)
        cleaned_img[:, :, c] = new_chanel # TBD end
    return cleaned_img

show_image(remove_salt_pepper_noise(noise_example))

In [None]:
# to detect edges we now perform a 2D convolution with a laplace filter kernel

def edge_detection_laplace(img_rgb):
    """ detect edgeds using a laplace filter,
        here a 2D convolution is performed with a laplace filter kernel
        
        only the luminance channel of the input image is used,
        use the convolve_2d method inside, 
        for simplicity borders are ignored
    """
    kernel = np.array([
        [0,  0, 0],  # TBD
        [1, -3, 1],  # TBD
        [0,  1, 0]   # TBD
    ]
    )
    # we only apply the laplace filter to the Y channel
    Y = skimage.color.rgb2yuv(img_rgb)[:,:,0]
    laplacian_edges = np.zeros(Y.shape)
    for view, i, j in convolve_2d(Y, kernel.shape):# TBD start
        laplacian_edges[i + 2, j + 2] = np.sum(view * kernel)
    
    
    return laplacian_edges# TBD end

# we use the same example image as before, and 
#    detect edges using the laplace filter kernel
show_image(edge_detection_laplace(example))