# Introduction to Image Processing
----

This notebook introduces some of the image processing algorithms/functions that are used in the composable pipeline. 

## Aims
* Introduce image algorithms

## Table of Contents
* [Introduction](#intro)
* [OpenCV](#opencv)
* [Color Conversion](#colorconv)
* [2D Convolution](#2dconv)
* [Morphological Transformations](#morphological)
* [Corner Detector](#corner)
* [Fork Operation](#fork)
* [Join Operations](#join)
* [Color Thresholding](#thresh)
* [Look Up Table](#lut)
* [Takeaways](#takeaways)
* [Conclusion](#conclusion)

----

## Revision History

* v1.0 | 15 April 2021 | First notebook revision.

----

## Introduction <a class="anchor" id="intro"></a>

### What is a pixel?

[Merriam Webster](https://www.merriam-webster.com/dictionary/pixel) defines pixel as:
> any of the small discrete elements that together constitute an image

Usually pixels are built of channels, in the case of color images there are three channels Red, Blue and Green (RGB). Each pixel is a discrete representation of the light intensity for a particular channel. The light intensity is encoded in 8-bit, hence only 256 discrete values are possible per channel. 

Run the next cell and then click on the blue box. Interact with the color picker and observe the light intensity for each channel.

In [None]:
import ipywidgets as widgets

widgets.ColorPicker(
    concise=False,
    description='Pick a color',
    value='blue',
    disabled=False
)

### How many colors can a pixel represent?

A color pixel is built of three channels, each pixel channel can represent 256 values. So, a color pixel can represent $256 \times 256 \times 256 = 16,777,256$ different colors.

### What is an image?

An image is a collection of pixel, where each pixel store a value proportional to the light intensity at that particular location. The size of an image is its dimension (or resolution) which is specified in width and hight. For instance, $1920x1080$

### What does the term frames per second refer to?

In a video the term frames per second (FPS), or images per second, represents the number of images that are played every second. The higher the image resolution and/or FPS the better the hardware required to record, process and reproduce video.

## OpenCV <a class="anchor" id="opencv"></a>

[OpenCV](https://opencv.org/) is an open source cross platform computer vision library. You will be using OpenCV to get familiar with some of the computer vision functions.

Import numpy, OpenCV and matplotlib to visualize images. We will use a mosaic image to demonstrate the effect of some vision operation to such image

In [None]:
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt

In [None]:
img = cv.imread('img/mosaic.jpg')
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(img);

## Color Conversion <a class="anchor" id="colorconv"></a>

A color image is represented on the [RGB color space](https://en.wikipedia.org/wiki/RGB_color_space), however there are many different color spaces. Each of them have a particular purpose. We will explore some of them

### Grayscale

A [grayscale image](https://en.wikipedia.org/wiki/Grayscale) has only one channel which represents the amount of light that each pixel contains. One of the main purposes of grayscale images on vision applications is to detect edges on images.


In [None]:
gray = cv.cvtColor(img, cv.COLOR_RGB2GRAY)
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(gray, cmap='gray');

### HSV Color space

[HSV color space](https://en.wikipedia.org/wiki/HSL_and_HSV) represents an image using the channels hue, saturation and value. This color space aligns a bit better to the way human perceives color-making attributes, and in vision application this color space is useful to detect color more accurately.

In [None]:
hsv = cv.cvtColor(img, cv.COLOR_BGR2HSV)
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(hsv);

### XYZ Color space

From [Wikipedia](https://en.wikipedia.org/wiki/CIE_1931_color_space):
>  The CIE 1931 XYZ color space defines quantitative links between distributions of wavelengths in the electromagnetic visible spectrum, and physiologically perceived colors in human color vision

In [None]:
xyz = cv.cvtColor(img, cv.COLOR_BGR2XYZ)
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(xyz);

## 2D Convolution <a class="anchor" id="2dconv"></a>

It is also know as [filter2D](https://docs.opencv.org/3.4.3/d4/d86/group__imgproc__filter.html#ga27c049795ce870216ddfb366086b5a04) on OpenCV.

The 2D convolution is a mathematical operations that uses a kernel (matrix of a dimension $n \times n$). This kernel slides over the input image and produces an output image.

In the next few cells we will consider a few $3 \times 3$ kernels. You can explore more kernels live [here](https://setosa.io/ev/image-kernels/)

### Identity kernel

The output of a 2D convolution using an identity kernel is the same image.

$
Identity = \begin{bmatrix}
0 & 0 & 0\\
0 & 1 & 0\\
0 & 0 & 0
\end{bmatrix}
$

In [None]:
identity = np.array([[0,0,0],[0,1,0],[0,0,0]],np.float32)
identity

Apply the identity kernel to the image

In [None]:
dst = cv.filter2D(img,-1,identity)
plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('Identity')
plt.xticks([]), plt.yticks([])
plt.show()

In [None]:
print("Are image the same? {}".format(np.array_equal(img,dst)))

### Emboss kernel

The output of a 2D convolution using an emboss kernel produces an image that stress the difference of pixels in a given direction given an illusion of depth

$
Emboss = \begin{bmatrix}
-2 & -1 & 0\\
-1 & 1 & 1\\
0 & 1 & 2
\end{bmatrix}
$

In [None]:
emboss = np.array([[-2,-1,0],[-1,1,1],[0,1,2]],np.float32)
emboss

In [None]:
dst = cv.filter2D(img,-1,emboss)
plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('Emboss')
plt.xticks([]), plt.yticks([])
plt.show()

## Morphological Transformations <a class="anchor" id="morphological"></a>

Morphological transformations are operations based on the image shape. The used kernel decides the nature of operation. 

### Dilate

From [Wikipedia](https://en.wikipedia.org/wiki/Dilation_(morphology)):
> The dilation operation usually uses a structuring element for probing and expanding the shapes contained in the input image.

In [None]:
kernel = np.ones((3,3),np.uint8)
dilate = cv.dilate(img,kernel,iterations = 1)
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(dilate);

### Erode

From [Wikipedia](https://en.wikipedia.org/wiki/Erosion_(morphology)):
> The erosion operation usually uses a structuring element for probing and reducing the shapes contained in the input image.

In [None]:
kernel = np.ones((3,3),np.uint8)
erode = cv.erode(img,kernel,iterations = 1)
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(erode);

## Corner Detector  <a class="anchor" id="corner"></a>

### Harris Corner Detector

In [None]:
gray = np.float32(cv.cvtColor(img,cv.COLOR_BGR2GRAY))
harris = cv.cornerHarris(gray,2,3,0.1)
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(harris, cmap='gray');

We can overlap the result on top of the original image, red dots

In [None]:
dst = cv.dilate(harris, None)
corners = img.copy()
corners[dst>0.01*dst.max()]=[255,0,0]
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(corners);

### Fast algorithm

In [None]:
fast = cv.FastFeatureDetector_create()
kp = fast.detect(img, None)
img2 = cv.drawKeypoints(img, kp, None, color=(255,0,0))
plt.figure(figsize=(10, 10)), plt.axis("off"), plt.imshow(img2);

## Fork Operation <a class="anchor" id="fork"></a>

### Duplicate

This operation simply produces two copies of the same image, in the FPGA these two images will be processed at the same time

In [None]:
img2 = img.copy()
img3 = img.copy()

plt.figure(figsize=(18, 18))
plt.subplot(131),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(img2),plt.title('Copy 1')
plt.xticks([]), plt.yticks([])
plt.subplot(133),plt.imshow(img3),plt.title('Copy 2')
plt.xticks([]), plt.yticks([])
plt.show()

## Join Operations <a class="anchor" id="join"></a>

These vision operations take two input images and produce a single one as a result

We will generate the input images as the mosaic version after erode and dilate operations

In [None]:
kernel = np.ones((3,3),dtype=np.uint8)
dilate = cv.dilate(img,kernel,iterations = 1)
erode = cv.erode(img,kernel,iterations = 1)

### Subtract

This function perform a pixel-wise subtraction, therefore the order of the the images matter.

In [None]:
ed = erode - dilate
de = dilate - erode

plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(ed),plt.title('erode - dilate')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(de),plt.title('dilate - erode')
plt.xticks([]), plt.yticks([])
plt.show()

### Absdiff

In [None]:
ed = cv.absdiff(erode, dilate)
de = cv.absdiff(dilate, erode)
plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(ed),plt.title('erode - dilate')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(de),plt.title('dilate - erode')
plt.xticks([]), plt.yticks([])
plt.show()

In [None]:
print("Are image the same? {}".format(np.array_equal(ed,de)))

### Add

The addition of two images can potentially lead to overflow (the result is bigger than 255). Depending on the implementation the result can either
1. Saturate: you will notice large white areas in the image
1. Wrap around: you will notice artifacts in the result

In [None]:
addsat = cv.add(erode, dilate)
addwrap = erode + dilate
plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(addsat),plt.title('Add Saturate')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(addwrap),plt.title('Add wrap around')
plt.xticks([]), plt.yticks([])
plt.show()

## Color Thresholding <a class="anchor" id="thresh"></a>

In [None]:
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
mask = cv.inRange(img, (0, 100, 0), (80,255,80))
mask_rgb = cv.cvtColor(mask,cv.COLOR_GRAY2BGR)
cothr = img & mask_rgb

plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(cothr),plt.title('Color Detect')
plt.xticks([]), plt.yticks([])
plt.show()

## Look Up Table <a class="anchor" id="lut"></a>

This function performs pixel-wise look-up table transformation. This is, output image is a map of the input image using a 256 table. The light intensity of each channel is used as address and the content is the output result.

<div class="alert alert-info">
  <strong>INFO:</strong> The compute vision function LUT does not refer to an FPGA LUT. Although, the concept in essence is quite similar.
</div>


Let see this function in action

### Negative

Produce the negative of an image

In [None]:
lut = np.zeros((256,1),dtype=np.uint8)
for i in range(len(lut)):
    lut[i] = 255-i

dst = cv.LUT(img, lut)

plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('Negative')
plt.xticks([]), plt.yticks([])
plt.show()

### Thresholding

Produce an output image that lights up pixels with light intensity within a range

In [None]:
lut = np.zeros((256,1),dtype=np.uint8)
for i in range(len(lut)):
    if 188 > i > 127:
        lut[i] = 255

dst = cv.LUT(img, lut)

plt.figure(figsize=(15, 15))
plt.subplot(121),plt.imshow(img),plt.title('Original')
plt.xticks([]), plt.yticks([])
plt.subplot(122),plt.imshow(dst),plt.title('Thresholded')
plt.xticks([]), plt.yticks([])
plt.show()

## Takeaways <a class="anchor" id="takeaways"></a>

After having read and used this notebook you should be able to answer the following questions:

- What is a pixel?
- What is an image?
- What is image resolution?
- What is frames per second?


## Conclusion <a class="anchor" id="conclusion"></a>

This notebooks provides a brief introduction to core concepts of computer vision. The OpenCV library is introduced and used to visualize some of the common computer vision functionality.

[⬅️ Getting Started with the Composable Pipeline](01_get_started.ipynb) | | [Difference of Gaussians Application ➡️](applications/01_difference_gaussians_app.ipynb)

Copyright &copy; 2021 Xilinx, Inc

SPDX-License-Identifier: BSD-3-Clause

----