# ELE510 Image Processing with robot vision: LAB, Exercise 2, Image Formation.

**Purpose:** *To learn about the image formation process, i.e. how images are projected from the scene to the image plane.*

The theory for this exercise can be found in chapter 2 and 3 of the text book [1]. Supplementary information can found in chapter 1, 2 and 3 in the compendium [2]. See also the following documentations for help:
- [OpenCV](https://opencv.org/opencv-python-free-course/)
- [numpy](https://numpy.org/doc/stable/)
- [matplotlib](https://matplotlib.org/stable/contents.html)

**IMPORTANT:** Read the text carefully before starting the work. In
many cases it is necessary to do some preparations before you start the work
on the computer. Read necessary theory and answer the theoretical part
frst. The theoretical and experimental part should be solved individually.
The notebook must be approved by the lecturer or his assistant.

**Approval:**
<div class="alert alert-block alert-success">
The current notebook should be submitted on CANVAS as a single pdf file. 
</div>

<div class="alert alert-block alert-info">
    To export the notebook in a pdf format, goes to File -> Download as -> PDF via LaTeX (.pdf).
</div>

**Note regarding the notebook**: The theoretical questions can be answered directly on the notebook using a *Markdown* cell and LaTex commands (if relevant). In alternative, you can attach a scan (or an image) of the answer directly in the cell.

Possible ways to insert an image in the markdown cell:

`![image name]("image_path")`

`<img src="image_path" alt="Alt text" title="Title text" />`


**Under you will find parts of the solution that is already programmed.**

<div class="alert alert-block alert-info">
    <p>You have to fill out code everywhere it is indicated with `...`</p>
    <p>The code section under `######## a)` is answering subproblem a) etc.</p>
</div>


## Problem 1

**a)** What is the meaning of the abbreviation PSF? What does the PSF specify?

### Pont Spread Function (PSF)
It is a function that defines how an object would look like when processed or visualized by an imaging system. It is also called impulse response. 
The PSD is a result of diffraction and interference.  The degree of spreading (blurring) of the object is a measure for the quality of an imaging system.

## Alternaive answer
The point spread function (PSF) describes the response of an imaging
system to a point source. This is more generally known as a systems impulse response. The Point
spread function (PSF) specifies the shape that a point will take on the image pane.

The term “impulse response” is common in signal processing, and if the system is a filter (rather
than the imaging system).

**b)** Use the imaging model shown in Figure 1. The camera has a lens with focal length $f = 40\text{mm}$ and in the image plane a CCD sensor of size $8\text{mm} \times 8\text{mm}$. The total number of pixels is $4000 \times 4000$. How many lines per mm will this camera resolve at a distance of $z_w = 1\text{m}$ from the camera center?

<img src="./images/perspectiveProjection.jpg" alt="Alt text" title="Title text" />

**Figure 1**: Perspective projection caused by a pinhole camera. Figure 2.23 in [2].



### Point CCD Sensor (Charge-coupled device   ) 
Assumed to have an array of photodiodes spaced 
$$d=8mm/4000pixels=2 \mu m/pixel(line)$$

$$\frac{D}{1m}=\frac{d}{f}=\frac{d}{40mm}$$

$$D=\frac{2 \cdot 10^{-6}}{40 \cdot 10^{-6}} \cdot 1m = \frac{1}{20}mm/pixel(line)$$

**20 lines per mm**

**c)** Explain how a Bayer filter works. What is the alternative to using this type of filter in image acquisition?

### Bayer filter
Is an array of color filters arranged in a square grid of photosensors. The filter pattern is half green, one quarter red and one quarter blue, hence is also called BGGR or any combination or double Gren (BB). It mimics the physiology of the human eye whose combination of type M and L cone cells during daylight vision are more sensitive to green light.

Information of the color combination is calculatedd by debayering algorithms which interpolate the missing information from neighboring pixels and estimate it. This involves image processing that is implemented on the same chip as the image
sensor.This process, though can lead to false representation of image quality.

An alternative to this architecture is found in the CMY color combination (Cyan, Magenta, Yellow) a set of inverted colors that have and improved light absorption characteristic.
Other filter color patterns can be found for different camera manufacturers such as
* Fujifilm "EXR" color filter array
* Fujifilm "X-trans" filter
* Quad Bayer

## Propposed solution
A Bayer filter blocks all but the green light for alternating pixels throughout the sensor in a checkerboard pattern. Green is then sensed by half the pixels, with the remaining
pixels sensing blue or red in alternating rows. The alternative to using this type of filter is to use
three sensors to capture the image, one for each color channel.



## Problem 2

Assume we have captured an image with a digital camera. The image covers an area in the scene of size $1.024\text{m} \times 0.768\text{m}$ (The camera has been pointed towards a wall such that the distance is approximately constant over the whole image plane, *weak perspective*). The camera has 2048 pixels horizontally, and 1536 pixels vertically. The active region on the CCD-chip is $10\text{mm} \times 7.5\text{mm}$. We define the spatial coordinates $(x_w,y_w)$ such that the origin is at the center of the optical axis, x-axis horizontally and y-axis vertically upwards. The image indexes $(x,y)$ is starting in the upper left corner. For simplicity let the optical axis meet the image plane at $(x_{0}=1024,y_{0}=768)$. The solutions to this problem can be found from simple geometric considerations. Make a sketch of the situation and answer the following questions:

**a)** What is the size of each sensor (one pixel) on the CCD-chip?

**b)** What is the scaling coefficient between the image plane (CCD-chip) and the scene? What is the scaling coefficient between the scene coordinates and the image indexes?


### Sketch
![Sketch Image](./images/sketch.png)

### a) What is the size of each sensor (one pixel) on the CCD-Chip?
$$Scene_{Size}=1024mm \times 768mm$$
$$Camera_{resolution}=2048 \times 1536 \space pixels$$
$$CCD_{ActiveRegion}=10mm \times 7.5mm$$
$$Pixel_{sizeX}=\frac{10mm}{2048pixels}=0.0048mm=4.8 \mu m$$
$$Pixel_{sizeY}=\frac{7.5mm}{1536pixels}=0.0048mm=4.8 \mu m$$
$$Pixel_{Size}=4.8 \mu m \times 4.8 \mu m $$

### b) What is the scaling coefficient between the image plane (CCD-Chip) and the scene?
$$C_x=\frac{10mm}{1024mm}=0.097$$  
$$C_y=\frac{7.5mm}{768mm}=0.097$$
**The coefficient is  0.97 to 1 or 97 times smaller in the CCD-Chip than in the original scene**

### What is the scaling coefficient between the scene coordinates and the image pixels?
$$C_x=\frac{1024}{2048}=0.5$$  
$$C_y=\frac{768}{1536}=0.5$$
**The coefficient is 1 to 1/2**


## Problem 3

Translation from the scene to a camera sensor can be done using a transformation matrix, $T$. 

\begin{equation}
	\begin{bmatrix} x\\y\\1\end{bmatrix} = 
	T
	\begin{bmatrix}
		x_w\\ y_w\\ 1
	\end{bmatrix}\\
\end{equation}
where
\begin{equation}
	T= \begin{bmatrix} \alpha_x & 0 & x_0\\
			0 & \alpha_y & y_0\\
		0   & 0 & 1
	\end{bmatrix}
\end{equation}
$\alpha_x$ and $\alpha_y$ are the scaling factors for their corresponding axes.

Write a function in Python that computes the image points using the transformation matrix, using the parameters from Problem 2. Let the input to the function be a set of $K$ scene points, given by a $2 \times K$ matrix, and the output the resulting image points also given by a $2 \times K$ matrix. The parameters defining the image sensor and field of view from the camera center to the wall can also be given as input parameters.

Test the function for the following input points given as a matrix:
\begin{equation}\label{cam-eq4}
    {\mathbf P}_{in} = \begin{bmatrix} 0.512 & -0.512 & -0.512 & 0.512 & 0 & 0.3 & 0.3 & 0.3 & 0.6 \\
    0.384 & 0.384 & -0.384 & -0.384 & 0 & 0.2 & -0.2 & -0.4 & 0 \end{bmatrix}.
\end{equation}

<div class="alert alert-block alert-info">
Comment on the results, especially notice the two last points!
</div>


In [2]:
# Import the packages that are useful inside the definition of the weakPerspective function
import math 
import numpy as np
import matplotlib.pyplot as plt

In [47]:
"""
Function that takes in input:
- FOV: field of view,
- sensorsize: size of the sensor,
- n_pixels: camera pixels,
- p_scene: K input points (2xK matrix)

and return the resulting image points given the 2xK matrix
"""
def weakPerspective(FOV, sensorsize, n_pixels, p_scene):
    scale_coefficient_scene = sensorsize/FOV
    scale_coefficient_sensor = FOV/n_pixels
    
    scale= scale_coefficient_scene * scale_coefficient_sensor
    p_scene = np.vstack([p_scene, np.ones(p_scene.shape[1])])

    T = np.array([[scale[0],0,FOV[0]], [0,scale[1],FOV[1]], [0,0,1]])
    
    result = np.round( T @ p_scene,2) 
    result.dtype = 'float32'
    return  result
    

In [48]:
# The above function is then called using the following parameters:

# Parameters
FOV = np.array([1024, 768])  #plane size giveninmilimiters
sensorsize = np.array([10, 7.5]) # CCD Sensor size
n_pixels = np.array([2048, 1536])
p_scene_x = [0.512, -0.512, -0.512, 0.512, 0, 0.3, 0.3, 0.3, 0.6]
p_scene_y = [0.384, 0.384, -0.384, -0.384, 0, 0.2, -0.2, -0.4, 0]

In [49]:
####
# This cell is locked; it can be only be executed to see the results. 
####
# Input data:
p_scene = np.array([p_scene_x, p_scene_y])

# Call to the weakPerspective() function 
pimage = weakPerspective(FOV, sensorsize, n_pixels, p_scene)

# Result: 
print(pimage)

[[0.    4.5   0.    4.5   0.    4.5   0.    4.5   0.    4.5   0.    4.5
  0.    4.5   0.    4.5   0.    4.5  ]
 [0.    4.25  0.    4.25  0.    4.25  0.    4.25  0.    4.25  0.    4.25
  0.    4.25  0.    4.25  0.    4.25 ]
 [0.    1.875 0.    1.875 0.    1.875 0.    1.875 0.    1.875 0.    1.875
  0.    1.875 0.    1.875 0.    1.875]]




### Delivery (dead line) on CANVAS: 12-09-2021 at 23:59


## Contact
### Course teacher
Professor Kjersti Engan, room E-431

E-mail: kjersti.engan@uis.no

### Teaching assistant
Tomasetti Luca, room E-401

E-mail: luca.tomasetti@uis.no


## References

[1] S. Birchfeld, Image Processing and Analysis. Cengage Learning, 2016.

[2] I. Austvoll, "Machine/robot vision part I," University of Stavanger, 2018. Compendium, CANVAS.