# 3D Computer Vision (2023/24)

## Exercise 1

Submitted by Group xx: 
- Name1
- Name2
- Name3
- Name4

Upload: 08.11.2023 (11:30)

**Deadline**: 21.11.2023 (23:59)

Please hand in a single **.zip** file named according to the pattern "**groupXX_exerciseX**" (e.g. group00_exercise9). The contents of the .zip should be as follows:
- folder named according to the pattern groupXX_exerciseX
    - **.ipynb** file
    - **.html** export of .ipynb with all the outputs you got
    - **data** folder containing necessary files to run the code

I.e.
1. **unzip** the provided exerciseX.zip file
2. **rename** folder "exerciseX" according to the pattern "groupXX_exerciseX"
3. **solve** tasks inside .ipynb file
4. **export** notebook as .html (File > Download as > HTML)
5. **zip** folder groupXX_exerciseX
6. **submit** groupXX_exerciseX.zip

## Theory

### T1. Camera with lenses

#### (a)
Given a camera with a distance $L$ between the film and the lens, derive the mathematical relationship (formula) between the height $H_o$ of the object in front of the camera and the height $H_i$ of its image. Additionally, explain the intuition behind the relationship. Assume a thin lens.

#### Solution
- Similar triangles:
$\frac{y'}{D'} =\frac{y}{D} $ $\rightarrow$ $\frac{H_i}{L} =\frac{H_o}{D}$
- therefore $\frac{H_i}{H_o} =\frac{L}{D}$
- Interpretation: 
    - For the same distance to the lense $D$, a bigger object ($H_o$) has a bigger image ($H_i$)
    - (OR) The further the object of height $H_o$ is away from the lense ($D$), the smaller is its image $H_i$

### T2. Rotation 
An Object is roated around the x-axis by $90°$, then around the y-axis by $270°$, and finally around the z-axis by $180°$.

#### (a)
Derive the 3D Roation Matrix that executes the same transformation.<br/>
*Hint: the given values lead to 'nice' numbers.*

#### Your answer goes here

#### (b)
Assume the object is a sphere with a radius of 3.5. Explain how the radius will change after the transformation. 

#### Your answer goes here

#### (c)
What is the rotation matrix that transforms the object back to its original orientation?<br/>
*Hint: this should be very short*

#### Your answer goes here

### T3. Transformation Chain

#### (a)
Why are homogeneous coordinates used for transforming points between coordinate systems?

#### Your answer goes here

#### (b)
Describe the transformation chain for mapping a point from the world coordinate system to the pixel coordinate system of an intrinsically and extrinsically calibrated camera.  Use formulas and explain the intermediate steps in words.

#### Your answer goes here

#### (c)
Describe the steps for modelling distortion. 

#### Your answer goes here

## Implementation

### I1. Distortion Modelling
The **./data/** directory contains images of a chessboard that were used for calibrating a camera with high radial distortion. The results of the calibration (intrinsics of the camera and extrinsics for each board) are stored in **./data/calib.mat**.

In [6]:
'''You can add all your imports here'''
import os

import numpy as np
import cv2 as cv
import scipy.io as io
from PIL import Image
import matplotlib.pyplot as plt

# scale_factor = 0.5

# I = cv.imread("data/00000.jpg", cv.IMREAD_GRAYSCALE)
# I_resized = cv.resize(I, (0, 0), fx=scale_factor, fy=scale_factor)

# plt.imshow(I_resized)

# num_columns = 9
# num_rows = 6

# board_size = [num_rows-1, num_columns-1]

# # Flags to findChessboardCorners that improve performance
# detect_flags = cv.CALIB_CB_ADAPTIVE_THRESH + cv.CALIB_CB_NORMALIZE_IMAGE + cv.CALIB_CB_FAST_CHECK

# ok, u = cv.findChessboardCorners(I, (board_size[0],board_size[1]), detect_flags)


#### (a)
Write a function that does the following:
- Draw any number of 2D pixel points onto an image

The inputs should be:
- A single image
- An Array of 2D pixel points
- Color of the points to be drawn

The output should be:
- An image with points drawn onto it

In [147]:
def draw_points(image, points, color):
    '''Draws points on an image'''
    for point in points:
        # print("Center:", point)
        # print("Center (tuple):", (point[0], point[1]))
        image = cv.circle(image, (int(point[0]), int(point[1])), 3, color)
    return image


# I = cv.imread("data/00000.jpg", cv.IMREAD_COLOR)
# draw_points(I, [(5, 5), (10, 10)], (0, 0, 255))

# # show image with matplotlib and correct color space
# plt.imshow(cv.cvtColor(I, cv.COLOR_BGR2RGB))


#### (b)
Write a function **project_points** that does the following:
- Convert 3D world points to 2D image points
- As an option: model radial distortions (using k1, k2, k5)

The inputs should be:
- Array of 3D world points
- Camera intrinsics
- Camera extrinsics
- Distortion coefficients (if needed)

The output should be:
- Array of 2D pixel coordinates

In [309]:
def project_points(points, K, R, t, dist_coef = None): 
    '''Projects a 3D point into the image plane'''
    # Apply rotation
    X = R@points.T
    # print ("points after rotation", X.shape)

    # Apply extrinsics
    X = X + t
    # print("point after translation", X)

    # Apply instrinsics
    X = K@X

    # print("point after intrinsics", X)

    X /= X[2,:]
    X = X[:2, :].T

    if dist_coef is None:
        return X
    else:
        print("dist_coef:", dist_coef)
        # Extract distortion coefficients
        k1 = dist_coef[0]
        k2 = dist_coef[1]
        k3 = dist_coef[2]
        k4 = dist_coef[3]
        k5 = dist_coef[4]

        fx = K[0,0]
        fy = K[1,1]


        # Calculate radial distance from center (480 / 2, 640 /2)
        x_c = 640/2
        y_c = 480/2
        x = (X[:,0] - x_c) / fx
        y = (X[:,1] - y_c) / fy
        r = np.sqrt(x**2 + y**2)

        # Calculate distorted point
        x_distort = x * (1 + k1*r**2 + k2*r**4 + k3*r**6 + k4*r**8 + k5*r**10)
        y_distort = y * (1 + k1*r**2 + k2*r**4 + k3*r**6 + k4*r**8 + k5*r**10)

        X[:,0] = x_distort * fx + x_c
        X[:,1] = y_distort * fy + y_c
        print("X:", X)
    
        return X


#### (c)
Write a function **project_and_draw** that does the following:
- Execute **project_points**
- Execute **draw_points**
- Save the result as an image file

The inputs should be:
- The data that is necessary to run your functions
- Needs to run on **all images** with a single call

In [313]:
def project_and_draw(imgs, points, color, K, R, t, dist_coef = None):
  for i in range(len(imgs)):
    projected_points = project_points(points[i], K, R[i], t[i], dist_coef)
    image = imgs[i].copy()
    image = draw_points(image, projected_points, color)
    if dist_coef is None:
      cv.imwrite("results/no_distortion/{0}.jpg".format(i), image)
    else:
      cv.imwrite("results/distortion/{0}.jpg".format(i), image)


#### (d)
Run your **project_and_draw** function once without and once with distortion modelling. Then display the following:
- Your output for 00000.jpg **without any distortion modelling** in **red**
- Your output for 00000.jpg **with added radial distortion** (k1, k2, and k5) in **green**

In [314]:
os.makedirs(name=f'results/no_distortion', exist_ok=True)
os.makedirs(name=f'results/distortion', exist_ok=True)

base_folder = './data/'

# Load the data
# There are 25 views/or images/ and 40 3D points per view
data = io.loadmat('./data/calib.mat')

# 3D points in the world coordinate system
x_3d_w = data['x_3d_w'] # shape=[25, 40, 3]

# Translation vector: as the world origin is seen from the camera coordinates
t_vecs = data['translation_vecs'] # shape=[25, 3, 1]

# Rotation matrices: converts coordinates from world to camera
rot_mats = data['rot_mats'] # shape=[25, 3, 3]

# Five distortion coefficients
dist_coef = data['distortion_params'] # shape=[5, 1]

# K matrix of the cameras
k_matrix = data['k_mat'] # shape=[3, 3]

# Images corresponding to the 3D points
imgs_list = [cv.imread(base_folder+str(i).zfill(5)+'.jpg') for i in range(t_vecs.shape[0])]
imgs = np.asarray(imgs_list) # shape=[25, 480, 640, 3]

#Call project_and_draw twice: once without and once with distortion modelling
project_and_draw(imgs, x_3d_w, (0, 0, 255), k_matrix, rot_mats, t_vecs)
project_and_draw(imgs, x_3d_w, (0, 255, 0), k_matrix, rot_mats, t_vecs, dist_coef)

# project_and_draw()

# without_distortion = Image.open('results/no_distortion/')
# display(without_distortion)

# with_distortion = Image.open('results/distortion/')
# display(with_distortion)


K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [  0.           0.           1.        ]]
K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [  0.           0.           1.        ]]
K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [  0.           0.           1.        ]]
K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [  0.           0.           1.        ]]
K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [  0.           0.           1.        ]]
K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [  0.           0.           1.        ]]
K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [  0.           0.           1.        ]]
K: [[376.64067184   0.         320.57546504]
 [  0.         374.71731472 247.85562599]
 [ 