## Exercise 3.1 (4 Points)

Please find the 3 sets of images provided in the Moodle page of this week's exercise. The three image sets (A,B and C) contain pictures taken of a camera calibration checkerboard pattern with a square size of 3cm.

1) Write a function using built-in OpenCV methods that computes the intrinsic camera calibration matrix and distortion matrix from a given set of calibration images.

2) Apply that function to all three sets of images and observe the results. Based on the results for the intrinsic and distortion matrix, discuss what type of camera or lens was used to capture the different image sets.

In [1]:
import cv2
import numpy as np
import os
import glob
from matplotlib import pyplot as plt

np.set_printoptions(suppress=True)


def calibrate(img_path):
    matrix = None
    distortion = None

    nx, ny = 8, 6

    objp = np.zeros((nx * ny, 3), np.float32)
    objp[:,:2] = np.mgrid[0:nx, 0:ny].T.reshape(-1,2)
    # objp *= 30
    
    # termination criteria
    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

    objpoints = [] # 3d point in real world space
    imgpoints = [] # 2d points in image plane.

    images = glob.glob("{}/*.jpeg".format(img_path))

    for filename in images:
        image = cv2.imread(filename)
        grayColor = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
        
        # Find the chess board corners
        ret, corners = cv2.findChessboardCorners(grayColor, (nx, ny), None)
    
        # If found, add object points, image points (after refining them)
        if ret is True:
            objpoints.append(objp)
    
            corners2 = cv2.cornerSubPix(grayColor,corners, (11,11), (-1,-1), criteria)
            imgpoints.append(corners2)

    ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, grayColor.shape[::-1], None, None)
    matrix, distortion = mtx, dist

    return matrix, distortion


print("Images A:")
matrix, distortion = calibrate("images_A")
print(matrix)
print(distortion)
print()

print("Images B:")
matrix, distortion = calibrate("images_B")
print(matrix)
print(distortion)
print()

print("Images C:")
matrix, distortion = calibrate("images_C")
print(matrix)
print(distortion)

Images A:
[[3440.39401167    0.         1606.30026185]
 [   0.         3438.45441335 1924.66106917]
 [   0.            0.            1.        ]]
[[ 0.20366693 -0.89153343 -0.00669965  0.0097696   2.01645186]]

Images B:
[[1707.80289324    0.         1477.7231122 ]
 [   0.         1704.50604179 1989.87378206]
 [   0.            0.            1.        ]]
[[-0.00571021  0.01908916 -0.00171764 -0.00362863 -0.00341247]]

Images C:
[[7567.61688346    0.         1286.82390667]
 [   0.         7594.36409244 2015.22836071]
 [   0.            0.            1.        ]]
[[ -0.17233294   9.05892372  -0.01414267  -0.00978813 -61.28909561]]


### Answer for question 2

From distortion matrix, we can see that with Image set B, the calculated distortion coefficients are nearly equal to 0, so the images in set B are taken from normal camera.

In [1]:
import cv2
import numpy as np
import os
import glob
from matplotlib import pyplot as plt

In [11]:
img_path = "images_A"

nx, ny = 8, 6

criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)

objp = np.zeros((nx * ny, 3), np.float32)
objp[:,:2] = np.mgrid[0:nx, 0:ny].T.reshape(-1,2)
# objp *= 30


objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.

images = glob.glob("{}/*.jpeg".format(img_path))

for filename in images:
    image = cv2.imread(filename)
    grayColor = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # Find the chess board corners
    ret, corners = cv2.findChessboardCorners(grayColor, (nx, ny), None)

    # If found, add object points, image points (after refining them)
    if ret is True:
        objpoints.append(objp)

        corners2 = cv2.cornerSubPix(grayColor,corners, (11,11), (-1,-1), criteria)
        imgpoints.append(corners2)

ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, grayColor.shape[::-1], None, None)

print(filename)

image = cv2.imread(filename)
h,  w = image.shape[:2]
newcameramtx, roi = cv2.getOptimalNewCameraMatrix(mtx, dist, (w,h), 1, (w,h))
mapx, mapy = cv2.initUndistortRectifyMap(mtx, dist, None, newcameramtx, (w,h), 5)
dst = cv2.remap(image, mapx, mapy, cv2.INTER_LINEAR)
 
# crop the image
x, y, w, h = roi
dst = dst[y:y+h, x:x+w]

# plt.imshow(dst)
cv2.imwrite('calibresult.png', dst)

images_A/0FD77AFE-FDC1-4E5E-AD57-86B8B1B29738.jpeg


True

## Exercise 3.2 (4 Points)

1) Explain how the Hough Transform works, focusing on its application to line detection in images. Discuss why the Hough Transform is particularly useful for detecting lines in noisy images or images with missing data. Additionally, explain the limitations of the Hough Transform and potential methods to overcome these limitations.

2)  We provided you example images from the Berkeley Segmentation Dataset. Implement the Hough Transform for line detection by yourself (You may only use the openCV function to check if your solution is correct). Instead of using the default parameters, customize the parameters (e.g., resolution of the parameter space, threshold values for line detection, minimum line length, maximum line gap) to optimize edge detection for the given set of images. Report your observations: which parameters influenced which behaviour in the output? Also report which parameter configuration resembles closest an object segmentation in the test images.

### Answer for question 1

Hough transform employ Voting mechanism in parameter space to detect line from detected edges.
If the edge is indeed a part of a line, Voting mechanism in Hough transform will increase the likelihood of the line. Noise, on the other hand, doesn't constitute to any particular line, so the positions in accumulator affected by noise are less likely to become lines. That's why Hough transform is noise-resistant. In the occlusion case, even part of real line is occluded, other parts of the line can still increase the likelihood of the line in the accumulator and therefore the line can still be detected eventually.

One of limitation of Hough Transform is the running time. Standard Hough Transform has large running time complexity. Therefore, Probabilistic Hough Transform is proposed to mitigate this problem.

In [73]:
#2)
from pathlib import Path
import numpy as np
import matplotlib.pyplot as plt
import cv2

def apply_hough_transform(image, rho=1, theta=np.pi/180, threshold=140):
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

    edges = cv2.Canny(gray, 50, 150)
    edges_coord = np.vstack(np.where(edges > 0)).T.tolist()

    h, w = gray.shape
    diag = np.sqrt(h**2 + w**2)

    N_rho = np.floor(diag / rho).astype(np.int32)
    N_theta = np.floor(np.pi / theta).astype(np.int32)

    H = np.zeros((N_theta + 1, N_rho + 1))

    for (y, x) in edges_coord:
        for i_theta in range(0, N_theta + 1):
            i_rho = np.round(x * np.cos(i_theta) + y * np.sin(i_theta)).astype(np.int32)

            H[i_theta, i_rho] += 1

    lines = np.vstack(np.where(H >= threshold)).T

    return lines


RHO = 1
THETA = np.pi / 150.
threshold = 120

images = Path("segmentation_images").glob("*.jpg")
dir_out = Path("line_detections")
dir_out.mkdir(exist_ok=True, parents=True)

for filename in images:
    print(filename.stem)

    image = cv2.imread(filename, cv2.COLOR_BGR2RGB)

    lines = apply_hough_transform(image, RHO, THETA, threshold)

    # Draw lines
    for (theta, rho) in lines:
        a = np.cos(theta)
        b = np.sin(theta)
        x0 = a * rho
        y0 = b * rho
        pt1 = (int(x0 + 1000*(-b)), int(y0 + 1000*(a)))
        pt2 = (int(x0 - 1000*(-b)), int(y0 - 1000*(a)))

        cv2.line(image, pt1, pt2, (0,0,255), 3, cv2.LINE_AA)

    path_out = dir_out / f"{filename.stem}.png"
    cv2.imwrite(path_out.as_posix(), image)


21077
38082
69020
37073
33039
41069


### Answer for question 2

Among 3 parameters:
- `threshold` affects the likelihood of the lines. Therefore, higher `threshold` means less lines detected
- `rho` and `theta` are the resolution. Therefore, higher value of these parameters mean less accurate the lines are