 计算机视觉（25春）作业1-5（30分）
---

## 题目: 相机标定实验
- 有一个3D空间中的棋盘格平面，大小为12x12，边长为1个单位长度。假设棋盘格平面位于世界坐标系中`z=0`的平面。
- 同时给你提供一个虚拟相机（理想相机，参数仅包含焦距和主点坐标），能够获得3维棋盘格的二维图像，大小为`600x600`，你可以移动它来获得不同的观察角度下的棋盘格平面的图像（简化为棋盘格角点图像）。
- 请你自行移动相机，获得多个不同角度下的棋盘格平面的图像，利用这些点的坐标，实现相机的标定。

---

### 环境配置见作业HW1-4

---

## 相机模型和3D棋盘格模型

- 以下的代码块里包含一个实现好的相机模型和3D棋盘格模型，你可以直接运行它们来实现相机标定。


In [38]:
import numpy as np
import cv2

"""相机模型"""
class CameraModel(object):
    dft_configs = {
        "fx": 300,
        "fy": 300,
        "cx": 300,
        "cy": 300,
        "yaw": 0, # degrees
        "pitch": 0, # degrees
        "roll": 0, # degrees
        "x": 0, # translation
        "y": 0,
        "z": 10,
    }
    def __init__(self, configs={}) -> None:
        self.configs = {**self.dft_configs, **configs}
        self.fx = self.configs["fx"]
        self.fy = self.configs["fy"]
        self.cx = self.configs["cx"]
        self.cy = self.configs["cy"]
        self.K = np.array([[self.fx, 0, self.cx], [0, self.fy, self.cy], [0, 0, 1]])

        self.yaw = self.configs["yaw"]
        self.pitch = self.configs["pitch"]
        self.roll = self.configs["roll"]
        self.R = self._get_rotation_matrix(self.yaw, self.pitch, self.roll)

        self.x = self.configs["x"]
        self.y = self.configs["y"]
        self.z = self.configs["z"]
        self.T = np.array([self.x, self.y, self.z])

    def show_K(self):
        print(f"K: {self.K}")
    
    def show_R(self):
        print(f"R: {self.R}")

    def show_T(self):
        print(f"T: {self.T}")

    def _get_rotation_matrix(self, yaw, pitch, roll):
        yaw = np.deg2rad(yaw)
        pitch = np.deg2rad(pitch)
        roll = np.deg2rad(roll)
        Rx = np.array([[1, 0, 0], [0, np.cos(pitch), -np.sin(pitch)], [0, np.sin(pitch), np.cos(pitch)]])
        Ry = np.array([[np.cos(yaw), 0, np.sin(yaw)], [0, 1, 0], [-np.sin(yaw), 0, np.cos(yaw)]])
        Rz = np.array([[np.cos(roll), -np.sin(roll), 0], [np.sin(roll), np.cos(roll), 0], [0, 0, 1]])
        return Rz @ Ry @ Rx

    def project(self, points3D):
        """ Project 3D points to 2D
        Args:
            points3D: 3D points, numpy array of shape (N, 3)
        """
        points3D = points3D.reshape(-1, 3)
        points3D = points3D.T
        points_ = self.R @ points3D + self.T.reshape(-1, 1)
        points_ = self.K @ points_
        points2D = points_ / points_[2, :]
        return points2D[:2, :].T # (N, 2)
    
    def draw(self, points, out_img_path=None):
        """ Draw 2D points on image
        Args:
            points: 2D points, numpy array of shape (N, 2)
        """
        img_W = 2*self.cx
        img_H = 2*self.cy
        img = np.zeros((img_H, img_W, 3), dtype=np.uint8)

        point_num = points.shape[0]
        outside_img_count = 0
        for point in points:
            x, y = point
            x, y = int(x), int(y)
            if 0 <= x < img_W and 0 <= y < img_H:
                img = cv2.circle(img, (x, y), 5, (0, 255, 0), -1)
            else:
                outside_img_count += 1
                print(f"Point ({x}, {y}) is outside the image")
        
        #print(f"Outside image count: {outside_img_count}, while total points: {point_num}")

        if out_img_path:
            cv2.imwrite(out_img_path, img)
        
        if outside_img_count > 0: return False
        return True

    def move_x_axis_plus(self, step):
        self.x += step
        self.T = np.array([self.x, self.y, self.z])
    
    def move_y_axis_plus(self, step):
        self.y += step
        self.T = np.array([self.x, self.y, self.z])

    def move_z_axis_plus(self, step):
        self.z += step
        self.T = np.array([self.x, self.y, self.z])

    def rotate_yaw_plus(self, degree):
        self.yaw += degree
        self.R = self._get_rotation_matrix(self.yaw, self.pitch, self.roll)
    
    def rotate_pitch_plus(self, degree):
        self.pitch += degree
        self.R = self._get_rotation_matrix(self.yaw, self.pitch, self.roll)
    
    def rotate_roll_plus(self, degree):
        self.roll += degree
        self.R = self._get_rotation_matrix(self.yaw, self.pitch, self.roll)

    def reset(self):
        self.x = self.configs["x"]
        self.y = self.configs["y"]
        self.z = self.configs["z"]
        self.T = np.array([self.x, self.y, self.z])

        self.yaw = self.configs["yaw"]
        self.pitch = self.configs["pitch"]
        self.roll = self.configs["roll"]
        self.R = self._get_rotation_matrix(self.yaw, self.pitch, self.roll)

### 相机模型的使用方法

**主要成员函数如下：**
- `CameraModel.project(points3D)`: 将3D空间中的点投影到相机平面上;
    - 输入
        - `points3D`：为3D空间中的点，类型为`np.ndarray`，尺寸为`(N,3)`，N为点的个数。
    - 返回值
        - `points2D`: 为2D图像平面上的坐标，类型为`np.ndarray`，尺寸为`(N,2)`。
        
- `CameraModel.draw(points2D, out_img_path=None)`: 在图像上绘制投影点，并检验投影点是否在图像内部;
    - 输入：
        - `points2D`：2D平面上的坐标，类型为`np.ndarray`，尺寸为`(N,2)`。
        - `out_img_path`：保存图像的路径，类型为`str`，默认值为`None`，此时不保存图像。
    - 返回值
        - `True`：表示所有投影点都在图像内部
        - `False`：表示有投影点在图像外部

- `CameraModel.move_<axis_name>_plus(step)`：沿着世界坐标系的坐标轴`axis_name`的正方向移动相机;
    - axis_name: `x_axis`, `y_axis`, `z_axis`
    - 输入
        - `step`：移动的步长，类型为`float` （正负均可）。
    - 返回值
        - 无

- `CameraModel.rotate_<angel_name>(degree)`: 改变相机的旋转角，以相机自身坐标系为参照;
    - 相机自身坐标系的x轴指向右，y轴指向下，z轴指向相机的观察方向。
    - angel_name: 旋转角名称 (因为相机坐标系的轴名字定义，与一般的欧拉角定义不同)
        - `yaw`: 绕y轴旋转
        - `pitch`: 绕x轴旋转
        - `roll`: 绕z轴旋转
    - 输入
        - `degree`：旋转的角度，类型为`float` （单位为度，正负均可）。
    - 返回值
        - 无

- `CameraModel.reset()`: 重置相机的位置和旋转角;

**推荐使用流程为：先移动相机，然后进行投影，得到投影点后用`draw`函数检验点的有效性（在图像内），收集有效投影点集合用以标定**


In [2]:
"""3D 棋盘格模型"""
class chessboard3D(object):
    """
    """
    dft_configs = {
        "square_size": 1,
        "rows": 12,
        "cols": 12,
        "z_plane": 0,
    }

    def __init__(self, configs={}) -> None:
        self.configs = {**self.dft_configs, **configs}
        self.square_size = self.configs["square_size"]
        self.rows = self.configs["rows"]
        self.cols = self.configs["cols"]
        self.z_plane = self.configs["z_plane"]
        self.points = self._generate_points()
    
    def _generate_points(self):
        points = []
        for row in range(self.rows):
            for col in range(self.cols):
                points.append([col*self.square_size, row*self.square_size, self.z_plane])
        return np.array(points) # (N, 3)

    def return_points(self):
        return self.points

### 3维棋盘格模型的使用方法

**主要成员函数如下：**
- `ChessboardModel.return_points()`: 获取3D棋盘格模型的角点坐标;
    - 返回值
        - `points3D`：为3D空间中的点，类型为`np.ndarray`，尺寸为`(N,3)`，N为点的个数。

---
## 相机标定的实现

在运行完上述的代码块后，你需要在下面的代码块中实现相机的标定，并用Markdown块完成实验报告。

**实现及报告要求：**

- 相机的标定可以使用opencv库中的函数`cv2.calibrateCamera`，也可以自己实现DLT算法（可酌情加分）。

- OpenCV相机标定可参考：https://docs.opencv.org/3.4/dc/dbb/tutorial_py_calibration.html

- 请输出最后得到的相机内参数，包括焦距、主点坐标。
    
- 开放题回答：
    - 请尝试不同的相机移动方式，观察标定结果的差异，尝试总结相机移动的最佳策略。
    - 尝试不同次数的二维点投影，观察标定结果的差异，尝试总结标定所用的图像数量（即二维投影点集合的数量）的最佳策略。（越多越好？）

**评分标准：**
- 实现相机标定的正确性（15分）
- 相机参数的精度（5分）
- 开放题的回答（5+5分）

---

以下为答题区域，可以使用多个代码块和markdown块。

### 一些函数实现

首先先对棋盘格类和相机类进行简单的初始化，移动相机至棋盘格的中心，返回此时记录的世界坐标系点集、相机坐标系点集

In [13]:
def init():
    worldPoints = chessboard3D(configs={"square_size": 1, "rows": 12, "cols": 12}).return_points()
    Camera = CameraModel()
    Camera.reset()
    Camera.move_x_axis_plus(-5.5)
    Camera.move_y_axis_plus(-5.5)
    CamerPoints=Camera.project(worldPoints)
    result=Camera.draw(CamerPoints, out_img_path="chessboard.png")
    if result:
        print("Draw success")
    return worldPoints, Camera


一些功能函数的实现，包括相机点的采集、相机的移动等实现

In [None]:
def add_calibration_point(camera, world_point, world_points, camera_points):
    """
    将一个世界点添加到校准点集合中
    
    参数:
        camera: 相机对象
        world_point: 3D世界坐标点
        world_points: 世界点列表
        camera_points: 相机点列表
    
    返回:
        bool: 是否成功添加点
    """
    camera_point = camera.project(world_point)
    if camera.draw(camera_point):
        world_points.append(np.array(world_point, dtype=np.float32).reshape(-1, 1, 3))
        camera_points.append(np.array(camera_point, dtype=np.float32).reshape(-1, 1, 2))
        print("Draw success")
        return True
    return False

def move_camera(camera, axis, value):
    """
    移动相机
    
    参数:
        camera: 相机对象
        axis: 'x', 'y', 或 'z'
        value: 移动距离
    """
    if axis == 'x':
        camera.move_x_axis_plus(value)
    elif axis == 'y':
        camera.move_y_axis_plus(value)
    elif axis == 'z':
        camera.move_z_axis_plus(value)
    elif axis == 'pitch':
        camera.rotate_pitch_plus(value)
    elif axis == 'yaw':
        camera.rotate_yaw_plus(value)
    elif axis == 'roll':
        camera.rotate_roll_plus(value)

def collect_calibration_points_at_positions(camera, world_point, positions):
    """
    在多个相机位置收集校准点
    
    参数:
        camera: 相机对象
        world_point: 3D世界坐标点
        positions: 相机位置列表，格式为[(axis, value), ...]
                  axis可以是'x', 'y', 'z'
                  value是移动距离
    
    返回:
        tuple: (world_points, camera_points)
    """
    world_points = []
    camera_points = []
    
    # 添加初始位置的点
    add_calibration_point(camera, world_point, world_points, camera_points)
    
    # 在每个位置收集点
    for axis, value in positions:
        # 移动相机到新位置
        move_camera(camera, axis, value)
        add_calibration_point(camera, world_point, world_points, camera_points)

    
    return world_points, camera_points



### 单纯平移对于相机矩阵参数计算的影响，让相机在三个轴上六个方向移动一些距离，得到共计7张图片。

In [32]:
worldPoint, Camera = init()

positions = [
    ('z', -3),  
    ('x', -3),  
    ('y', -3),
    ('z', 4),  
    ('x', 4),  
    ('y', 2),

]

# 使用封装函数收集校准点
world_points, camera_points = collect_calibration_points_at_positions(Camera, worldPoint, positions)

# 打印收集的点信息
for i, (objp, imgp) in enumerate(zip(world_points, camera_points)):
    print(f"View {i}: objp {objp.shape}, imgp {imgp.shape}")
ret, camera_matrix, dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(
    world_points, camera_points, (600,600), None, None
)
print("ret:", ret)
print("cameraMatrix:", camera_matrix)
print("distCoeffs:", dist_coeffs)

Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
View 0: objp (144, 1, 3), imgp (144, 1, 2)
View 1: objp (144, 1, 3), imgp (144, 1, 2)
View 2: objp (144, 1, 3), imgp (144, 1, 2)
View 3: objp (144, 1, 3), imgp (144, 1, 2)
View 4: objp (144, 1, 3), imgp (144, 1, 2)
View 5: objp (144, 1, 3), imgp (144, 1, 2)
View 6: objp (144, 1, 3), imgp (144, 1, 2)
ret: 630696.0435082357
cameraMatrix: [[1.12802615e+14 0.00000000e+00 2.99500000e+02]
 [0.00000000e+00 1.12802615e+14 2.99500000e+02]
 [0.00000000e+00 0.00000000e+00 1.00000000e+00]]
distCoeffs: [[-2.33922228e-24

可以看到结果误差相当大，同时焦距的结果与不符合实际。

### 单纯旋转对于相机矩阵参数计算的影响，分别在x,y,z三个轴上旋转相同角度，得到共计7张图片。

In [None]:
worldPoint, Camera = init()

positions = [
    ('pitch', 10),  
    ('yaw', 10),  
    ('roll', 10),
    ('pitch', -10),  
    ('yaw', -10),  
    ('roll', -10),
]

# 使用封装函数收集校准点
world_points, camera_points = collect_calibration_points_at_positions(Camera, worldPoint, positions)

# 打印收集的点信息
for i, (objp, imgp) in enumerate(zip(world_points, camera_points)):
    print(f"View {i}: objp {objp.shape}, imgp {imgp.shape}")
ret, camera_matrix, dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(
    world_points, camera_points, (600,600), None, None
)
print("ret:", ret)
print("cameraMatrix:", camera_matrix)
print("distCoeffs:", dist_coeffs)



Draw success
Draw success
Draw success
View 0: objp (144, 1, 3), imgp (144, 1, 2)
View 1: objp (144, 1, 3), imgp (144, 1, 2)
ret: 7.2793244239222835e-06
cameraMatrix: [[297.24906185   0.         299.99998485]
 [  0.         297.24906186 299.51720343]
 [  0.           0.           1.        ]]
distCoeffs: [[-7.01861483e-07  1.91599050e-06 -1.66120623e-08 -6.20503586e-09
  -1.61862464e-06]]


旋转的图片集合计算的相机矩阵的误差明显远远小于单纯平移的图片，尝试进一步增加旋转角度的组合，计算相机参数

In [34]:
worldPoint, Camera = init()

positions = [
    ('pitch', 10),
    ('pitch', 10),   
    ('yaw', 10),  
    ('roll', 10),
    ('pitch', -10),  
    ('yaw', -10),  
    ('roll', -10),
    ('pitch', -5),  
    ('yaw', -5),  
    ('roll', -5),
]

# 使用封装函数收集校准点
world_points, camera_points = collect_calibration_points_at_positions(Camera, worldPoint, positions)

# 打印收集的点信息
for i, (objp, imgp) in enumerate(zip(world_points, camera_points)):
    print(f"View {i}: objp {objp.shape}, imgp {imgp.shape}")
ret, camera_matrix, dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(
    world_points, camera_points, (600,600), None, None
)
print("ret:", ret)
print("cameraMatrix:", camera_matrix)
print("distCoeffs:", dist_coeffs)



Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
Outside image count: 0, while total points: 144
Draw success
View 0: objp (144, 1, 3), imgp (144, 1, 2)
View 1: objp (144, 1, 3), imgp (144, 1, 2)
View 2: objp (144, 1, 3), imgp (144, 1, 2)
View 3: objp (144, 1, 3), imgp (144, 1, 2)
View 4: objp (144, 1, 3), imgp (144, 1, 2)
View 5: objp (144, 1, 3), imgp (144, 1, 2)
View 6: ob

增加更加丰富的角度变化的图片并不能减小误差，反而使得误差加大，说明接下来需要研究平移和旋转的组合的图片。

### 平移与旋转的结合

In [55]:
worldPoint, Camera = init()

positions = [
    ('z', -3),  
    ('x', -3),  
    ('y', -3),
    ('pitch', 10),  
    ('yaw', 10),  
    # ('roll', 10),
    # ('z', 3),  
    # ('x', 3),  
    # ('y', 3),
    # ('pitch', -10),  
    # ('yaw', -10),  
    # ('roll', -10),
]

# 使用封装函数收集校准点
world_points, camera_points = collect_calibration_points_at_positions(Camera, worldPoint, positions)

# 打印收集的点信息
for i, (objp, imgp) in enumerate(zip(world_points, camera_points)):
    print(f"View {i}: objp {objp.shape}, imgp {imgp.shape}")
ret, camera_matrix, dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(
    world_points, camera_points, (600,600), None, None
)
print("ret:", ret)
print("cameraMatrix:", camera_matrix)
print("distCoeffs:", dist_coeffs)



Draw success
Draw success
Draw success
Draw success
Draw success
Draw success
Draw success
View 0: objp (144, 1, 3), imgp (144, 1, 2)
View 1: objp (144, 1, 3), imgp (144, 1, 2)
View 2: objp (144, 1, 3), imgp (144, 1, 2)
View 3: objp (144, 1, 3), imgp (144, 1, 2)
View 4: objp (144, 1, 3), imgp (144, 1, 2)
View 5: objp (144, 1, 3), imgp (144, 1, 2)
ret: 6.989282922312471e-06
cameraMatrix: [[300.00015093   0.         300.00000675]
 [  0.         300.00015179 300.00000296]
 [  0.           0.           1.        ]]
distCoeffs: [[-3.25951949e-08  1.23037703e-07  3.98560192e-09  3.63366941e-09
  -6.56634476e-08]]


观察结果可以发现，平移的引入对于误差的减少是由一定的改善，从9e-6减少到了7e-6。


进一步实验发现，校准的过程并不需要太多的照片，一张旋转过的照片（非沿z轴）即可减小误差至可以额接受的结果，当然沿着z轴的旋转也可以相较于单纯平移的结果有很大的改善。

### 结论

相机矩阵参数计算结果如下：
$$
K = \begin{bmatrix}
300.000 & 0            & 300.000 \\[6pt]
0            & 300.000 & 300.000 \\[6pt]
0            & 0            & 1
\end{bmatrix}
$$

**最佳的相机移动策略：**

最佳的相机移动策略在本次实验中变现为旋转的引入，一张初始图和旋转过后的图片即可大大减小标定的误差。旋转的方向尽量选择x、y轴，即能够使得棋盘格产生较大的视差差异，极大地提高对焦距和畸变参数的灵敏度。然后再进一步可以引入适当的平移策略，提高准确率。

**最佳的投影次数：**

最佳的投影次数不需要太多次，控制在3至6次即可，增加更多的图片反而增加了计算的开销，同时在误差上没有太太的改良，矩阵参数的结果与图片数量适中的情况没有太太的差异.