## Section 1.2 - Camera calibration process from 3-D pattern

#### The object coordinate system in the checkerboard pattern as a temporary world reference system



<!-- 
```python
import cv2
import matplotlib.pyplot as plt
import matplotlib.patches as patches


f, ax = plt.subplots()
ax.imshow(cv2.imread("imgs/pattern.png"))

secax = ax.secondary_xaxis('top')
secax.set_xlabel('squares')
secax.set_xticks([7,22,36,51,66,81,95,110,125,140])
secax.set_xticklabels([-1,0,1,2,3,4,5,6,7,8]) 

secay = ax.secondary_yaxis('right')
secay.set_ylabel('squares')
secay.set_yticks([9,24,39,54,68,84,98,113])
secay.set_yticklabels([6,5,4,3,2,1,0,-1]) 

ax.set_xlabel('pixel')
ax.set_xticks([7,22,36,51,66,81,95,110,125,140])
ax.set_xticklabels([7,22,36,52,66,82,96,112,126,142]) 

ax.set_ylabel('pixel')
ax.set_yticks([9,24,39,54,68,84,98,113])
ax.set_yticklabels([9,24,39,54,69,84,99,114]) 

# Plot the line
ax.plot([22, 45], [98, 98],linewidth=3,label="objP x")
ax.plot([22, 22 ], [98, 75],linewidth=3,label="objP y")
ax.plot(22, 98, 'ro', color='green',label="objP z")

circle = patches.Circle((22, 98), 3, fill=False, edgecolor='green')

# Add the circle to the plot
ax.add_patch(circle)

ax.grid()
plt.legend()
plt.title("Object pattern reference system")

```
-->

<img src="imgs/objp.jpg">

The above image represent one of several pictures taken from a static camera and chess boards placed at different locations and orientations. So we need to know $[X_w^{(i)} Y_w^{(i)} Z_w^{(i)}]$ values. But for simplicity, we can say chess board was kept stationary at XY plane, (so $Z_w^{(i)}=0$ always) and camera was moved accordingly. This consideration helps us to find only $[X_w^{(i)} Y_w^{(i)}]$ values simply passing the points as (0,0), (1,0), (2,0), ... which denotes the location of points. In this case, the results we get will be in the scale of size of chess board square. But if we know the square size, (say 30 mm), we can pass the values as (0,0), (30,0), (60,0), ... . Thus, we get the results in mm. 

3D points are called object points and 2D image points are called image points. 

From the above image, the corners are the places where two black squares meet each other, so we have a pattern with 9x6 corners, from that we can estabilish 54 correspondences, in example:

| u  | v  | $X_w$ | $Y_w$ | $Z_w$ | 
| -- | -- | ----- | ----- | ------|
| 22 | 99 |   0   | 0     | 0     |
| 36 | 99 |   1   | 0     | 0     |
| 52 | 99 |   2   | 0     | 0     |
| 66 | 99 |   3   | 0     | 0     |
| ... | ... |   ...   | ...     | ...     |
| 22 | 84 |   0   | 1     | 0     |
| 22 | 69 |   0   | 2     | 0     |
| 22 | 54 |   0   | 3     | 0     |
| ... | ... |   ...   | ...     | ...     |

The coorespondence can be done mannually or using an automatic algorithm.

#### Usage of calibration data to solve P

After gathering $[u^{(i)} v^{(i)} X_w^{(i)} Y_w^{(i)} Z_w^{(i)}], i=1,2,...,n$ correspondent points we start at equation (3) applied to each point:

$$
Z_w\begin{bmatrix}
    u \\
    v \\
    1
\end{bmatrix} = 
\begin{bmatrix}
    p_{11} & p_{12} & p_{13} & p_{14} \\
    p_{21} & p_{22} & p_{23} & p_{24} \\
    p_{31} & p_{32} & p_{33} & p_{34}
\end{bmatrix}
\begin{bmatrix}
    X_w \\
    Y_w \\
    Z_w \\
    1
\end{bmatrix} =
\begin{bmatrix}
    p_{11}X_w + p_{12}Y_w + p_{13}Z_w + p_{14} \\
    p_{21}X_w + p_{22}Y_w + p_{23}Z_w + p_{24} \\
    p_{31}X_w + p_{32}Y_w + p_{33}Z_w + p_{34}
\end{bmatrix} \\
$$
$$
u = \frac{p_{11}X_w + p_{12}Y_w + p_{13}Z_w + p_{14}}{p_{31}X_w + p_{32}Y_w + p_{33}Z_w + p_{34}} \\

v = \frac{p_{21}X_w + p_{22}Y_w + p_{23}Z_w + p_{24}}{p_{31}X_w + p_{32}Y_w + p_{33}Z_w + p_{34}} \\
$$
$$
p_{11}X_w + p_{12}Y_w + p_{13}Z_w + p_{14} - up_{31}X_w - u p_{32}Y_w - u p_{33}Z_w - u p_{34} = 0 \\

p_{21}X_w + p_{22}Y_w + p_{23}Z_w + p_{24} - vp_{31}X_w - v p_{32}Y_w - v p_{33}Z_w - v p_{34} = 0
$$

Making this as a vector multiplication we have:

$$
\begin{bmatrix}
X_w & Y_w & Z_w & 1 & 0   & 0   & 0   & 0 & -uX_w & -uY_w & -uZ_w & -u \\
0   &  0  & 0   & 0 & X_w & Y_w & Z_w & 1 & -vX_w & -vY_w & -vZ_w & -v 
\end{bmatrix}
\begin{bmatrix}
 p_{11} \\
 p_{12} \\
 p_{13} \\
 p_{14} \\
 p_{21} \\
 p_{22} \\
 p_{23} \\
 p_{24} \\
 p_{31} \\
 p_{32} \\
 p_{33} \\
 p_{34}
\end{bmatrix} = 0
$$

And for our found set we have:

$$

\begin{bmatrix}
X_w^{(1)} & Y_w^{(1)} & Z_w^{(1)} & 1 & 0   & 0   & 0   & 0 & -u^{(1)}X_w^{(1)} & -u^{(1)}Y_w^{(1)} & -u^{(1)}Z_w^{(1)} & -u^{(1)} \\
0   &  0  & 0   & 0 & X_w^{(1)} & Y_w^{(1)} & Z_w^{(1)} & 1 & -v^{(1)}X_w^{(1)} & -v^{(1)}Y_w^{(1)} & -v^{(1)}Z_w^{(1)} & -v^{(1)} \\
\vdots \\
X_w^{(n)} & Y_w^{(n)} & Z_w^{(n)} & 1 & 0   & 0   & 0   & 0 & -u^{(n)}X_w^{(n)} & -u^{(n)}Y_w^{(n)} & -u^{(n)}Z_w^{(n)} & -u^{(n)} \\
0   &  0  & 0   & 0 & X_w^{(n)} & Y_w^{(n)} & Z_w^{(n)} & 1 & -v^{(n)}X_w^{(n)} & -v^{(n)}Y_w^{(n)} & -v^{(n)}Z_w^{(n)} & -v^{(n)} \\
\end{bmatrix}
\begin{bmatrix}
 p_{11} \\
 p_{12} \\
 p_{13} \\
 p_{14} \\
 p_{21} \\
 p_{22} \\
 p_{23} \\
 p_{24} \\
 p_{31} \\
 p_{32} \\
 p_{33} \\
 p_{34}
\end{bmatrix} = 0 
$$

$$
\begin{bmatrix}A\end{bmatrix}_{2n \ \times \ 12}\ \begin{bmatrix}p\end{bmatrix}_{12 \ \times\ 1}=0
$$

Which is solveable through constrained least squares parameter estimation method. The scale of $p$ is set and the problem is transformed into <u>[A][p] tending to [0]</u> such that <u>||p||=1</u>:

$$
min (||Ap||^2) \ subject \ to \ ||p||=1
$$

$$
min (p^TA^TAp) \ subject \ to \ p^Tp=1
$$

We can define a convenient loss function we want to minimize:

$$
min = p^TA^TAp - \lambda(p^Tp -1)
$$

Taking the derivative with respect to $p$, we have:

$$
0 = 2A^TAp - 2\lambda p
$$

That is equivalent to solving the eigenvalue problem:

$$
A^TAp = \lambda p
$$

$p$ is the eigenvector corresponding to smallest eigenvalue of $A^TA$.

You have notice that each image might give different correspondences and $P$, therefore different $K$ (intrinsic parameters), $R$ and $t$ (extrinsic parameters). We need them, because the linear camera model do not take in account the <u>distortions</u> caused by lenses. We have to incorporate more parameters to our camera model leading to a non-limear camera model. Do a separate calibration just to find these extra parameters to obtain a linearized camera model. Finaly we can do the calibration method above.

#### Factorizing P

$$
\begin{bmatrix}
    p_{11} & p_{12} & p_{13} & p_{14} \\
    p_{21} & p_{22} & p_{23} & p_{24} \\
    p_{31} & p_{32} & p_{33} & p_{34}
\end{bmatrix} = 
\begin{bmatrix}
    f_x & 0 & c_x & 0 \\
    0 & f_y & c_y & 0 \\
    0 & 0 & 1 & 0
\end{bmatrix}
\begin{bmatrix}
    R_{3 \times 3} & t_{3\times 1} \\
    0_{1 \times 3} & 1 
\end{bmatrix}
$$

The first $3 \times 3$ block of $P$ is:

$$
\begin{bmatrix}
    p_{11} & p_{12} & p_{13} \\
    p_{21} & p_{22} & p_{23} \\
    p_{31} & p_{32} & p_{33} 
\end{bmatrix} = 
\begin{bmatrix}
    f_x & 0 & c_x \\
    0 & f_y & c_y \\
    0 &  0  & 1 
\end{bmatrix}
\begin{bmatrix}
    R\\
\end{bmatrix} = KR
$$

Where $K$ (the camera matrix) happen to be an upper triangular matrix and $R$ is a rotation matrix, thereby, orthonormal. With this, we can use *QR factorization* to compute $K$ and $R$.

The last $3 \times 1$ block of $P$ is:

$$
\begin{bmatrix}
    p_{14} \\
    p_{24} \\
    p_{34} 
\end{bmatrix} = 
\begin{bmatrix}
    f_x & 0 & c_x \\
    0 & f_y & c_y \\
    0 &  0  & 1 
\end{bmatrix}
\begin{bmatrix}
    t_x \\
    t_y \\
    t_z
\end{bmatrix} = Kt
$$

Therefore we can compute $t$:

 $$
 t = K^{-1}\begin{bmatrix}
    p_{14} \\
    p_{24} \\
    p_{34} 
\end{bmatrix}
$$

And we can transform from the world to camera reference frame with equation (1)

$$
\begin{bmatrix}
X_c \\
Y_c \\
Z_c \\
1
\end{bmatrix} = 
\underbrace{
    \begin{bmatrix}
        R_{3 \times 3} & t_{3\times 1} \\
        0_{1\times 3} & 1 
    \end{bmatrix}
}_{Extrinsinc \ parameters \ T_{4\times 4}}
\begin{bmatrix}
X_w \\
Y_w \\
Z_w \\
1
\end{bmatrix} 
$$