# Lab2 - Epipolar Geometry  

## Objective

The objective of this lab is to focus on the foundational elements of epipolar geometry. Using the provided functions in the `utils.py` file, you will complete the following utility functions in `utils_to_complete.py` based on your coursework:

- `inverseHomogeneousMatrix()`
- `multiplyHomogeneousMatrix()`
- `skew()`

![Figure 1](../assets/lab2/fig1.png)

Consider a stereoscopic system providing two images, $I_1$  and $I_2$. The camera calibration is known. We will assume that, initially, the calibration matrix is given by:

$$\mathbf{K} =
\begin{pmatrix}
800 & 0 & 200 \\
0 & 800 & 150 \\
0 & 0 & 1 \\
\end{pmatrix}$$

**Question 1**: What do the values in matrix $K$ represent?

**Answear:**
This matrix represents the **intrinsic parameters** of the camera. Specifically:
- The entries $K[0,0] = 800$ and $K[1,1] = 800$ are the **focal lengths** in the $x$ and $y$ directions, respectively.
- The values $K[0,2] = 200$ and $K[1,2] = 150$ represent the coordinates of the **principal point** (the image center) in the $x$ and $y$ directions, respectively.
- The $1$ at $K[2,2]$ is a **scaling factor**, often set to 1 in homogeneous coordinates.




We assume that camera $c2$ is positioned at the location defined by ${}^{c2}T_w$ (see Figure 2):

$$
{}^{c2}T_w =
\begin{pmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 2 \\
0 & 0 & 0 & 1 \\
\end{pmatrix}
$$

![Figure 2](../assets/lab2/fig2.png)

#### Case 1: Camera $c_1$ is positioned 10 cm to the left of $c_2$

**Question 2:** Provide the matrix ${}^{c1}T_w$. Using the provided code skeleton, complete the matrix ${}^{c1}T_w$.

**Answear:**
Since camera $c_1$ is positioned 10 cm (or 0.1 meters) to the left of $c_2$, the transformation matrix ${}^{c1}T_w$ will reflect this translation along the $x$-axis. Assuming no rotation, the matrix ${}^{c1}T_w$ is:

$$
{}^{c1}T_w =
\begin{pmatrix}
1 & 0 & 0 & -0.1 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 2 \\
0 & 0 & 0 & 1 
\end{pmatrix}
$$

In this matrix:

- The $0.$ in the $(1, 4)$ position indicates a 10 cm leftward translation of $c_1$ relative to $c_2$ along the $x$-axis.
- The $2$ in the $(3, 4)$ position represents the $z$-axis translation of both cameras (since they are both positioned 2 units from the world origin along $z$).

**Question 3:** What is this type of system called, and can you elaborate?

*We aim to facilitate point matching between the two images. To do this, we will represent the geometric location of a point $x_1$ in image $I_1$, where its corresponding point $x_2$ in $I_2$ might be located.*

**Question 4:** Characterize this location. Calculate its equation. In your report, provide the coordinates of points $x_1$ for the points $x_2$ at (100, 100) and (50, 75).

**Question 5:** Display the points $x_2$ in $I_2$ and the previously calculated locations in $I_1$. Verify that the expected result is achieved and provide the obtained image.

#### Question 2: Provide the matrix ${}^{c1}T_w$

Since camera $c_1$ is positioned 10 cm (or 0.1 meters) to the left of $c_2$, the transformation matrix ${}^{c1}T_w$ will reflect this translation along the $x$-axis. Assuming no rotation, the matrix ${}^{c1}T_w$ is:

$$
{}^{c1}T_w =
\begin{pmatrix}
1 & 0 & 0 & -0.1 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 2 \\
0 & 0 & 0 & 1 
\end{pmatrix}
$$

In this matrix:

- The $0.$ in the $(1, 4)$ position indicates a 10 cm leftward translation of $c_1$ relative to $c_2$ along the $x$-axis.
- The $2$ in the $(3, 4)$ position represents the $z$-axis translation of both cameras (since they are both positioned 2 units from the world origin along $z$).

#### Question 3: What is this type of system called, and can you elaborate?

This type of system is called a **stereo camera system** or a **stereoscopic vision system**.

In a stereo camera system, two cameras are positioned at a known baseline distance apart, capturing two slightly different perspectives of the same scene. By analyzing the disparity between the two images (the difference in positions of corresponding points in each image), we can extract depth information, effectively estimating the 3D structure of the scene.

Key features of a stereo camera system:

- **Epipolar Geometry**: Each point in one image corresponds to a line (called the epipolar line) in the other image. This allows us to constrain the search for matching points to a 1D search along the epipolar lines, making disparity calculation more efficient.
- **Depth Perception**: The system uses disparity (the horizontal offset between corresponding points in the left and right images) to compute depth. Closer objects have higher disparity, while farther objects have lower disparity.
- **Applications**: Stereo vision is widely used in applications like autonomous driving, 3D reconstruction, robot navigation, and augmented reality.

A stereo system configuration like this is foundational in computer vision, as it mimics the way human binocular vision perceives depth.


#### Case 2: Now position camera $c_1$ 20 $cm$ in front of $c_2$.

**Question 6:** Provide the new matrix ${}^{c1}T_w$.


**Question 7:** What is the position of the epipole?


**Question 8:** Redo questions 4 and 5 for this new position.

#### Case 3: Now position camera $c1$ at ${}^{c1}T_w$ such that the translation is (0.1, 0, 1.9) and the rotation in degrees is (5, 5, 5), using the minimal representation angle-axis (also called theta-u).

**Question 9** Provide the new matrix ${}^{c1}T_w$.


**Question 10** Provide the new position of the epipoles.


**Question 11** Redo questions 4 and 5 for this new position.