### Students
- Student 1: <span style="color:green">253885 - Luca Franceschi</span>

# Lab 5: Sructure from motion

## Goals
The goal of the current assignment is to learn the following concepts:

- How to self-calibrate a camera using vanishing points.
- What are the main elements of an incremental structure from motion approach.

## Introduction

The Structure fron Motion problem (SfM) is defined as the 3D reconstruction from a set of unordered and uncalibrated images. There are different SfM approaches: global, hierarchical or incremental. In this lab we will focus on the incremental SfM, which is the most popular strategy for 3D reconstruction from unordered and uncalibrated photo collections.

Incremental SfM is a sequential processing pipeline with an iterative component. It starts working from two views, from which motion (camera parameters) are initialized, and then the structure (3D reconstruction) is also initialized. Right after, an iterative extension of both the motion and the structure is performed, progressively adding new cameras/views and 3D points. A further refinement of both the camera matrices and the reconstructed point cloud is carried out. This last step is known as Bundle Adjustment and it involves solving an optimization problem by iterative techniques.

The purpose of this lab is to familiarize with the structure from motion problem and work with the main blocks that constitute a basic (vanilla) incremental SfM algorithm, without the refinement step. A more complete and robust SfM pipeline would include -- apart from the bundle adjustment --  a carefully selection of the two initial views and each next new view to incorporate, filtering of outliers,  and some other tricks [3]. If you are interested in a more complete solution here we provide a list of some of the most known libraries and softwares:

- OpenMVG:  http://openmvg.readthedocs.io/en/latest/# <br>
Incremental and global SfM, open source.

- VisualSFM:  http://ccwu.me/vsfm/ <br>
Incremental SfM, very efficient, GUI, binaries.

- Bundler: http://www.cs.cornell.edu/~snavely/bundler/ <br>
Incremental SfM, open source.

- Colmap: http://colmap.github.io/ <br>
Incremental SfM, very efficient, nice GUI, open source.

- Theia: http://www.theia-sfm.org/ <br>
Incremental and Global SfM, very efficient, open source.

The solution of the structure from motion strongly relies on point correspondences (matchings) across the different views, commonly known as feature tracks. Then, one of the first things to do is feature extraction and matching, followed by geometric verification to remove outliers. 

As in previous labs, we will be using SIFT for estimating keypoints and matchings between pairs of images.  These matchings will contain outliers; these can be filtered by robustly estimating a fundamental matrix. Moreover, the fundamental matrix that relates the two initial views will be used to estimate the camera parameters (motion) of these two views.

You will have to answer the questions and complete the provided code when necessary as required. **You must deliver the completed (and executed) ipynb file, including the answers to the questions (please make clear visually what it is answer, either preceding it by ANSWER and/or changing its color).**


In [None]:
# !pip install requirements.txt

In [None]:
import logging
import math
import random
import sys

import cv2
import matplotlib.pyplot as plt
import numpy as np
import plotly.graph_objects as go
import scipy.io as sio
import seaborn_image as isns
from tqdm.notebook import tqdm
from utils import *

We will work with 3 differrent views of a scene from the UPF campus.

In [None]:
# Read images
img1_color = cv2.imread(f"images/v1.jpg", cv2.IMREAD_COLOR)
img2_color = cv2.imread(f"images/v2.jpg", cv2.IMREAD_COLOR)
img3_color = cv2.imread(f"images/v3.jpg", cv2.IMREAD_COLOR)

img1_color = cv2.cvtColor(img1_color, cv2.COLOR_BGR2RGB)
img2_color = cv2.cvtColor(img2_color, cv2.COLOR_BGR2RGB)
img3_color = cv2.cvtColor(img3_color, cv2.COLOR_BGR2RGB)

# Reduce image size to speed up computations
original_shape = np.array(img1_color.shape[:2])
scale_percent = 50
rescaled_shape = np.flip(original_shape * scale_percent // 100)
img1r = cv2.resize(img1_color, rescaled_shape)
img2r = cv2.resize(img2_color, rescaled_shape)
img3r = cv2.resize(img3_color, rescaled_shape)

In [None]:
g = isns.ImageGrid([img1_color, img2_color, img3_color], height=5, cmap="gray")
for i, axis in enumerate(g.axes[0], start=1):
    axis.set_title(f"View {i}")
plt.show()


**Q1.** Robustly estimate the fundamental matrix that relates views 1 and 2, and views 1 and 3. For each case, display the inlier matchings together with the pair of images.

We recommend you to use SIFT with 13000 matches instead of 3000 which is the value in previous labs.

**Hint**: For a faster convergence of the RANSAC algorithm you can filter matched keypoints based on their matching distance. You can filter out matches that have a distance bigger than a threshold (e.g. 200).



In [None]:
# TODO: Find keypoints
th = 3
keypoints = [None]*3

# Initiating SIFT detector
sift = cv2.SIFT_create(13000)

# Finding the keypoints and descriptors
keypoints[0], des1 = sift.detectAndCompute(img1r, None)
keypoints[1], des2 = sift.detectAndCompute(img2r, None)
keypoints[2], des3 = sift.detectAndCompute(img3r, None)

# Keypoint matching
bf = cv2.BFMatcher(cv2.NORM_L2, crossCheck=True)
matches_12 = bf.match(des1, des2)
matches_13 = bf.match(des1, des3)

# Show matches
def plot_matches(img1, img2, kp1, kp2, matches, title="Matches"):
    """
    Plot matches between two images.
    """
    img_matches = cv2.drawMatches(
        img1,
        kp1,
        img2,
        kp2,
        matches,
        None,
        flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS,
    )
    plt.figure(figsize=(20, 10))
    plt.imshow(img_matches)
    plt.axis("off")
    plt.title(title)
    plt.show()
plot_matches(img1r, img2r, keypoints[0], keypoints[1], matches_12, "Matches 1-2")
plot_matches(img1r, img3r, keypoints[0], keypoints[2], matches_13, "Matches 1-3")

In [None]:
# TODO: Filter matches based on hint
old_matches_12 = matches_12
old_matches_13 = matches_13

matches_12 = np.array(matches_12)[[match.distance < 200 for match in matches_12]].tolist()
matches_13 = np.array(matches_13)[[match.distance < 200 for match in matches_13]].tolist()

print(f"Images 1-2 - {len(matches_12)/len(old_matches_12)*100:.2f}% of matches after filtering: {len(matches_12)} of {len(old_matches_12)}")
print(f"Images 1-3 - {len(matches_13)/len(old_matches_13)*100:.2f}% of matches after filtering: {len(matches_13)} of {len(old_matches_13)}")

# Show filtered matches
plot_matches(img1r, img2r, keypoints[0], keypoints[1], matches_12, "Filtered Matches 1-2")
plot_matches(img1r, img3r, keypoints[0], keypoints[2], matches_13, "Filtered Matches 1-3")

In [None]:
# TODO: Find the fundamental matrices and inlier matches for views 1-2 and 1-3

points1_H12, points2_H12 = correspondences_between_keypoints(keypoints[0], keypoints[1], matches_12)
points1_H13, points3_H13 = correspondences_between_keypoints(keypoints[0], keypoints[2], matches_13)

F_12, inliers_12 = ransac_fundamental_matrix(points1_H12, points2_H12, th, 1000)
F_13, inliers_13 = ransac_fundamental_matrix(points1_H13, points3_H13, th, 1000)

matches_12 = np.array(matches_12)[inliers_12].tolist()
matches_13 = np.array(matches_13)[inliers_13].tolist()

In [None]:
# Show inlier matches for views 1-2 and 1-3
# NOTE: make sure that matches is an array of cv2.DMatch objects
plot_matches(img1r, img2r, keypoints[0], keypoints[1], matches_12, "Inlier Matches for F 1-2")
plot_matches(img1r, img3r, keypoints[0], keypoints[2], matches_13, "Inlier Matches for F 1-3")

<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color:rgba(255, 255, 255, 0);">
  <strong>🎥 Video Question 1:</strong>
  <ul style="margin: 10px 0 0 20px; padding: 0;">
    <li>Explain the procedure followed above.</li>
    <li>How did you find keypoints?</li>
    <li>How did you match them?</li>
    <li>How did you filter them?</li>
    <li>How did you find the Fundamental matrix?</li>
    <li>Which distance was applied to determine which matches were inliers? Does it measure the distance between two points or the distance between a point and a line?</li>
  </ul>
</div>

## Initial two-view reconstruction and self-calibration
In lab 4 we saw how to recover the motion between a pair of calibrated cameras. In SfM, cameras are not calibrated, but different self-calibration techniques can be applied for estimating the intrinsic parameters.
 
In case we work with images of man-made environments, like urban or indoor scenes, it is possible to estimate vanishing points of orthogonal directions. Vanishing points are useful for self-calibration because they allow us to establish constraints on the internal parameters of the camera (intrinsics). There are different methods in the literature to estimate vanishing points. For example, the following ones are implemented in Matlab:

- Vanishing Point Detection in Urban Scenes Using Point Alignments [1, 2]: <br>
http://www.ipol.im/pub/art/2017/148/

- Orthogonal Vanishing Points in Uncalibrated Images of Man-Made Environments [4]: <br>
https://members.loria.fr/GSimon/software/fastvp/

And this one in Python:
- NeurVPS: Neural Vanishing Point Scanning via Conic Convolution: [6] <br>
https://github.com/zhou13/neurvps

We will assume square pixels (aspect ratio of 1), zero skew factor and principal point at the center of the image, which is common in most commercial cameras.  Then, the only remaining unknown in the matrix of internal parameters is the focal length $\alpha$, but it can be estimated using a pair of vanishing points as we will explain now (see also Section 6.3.2 of Szeliski's book [5]). 

Under the previous assumptions, the camera calibration matrix, $K$, can be written as: <br>

$$ K=\begin{pmatrix} \alpha _x & s & c_x \\ 0 & \alpha _y & c_y \\ 0 & 0 & 1\end{pmatrix}
\overset{Assumptions}{\longrightarrow}
K=\begin{pmatrix} \alpha & 0 & c_x \\ 0 & \alpha & c_y \\ 0 & 0 & 1\end{pmatrix}$$
where the only unknown is $\alpha$, since $c_x = \frac{w}{2}$ and $c_y = \frac{h}{2}$, being $w$ and $h$, respectively, the image width and height in pixels. 

Let us assume that we have detected two or more orthogonal vanishing points, all of which are finite, i.e., they are not obtained from lines that appear to be parallel in the image plane. The projection equation for the vanishing point $\mathbf{v}_1$ corresponding to the cardinal direction $(1, 0, 0)^T$ can be written as
$$ \mathbf{v}_1 \sim P \begin{pmatrix} 1 \\ 0 \\ 0 \\ 0 \end{pmatrix} = K  \begin{pmatrix}\mathbf{r}_1 & \mathbf{r}_2 & \mathbf{r}_3 & \mathbf{t} \end{pmatrix} \begin{pmatrix} 1 \\ 0 \\ 0 \\ 0 \end{pmatrix} = K \mathbf{r}_1$$
If we denote the coordinates of the vanishing point $\mathbf{v}_1$ as $x_1$ and $y_1$ we have:
$$ \mathbf{r}_1 \sim K^{-1} \mathbf{v}_1 = \begin{pmatrix} 1/\alpha & 0 & -c_x/\alpha \\ 0 & 1/\alpha & -c_y/\alpha \\ 0 & 0 & 1 \end{pmatrix} 
 \begin{pmatrix} x_1 \\ y_1 \\ 1 \end{pmatrix} = 
 \begin{pmatrix} (x_1 - c_x) / \alpha \\ (y_1 -c_y) / \alpha \\ 1 \end{pmatrix}
 \sim \begin{pmatrix} x_1 - c_x \\ y_1 -c_y \\ \alpha \end{pmatrix}$$
And in general, for the vanishing point $\mathbf{v}_i$, $i=1,2,3$, corresponding to one of the cardinal directions (1, 0, 0), (0, 1, 0), or (0, 0, 1) respectively, and $\mathbf{r}_i$ being the $i_{th}$ column of the rotation matrix $R$ we have:
$$ \mathbf{r}_i \sim  \begin{pmatrix} x_i - c_x \\ y_i -c_y \\ \alpha \end{pmatrix}$$
From the orthogonality between columns of the rotation matrix, we have:
$$ \mathbf{r}_i  \cdot \mathbf{r}_j = (x_i - c_x)(x_j - c_x)+(y_i - c_y)(y_j - c_y)+ \alpha^2 =0, $$
from which we can obtain an estimate for $\alpha$:
$$ \alpha = \sqrt{-(x_i - c_x)(x_j - c_x)-(y_i - c_y)(y_j - c_y)}.$$
Then, it is possible to estimate $\alpha$, and thus $K$, using two vanishing points corresponding to orthogonal directions. In our case, all the images have been taken with the same camera, so all of them will share the same $K$.

**Q2.** Provide the code to estimate the matrix of internal parameters following the previous directions. Which is the matrix you have obtained?

In [None]:
# Vanishing points are provided below
vp1 = np.array([1932.51919443, 3033.59044871])
vp2 = np.array([1516.57688111, -14998.62215829])
vp1, vp2 = vp1 * scale_percent / 100, vp2 * scale_percent / 100

In [None]:
# TODO: Complete
x1, y1 = vp1
x2, y2 = vp2
ny, nx, _ = img1r.shape # NOTE: remember (height, width)
cy = ny / 2 # REMEMBER HALF !! NOT FULL IMAGE SIZES
cx = nx / 2

alpha = np.sqrt(-(x1-cx)*(x2-cx)-(y1-cy)*(y2-cy))
K = np.array([[alpha, 0,     cx],
              [0,     alpha, cy],
              [0,     0,     1]])

print(f"alpha: {alpha}")
print(f"Intrinsic matrix K:\n{K}")

**Optional question 2**: 

Given what we have learned in slides 29-32 from lecture 5, we can find a different way to estimate $\alpha$. We know that $v_1^T \omega v_2 = 0$. We should explain that with the image of the absolute conic we can obtain the formula for the angle between 2 vectors. $cos(\theta) = \frac{v_1^T \omega v_2}{...}$, and knowing that $v_1$ and $v_2$ are orthogonal, we know that the numerator is equal to zero. Then, we know that $\omega = (K K^T)^{-1}$.


Let's understand the shape of $\omega$. We know that in the general case
$$ K = \begin{pmatrix} \alpha_x & s & c_x \\ 0 & \alpha_y & c_y \\ 0 & 0 & 1\end{pmatrix} $$
Then, the value of $\omega$ can be computed and the result is the following one (extracted from Zhang paper provided in lab 2):

$$\omega = \begin{pmatrix}
\frac{1}{\alpha_x^2} & - \frac{s}{\alpha_x^2\alpha_y} & \frac{c_y s - c_x \alpha_y}{\alpha_x^2\alpha_y} \\
- \frac{s}{\alpha_x^2\alpha_y} & \frac{s^2}{\alpha_x^2\alpha_y} + \frac{1}{\alpha_y^2} & - \frac{s (c_y s - c_x \alpha_y)}{\alpha_x^2\alpha_y} - \frac{c_y}{\alpha_y^2} \\
\frac{c_y s - c_x \alpha_y}{\alpha_x^2\alpha_y} & - \frac{s (c_y s - c_x \alpha_y)}{\alpha_x^2\alpha_y} - \frac{c_y}{\alpha_y^2} & \frac{(c_y s - c_x \alpha_y)^2}{\alpha_x^2\alpha_y^2} + \frac{c_y^2}{\alpha_y^2} + 1\end{pmatrix}
$$

Apply the assumptions of squared pixels and zero skew factor to demonstrate the same equation of $\alpha$ as in the previous explanation.

<span style="color:green">

OPTIONAL ANSWER: write your theoretical answer in LaTex or add a picture of your notes.

</span>

<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <ul style="margin: 0; padding-left: 20px;">
    <strong>🎥 OPTIONAL Video Question 2:</strong><br>
    <li>Explain the theoretical answer above.<br>
  </ul>
</div>

-------

In the first part of the lab we have estimated the fundamental matrix $F$ that relates the initial pair of views. With $F$ and $K$ we can estimate the essential matrix $E$ and from it we can get the complete camera matrices (intrinsics and extrinsics) for the initial pair of views. Once the cameras are fully calibrated an initial 3D reconstruction (structure) is found by triangulation. These steps were part of lab 3.


**Q3.** Estimate the camera matrices for views 1 and 2. 


In [None]:
# TODO: Find P1
P1 = K @ np.eye(3, 4)
print(f"P1: {P1}")

# TODO: Find Essential matrix for view 1-2
E_12 = K.T @ F_12 @ K
print(f"Essential matrix E_12:\n{E_12}")

# TODO: Find P2
ny, nx, _ = img1r.shape
P2 = camera_projection_matrix(F_12, K, nx, ny, points1_H12, points2_H12, inliers_12)
print(f"P2: {P2}")

# Show the projection matrices
ny, nx, _ = img1r.shape
fig = go.Figure()
plot_camera(P1, nx, ny, fig, "Camera 1")
plot_camera(P2, nx, ny, fig, "Camera 2")
fig.update_layout(
    title="Camera Projection Matrices",
    scene=dict(
        xaxis_title="X",
        yaxis_title="Y",
        zaxis_title="Z",
        aspectmode="data"
    ),
)
fig.show()

<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <ul style="margin: 0; padding-left: 20px;">
    <strong>🎥 Video Question (Q3):</strong><br>
    <li>Explain the procedure followed to find P1, E_12, and P2. Explain it step by step referring to the code above.<br>
  </ul>
</div>


**Q4.** Triangulate the matches from views 1 and 2 and plot them together with the cameras.

In [None]:
# complete ...
# TODO: Triangulate all matches
X = ...

print(f"Number of points in the point cloud: {X.shape[1]}")


# TODO: Remove bad-triangulated points (Z < 0) for a proper visualization

print(f"Number of points in the point cloud after removing errors: {X.shape[1]}")

# Render the 3D point cloud
fig = go.Figure()
plot_camera(P1, nx, ny, fig, "Camera 1")
plot_camera(P2, nx, ny, fig, "Camera 2")
x_img = inliers_1_12[:2].T.astype(int) # inliers_1_12 are the points in the first image for inliers of view 1-2
rgb_vals = img1r[x_img[:, 1], x_img[:, 0]]
rgb_vals = [f"rgb({int(r)},{int(g)},{int(b)})" for r, g, b in rgb_vals]
point_color = [(255, 0, 0), (0, 255, 0)]
fig.add_trace(go.Scatter3d(x=X[0, :], y=X[2, :], z=-X[1, :], mode="markers", marker=dict(size=2, color=rgb_vals)))
fig.update_layout(
    title="3D Point Cloud - View 1-2",
    scene=dict(
        xaxis_title="X",
        yaxis_title="Y",
        zaxis_title="Z",
        aspectmode="data"
    ),
)
fig.show()


<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <ul style="margin: 0; padding-left: 20px;">
    <strong>🎥 Video Question (Q4):</strong><br>
    <li>Explain the results obtained. Is the result correct? Can you find objects in the 3D point cloud?<br>
  </ul>
</div>

## Estimate new camera pose from structure

At this point we have reconstructed some 3D points from the point correspondences in the initial pair of views. If we are able to find a sufficient number of correspondences, in a new image, of the already reconstructed keypoints we will have a set of 3D-2D correspondences that can be used to calibrate the new view. For that, we will use the resectioning method (lecture 6) that needs  at least six 3D-2D correspondences. Alternatively, other methods, like $PnP$ approaches can be used for this purpose (since all cameras share the same matrix $K$).

**Q5.** Find the intersection matches between matches 1-2 and 1-3.


In [None]:
# TODO: Find intersection matches between views 1-2 and 1-3
intersect_12, intersect_13 = [], []
# complete ...
kp1 = ...
kp2 = ...
kp3 = ...

In [None]:

# Print number of intersection matches and show matches on images
print(f"Number of intersection matches: {kp1.shape[1]} out of {len(matches_12)}")
plot_matches(img1r, img2r, keypoints[0], keypoints[1], intersect_12, "Intersection Matches 1–2")
plot_matches(img1r, img3r, keypoints[0], keypoints[2], intersect_13, "Intersection Matches 1–3")


<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <ul style="margin: 0; padding-left: 20px;">
    <strong>🎥 Video Question (Q5):</strong><br>
    <li>Briefly explain the purpose of the code above.<br>
  </ul>
</div>


**Q6.** Create the function `resectioning` to calibrate the 3rd view and establish the proper entries to the function.

In [None]:
# TODO: Define points_2d and points_3d
points_2d = ...
points_3d = ...

In [None]:

# TODO: Define resectioning function
def resectioning(points_2d, points_3d): ...

# Find P3
P3_resectioned, inliers_3 = resectioning(points_2d, points_3d) ## P3_resectioned, inliers_3 = resectioning(points_2d, points_3d)
print(f"Number of inliers for P3: {len(inliers_3)}")

<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <ul style="margin: 0; padding-left: 20px;">
    <strong>🎥 Video Question (Q6):</strong><br>
    <li>Explain the resectioning technique step by step.</li>
    <li>How many correspondences are required? Why?</li><br>
  </ul>
</div>

You will obtain $P_3$ by resectioning. However, in order to plot $P_3$, we will have to normalize it because the current scaling factor can be of any value. To do that we will decompose the projection matrix into $K_3$, $R_3$ and $t_3$. Normalize $K_3$ with its last element, dehomogenise $t_3$, and compute the new normalised projection matrix of camera 3, `P3_norm`.   

In [None]:
# TODO: Decompose P3
K_3, R_3, t_3 = cv2.decomposeProjectionMatrix(P3)[:3]
K_3 /= ...
t_3 = t_3[:, 0] / ...
P3 = ??? @ np.column_stack((???, ???))

print(f"Previous K:{K}", f"K_3: {K_3}", f"R_3: {R_3}", f"t_3: {t_3}", f"P3: {P3}", sep="\n")

<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <strong>🎥 Video Question (Q7):</strong>
</div>

Try to understand the values of $K_3$, $R_3$ and $t_3$.
- Is $K_3$ similar to the previously estimated $K$? Should it be similar?
- What can you interpret from $R_3$? How is this rotation like? Which are the reference coordinates of this rotation?
- What can you interpret from $t_3$? How is the translation like? Which are the reference coordinates of this translation?
- Visualise the plot below. Are your interpretations shown in the plot?

In [None]:
# Show the projection matrices
ny, nx, _ = img1r.shape
fig = go.Figure()
plot_camera(P1, nx, ny, fig, "Camera 1")
plot_camera(P2, nx, ny, fig, "Camera 2")
plot_camera(P3, nx, ny, fig, "Camera 3")
fig.update_layout(
    title="Estimated Camera Projection Matrices",
    scene=dict(
        xaxis_title="X",
        yaxis_title="Y",
        zaxis_title="Z",
        aspectmode="data"
    ),
)
fig.show()

Now, let's show the 3D point clouds with all the cameras.

In [None]:
ny, nx, _ = img1r.shape
fig = go.Figure()
plot_camera(P1, nx, ny, fig, "Camera 1")
plot_camera(P2, nx, ny, fig, "Camera 2")
plot_camera(P3, nx, ny, fig, "Camera 3")
fig.add_trace(go.Scatter3d(x=X[0, :], y=X[2, :], z=-X[1, :], mode="markers", marker=dict(size=2, color=rgb_vals)))
fig.update_layout(
    title="Estimated 3D Points and Camera Projection Matrices",
    scene=dict(
        xaxis_title="X",
        yaxis_title="Y",
        zaxis_title="Z",
        aspectmode="data"
    ),
)
fig.show()

<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <strong>🎥 Video Question (Q8):</strong>
  <ul>
    <li>Evaluate qualitatively the 3D representation that you have obtained.</li>
    <li>Which elements can you recognise from the image? Is there anything poorly represented? Interpret why.</li>
  </ul>
</div>

**Optional 2**: Triangulate the matches from the 1st and 3rd view in order to add new 3D points in the point cloud. You can also the function findFundamentalMat from OpenCV (in place of your own function created in previous labs).


## 3. References

Add here the material you used to complete this Lab. Cite and describe the usage of AI tools if any was used according to the Guidelines for AI tools.

TODO: Complete

<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <strong>🎥 Video Questions</strong>: Briefly mention the references.
</div>


<div style="border: 2px solid #007acc; border-radius: 10px; padding: 10px; background-color: rgba(255, 255, 255, 0);">
  <strong>🎥 Self-Assessment and Conclusions</strong>:
  <ul>
  <li><b>Which parts of the notebook did you succeed in? </b><br>
  <em>Describe the sections where you felt confident, and explain why you think they were successful.</em></li>
  <li><b>Which parts of the notebook did you fail to solve? </b><br>
  <em>Be honest about the areas where you faced difficulties. What challenges or issues did you encounter that you couldn’t resolve? How would you approach these issues in the future?</em></li>
  </ul>
  Is there anything else that you would like to comment?
</div>


### Lab References

[1] José Lezama, Rafael Grompone von Gioi, Gregory Randall, and Jean-Michel Morel. Finding vanishing points via point alignments in image primal and dual domains. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 509–515, 2014.

[2] José Lezama, Gregory Randall, and Rafael Grompone von Gioi. Vanishing Point Detection in Urban Scenes Using Point Alignments. Image Processing On Line, 7:131–164, 2017.

[3] Johannes Lutz Schönberger and Jan-Michael Frahm. Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[4] Gilles Simon, Antoine Fond, and Marie-Odile Berger. A simple and effective method to detect orthogonal vanishing points in uncalibrated images of man-made environments. In Eurographics, 2016.

[5] Richard Szeliski. Computer vision: algorithms and applications. Springer Science & Business Media, 2010.

[6] Yichao Zhou, Haozhi Qi, Jingwei Huang, and Yi Ma. Neurvps: Neural vanishing point scanning via conic convolution. NeurIPS 2019. 