# Activity: Triangulation

"Triangulation" is the standard way to add new points to an existing reconstruction in structure-from-motion. Given feature matches from a pair of images captured at poses that have already been estimated, we want to find the 3D position of the points to which the feature matches correspond (which we assume are *not* already part of the reconstruction). This is a slight generalization of the method you implemented in the "two-view reconstruction" activity.

# Theory

### The problem

Suppose you know $(R^B_A, p^B_A)$ and $(R^C_A, p^C_A)$. Suppose $n$ points
$$ \mathrm{p}_1, \dotsc, \mathrm{p}_n $$
are visible in image $B$ with coordinates
$$ b_1, \dotsc, b_n \in \mathbb{R}^2 $$
and in image $C$ with coordinates
$$ c_1, \dotsc, c_n \in \mathbb{R}^2. $$
The problem of **triangulation** is to find the coordinates
$$ p^A_1, \dotsc, p^A_n \in \mathbb{R}^3 $$
of the $n$ points in frame $A$.


### The solution

You already know how to solve this problem from the activity on two-view reconstruction. Begin by applying sequential transformation to compute the relative pose
$$ R^B_C = R^B_A (R^C_A)^\top \qquad\qquad p^B_C = p^B_A - R^B_A (R^C_A)^\top p^C_A. $$
Then, do the following five things for each $i\in\{1, \dotsc, n\}$.

First, compute normalized image coordinates:
$$ \beta_i = K^{-1} \begin{bmatrix} b_i \\ 1 \end{bmatrix} \qquad\qquad \gamma_i = K^{-1} \begin{bmatrix} c_i \\ 1 \end{bmatrix}. $$

Second, compute the depth
$$ \lambda_{c_i} = \frac{u^\top v}{u^\top u} $$
of point $i$ in image $C$, where
$$ u = \widehat{\beta_i} R^B_C \gamma_i $$
and
$$ v = - \widehat{\beta_i} p^B_C.$$
Check that $\lambda_{c_i} > 0$ (i.e., that point $i$ is in front of the camera at frame $C$) and either throw an error or discard the point if this condition is violated.

Third, compute the depth
$$ \lambda_{b_i} = \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}^\top\left( R^B_C \gamma_i + p^B_C \right) $$
of point $i$ in image $B$. Check that $\lambda_{b_i} > 0$ (i.e., that point $i$ is in front of the camera at frame $B$) and either throw an error or discard the point if this condition is violated.

Fourth, compute the coordinates
$$ p^C_i = \lambda_{c_i} \gamma_i $$
of point $i$ in frame $C$.

Fifth, apply coordinate transformation to compute the coordinates
$$ p^A_i = (R^C_A)^\top \left( p^C_i - p^C_A \right) $$
of point $i$ in frame $A$.

# Practice

## Set up notebook

Do all imports.

In [None]:
import numpy as np
from scipy.spatial.transform import Rotation
from scipy.linalg import block_diag
import cv2

Create random number generator with a particular seed so we can reproduce results.

In [None]:
rng = np.random.default_rng(0)

Define a function that constructs the skew-symmetric matrix

$$ \widehat{v} = \begin{bmatrix} 0 & -v_3 & v_2 \\ v_3 & 0 & -v_1 \\ -v_2 & v_1 & 0 \end{bmatrix} \in \mathbb{R}^{3 \times 3} $$

that is associated with a vector

$$ v = \begin{bmatrix} v_1 \\ v_2 \\ v_3 \end{bmatrix} \in \mathbb{R}^3. $$

In [None]:
def skew(v):
    assert(type(v) == np.ndarray)
    assert(v.shape == (3,))
    return np.array([[0., -v[2], v[1]],
                     [v[2], 0., -v[0]],
                     [-v[1], v[0], 0.]])

Define function to perform coordinate transformation.

In [None]:
def apply_transform(R_inB_ofA, p_inB_ofA, p_inA):
    p_inB = np.row_stack([
        (R_inB_ofA @ p_inA_i + p_inB_ofA) for p_inA_i in p_inA
    ])
    return p_inB

Define a function to print things nicely.

In [None]:
def myprint(M):
    if M.shape:
        with np.printoptions(linewidth=150, formatter={'float': lambda x: f'{x:10.4f}'}):
            print(M)
    else:
        print(f'{M:10.4f}')

## Create dataset

Choose intrinsic parameters, i.e., the camera matrix

$$K = \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix}.$$

In [None]:
K = np.array([
    [1500., 0., 1000.],
    [0., 1500., 500.],
    [0., 0., 1.],
])

Choose extrinsic parameters, i.e., the poses **of frame $A$ in frame $B$** and **of frame $A$ in frame $C$**.

In [None]:
# A in W
R_inW_ofA = Rotation.from_rotvec((0.05 * np.pi) * np.array([1., 0., 0.])).as_matrix()
p_inW_ofA = np.array([0.0, 0.0, -1.0])

# B in W
R_inW_ofB = Rotation.from_rotvec((0.05 * np.pi) * np.array([0., 1., 0.])).as_matrix()
p_inW_ofB = np.array([0.5, 0.0, -1.1])

# C in W
R_inW_ofC = Rotation.from_rotvec((0.05 * np.pi) * np.array([0., 0., 1.])).as_matrix()
p_inW_ofC = np.array([0.2, -0.1, -0.8])

# A in B
R_inB_ofA = R_inW_ofB.T @ R_inW_ofA
p_inB_ofA = R_inW_ofB.T @ (p_inW_ofA - p_inW_ofB)

# A in C
R_inC_ofA = R_inW_ofC.T @ R_inW_ofA
p_inC_ofA = R_inW_ofC.T @ (p_inW_ofA - p_inW_ofC)

Sample points $p^A_1, \dotsc, p^A_{n}$. We assume (1) that these points are *not* already part of the reconstruction and (2) that these points are visible in images $B$ and $C$.

In [None]:
n = 10
p_inW = rng.uniform(low=[-1., -1., -0.5], high=[1., 1., 2.5], size=(n, 3))
p_inA_true = apply_transform(R_inW_ofA.T, -R_inW_ofA.T @ p_inW_ofA, p_inW)

Project points into images $B$ and $C$.

In [None]:
def project(K, R_inC_ofA, p_inC_ofA, p_inA):
    p_inC = apply_transform(R_inC_ofA, p_inC_ofA, p_inA)
    assert(np.all(p_inC[:, 2] > 0))
    q = np.row_stack([K @ p_inC_i / p_inC_i[2] for p_inC_i in p_inC])
    return q[:, 0:2]

b = project(K, R_inB_ofA, p_inB_ofA, p_inA_true)
c = project(K, R_inC_ofA, p_inC_ofA, p_inA_true)

Knowns:

* `b` and `c` are the image coordinates $b_1, \dotsc, b_n$ and $c_1, \dotsc, c_n$ of projected points
* `R_inB_ofA` and `p_inB_ofA` is the pose of frame $B$ in frame $A$
* `R_inC_ofA` and `p_inC_ofA` is the pose of frame $C$ in frame $A$

Unknowns:

* `p_inA_true` is the true value of $p^A_1, \dotsc, p^A_n$

## Get reference solution with OpenCV

Estimate $p^A_1, \dotsc, p^A_n$.

In [None]:
points = cv2.triangulatePoints(
    K @ np.column_stack([R_inB_ofA, p_inB_ofA]),
    K @ np.column_stack([R_inC_ofA, p_inC_ofA]),
    b.copy().T,
    c.copy().T,
)

# Normalize points
points /= points[-1, :]

# Extract non-homogeneous coordinates
p_inA_cv = points[0:3, :].T

Check that results are correct.

In [None]:
assert(np.allclose(p_inA_cv, p_inA_true))

## Get solution with your own code

Define a function to do triangulation (i.e., estimate $p^A_1, \dotsc, p^A_n$ given $b_1, \dotsc, b_n$, $c_1, \dotsc, c_n$, $R^B_A$, $p^B_A$, $R^C_A$, $p^C_A$, and $K$).

In [None]:
def triangulate(b, c, R_inB_ofA, p_inB_ofA, R_inC_ofA, p_inC_ofA, K):
    # Compute relative pose
    # ... FIXME ...

    # Compute normalized image coordinates
    # ... FIXME ...

    # Compute depths in both images and verify (e.g., assert) all are positive
    # ... FIXME ...

    # Compute points in frame C
    # ... FIXME ...

    # Compute points in frame A
    # ... FIXME ...

    p_inA = None

    return p_inA

Apply function to do triangulation.

In [None]:
p_inA = triangulate(b, c, R_inB_ofA, p_inB_ofA, R_inC_ofA, p_inC_ofA, K)

Check that results are correct.

In [None]:
assert(np.allclose(p_inA, p_inA_true))

# Reflection

Answer the following questions:

* What happens if the origin of frame $B$ and the origin of frame $C$ are at the same point? Would your method still work? Form and test a hypothesis.
* What would cause your method to estimate a negative depth? Would it be better to raise an exception or to discard the point (i.e., not add the point to your reconstruction) in this case?

You could also try applying your method to real data (e.g., to feature matches from a third image, given an existing two-view reconstruction from two images) rather than synthetic data.