In [7]:
import numpy as np

`vv1` and `vv2` are to be given so that `vv1[i]` and `vv2[i]` were images of orthogonal VPs in the i-th picture. Coefficients are calculated in the notebook "Compute linear system coefficients for omega".

Recall that $\omega = (KK^T)^{-1} = K^{-T}K^{-1}$. Since the inverse of an upper triangular matrix is upper triangular, $K^{-T}$ is lower triangular, so the Cholesky decomposition of $\omega$ will return $K^{-T}$, which will have to be inverted and transposed.

In [8]:
def K_from_vps(vv1, vv2):
    assert (len(vv1) == 4 and len(vv2) == 4), "Length of vv1 and vv2 must be 4!"
    A = []
    b = []
    for i in range(len(vv1)):
        A += [[ vv1[i][0]*vv2[i][0],
                vv1[i][0]*vv2[i][2] + vv1[i][2]*vv2[i][0],
                vv1[i][1]*vv2[i][2] + vv1[i][2]*vv2[i][1],
                vv1[i][1]*vv2[i][1] ]]
        b += [ -vv1[i][2]*vv2[i][2] ]
    w = np.linalg.solve(A, b)
    
    W = np.array([[w[0],  0,   w[1]],
                 [ 0,  w[3],  w[2]],
                 [w[1], w[2],  1 ]])
    K = np.linalg.inv(np.linalg.cholesky(np.array(W))).T
    return K/K[2,2]

**TODO:** Test this with ground truth data

In [9]:
with open("pics2/vps.csv") as f:
    vps = np.array([l.strip("\n").split(",") for l in f.readlines()]).astype("float")
vv1, vv2 = vps[:,:3], vps[:,3:]
# vv1[:, 0:2] = vv1[:, 0:2]/6000
# vv2[:, 0:2] = vv2[:, 0:2]/6000
K_from_vps(vv1, vv2)

array([[ 6.00579621e+03,  0.00000000e+00,  3.20150216e+03],
       [-2.13268630e-13,  3.82292029e+03, -9.55041891e+02],
       [-0.00000000e+00,  0.00000000e+00,  1.00000000e+00]])

Consider a coordinate system centered in the center of the table, with $x$ and $y$ directions parallel to its edges. Once we have $K$, $R$ can be computed by considering that the rays through the projections of the vanishing points must be perpendicular.

Let $\mathbf u$ and $\mathbf v$ be the projections vanishing points on the plane $z=k$ in Euclidean coordinates (these can be obtained by inverting $K$ on the pixel coordinates). We know $\mathbf u \cdot \mathbf v = 0$, so solving for $k$ we get $k^2 = -x_u x_v - y_u y_v$. Immediately we get $\mathbf x' = \frac{\mathbf u}{||\mathbf u||}$ and $\mathbf y' = \frac {\mathbf v} {||\mathbf v||}$, and of course $\mathbf z' = \mathbf x' \times \mathbf y'$. Lastly,

$$
R = \begin{pmatrix} 
\\
\mathbf x' & \mathbf y' & \mathbf z' \\
\\
\end{pmatrix}
$$.

In order to find the missing $\mathbf t$ that completes $P = K \begin{pmatrix}
R & \mathbf t
\end{pmatrix}$ the table size and the projections of its corners on the image plane can be used. There are five points of which we know in-world position and projections on the image plane: the four corners of the table and its center, the intersection of the diagonals. Let $\mathbf x'$ be the projection on the image plane of the known point $\mathbf x$, and let $H := \begin{pmatrix}
R & \mathbf t
\end{pmatrix}$. Being $\mathbb R^3$ the codomain of $H$ we can express the constraint that $H \mathbf x$ be in the span of $\mathbf x'$ with the equation $H \mathbf x \times \mathbf x' = \mathbf 0$. By writing $\mathbf x =: \begin{pmatrix}\mathbf x_{1-3} & x_4\end{pmatrix}^\top$ and considering the definition of $H$ one can rewrite the constraints as:

$$
(R \mathbf x_{1-3} + \mathbf t x_4) \times \mathbf x' = \mathbf 0 \\
\implies R \mathbf x_{1-3} \times \mathbf x' = -x_4 \mathbf t \\
\implies \mathbf x' \times \mathbf t = -\frac{1}{x_4} \mathbf x' \times R \mathbf x_{1-3}
$$

In the last form of the equation $\mathbf t$ is the only unknown, as the right hand side is a constant vector; furthermore, cross product by $\mathbf x'$ can be expressed as a matrix multiplication, hence we have:

$$
\mathbf x'_{[\times]} \mathbf t = 
\begin{bmatrix}
0 & -x_3' & x_2' \\
x_3' & 0 & -x_1' \\
-x_2' & x_1' & 0 
\end{bmatrix} \mathbf t = -\frac{1}{x_4} \mathbf x' \times R \mathbf x_{1-3}
$$

$\mathbf x'_{[\times]}$ has rank 2 (it cannot be full rank since $\mathbf x'_{[\times]} \mathbf x' = \mathbf x' \times \mathbf x' = \mathbf 0$), at least two such constraints are needed in order to specify $\mathbf t$.

If the projections of the vanishing points are taken as known points, as they are the images of $\begin{pmatrix}1 & 0 & 0 & 1\end{pmatrix}^\top$ and $\begin{pmatrix}0 & 1 & 0 & 1\end{pmatrix}^\top$ respectively, the right hand side vector become $-\mathbf x' \times R^{1\top}$ and $-\mathbf y' \times R^{2\top}$ where $R^{n\top}$ id the $n$-th column of $R$.

In [48]:
def find_projection_matrix(K, vpx, vpy):
    def normalize(v):
        return v/np.linalg.norm(v)
    
    def crossmatrix(u):
        return np.array([[0,    -u[2],  u[1]],
                         [u[2],   0,   -u[0]],
                         [-u[1],  u[0],  0 ]])
    
    # vpx_impl is the projection of vpx on the image plane, without Z coordinate
    vpx = np.array(vpx)
    vpy = np.array(vpy)
    K = np.array(K)
    
    Kinv = np.linalg.inv(K)
    vpx_impl = np.dot(Kinv, vpx)
    vpy_impl = np.dot(Kinv, vpy)
    vpx_impl = vpx_impl/vpx_impl[2]
    vpy_impl = vpy_impl/vpy_impl[2]
    print(vpx_impl, vpy_impl)
    
    # (lowercase) k is the Z coordinate of the plane, u and v are the full vectors
    k = np.sqrt(-vpx_impl[0]*vpy_impl[0] - vpx_impl[1]*vpy_impl[1])
    u = [vpx_impl[0], vpx_impl[1], k]
    v = [vpy_impl[0], vpy_impl[1], k]
    R = np.column_stack((normalize(u), normalize(v), normalize(np.cross(u, v))))
    
    bx = -np.cross(vpx_impl, R[:,0].T)
    by = -np.cross(vpy_impl, R[:,1].T)
    b = np.concatenate((bx, by))
    A = np.row_stack((crossmatrix(vpx_impl), crossmatrix(vpy_impl)))
    
    def minors(n): 
        return ([i, j, k] for i in range(n)
                          for j in range(i+1, n)
                          for k in range(j+1, n))
    
    # We select a triple of linearly independent rows
    ind_minor = next(m for m in minors(6) if np.linalg.det(A[m]) != 0)
    A = A[ind_minor]
    b = b[ind_minor]
    t = np.linalg.solve(A, b)
    
    P = np.dot(K, np.column_stack((R, t)))
    print(np.linalg.det(A))
    return P

In [49]:
K = [[1159.10049262109, 8.81182868511778,  656.356931046326],
     [0,                1164.35813268503,  499.093066111278],
     [0,                0,                 1               ]]
vpx = [ 4.03377304e+03, -1.03233832e+03,  1.00000000e+00]
vpy = [-7.78844385e+02, -1.86022864e+03,  1.00000000e+00]

In [50]:
find_projection_matrix(K, vpx, vpy)

[ 2.92382409 -1.31525803  1.        ] [-1.22279824 -2.02628525  1.        ]
-0.7110272147031522


array([[ 1.19690101e+03, -3.17050865e+02, -4.91225753e+02,
         5.81421730e+01],
       [-3.15484454e+02, -7.38005226e+02, -9.80123255e+02,
        -5.71090349e+00],
       [ 2.85211270e-01,  3.73875694e-01, -8.82536966e-01,
         2.59225425e-02]])