<table>
  <tr>
    <td><img src="https://github.com/rvss-australia/RVSS/blob/main/Pics/RVSS-logo-col.med.jpg?raw=1" width="400"></td>
    <td><div align="left"><font size="30">Modeling a perspective camera</font></div></td>
  </tr>
</table>

(c) Peter Corke 2024

Robotics, Vision & Control: Python, see section 13.1.2

## Configuring the Jupyter environment
We need to import some packages to help us with linear algebra (`numpy`), graphics (`matplotlib`), and machine vision (`machinevisiontoolbox`).
If you're running locally you need to have these packages installed.  If you're running on CoLab we have to first install machinevisiontoolbox which is not preinstalled, this will be a bit slow.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import mpl_toolkits.mplot3d.art3d as art3d
import ipywidgets as widgets
%matplotlib inline

***

The sort of camera that we have in our eye and in our mobile phone performs a mapping from the 3D world to a 2D image. The particular mapping is called a perspective projection, and the underlying mathematics is very simple.

Two models are generally used:

1.  The "pinhole camera" model, which produces an inverted image, using a pinhole approximation to a lens.
2.  The central projection model, widely used in the computer vision literature, which creates a non-inverted image.

For a camera at the origin looking out along the z-axis (its optical axis) and a point in the world at $P = (X, Y, Z)$ the 2D projection on the image plane will be $p=(x,y)$ where
\begin{equation}
x = \frac{fX}{Z}, y = \frac{fY}{Z}
\end{equation}
and $f$ is the focal length of the lense.

The simple animation below lets you adjust the coordinates $(X, Y, Z)$ and that alters the projection ray, from
the origin through that point. Where that ray pierces the imageplane at $Z=f$ is the image-plane projection of that world point.

Have a play.  Convince yourself that an infinite number of points in the world have the same image-plane projection.

In [None]:
verts = np.array([
    [-5, -5],
    [-5,  5],
    [ 5,  5],
    [ 5, -5] ])

@widgets.interact
def animate( X = widgets.FloatSlider(value=1, description='X:', min=-4, max=4),
             Y = widgets.FloatSlider(value=1, description='Y:', min=-4, max=4),
             Z = widgets.FloatSlider(value=2, description='Z:', min=1, max=5),
             f = widgets.FloatSlider(value=1, description='f', min=0.1, max=1.5)):
    fig = plt.figure(figsize=(10, 10))
    ax = plt.axes(projection='3d')
    p = plt.Polygon(verts, alpha=0.7, color="lightyellow")

    ax.add_patch(p)
    art3d.pathpatch_2d_to_3d(p, z=f, zdir="z")

    ax.set_xlim3d(-5, 5)
    ax.set_ylim3d(-5, 5)
    ax.set_zlim3d(0, 5)
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('Z')
    P = ax.scatter3D(X, Y, Z, 'bo')
    P.set_sizes([40])
    ax.quiver(0, 0, 0, X, Y, Z, arrow_length_ratio=0.1)
    x = X * f / Z
    y = Y * f / Z
    p = ax.scatter3D(x, y, f, 'ko')
    p.set_sizes([20])

    ax.text(X, Y, Z, "   P")
    ax.text(x, y, f,  "  p = (%.3f, %.3f)" % (x, y))
    plt.show()
    
    # uv_coords.value = "Image-plane coordinate:   (%.3f, %.3f)" % (x, y)
