# Camera Intrinsics
This notebook is my attempt to better understand camera intrinsics. 
To test my assumptions I will do some calculations with the ZED2i as reference.

ZED2i datasheet: https://www.stereolabs.com/assets/datasheets/zed-2i-datasheet-feb2022.pdf

## Sensor size

In [1]:
sensor_pixels_x = 2688
sensor_pixels_y = 1520
sensor_pixel_size_in_mm = 0.002 # 2 micrometers

sensor_width_calculated = sensor_pixels_x * sensor_pixel_size_in_mm
sensor_height_calculated = sensor_pixels_y * sensor_pixel_size_in_mm

print("ZED2i sensor size in mm:")
sensor_width_calculated, sensor_height_calculated

ZED2i sensor size in mm:


(5.376, 3.04)

## Aspect ratios

In [2]:
resolutions = {
    "sensor" : (sensor_pixels_x, sensor_pixels_y),
    "2K" : (2208, 1242),
    "FHD": (1920, 1080),
    "HD" : (1280, 720),
    "VGA": (672, 376),
}


aspect_ratios = [width / height for width, height in resolutions.values()]

print("Aspect ratios \n---------------")
for name, aspect_ratio in zip(resolutions.keys(), aspect_ratios):
    print(f"{name:>6}: {aspect_ratio}")

Aspect ratios 
---------------
sensor: 1.768421052631579
    2K: 1.7777777777777777
   FHD: 1.7777777777777777
    HD: 1.7777777777777777
   VGA: 1.7872340425531914


In [3]:
16 / 9

1.7777777777777777

Note that all available image aspect ratios are **larger** than the aspect ratio of the sensor.
This means that the images are **relatively wider** than the sensor, and that some sensor pixels at the top or bottom are not used.

The part of the sensor that is actually used (for each image resolution) is called the **active sensor area**.

Two questions you might have about camera intrinsics:
* Why does each resolution have a different intrinsics matrix?
* Why are focal lengths and principal points saved in pixels?

Wouldn't it be cleaner to store the physical intrinsics of the camera.
E.g. the 

In [4]:
fx=1067.91
fy=1068.05
cx=1107.48
cy=629.675
k1=-0.0542749
k2=0.0268096
p1=0.000204483
p2=-0.000310015
k3=-0.0104089

In [5]:
print("Focal length in (sensor) pixels:")
2.12 / 0.002


Focal length in (sensor) pixels:


1060.0

Because the focal length in sensor pixels is approaximately equal to the focal in 2K and FHD pixels, we know that a pixel in image at those resolutions corresponds to a pixel on the sensor.


According to the datasheet, the camera's focal length should be 2.12 mm, and the size of a pixel on the sensor 0.002 mm.
Using the sensor pixels size, we get a focal length in pixels of 2.12 mm / 0.002 mm = 1060 pixels.
This is close to the value we see in the config file for the highest image resolutions.
This means that for those resolution, 1 image pixel corresponds to 1 sensor pixel.
For the lower resolutions, e.g. HD the focal length is approximately half of that, which means that 4 sensor pixels (a 2x2 square) are binned together to form an image pixel.


In [6]:
for name, (width, height) in resolutions.items():
    print(f"{name:>6}: {width / 2}, {height / 2}")

sensor: 1344.0, 760.0
    2K: 1104.0, 621.0
   FHD: 960.0, 540.0
    HD: 640.0, 360.0
   VGA: 336.0, 188.0


In [7]:
import numpy as np  

focal_length_in_mm = 2.12
np.rad2deg(2 * np.arctan2(sensor_pixels_x * sensor_pixel_size_in_mm / 2, focal_length_in_mm))

103.47498375651556

In [8]:
import numpy as np  

focal_length_in_mm = 2.12
np.rad2deg(2 * np.arctan2(1920 * sensor_pixel_size_in_mm / 2, focal_length_in_mm))

84.33177796738698

In [9]:
import numpy as np  

np.rad2deg(2 * np.arctan2(2208 / 2, fx))

91.90395955339223

In [10]:
import numpy as np  

np.rad2deg(2 * np.arctan2(1920 / 2, fx))

83.90805133022553

In [11]:
np.rad2deg(2 * np.arctan2(1920 / 2, focal_length_in_mm / sensor_pixel_size_in_mm))

84.33177796738698

In [12]:
# sample six random values between -pi and pi
np.printoptions(precision=3, suppress=True)
np.random.uniform(-np.pi, np.pi, 6)

array([ 3.0583577 ,  1.47310216, -2.47639619, -1.29575394,  0.14349737,
       -2.85743811])