# Box 3D Image Transform
This notebook is intended to demonstrate the differences of the different coordinate systems used for 3D Boxes.
In general, 4 different coordinate systems are used with 3 of them are already described in https://github.com/mcordts/cityscapesScripts/blob/master/docs/csCalibration.pdf
1. The vehicle coordinate system *V* according to ISO 8855 with the origin on the ground below of the rear axis center, *x* pointing in driving direction, *y* pointing left, and *z* pointing up.
2. The camera coordinate system *C* with the origin in the camera’s optical center and same orientation as *V*.
3. The image coordinate system *I* with the origin in the top-left image pixel, *u* pointing right, *v* pointing down.
4. In addition, we also add the coordinate system *S* with the same origin as *C*, but the orientation of *I*, ie. *x* pointing right, *y* down and *z* into the driving direction.

All GT annotations are given in the ISO coordinate system *V* and hence, the evaluation requires the data to be available in this coordinate system.

In this notebook, the transformations between all these coordinate frames are described exemplarily by loading a 3D box annotation and calculate the projection into 2D image, ie. coordinate system *I*.

### Sample annotation

In [1]:
sample_annotation = {
    "sensor": {
        "sensor_T_ISO_8855": [
            [
                0.9990881051503779,
                -0.01948468779721943,
                -0.03799085532693703,
                -1.6501524664770573
            ],
            [
                0.019498764210995674,
                0.9998098810245096,
                0.0,
                -0.1331288872611436
            ],
            [
                0.03798363254444427,
                -0.0007407747301939942,
                0.9992780868764849,
                -1.2836173638418473
            ]
        ],
        "fx": 2262.52,
        "fy": 2265.3017905988554,
        "u0": 1096.98,
        "v0": 513.137,
        "baseline": 0.209313,
        "image_height": 1024,
        "image_width": 2048
    },
    "annotation": [
        {
            "2d": {
                "modal": [
                    609,
                    420,
                    807,
                    531
                ],
                "amodal": [
                    602,
                    415,
                    816,
                    533
                ]
            },
            "3d": {
                "center": [
                    33.95,
                    5.05,
                    0.57
                ],
                "dimensions": [
                    4.3,
                    1.72,
                    1.53
                ],
                "rotation": [
                    0.9735839424380041,
                    -0.010751769161021867,
                    0.0027191710555974913,
                    0.22805988817753894
                ],
                "type": "Mid Size Car",
                "format": "CRS_ISO8855"
            },
            "occlusion": 0.0,
            "truncation": 0.0,
            "instance_id": 26010,
            "class_name": "car",
            "score": 1.0
        }
    ]
}

### Python imports

In [2]:
import numpy as np
from cityscapesscripts.helpers.box3d import (
    Camera, 
    Box3DImageTransform,
    CRS_V,
    CRS_C,
    CRS_S
)

### Create the camera
``sensor_T_ISO_8855`` is the transformation matrix from coordinate system *V* to *C*.

In [3]:
camera = Camera(fx=sample_annotation["sensor"]["fx"],
                fy=sample_annotation["sensor"]["fy"],
                u0=sample_annotation["sensor"]["u0"],
                v0=sample_annotation["sensor"]["v0"],
                sensor_T_ISO_8855=sample_annotation["sensor"]["sensor_T_ISO_8855"])

### Load the annotation
As the annotation is given in coordinate system *V*, it must be transformed from *V* &#8594; *C* &#8594; *S* &#8594; *I*.

In [4]:
# Create the Box3DImageTransform object
box3d_annotation = Box3DImageTransform(camera=camera)

# Initialize the 3D box with an annotation in coordinate system V. 
# You can alternatively pass CRS_S or CRS_C if you want to initalize the box in a different coordinate system.
box3d_annotation.initialize_box(size=sample_annotation["annotation"][0]["3d"]["dimensions"],
                              quaternion=sample_annotation["annotation"][0]["3d"]["rotation"],
                              center=sample_annotation["annotation"][0]["3d"]["center"],
                              coordinate_system=CRS_V)
size_V, center_V, rotation_V = box3d_annotation.get_parameters(coordinate_system=CRS_V)

# Values in C
size_C, center_C, rotation_C = box3d_annotation.get_parameters(coordinate_system=CRS_C)
box3d_annotation.initialize_box(size=size_C,
                              quaternion=rotation_C,
                              center=center_C,
                              coordinate_system=CRS_C)
size_VV, center_VV, rotation_VV = box3d_annotation.get_parameters(coordinate_system=CRS_V)

# Values in S
size_S, center_S, rotation_S = box3d_annotation.get_parameters(coordinate_system=CRS_S)
box3d_annotation.initialize_box(size=size_S,
                              quaternion=rotation_S,
                              center=center_S,
                              coordinate_system=CRS_S)
size_VVV, center_VVV, rotation_VVV = box3d_annotation.get_parameters(coordinate_system=CRS_V)

print(size_V, size_VV, size_VVV)
print(center_V, center_VV, center_VVV)
print(rotation_V, "\n", rotation_VV, "\n", rotation_VVV)
print([np.degrees(x) for x in rotation_V.yaw_pitch_roll])
print([np.degrees(x) for x in rotation_VV.yaw_pitch_roll])
print([np.degrees(x) for x in rotation_VVV.yaw_pitch_roll])

[4.3  1.72 1.53] [4.3  1.72 1.53] [4.3  1.72 1.53]
[33.95  5.05  0.57] [33.95  5.05  0.57] [33.95  5.05  0.57]
0.974 -0.011i +0.003j +0.228k 
 0.974 -0.011i +0.003j +0.228k 
 0.974 -0.011i +0.003j +0.228k
[26.36765120458783, 0.02237904494421823, -1.2706821318527146]
[26.36765120458783, 0.022379044944217878, -1.2706821318527146]
[26.36765120458783, 0.022379044944218676, -1.2706821318527128]


### Print coordinates of cuboid vertices

In [5]:
# Get the vertices of the 3D box in the requested coordinate frame
box_vertices_V = box3d_annotation.get_vertices(coordinate_system=CRS_V)
box_vertices_C = box3d_annotation.get_vertices(coordinate_system=CRS_C)
box_vertices_S = box3d_annotation.get_vertices(coordinate_system=CRS_S)

# Print the vertices of the box.
# loc is encoded with a 3-char code
#   0: B/F: Back or Front
#   1: L/R: Left or Right
#   2: B/T: Bottom or Top
# BLT -> Back left top of the object

# Print in V coordinate system
print("Vertices in V:")
print("     %8s %8s %8s" % ("x[m]", "y[m]", "z[m]"))
for loc, coord in box_vertices_V.items():
    print("%s: %8.2f %8.2f %8.2f" % (loc, coord[0], coord[1], coord[2]))
    
# Print in C coordinate system
print("\nVertices in C:")
print("     %8s %8s %8s" % ("x[m]", "y[m]", "z[m]"))
for loc, coord in box_vertices_C.items():
    print("%s: %8.2f %8.2f %8.2f" % (loc, coord[0], coord[1], coord[2]))
    
# Print in S coordinate system
print("\nVertices in S:")
print("     %8s %8s %8s" % ("x[m]", "y[m]", "z[m]"))
for loc, coord in box_vertices_S.items():
    print("%s: %8.2f %8.2f %8.2f" % (loc, coord[0], coord[1], coord[2]))

Vertices in V:
         x[m]     y[m]     z[m]
BLB:    31.64     4.85    -0.19
BRB:    32.41     3.31    -0.16
FRB:    36.26     5.22    -0.20
FLB:    35.49     6.76    -0.23
BLT:    31.64     4.88     1.34
BRT:    32.41     3.34     1.37
FRT:    36.26     5.25     1.33
FLT:    35.49     6.79     1.30

Vertices in C:
         x[m]     y[m]     z[m]
BLB:    29.88     5.33    -0.28
BRB:    30.67     3.81    -0.21
FRB:    34.48     5.79    -0.11
FLB:    33.69     7.32    -0.17
BLT:    29.82     5.37     1.25
BRT:    30.61     3.84     1.32
FRT:    34.42     5.82     1.42
FLT:    33.63     7.35     1.35

Vertices in S:
         x[m]     y[m]     z[m]
BLB:    -5.33     0.28    29.88
BRB:    -3.81     0.21    30.67
FRB:    -5.79     0.11    34.48
FLB:    -7.32     0.17    33.69
BLT:    -5.37    -1.25    29.82
BRT:    -3.84    -1.32    30.61
FRT:    -5.82    -1.42    34.42
FLT:    -7.35    -1.35    33.63


### Print box parameters

In [6]:
# Similar to the box vertices, you can retrieve box parameters center, size and rotation in any coordinate system
size_V, center_V, rotation_V = box3d_annotation.get_parameters(coordinate_system=CRS_V)
size_C, center_C, rotation_C = box3d_annotation.get_parameters(coordinate_system=CRS_C)
size_S, center_S, rotation_S = box3d_annotation.get_parameters(coordinate_system=CRS_S)

print(size_V, center_V, rotation_V)
print(size_C, center_C, rotation_C)
print(size_S, center_S, rotation_S)

[4.3  1.72 1.53] [33.95  5.05  0.57] 0.974 -0.011i +0.003j +0.228k
[4.3  1.72 1.53] [32.29984753  4.91687111 -0.71361736] 0.971 -0.007i -0.016j +0.238k
[4.3  1.72 1.53] [-4.91687111  0.71361736 32.29984753] 0.971 -0.007i -0.016j +0.238k


### Get 2D image coordinates

In [7]:
# Get the vertices of the 3D box in the image coordinates
box_vertices_I = box3d_annotation.get_vertices_2d()

# Print the vertices of the box.
# loc is encoded with a 3-char code
#   0: B/F: Back or Front
#   1: L/R: Left or Right#
#   2: B/T: Bottom or Top
# BLT -> Back left top of the object

print("\n     %8s %8s" % ("u[px]", "v[px]"))
for loc, coord in box_vertices_I.items():
    print("%s: %8.2f %8.2f" % (loc, coord[0], coord[1]))


        u[px]    v[px]
BLB:   693.20   533.99
BRB:   816.17   528.73
FRB:   717.05   520.36
FLB:   605.66   524.83
BLT:   689.84   417.91
BRT:   813.13   415.63
FRT:   714.17   419.78
FLT:   602.53   421.89
