# The AprilTag system: 6DoF vision with monocular cameras


The library Apriltag allows us to detect markers in images, and to compute the relative pose between the camera coordinate frame and the tag.

We will need Apriltag, OpenCV-python, and some NumPy operations to proceed. Install them with

```terminal
python -m pip install apriltag
python -m pip install opencv-python
```
Then import them to your python project:

In [None]:
import apriltag
import cv2
import numpy as np

## Detecting apriltags in images

First we need to read an image into memory, and store it as greyscale

In [None]:
image = cv2.imread('./data/skew.jpeg', cv2.IMREAD_GRAYSCALE)

cv2.imshow("Image view", image)
cv2.waitKey(1000) # waits until a key is pressed


Then we need an apriltag detector

In [None]:
detector = apriltag.Detector()


Now we can process the image, and see what comes out of the Apriltag detector:

In [None]:
detections = detector.detect(image)

print('detections = \n', detections)

So we've processed an image, and discovered one Tag, labelled with the ID=5.

We have some more information regarding this detection, which we can access easily:

In [None]:

for detection in detections:  

    print('tag #             ', detection.tag_id)
    print('detection hamming ', detection.hamming)
    print('detection goodness', detection.goodness)
    print('decision margin   ', detection.decision_margin)
    print('tag center        ', detection.center)
    print('homography\n',       detection.homography)
    print('tag corners\n',      detection.corners)


These fields are documented in Apriltag as follows

    # The decoded ID of the tag
    tag_id

    # How many error bits were corrected? Note: accepting large numbers of
    # corrected errors leads to greatly increased false positive rates.
    # NOTE: As of this implementation, the detector cannot detect tags with
    # a hamming distance greater than 2.
    hamming

    # A measure of the quality of tag localization: measures the
    # average contrast of the pixels around the border of the
    # tag. refine_pose must be enabled, or this field will be zero.
    goodness

    # A measure of the quality of the binary decoding process: the
    # average difference between the intensity of a data bit versus
    # the decision threshold. Higher numbers roughly indicate better
    # decodes. This is a reasonable measure of detection accuracy
    # only for very small tags-- not effective for larger tags (where
    # we could have sampled anywhere within a bit cell and still
    # gotten a good detection.)
    decision_margin

    # The 3x3 homography matrix describing the projection from an
    # "ideal" tag (with corners at (-1,-1), (1,-1), (1,1), and (-1,
    # 1)) to pixels in the image. This matrix will be freed by
    # apriltag_detection_destroy.
    homography

    # The center of the detection in image pixel coordinates.
    center

    # The corners of the tag in image pixel coordinates. These always
    # wrap counter-clock wise around the tag.
    corners

## Computing camera-to-tag relative pose

Camera-to-tag transforms can be obtained from the detected corners if we know
  - the geometry of the tags
  - the geometry of the camera (intrinsic and distortion parameters)

So let us define these parameters:

In [None]:
# Tag geometry
tag_size    = 0.2
tag_corners = tag_size / 2 * np.array([[-1,1,0],[1,1,0],[1,-1,0],[-1,-1,0]])

# Camera calibration 
K = np.array([  [   320,    0.0,    320  ], 
                [   0.0,    320,    240  ], 
                [   0.0,    0.0,    1.0  ]])  
# warning: these params do not corresopnd to the ones of the camera used to take the image skew.jpeg

distortion_model = np.array([])  # we assume rectified images, therefore with no distortion


From here there are two ways of computing the relative pose between the camera and the tag.

- One method uses the **homography** provided by the detector to extract translation `T` and rotation `R`. This method is unstable and we do not recommend it.

- The other method uses the **PnP algorithm**. Given the four corners of the tag in tag reference (which are known), the  same corners projected in the image, obtained by the Apriltag detector, and the camera calibration parameters, the PnP algorithm computes the transformation (`T`,`R`) between camera and tag. 

We use OpenCV for this, and recover a translation vector and a rotation vector:

In [None]:
(_, rotation_vector, translation_vector) = cv2.solvePnP(tag_corners, detection.corners, K, distortion_model, flags = cv2.SOLVEPNP_IPPE_SQUARE)

T = translation_vector
w = rotation_vector

print('T = \n', T)
print('w = \n', w)


See that although `T` and `w` are vectors, they come represented as 2-dimensional arrays. We can clean them up to avoid trouble down the road:

In [None]:
T = (T.T)[0]
w = (w.T)[0]

print('T = \n', T)
print('w = \n', w)


It is now important to know how to interpret these `T` and `R`. Are they _camera-to-tag_? Or _tag-to-camera_? Reading the doc of `cv2.computePnP()`, we see that they transform points in tag frame into points in camera frame. Let us rename the variables to account for this and be more verbose:

In [None]:
T_c_t = T
w_c_t = w

In order to obtain a rotation matrix from a rotation vector, we require the exponential in SO(3), or the Rodrigues formula. We use pinocchio for this:

In [None]:
import pinocchio as pin

R_c_t = pin.exp(w_c_t)

print('R = \n',R_c_t)

We can now reproject the tag corners into the image, and see if they differ much from the detections:

In [None]:
projected_corners = cv2.projectPoints(tag_corners, R_c_t, T_c_t, K, distortion_model)
projected_corners = np.reshape(projected_corners[0],[4,2])  # fix weird format from opencv

print('projected corners\n', projected_corners)
print()
print('detected corners\n', detection.corners)

We see that they kind of match, but not really. This is because the image used was taken from the internet, and we do not have the correct camera calibration parameters.

Let us then re-do the whole process with a proper image taken a the camera with known calibration parameters:

In [None]:
image = cv2.imread('./data/visual_odom_laas_corridor/short2/frame0000.jpg', cv2.IMREAD_GRAYSCALE)
cv2.imshow("Image view", image)
cv2.waitKey(10) # waits so the image can be drawn (??)

# Camera calibration
K           = np.array([[   419.53, 0.0,    427.88  ], 
                        [   0.0,    419.53, 241.32  ], 
                        [   0.0,    0.0,    1.0     ]])  

detections = detector.detect(image)

for detection in detections[0:2]:   # we'll show results of only 2 detections

    print('Tag # ', detection.tag_id)

    (_, w_c_t, T_c_t) = cv2.solvePnP(tag_corners, detection.corners, K, distortion_model, flags = cv2.SOLVEPNP_IPPE_SQUARE)
    R_c_t = pin.exp(w_c_t)

    projected_corners = cv2.projectPoints(tag_corners, R_c_t, T_c_t, K, distortion_model)
    projected_corners = np.reshape(projected_corners[0],[4,2])  # fix weird format from opencv
    
    print('projected corners\n', projected_corners)
    print('detected corners\n', detection.corners)



## Assessing tag detection quality

The quality of a detection can be assessed with different metrics. We provide here some clues:

- Use the `detection.goodness` result
- Use the `detection.hamming` result
- Use the `detection.decision_margin` result
- Use the reprojection error of the corners. 

We do not provide further details here, but you may explore these possibilities should your SLAM algorithm show signs of fragility:
- For the Apriltag detector metrics, see the documentation above
- For the reprojection error, you can use the code below:

In [None]:
detection = detections[0]

(_, w_c_t, T_c_t) = cv2.solvePnP(tag_corners, detection.corners, K, distortion_model, flags = cv2.SOLVEPNP_IPPE_SQUARE)
R_c_t = pin.exp(w_c_t)

projected_corners   = cv2.projectPoints(tag_corners, R_c_t, T_c_t, K, distortion_model)
projected_corners   = np.reshape(projected_corners[0],[4,2])  # fix weird format from opencv

reprojection_errors = projected_corners - detection.corners

reprojection_error_rms = np.linalg.norm(reprojection_errors) / np.sqrt(8.0)

print('reprojection errors [pix]\n', reprojection_errors)
print('reprojection_error_rms [pix rms]\n', reprojection_error_rms)


And that's all we need to know about the `AprilTag` package to make 3D measurements of tags in the environment!