# The SLAM algorithm

## Introduction

This tuto is intended to help you organize all the pieces of the SLAM system into a functioning algorithm.

The idea of the algorithm is the following:

1. Images will arrive

1. For each image, you will define a keyframe, storing position and orientation of the camera

1. For each keyframe, you need to link it to the previous one via a motion factor

1. You have to analyse the images and detect AprilTags on them

1. For each detected Tag, you will define a landmark

    1. Sometimes, the tag is already known: you need to identify it and create a factor

    1. Sometimes, the detected tag is new: you need to create a landmark and a factor

The following is one of the many ways you can organize your code to achieve this.

We advance in iterations:

1. First iteration:  basic algorithm

1. Secon iteration:  bootstrap the 1st image

1. Third iteration:  set landmark warm-start values

1. Fourth iteration: add a keyframe prior

1. Fifth iteration:  add a landmark prior


You will need to import all of the following:

In [None]:
import numpy as np
import time
import pinocchio as pin
import casadi
from pinocchio import casadi as cpin

import apriltag
import cv2
from utils.meshcat_viewer_wrapper import MeshcatVisualizer

from gslam_types import *
from gslam_april_tools import *


## First iteration: basic algorithm

The basic algorithm can be put in pseudo-code as follows

In [None]:
# INITIALIZE ---------------
# set detector
# set solver
# set visualizer
# initialize dictionaries
# set initial values for warm-start
# set time variables, counters, etc
# set control flags 

# LOOP ---------------------
while True:
    # INPUT IMAGE ----------
    # read image
    # if no image: break
    # show image

    # PROCESS MOTION -------
    # make keyframe
    # make 'motion' factor

    # PROCESS IMAGE --------
    # detect all tags
    # loop all tag detections
        # if tag is known
            # make 'landmark' factor
        # else  -- is tag is new
            # make new landmark
            # make 'landmark' factor

    # SOLVE AND DISPLAY ----
    # call the solver
    # display all landmarks and keyframes in a 3d viewer

# CONCLUDE -----------------
# print some results

A more elaborate version of this basic algorithm is shown below:

In [None]:
## INITIALIZE --------------------------------------------------------------

# AprilTag detector
detector = apriltag.Detector()

# solver
opti    = casadi.Opti()
options = {'ipopt.print_level': 0, 'print_time': 0, 'ipopt.sb': 'yes'}
opti.solver("ipopt", options)

# Display
viz = MeshcatVisualizer()

# Dictionaries for SLAM objects
keyframes = dict()
landmarks = dict()
factors   = dict()

# Time and IDs
t       = 0
kf_id   = 0
fac_id  = 0


## LOOP ------------------------------------------------------------------------
while(True):

    ## INPUT IMAGE --------------------------------------------------------------

    # Read image
    image = cv2.imread('imagefile(t)', cv2.IMREAD_GRAYSCALE)
    if image is None: 
        break

    ## PROCESS MOTION --------------------------------------------------------------

    # make a KF for this image
    kf_id += 1
    keyframe = makeKeyFrame(...)  # see tuto_slam_elements for info!
    keyframes[kf_id] = keyframe

    # make a motion factor from last KF
    motion_measurement = ...
    motion_sqrt_info = np.eye(6) / 1e3
    factor = makeFactor('motion', fac_id, kf_last_id, kf_id, motion_measurement, motion_sqrt_info)  # see tuto_slam_elements for info!
    factors[fac_id] = factor 

    ## PROCESS IMAGE --------------------------------------------------------------

    # analyze image
    detections = detector.detect(image)

    # loop all detected tags !
    for detection in detections:

        # obtain 3d pose of tag wrt. camera
        lmk_id      = detection.tag_id
        measurement = computePose(detection.corners)  # see tuto_apriltag for info!
        sqrt_info   = np.eye(6) / 1e-2

        # see if lmk is known or new
        if lmk_id in landmarks: # lmk is known: we only need to add a factor

            # factor
            fac_id += 1
            factor = makeFactor('landmark', fac_id, kf_id, lmk_id, measurement, sqrt_info)  # see tuto_slam_elements for info!
            factors[fac_id] = factor

        else: # lmk is new: we need to add the new landmark, and a factor

            # landmark
            landmark = makeLandmark(lmk_id, ...)  # see tuto_slam_elements for info!
            landmarks[lmk_id] = landmark
            
            # factor
            fac_id += 1
            factor = makeFactor('landmark', fac_id, kf_id, lmk_id, measurement, sqrt_info)  # see tuto_slam_elements for info!
            factors[fac_id] = factor

    ## SOLVE AND DISPLAY --------------------------------------------------------------

    # solve optimization problem!
    opti.solve()

    # draw all objects in 3d!
    drawAll(opti, keyframes, landmarks, factors, viz)

    ## ADVANCE TIME --------------------------------------------------------------

    kf_last_id = kf_id
    t          += 1

## CONCLUDE --------------------------------------------------------------

# print final results!
printResults(...)


## Improvements

The algorithm above shows a few important limitations:

- The first time, there is no `last keyframe` to add a motion factor

- keyframes and landmarks do not have warm-start values: the solver will take long to converge, and may diverge!

- There is no prior or absolute information to anchor the produced map to any particular position / orientation

    - The prior can be set to the first keyframe

    - The prior can be set to one of the landmarks

These concerns are treated in the following sections.


### Second iteration: bootstrapping

We need to tackle the slightly different situation of the first keyframe, since there is no last keyframe to refer any motion to.

We need a marker to indicate that we just entered the problem. We use a boolean `first_time` for this.

In [None]:
## INITIALIZE ----------------------------------------------------

# Mark first time execution
first_time = True

## LOOP ----------------------------------------------------------
while(True):

    ## INPUT IMAGE --------------------------------------------------------------

    ## PROCESS MOTION --------------------------------------------------------------

    # make a KF for this image
    kf_id += 1
    keyframe = makeKeyFrame(...)
    keyframes[kf_id] = keyframe

    # make a motion factor from last KF
    if not first_time:
        motion_measurement = ...
        motion_sqrt_info = np.eye(6) / 1e3
        factor = makeFactor('motion', fac_id, kf_last_id, kf_id, motion_measurement, motion_sqrt_info)
        factors[fac_id] = factor 

    ## PROCESS IMAGE --------------------------------------------------------------

    ## SOLVE AND DISPLAY --------------------------------------------------------------

    ## ADVANCE TIME --------------------------------------------------------------
    
    first_time = False

## CONCLUDE --------------------------------------------------------------



### Third iteration: set keyframe and landmark warm-start values

In order for the solver to converge quickly, it is important in SLAM to use the semsor measurements in our favor.

In particular, we want to compute warm-start values for each new state that we add to the system.

- For keyframes, we do so by copying the values of the old keyframe into the new keyframe

- For landmarks, we use the Pose3d information in the measurement to compute the position of the new landmark

In [None]:
## INITIALIZE ----------------------------------------------------

## LOOP ----------------------------------------------------------
while(True):

    ## INPUT IMAGE --------------------------------------------------------------

    ## PROCESS MOTION --------------------------------------------------------------

    if first_time:
        # make a KF for this image
        pos_kf = np.array([0,0,0])
        ori_kf = np.array([0,0,0])
        keyframe = makeKeyFrame(kf_id, pos_kf, ori_kf)  # specify warm-start values!!
        keyframes[kf_id] = keyframe

    else:
        # make a KF for this image
        kf_id += 1
        pos_kf = opti.value(keyframes[kf_last_id].position)     # recover numerical values from solver
        ori_kf = opti.value(keyframes[kf_last_id].anglevector)
        keyframe = makeKeyFrame(kf_id, pos_kf, ori_kf)          # specify warm-start values!!
        keyframes[kf_id] = keyframe

        # make a motion factor from last KF
        motion_measurement = np.array([0,0,0,  0,0,0]) # we use a constant-position motion model
        motion_sqrt_info = np.eye(6) / 1e3  # very unprecise!! this allows the solver to move this KF away from the last one
        factor = makeFactor('motion', fac_id, kf_last_id, kf_id, motion_measurement, motion_sqrt_info)
        factors[fac_id] = factor 


    ## PROCESS IMAGE --------------------------------------------------------------
    
    # analyze image
    detections = detector.detect(image)

    for detection in detections:

        # obtain 3d pose of tag wrt. camera
        measurement = computePose(detection.corners)
        sqrt_info = np.eye(6) / 1e-2

        # see if lmk is known or new
        lmk_id = detection.tag_id
        if lmk_id in landmarks: # lmk known: we only need to add a factor

            fac_id += 1
            factor = makeFactor('landmark', fac_id, kf_id, lmk_id, measurement, sqrt_info)
            factors[fac_id] = factor

        else: # lmk new: we need to add the new landmark, and a factor

            # landmark warm-start!!
            # compose kf pose with measurement of tag pose, to obtain tag pose in global frame
            pos_lmk, ori_lmk = composePoses(pos_kf, ori_kf, measurement)

            landmark = makeLandmark(lmk_id, pos_lmk, ori_lmk)
            landmarks[lmk_id] = landmark

            fac_id += 1
            factor = makeFactor('landmark', fac_id, kf_id, lmk_id, measurement, sqrt_info)
            factors[fac_id] = factor

    ## SOLVE AND DISPLAY --------------------------------------------------------------

    ## ADVANCE TIME --------------------------------------------------------------

## CONCLUDE --------------------------------------------------------------

### Fourth iteration: adding a keyframe prior

The map produced by the algorithm above only contains relative information:
- between consecutive keyframes
- from keyframes to kandmarks

It is pertinent to anchor the problem to some absolute value. For this, we can create a factor that will atract the first keyframe towards a user-defined value.

This factor will be added to the first keyframe, using the `first_time` marker we defined above:

In [None]:
## INITIALIZE ----------------------------------------------------

## LOOP ----------------------------------------------------------
while(True):

    ## INPUT IMAGE --------------------------------------------------------------
    
    ## PROCESS MOTION --------------------------------------------------------------
    
    # make a KF for this image
    if first_time:
        # make a KF for this image
        # ...
        
        # add a prior keyframe factor
        kf_measurement = np.array([0,0,0,  0,0,0])   # origin of coordinates -- same as warm-start values!!
        kf_sqrt_info = np.eye(6) / 1e6               # very precise!! 
        factor = makeFactor('prior_keyframe', fac_id, kf_id, prior_measurement, prior_sqrt_info)
        factors[fac_id] = factor

    else:
        # make a KF for this image
        # ...

        # make a motion factor from last KF
        # ...

    ## PROCESS IMAGE --------------------------------------------------------------

    ## SOLVE AND DISPLAY --------------------------------------------------------------
    
    ## ADVANCE TIME --------------------------------------------------------------

## CONCLUDE --------------------------------------------------------------



### Fifth iteration: adding a landmark prior

Adding a landmark prior is more complicated since doing so impacts a whole number of things, in particular the way we warm-start the first keyframe. This needs to be in accordance with the absolute pose we give to the landmark as initial specification.

In other words, if a landmark is known to be a-priori at a particular location, before you see it it is impossible to know at which location the camera is.

Therefore the first keyframe needs to be created and warm-started **after** analysing the first image and deciding which landmark receives the prior.

Once the landmark is positioned, we need to compose its pose with the measurement of the tag, to conclude on the pose of the camera.

Only then the first keyframe can be created.


In [None]:
## INITIALIZE ----------------------------------------------------
pos_lmk_prior = np.array([1,2,3])
ori_lmk_prior = np.array([4,5,6])

prior_is_set = False  # we replace the 'first_time' flag by this new flag for better clarity

## LOOP ----------------------------------------------------------
while(True):

    ## INPUT IMAGE --------------------------------------------------------------

    ## PROCESS MOTION --------------------------------------------------------------
    if prior_is_set:

        # make a KF for this image
        kf_id += 1
        pos_kf = opti.value(keyframes[kf_last_id].position)
        ori_kf = opti.value(keyframes[kf_last_id].anglevector)
        keyframe = makeKeyFrame(kf_id, pos_kf, ori_kf)  # specify warm-start values!!
        keyframes[kf_id] = keyframe

        # make a motion factor from last KF
        motion_measurement = np.array([0,0,0,  0,0,0]) # we use a constant-position motion model
        motion_sqrt_info = np.eye(6) / 1e3  # very unprecise!! this allows the solver to move this KF away from the last one
        factor = makeFactor('motion', fac_id, kf_last_id, kf_id, motion_measurement, motion_sqrt_info)
        factors[fac_id] = factor 

    ## PROCESS IMAGE --------------------------------------------------------------

    # analyze image
    detections = detector.detect(image)

    for detection in detections:

        # obtain 3d pose of tag wrt. camera
        lmk_id = detection.tag_id
        pos_kf_lmk, ori_kf_lmk = computePose(detection.corners)  # landmark pose wrt. keyframe (aka. camera)
        measurement = vertcat(pos_kf_lmk, ori_kf_lmk)
        sqrt_info = np.eye(6) / 1e-2

        if not prior_is_set: # need to create the landmark prior -- we apply it to the first detected landmark for simplicity

            # create landmark at prior position
            pos_lmk = pos_lmk_prior 
            ori_lmk = ori_lmk_prior
            landmark = makeLandmark(lmk_id, pos_lmk, ori_lmk)
            landmarks[lmk_id] = landmark

            # create and append 'prior_landmark' factor (code not shown)

            # compute keyframe position for warm-start            
            pos_lmk_kf, ori_lmk_kf = invertPose(pos_kf_lmk, ori_kf_lmk)  # obtain 3d pose of camera wrt tag
            pos_kf, ori_kf = composePoses(pos_lmk, ori_lmk, pos_lmk_kf, ori_lmk_kf)  # compute kf pose as composition of the above
            
            # make and append first keyframe
            keyframe = makeKeyframe(kf_id, pos_kf, ori_kf)
            keyframes[kf_id] = keyframe

            # flip flag
            prior_is_set = True

        if lmk_id in landmarks:
            # create and append 'landmark' factor (code not shown)
            fac_id += 1
            factor = ...

        else:
            # create and append new lmk (code not shown)
            landmark = ...
            # create and append 'landmark' factor (code not shown)
            fac_id += 1
            factor = ...


    ## SOLVE AND DISPLAY --------------------------------------------------------------

    ## ADVANCE TIME --------------------------------------------------------------
    
## CONCLUDE --------------------------------------------------------------


## Conclusion

As you can see, the code grows as new functionalities are added to the system.

The set of functionalities explored here should be considered as the minimum necessary:

- **Bootstrap :** to account for exceptions to the general algorithm at the start

- **Priors :** to anchor the resulting map to some global reference

- **Warm-starts :** to help the solver converge, and do it quick.

From here on, you are free to enrich the code with other assets. Some suggestions (not necessarily easy to implement) are:

- Assess the quality of the detected tags. If a tag is deemed unsure, discard the measurement.

- Add an option to select between keyframe prior or landmark prior

- Change landmark priors to something different: just say that landmarks are on the ground (z = 0) and horizontal (pitch = roll = 0)

Good luck!