# Dataset Visual Odometry / SLAM Evaluation

1. [Download odometry data set (grayscale, 22 GB)](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_odometry_gray.zip)
2. [Download odometry data set (color, 65 GB)](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_odometry_color.zip)
3. [Download odometry data set (velodyne laser data, 80 GB)](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_odometry_velodyne.zip)
4. [Download odometry data set (calibration files, 1 MB)](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_odometry_calib.zip)
5. [Download odometry ground truth poses (4 MB)](https://s3.eu-central-1.amazonaws.com/avg-kitti/data_odometry_poses.zip)



## Sensor setup 
<img src="images/setup_top_view.png" />

<img src="images/passat_sensors_920.png" />




## Calibration Files and Projection Matrices

to get the calibration data run:
```
python kitti_calibration.py
```





- $P0$: Reference camera (left of stereo pair 1), extrinsics are identity.
- $P1$: Right camera of stereo pair 1, extrinsics include baseline offset.
- $P2$: Left camera of stereo pair 2, extrinsics depend on setup.
- $P3$: Right camera of stereo pair 2, extrinsics depend on setup.


---

Camera: $P0$:

```
Projection Matrix:
[[707.0912   0.     601.8873   0.    ]
 [  0.     707.0912 183.1104   0.    ]
 [  0.       0.       1.       0.    ]]
Intrinsic Matrix:
[[707.0912   0.     601.8873]
 [  0.     707.0912 183.1104]
 [  0.       0.       1.    ]]
Rotation Matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Translation Vector:
[[0.]
 [0.]
 [0.]]
```
---

Camera: $P1$:
```
Projection Matrix:
[[ 707.0912    0.      601.8873 -379.8145]
 [   0.      707.0912  183.1104    0.    ]
 [   0.        0.        1.        0.    ]]
Intrinsic Matrix:
[[707.0912   0.     601.8873]
 [  0.     707.0912 183.1104]
 [  0.       0.       1.    ]]
Rotation Matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
Translation Vector:
[[ 5.37150653e-01]
 [-1.34802944e-17]
 [ 0.00000000e+00]]
```

From the above image the distance between two camera is `0.54` on $x$ axis and from decomposition we have: `5.37150653e-01`.

Refs: [1](https://www.cvlibs.net/datasets/kitti/setup.php)
[2](https://stackoverflow.com/questions/29407474/how-to-understand-the-kitti-camera-calibration-files), [3](https://github.com/yanii/kitti-pcl/blob/master/KITTI_README.TXT), [4](https://www.cvlibs.net/datasets/kitti/eval_odometry.php), [5](https://github.com/avisingh599/mono-vo/), [6](https://github.com/alishobeiri/Monocular-Video-Odometery), [7](https://avisingh599.github.io/vision/monocular-vo/)


## Ground Truth Poses
each row of the data has 12 columns, 12 come from flattening a `3x4` transformation matrix of the left:

```
r11 r12 r13 tx r21 r22 r23 ty r31 r32 r33 tz
```





## Display Ground Truth Poses in rerun 
just run: 

```
python kitti_gt_to_rerun.py
```


<img src="images/display_ground_truth_poses_rerun.png" />


## Visual Odometry

If you use `SIFT` run: 

```
python kitti_vo_sift.py
```

<img src="images/kitti_vo_sift.png" />

or if you use `cv2.goodFeaturesToTrack` you will get poor results:


```
python kitti_vo.py
```

<img src="images/kitti_vo.png" />


## Stereo Vision
just run:
```
python kitti_stereo.py
```


## Reconstruct Sparse/Dense Model From Known Camera Poses with Colmap

Your data should have the following structure: 

```
├── database.db
├── dense
│   ├── refined
│   │   └── model
│   │       └── 0
│   └── sparse
│       └── model
│           └── 0
├── images
│   ├── 00000.png
│   ├── 00001.png
│   ├── 00002.png
│   └── 00003.png
└── sparse
    └── model
        └── 0
            ├── cameras.txt
            ├── images.txt
            └── points3D.txt
```

1. `cameras.txt`: the format is:

```
CAMERA_ID, MODEL, WIDTH, HEIGHT, PARAMS[]
```
so for KITTI dataset the camera model is `PINHOLE`, and it has four parameters which are the focal lengths (`fx`, `fy`) and principal point coordinates (`cx`, `cy`).

- `CAMERA_ID`: 1
- `MODEL`: PINHOLE
- `WIDTH`: 1226
- `HEIGHT`: 370
- `fx`: 707.0912
- `fy`: 707.0912
- `cx`: 601.8873
- `cy`: 183.1104

should be like this:

```
1 PINHOLE 1226 370 707.0912 707.0912 601.8873 183.1104
```

2. `images.txt`: the format is
```
IMAGE_ID, QW, QX, QY, QZ, TX, TY, TZ, CAMERA_ID, NAME
```

so you data should be like this, mind the extra line after each line:

```
1 1.0 0.0 0.0 0.0 0.031831570484910754 -0.2020180259287443 -0.05988511865826446 1 000000.png

2 0.9999990698095921 -0.000486454947446343 0.0008155417501438222 -0.0009790981505847082 -0.026717887515950233 -0.09385561937368328 -0.38812196090339146 1 000001.png

3 0.9999976159395401 -0.0011567120445530273 0.0013793515824379724 -0.0012359294859380324 -0.23100950491953082 -0.05900910756124116 -0.9698261247623092 1 000002.png

4 0.9999950283825452 -0.0017604272641239351 0.0022926784138869423 -0.0012600522730534293 0.17578254454768152 -0.014474209460539546 -1.9112790713853196 1 000003.png
```
and finally:

3. `points3D.txt`: This file should be empty.

You can run the following command to convert some colmap dataset into TXT to compare with your dataset:

```
colmap model_converter --input_path $DATASET_PATH/sparse/0 --output_path $DATASET_PATH/ --output_type TXT
```

KITTI format for ground truth poses (for instance, for the file `data/kitti/odometry/05/poses/05.txt`) is:

```
r11 r12 r13 tx r21 r22 r23 ty r31 r32 r33 tz
```
The colmap format for `images.txt` is: 

```
IMAGE_ID, QW, QX, QY, QZ, TX, TY, TZ, CAMERA_ID, NAME
```

Run the script [kitti_to_colmap.py](../scripts/kitti/kitti_to_colmap.py). It dumps the output into `images.txt` file. 


You can run the following script to add noise: [kitti_to_colmap_noise.py](../scripts/kitti/kitti_to_colmap_noise.py).


The inside of `~/colmap_projects/kitti_noisy` create a soft link pointing to KITTI images:
ln -s <path-to-kitti-odometry-image> images

in my case:

```
 ln -s /home/$USER/workspace/OpenCVProjects/data/kitti/odometry/05/image_0/ images
```

### Setting up parameters

Then set the camera param:

```
CAM=707.0912,707.0912,601.8873,183.1104
```

set the project:
```
project_name=kitti_noisy
DATASET_PATH=/home/$USER/colmap_projects/$project_name
```

### Feature extraction

extract the features:
```
colmap feature_extractor  \
--database_path $DATASET_PATH/database.db  \
--image_path $DATASET_PATH/images  \
--ImageReader.single_camera=true --ImageReader.camera_model=PINHOLE --ImageReader.camera_params=$CAM \
--SiftExtraction.use_gpu 1 \
--SiftExtraction.estimate_affine_shape=true \
--SiftExtraction.domain_size_pooling=true
```

or 

```
colmap feature_extractor  \
--database_path $DATASET_PATH/database.db  \
--image_path $DATASET_PATH/images  \
--ImageReader.single_camera=true --ImageReader.camera_model=PINHOLE --ImageReader.camera_params=$CAM
```

### Matcher
run the matcher:

```
colmap sequential_matcher \
   --database_path $DATASET_PATH/database.db \
   --SequentialMatching.overlap=3 \
   --SequentialMatching.loop_detection=true \
   --SequentialMatching.loop_detection_period=2 \
   --SequentialMatching.loop_detection_num_images=50 \
   --SequentialMatching.vocab_tree_path="$DATASET_PATH/../vocab_tree/vocab_tree_flickr100K_words256K.bin" \
   --SiftMatching.use_gpu 1 --SiftMatching.gpu_index=-1  --SiftMatching.guided_matching=true 
```

So now if you run colmap, create new project, set the path for images and select the `database.db` and from **File> Import Model** and point to `kitti_noisy/sparse/model/0/` you will get the followings:

<img src="images/kitti_sparse_noisy_colmap.png" />





### Triangulation
then run the 

```
colmap point_triangulator \
    --database_path $DATASET_PATH/database.db \
    --image_path $DATASET_PATH/images\
    --input_path $DATASET_PATH/sparse/model/0 \
    --output_path $DATASET_PATH/dense/sparse/model/0
```

Now run bundle adjuster to only optimize the extrinsic (camera position and orientations) and **NOT** intrinsic (camera parameter)


```
colmap bundle_adjuster  \
  --input_path $DATASET_PATH/dense/sparse/model/0 \
  --output_path $DATASET_PATH/dense/refined/model/0 \
  --BundleAdjustment.refine_focal_length  0 \
  --BundleAdjustment.refine_principal_point   0 \
  --BundleAdjustment.refine_extra_params  0 \
  --BundleAdjustment.refine_extrinsics  1
```


Ok now if you run colmap, create new project, set the path for images and select the `database.db` and from **File> Import Model** and point to `kitti_noisy/dense/refined/model/0/` you will get the followings:

<img src="images/kitti_sparse_refined_colmap.png" />


Refs [1](https://colmap.github.io/faq.html#reconstruct-sparse-dense-model-from-known-camera-poses)