CNN-SVO

(Top) camera frames (Bottom) depth maps of the keyframes from Monodepth

Demo: https://www.youtube.com/watch?v=4oTzwoby3jw

Paper: CNN-SVO: Improving the Mapping in Semi-Direct Visual Odometry Using Single-Image Depth Prediction

This is an extension of the SVO with major improvements for the forward facing camera with the help of monocular depth estimation from CNN.

We tested this code with ROS Kinetic on Ubuntu 16.04.

There are two ways to make it work

Online mode
Offline mode

Dataset

We use KITTI Odometry sequences and Oxford Robotcar dataset to evaluate the robustness of our method.

We provide camera parameters for both datasets. Running a sequence requires setting the directory that keeps the images (also directory to their corresponding depth maps if you are using Offline mode).

Unfortunately, getting Oxford Robotcar dataset to work is not straightforward. Here is the rough idea of what do you need to do to preprocess Oxford Robotcar dataset images:

Pick a sequence to download
Use the SDK to get the RGB images
Crop the images to match the aspect ratio (1248x376) of images in KITTI dataset
(For running Offline mode) Save the depth maps of those images

Online mode

Clone the monodepth-cpp repo and follow the instructions
Make sure the library is successfully built by running the inference_monodepth executable
Set the image directory in rpg_svo/svo_ros/param/vo_kitti.yaml
Make sure the images are colour (not greyscale) images

NOTE: If you are having difficulty compiling the library, you can still visualize the results using the Offline mode

Offline mode

Clone the Monodepth repo and follow the Testing instructions
Modify the monodepth_main.py so that it saves the disparity maps as numpy (.npy) files in one folder (per sequence)
Set the image and depth directories in rpg_svo/svo_ros/param/vo_kitti.yaml

NOTE: For saving disparity map, the naming convention is depth_x.npy, where x is 0-based with no leading zero.

Instructions

Clone this repo

cd ~/catkin_ws/src
git clone https://github.com/yan99033/CNN-VO

(OPTIONAL) Clone monodepth-cpp repo

cd ~/catkin_ws/src/CNN-VO
git clone https://github.com/yan99033/monodepth-cpp

Enable/disable Online mode by toggling TRUE/FALSE in svo_ros/CMakeLists.txt and svo/CMakeLists.txt. Also, make sure that Monodepth library and header file are linked properly if Online mode is used. Note that Online mode is disabled by default
Compile the project

cd ~/catkin_ws
catkin_make

Launch the project

roscore
rosrun rviz rviz -d ~/catkin_ws/src/CNN-VO/rpg_svo/svo_ros/rviz_kitti.rviz
roslaunch svo_ros kittiOffline00-02.launch

Try another KITTI sequence

Set the folder image and depth directories in rpg_svo/svo_ros/param/vo_kitti.yaml
Use the corresponding roslaunch file

Evaluation

We used ATE evaluation code provided by ORBSLAM here with a little bit of modification.

cd eval

python2 evaluate_ate_scale.py PATH_TO_GROUNDTRUTH PATH_TO_YOUR_OUTPUT PATH_TO_TIMESTAMPS_FILE --plot PATH_TO_PLOT_FILE_YOU_WANT_TO_SAVE

As an example:

python2 evaluate_ate_scale.py ground_truth/kitti_ground_truth/00.txt sample/seq00_CNN_SVO_KITTI_SAMPLE.txt ground_truth/kitti_ground_truth/00_times.txt --plot 00.png

Possible extension

We tried our best to improve the existing SVO, but this code is by no means perfect. That being said, we would like to point out some of the noticeable problems in our code:

The original SVO is not designed to handle too many keyframes (KFs). Therefore, the system begins to slow down after accumulating too much KFs. For example, the system is running smoothly on Oxford Robotcar dataset because of the high frame-per-second (16 FPS); in contrast, it doesn't handle a large amount of KFs well on KITTI dataset (10 FPS). (It is always better to use a high FPS camera)
As mentioned in the paper, we use a constant velocity model to handle extreme brightness condition (i.e., complete blank image), and then we back-project more points when new features can be observed. In practice, that makes sense for outdoor driving conditions. However, if your application requires proper relocalization, you would need to implement that by yourself.
The network is trained to process depth maps with the image size of 256x512x3. There are two image resizing steps (downsizing the image and upsizing the depth map) in the VO pipeline. Although we don't see any noticeable problem doing it, it is indeed faster to process the images with the size of 256x512x3.
It is a rather inefficient way to save and load the depth maps that are stored separately (i.e., depth_0.npy, depth_1, npy, ...). A better way to store and retrieve the depth maps in the Offline mode is to save just ONE npy file (depth.npy) that has a shape of [total_num_images, width, height, 1].
(Suggestion) It would be better to create another thread for the visualization. The current approach to visualization is that it publishes local map points, trajectory and camera frustum, and those markers stay permanent until you close the visualization (unless the delete marker operation is carried out). You can have a dedicated thread (like ORB-SLAM) that gets all the keyframes and publishes the poses, map points and trajectory.

We hope that you can further extend the functionality of this work, and make the existing SVO even better (which is an impressive piece of work).

Disclaimer

The authors take no credit from SVO and Monodepth. Therefore the licenses should remain intact. Please cite their work if you find them helpful.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
eval		eval
gif		gif
rpg_svo		rpg_svo
rpg_vikit		rpg_vikit
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval

eval

gif

gif

rpg_svo

rpg_svo

rpg_vikit

rpg_vikit

.gitignore

.gitignore

README.md

README.md

Repository files navigation

CNN-SVO

Dataset

Online mode

Offline mode

Instructions

Evaluation

Possible extension

Disclaimer

About

Releases

Packages

Languages

ITking666/CNN-SVO

Folders and files

Latest commit

History

Repository files navigation

CNN-SVO

Dataset

Online mode

Offline mode

Instructions

Evaluation

Possible extension

Disclaimer

About

Resources

Stars

Watchers

Forks

Languages