Basically, VINS-Fusion on steriods with a learning front-end upgrade.
A tightly coupled visual-inertial odometry (VIO) system with deep learning front-ends for feature extraction and matching. It replaces classical hand-crafted features with learned alternatives — ALIKED, RaCo, SuperPoint, XFeat with either Lucas-Kanade optical flow tracking or frame-to-frame LightGlue matching — accelerated via TensorRT for deployment on NVIDIA Jetson. Optional loop closure is provided through AnyLoc using DINOv2 + VLAD place recognition. The system is evaluated against public benchmarks (EuRoC, NTU-VIRAL, SubT-MRS) and compared to baselines (LET-Net, VINS-Fusion, OKVIS2).
ntu_combined_web.mp4
euroc_combined_web.mp4
- ALIKED_LK SubT_MRS smoke room
aliked_lk_smoke_room_web.mp4
- SuperPoint_LightGlue SubT_MRS low_light1
sp_lg_lowlight1_web.mp4
git clone https://github.com/limshoonkit/DL-VINS-Factory-ROS2.git --recursive- CUDA 12.6 (Jetpack 6.x)
- TensorRT 10.3
- Ceres 2.2
- OpenCV 4.13
- Download weights with
chmod +x scripts/download_weights.sh
./scripts/download_weights.sh-
Follow instructions to setup uv. Make sure to get polygraphy and onnxsim.
-
Run the notebooks. Remember to adjust the batch size and dimensions.
-
Migrate the exported engines to weights folder. Setup format accordingly.
-
For loop-fusion, the vlad master codebook .bin is provided. You only need the DINOv2 ViT-S engine. Export with the provided script. Setup the uv env for Anyloc before that or just trt export from the .onnx provided.
cd /home/nvidia/dl-vins-factory/AnyLoc
uv sync
source ./venv/bin/activate
export PATH="/usr/src/tensorrt/bin:$PATH" # for jetson
uv pip install onnx # if missing
uv run python -m vpr.dino_vpr_export \
--input-h 280 --input-w 448 --clusters 32 --fp16
cd DL-VINS && colcon build --base-paths src/ --symlink-install \
--cmake-args=-DCMAKE_BUILD_TYPE=Release --allow-overriding cv_bridge image_geometry
cd ..The script (scripts/eval_run.sh + scripts/eval_runner.py) run for different public datasets.
Refer dataset source to download it.
Ground truth is pre-baked as <dataset>/groundtruth/<seq>_gt.tum and the ROS1 bags are converted to ROS2 .mcap format. Let me know if you need them for those that I am allowed to redistribute based on their license.
Trajectories are written to tmp/<dataset>_eval_<mode>/<method>/<seq>/vio_trajectory_run01.tum.
EVAL_METHOD_SET=dl \
EVAL_MODE=mono \
EVAL_USE_LOOP_FUSION=true \
EVAL_METHODS="dlvins_gftt_cpu_mono_loop" \
EVAL_DATASET=euroc \
EVAL_SEQUENCES="MH_01_easy" \
bash scripts/eval_run.shRun the full DL sweep for all sequences on a dataset (very long running):
EVAL_METHOD_SET=dl \
EVAL_MODE=stereo \
EVAL_DATASET=euroc \
bash scripts/eval_run.shIn one terminal, run
cd DL-VINS
source install/setup.bash
ros2 launch dl_vins ntu_viral_stereo.launch.py \
extractor:=aliked_lightglue \
use_loop_fusion:=true \
rviz:=true
# or
ros2 launch dl_vins subt_mrs_mono.launch.py \
extractor:=superpoint_lightglue \
use_loop_fusion:=true \
rviz:=true
and in another terminal, run
source install/setup.bash
ros2 bag play ../dataset/7_NTU-VIRAL/eee_01 --clock
# or
ros2 bag play ../dataset/6_SubT-MRS/low_light1 --clock
-
GPLv3 license
-
This work is based off VINS-Fusion and LightGlue-ONNX
-
Baseline method for comparison (LET-NET, vins-fusion, okvis2) is available at baseline_reference
-
📄 Paper: DL-VINS-Factory
@misc{lim2026dlvinsfactorymodularframeworklearned,
title={DL-VINS-Factory: A Modular Framework for Learned Visual Front-Ends in Visual-Inertial SLAM},
author={Shoon Kit Lim and Melissa Jia Ying Chong and Ting Yang Ling},
year={2026},
eprint={2607.01757},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2607.01757},
}
Feel free to open any Github issues.
Following are some to beware of
- ROS2 QOS profile and DDS discovery
I am using cyclonedds with profile available at ./scripts/cyclonedds.xml. This increase the buffer size. Also for sensor qos, its default is reliable. The evaluation script also set a non-default ROS_DOMAIN_ID.
I have found that intermittent network connection, ros multicast and some bad actors spoofing around can mess up the long running evaluation. My advice, run evaluation offline. Especially the bad actor part, took me a week to realise this.
- Jetson Thermal Issue
Basically GPU uptime is very long considering the length of evaluation data. Make sure to keep it in a cool place. I am also using an external SSD for the dataset, those get really hot as well while running.
- Non-determinism
GPU, multithread, different platform architecture, TensorRT version, floating point arithmetic, RANSAC differnt seed just to name a few. Variance should be small, file an issue if result is completely different than what I reported.