Official implementation of "CodedVO: Coded Visual Odometry" accepted in IEEE Robotics and Automation Letters, 2024.
Project page | IEEE Xplore | arXiv
If you use this code in your research, please cite:
@ARTICLE{codedvo2024,
author={Shah, Sachin and Rajyaguru, Naitri and Singh, Chahat Deep and Metzler, Christopher and Aloimonos, Yiannis},
journal={IEEE Robotics and Automation Letters},
title={CodedVO: Coded Visual Odometry},
year={2024},
doi={10.1109/LRA.2024.3416788}}- A novel method for estimating monocular visual odometry that leverages RGB and metric depth estimates obtained through a phase mask on a standard 1-inch camera sensor.
- A depth-weighted loss function designed to prioritize learning depth maps at closer distances.
- Evaluation in zero-shot indoor scenes without requiring a scale for evaluation.
git clone https://github.com/naitri/CodedVO
cd CodedVO conda env create -f environment.ymlWe provide our metric depth-weighted loss pre-trained model, which has been benchmarked on various indoor datasets. Download Pre-trained Models
We provide the training dataset, which includes the UMD-CodedVO dataset LivingRoom and NYU data, each containing 1000 images. The dataset also includes coded blur RGB images.
Additionally, we provide UMD-CodedVO dataset which includes ground truth depth, RGB images, coded blur RGB images, and trajectory information.
├── README.md
├── datasets
│ └── nyu_data
│ ├── rgb
│ ├── depth
│ └── Codedphasecam-27Linear
│ └── ...
├── scripts
│ └── ...
├── weights
│ └── ...
To generate coded blur RGB images from your own data, you can use the script coded-generator.py.
cd scripts
python coded_generator.py --root /path/to/your/data --scale_factor YOUR_SCALE_FACTOR- Scale factor for NYUv2 dataset is 1000, UMD-CodedVO-dataset is 1 and ICL-NUIM dataset is 5000.
- root path should be fodler contianing rgb, depth, Codedphasecam-27Linear. for e.g. ./datasets/nyu_data
Note: Our Point Spread Functions (PSFs) correspond to discretized depth layers using a 23×23 Zernike parameterized phase mask,with the depth range discretized into 27 bins within the interval of [0.5, 6] meters, with a focal distance of 85 cm.
To train your data or our given dataset :
python trainer.py --config MetricWeightedLossBlenderNYU --datasets /path/to/dataset/folder- You can add different configurations for loss and depth space in config.py and use those configurations for training. In this example, we use MetricWeightedLossBlenderNYU for our pre-trained weight file.
- You can also change the training or test dataset in config.py by modifying lines 19-31.
The evaluation script can be executed as follows:
python evaluate.py --CONFIG MetricWeightedLossBlenderNYU --DATASET /path/to/dataset/folder --OUTPUT /path/to/output/folder --CHECKPOINT /path/to/checkpoint/fileWe use ORB-SLAM after disabling the loop closure. Predicted depth maps from the above models are used to compute the odometry. Follow the ORB-SLAM2 RGBD execution instructions. Note that we do not use coded blur RGB images directly. As mentioned in the paper, we apply unsharp masking on them for computing odometry.
We would like to thank authors of Phasecam3D and ORB-SLAM2 for opensourcing codebase.
If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the authors Naitri Rajyaguru or Sachin Shah

