Preprocessing code for converting source datasets into the TLC-Calib dataset format used by Targetless LiDAR-Camera Calibration with Neural Gaussian Splatting (TLC-Calib) [RA-L 2026].
This repository currently supports the Waymo Open Dataset path. It is derived from DriveStudio, but keeps only the raw Waymo processing and TLC-Calib export flow needed for reproduction.
git clone https://github.com/zang09/TLC-Calib_preprocessing.git
cd TLC-Calib_preprocessingconda create -n tlc-calib-preprocess python=3.10 -y
conda activate tlc-calib-preprocesspython -m pip install -r requirements.txt
python -m pip install --no-deps waymo-open-dataset-tf-2-11-0==1.6.0Install and authenticate the Google Cloud SDK before downloading Waymo files. See the upstream DriveStudio Waymo guide for the original dataset access flow: https://github.com/ziyc/drivestudio/blob/main/docs/Waymo.md
Run all commands from the repository root. Raw files, DriveStudio-style intermediate outputs, and final TLC-Calib outputs stay inside this repository by default:
TLC-Calib_preprocessing/
├── raw/ # downloaded Waymo tfrecord files
├── processed/
│ ├── drivestudio/
│ │ ├── waymo/training/<scene_id>/ # intermediate DriveStudio-style scenes
│ │ └── ...
│ ├── tlc-calib/
│ │ ├── waymo/scene<scene_id>/ # final TLC-Calib exports
│ │ └── ...
│ └── ...
└── ...
Scene IDs follow the line index in data/waymo_train_list.txt.
python datasets/waymo/waymo_download.py \
--target_dir raw \
--scene_ids 81 226 362TLC-Calib export needs images, LiDAR, camera calibration, and ego poses:
export PYTHONPATH=$(pwd)python datasets/preprocess.py \
--data_root raw \
--target_dir processed/drivestudio/waymo \
--split training \
--scene_ids 81 226 362 \
--workers 8 \
--process_keys images lidar calib poseThis creates:
processed/drivestudio/waymo/training/{scene_id}/
Export one scene:
python data_process.py \
--output_root processed/tlc-calib \
dataset=waymo/5cams \
data.scene_idx=81 \
data.start_timestep=0 \
data.end_timestep=79 \
data.pixel_source.load_sky_mask=False \
data.pixel_source.load_dynamic_mask=False \
data.pixel_source.load_objects=False \
data.pixel_source.load_smpl=False \
data.lidar_source.only_use_top_lidar=True \
data.delete_out_of_view_points=Falsedata.end_timestep is inclusive. The example exports frames 0..79 to:
processed/tlc-calib/waymo/scene81/
Each exported scene follows the TLC-Calib dataset layout:
scene_root/
├── images/
│ ├── image_00/000000.png
│ ├── image_01/000000.png
│ └── ...
├── lidar/
│ ├── map.ply
│ └── rgb_map.ply
├── params/
│ ├── cam0.txt
│ ├── cam0_to_lidar.txt
│ ├── cams_to_lidar_gt.txt
│ ├── intrinsics.txt
│ ├── lidars.txt
│ └── timestamps.txt
├── pcds/000000.pcd
├── README.md
└── valid_frame.txt
pcds/*.pcdare binary PCD files withx y z intensityfields.params/cam*.txtandparams/lidars.txtcontain per-frame row-major4 x 4poses.params/cam*_to_lidar.txtandparams/cams_to_lidar_gt.txtcontain[camera_id, flattened 4 x 4 matrix]rows.valid_frame.txtmaps local exported indices to original Waymo frame indices.
This code is released under the MIT License; see LICENSE.
Users must follow the Waymo Open Dataset terms separately.
This preprocessing repository is built upon the DriveStudio codebase.
If you use this dataset or the preprocessing format, please cite TLC-Calib and OmniRe:
@article{jung2026targetless,
title = {{Targetless LiDAR-Camera Calibration with Neural Gaussian Splatting}},
author = {Jung, Haebeom and Kim, Namtae and Kim, Jungwoo and Park, Jaesik},
journal = {IEEE Robotics and Automation Letters (RA-L)},
year = {2026}
}@inproceedings{chen2025omnire,
title = {OmniRe: Omni Urban Scene Reconstruction},
author = {Ziyu Chen and Jiawei Yang and Jiahui Huang and Riccardo de Lutio and Janick Martinez Esturo and Boris Ivanovic and Or Litany and Zan Gojcic and Sanja Fidler and Marco Pavone and Li Song and Yue Wang},
booktitle = {The Thirteenth International Conference on Learning Representations},
year = {2025}
}