HermannΒ Blum1,3 Β· AlessandroΒ Mercurio1 Β· JoshuaΒ OβReilly1 Β· TimΒ Engelbracht1 Β· MihaiΒ Dusmanu2 Β· MarcΒ Pollefeys1,2 Β· ZuriaΒ Bauer1
1ETH ZΓΌrich Β Β 2Microsoft Β Β 3Lamar Institute / Uni Bonn
CroCoDL: the first dataset to contain sensor recordings from real-world robots, phones, and mixed-reality headsets, covering a total of 10 challenging locations to benchmark cross-device and human-robot visual registra- tion.
This repository hosts the source code for CroCoDL, the first dataset to contain sensor recordings from real-world robots, phones, and mixed-reality headsets, covering a total of 10 challenging locations to benchmark cross-device and human-robot visual registra-tion. The contributions of this work are:
- The (to the best of our knowledge) largest real-world cross-device visual localization dataset, focusing on diverse capture setups and environments.
- A novel benchmark on cross-device visual registration that shows considerable limitations of current state-of-the-art methods.
- Integration of the sensor streams of Boston Dynamicβs Spot robot into LaMARβs pseudo-GTpipeline. We will release the code for the data pre-processing and the required changes to the pipeline.
Here is a quick breakdown of the repository:
crocodile-benchmark/
βββ assets/ # README.md images
βββ lamar/ # Benchmarking pipeline code
βββ pipelines/ # End to end pipelines for processing data
βββ run_scripts/ # Convenience bash scripts for running pipelines
βββ scantools/ # Processing pipeline code
βββ scripts/ # Convenience external module installation bash scripts
βββ RAW-DATA.md # Information about raw data format
βββ CAPTURE.md # Information about capture format
βββ DATA.md # Information about data release structure
βββ Dockerfile # Docker container installation folder
βββ location_release.xlsx # Sheet containing inormation about data release locations
Setting up of our pipeline is similar to setting up Lamar with added dependencies. You can choose to set it up either locally, or using docker. Local installation has been tested with:
- Ubuntu 20.04 and Cuda 12.1
- Ubuntu 22.04 and Cuda XX.X (lamar machine)
git clone git@github.com:cvg/crocodl-benchmark.git
cd crocodl-benchmark
conda create -n croco python=3.10 pip
conda activate croco
We have used conda, however, you could also choose venv.
Depending on whether you would like to use exclusively use benchmarking pipeline or processing pipeline also, you can run:
chmod +x ./scripts/*
./scripts/install_all_dependencies.sh
for both processing and benchmarking pipelines, or:
chmod +x ./scripts/*
./scripts/install_benchmarking_dependencies.sh
for benchmarking dependencies only. Full package of dependencies, installed by install_all_dependencies.sh, is (in order):
- Ceres Solver 2.1 (processing and benchmarking)
- Colmap 3.8 (processing and benchmarking)
- hloc 1.4 (processing and benchmarking)
- raybender (processing)
- pcdmeshing (processing)
You can install these manually too using provided scripts inside ./scripts/install_{name_of_the_package}.
Last two are only required by processing pipeline, and are not installed by install_benchmarking_dependencies. Now, additional python dependencies need to be installed. You can do this by running:
python -m pip install -e .
for benchmarking pipeline only. If you wish to use processing too, also run:
python -m pip install -e .[scantools]
Lastly, if you wish to contribute run:
python -m pip install -e .[dev]
The Dockerfile provided in this project has multiple stages, two of which are:
scantools
and lamar
. For processing and benchamrking, respectively. You can build these images using:
docker build --target scantools -t croco:scantools -f Dockerfile ./
docker build --target lamar -t croco:lamar -f Dockerfile ./
In this section we will list available scripts and describe how to run our pipeline on both GPU and Docker. For simplicity, we will list only script you are directly running using bash scripts. To understand folder structure better, you may have a look at our data section.
Processing transforms raw data sessions into capture format, aligns capture sessions to ground truth scan, aligns sessions cross device, creates map and query split and finaly prunes query sessions. In the order of processing here is the list of run_{script_name}.py scripts that we are running to process data:
-
scantools/run_merge_bagfiles.py
- Combines Nuc and Orin bagfiles into a single, merged bagfile.
Output:{session_id}-{scene_name}.bag
for each pair of Nuc and Orin bagfiles given by the input txt file. Scene names are needed for further processing that is custom for each location. -
scantools/run_spot_to_capture.py
- Processes all merged bagfiles from a folder into a capture format spot sessions.
Output:sessions/spot_{session_id}/
capture format folder for each session in input folder. -
scantools/run_phone_to_capture.py
- Processes all raw iOS sessions into a capture format.
Output:sessions/ios_{session_id}/
capture format folder for each phone session inside input folder. -
scantools/run_navvis_to_capture.py
- Processes given raw NavVis session into a capture format.
Output:sessions/{navvis_session_id}/
capture format folder of the NavVis scan. -
scantools/run_combine_navvis_sessions.py
- Combines and aligns multiple NavVis sessions in capture format into a single NavVis session.
Output:sessions/{navvis_session_id_1}+...+{navvis_session_id_m}/
capture format folder of the combined NavVis scan. -
scantools/run_meshing.py
- Creates meshes from pointclouds of the NavVis scan. Also simplifies meshes for visualization.
Output:sessions/{navvis_session_id_1}+...+{navvis_session_id_m}/proc/meshes/*
for the given NavVis scan in capture format. -
scantools/run_rendering.py
- Renders meshes and calculates depth maps.
Output:sessions/{navvis_session_id_1}+...+{navvis_session_id_m}/raw_data/{session_id_i}/renderer/
depth map for images of the given NavVis scan mesh. -
pipelines/pipeline_scans.py
- Combines scripts 4β7 into a single pipeline for end-to-end processing of NavVis scans into capture format.
Output:sessions/{navvis_session_id_1}+...+{navvis_session_id_m}/
capture format folder of the combined NavVis scan.
-
scantools/run_sequence_aligner.py
- Aligns a given session to the ground truth NavVis scan.
Output:sessions/{session_id}/proc/
folder with alignment trajectories andregistration/{session_id}/
folders with image features, matches, and correspondences. -
scantools/run_joint_refinement.py
- Refines alignment trajectory of the given sessions by co-alignment.
Output:registration/{session_id}/{covisible_session_id}/
with matches/correspondences, and updated aligned trajectories inregistration/{session_id}/trajectory_refined.txt
. -
pipelines/pipeline_sequences.py
- Combines 1 and 2 into a pipeline for aligning sessions listed in.txt
files.
Output:sessions/{session_id}/
andregistration/{session_id}/
with alignment information.
-
scantools/run_combine_sequences.py
- Combines multiple capture sessions into a single capture session.
Output:{combined_session_id}/
folder with combined sessions in capture format. -
scantools/run_map_query_split_manual.py
- Creates map and query splits using 1 and.txt
inputs inlocation/*.txt
. Also transforms map sessions such that they are randomized. Output:{combined_session_id}/
folder with map/query split in capture format for all selected devices, transformation applied intransformations.txt
and visualizations invisualizations/
of all intermediate steps. -
scantools/run_query_pruning.py
- Prunes query trajectories of all devices such that all parts of the locations are covered in each query trajectory and subsamples them to achieve equal desnity overall. Output:{map/query_session_id}/proc/keyframes_*.txt
containing all the keyframes selected by the algorithm in each of its steps (original, after pruning annd after subsampling) and visualizations invisualizations/
of all intermediate steps along with a configuration filequery_pruning_config.txt
used for pruning.
-
scantools/run_visualize_trajectories.py
- Visualizes all available trajectories for selected devices. Output:visualizations/trajectories/trajectory_{device}.png
. -
scantools/run_visualize_map_query.py
- Visualizes all map and query overlap for selected devices. Output:visualizations/map_query/trajectory_{device}.png
. -
scantools/run_visualize_map_query_matrix.py
- Visualizes matrix of map and query devices for all selected devices. Output:visualizations/map_query/matrix_{device_list}.png
. -
scantools/run_visualize_map_query_renders.py
- Visualizes comparison of renders and raw images in all map/query session that are avialable. It also saves a video of these images stiched together. Output:visualizations/renders/{device}_{map/query}/*png
andvisualizations/render_videos/{device}_{map/query}.mp4
After fully processing the pipeline and confirming with visualizations you can now run the benchmarking pipeline. In this case you can choose whether to choose original keyframes, the ones generated after pruning or the ones generated after subsampling. These could be found in the corresponding {map/query_session_id}/proc/keyframes_*.txt
, where the start can be: original
, _pruned
or _pruned_subsampled
.
lamar/run.py
- Runs the benchmarking for the given pair of map and query capture sessions. Output:benchmarking/
folder containing all intermediate data for benchmarking, features matches etc.
In case you are running our pipeline locally, you can use these given example bash scripts with arguments:
-
run_scripts/run_merge_spot.sh
- Runsscantools/run_merge_bagfiles.py
locally. -
run_scripts/run_spot_to_capture.sh
- Runsscantools/run_spot_to_capture.py
locally. -
run_scripts/run_phone_to_capture.sh
- Runsscantools/run_phone_to_capture.py
locally. -
run_scripts/run_process_navvis.sh
- Runspipelines/pipeline_scans.py
locally. -
run_scripts/run_align_sessions.sh
- Runspipelines/pipeline_sequences.py
locally. -
run_scripts/run_map_query_split.sh
- Runsscantools/run_map_query_split_manual.py
locally. -
run_scripts/run_query_pruning.sh
- Runsscantools/run_query_pruning.py
locally. -
scantools/run_vis_trajectories.sh
- Runsscantools/run_visualize_trajectories.py
locally. -
scantools/run_vis_map_query.sh
- Runsscantools/run_visualize_map_query.py
locally. -
scantools/run_vis_map_query_matrix.sh
- Runsscantools/run_visualize_map_query_matrix.py
for all device combinations locally. -
scantools/run_vis_map_query_renders.sh
- Runsscantools/run_visualize_map_query_renders.py
for all available map/query sessions locally. -
scantools/run_benchmarking.sh
- Runslamar/run.py
locally.
In case you are running our pipeline on Docker, you can use these given example bash scripts with arguments:
-
run_scripts/docker_run_merge_spot.sh
- Runsscantools/run_merge_bagfiles.py
in a Docker container. -
run_scripts/docker_run_spot_to_capture.sh
- Runsscantools/run_spot_to_capture.py
in a Docker container. -
run_scripts/docker_run_phone_to_capture.sh
- Runsscantools/run_phone_to_capture.py
in a Docker container. -
run_scripts/docker_run_process_navvis.sh
- Runspipelines/pipeline_scans.py
in a Docker container. -
run_scripts/docker_run_align_sessions.sh
- Runspipelines/pipeline_sequences.py
in a Docker container. -
run_scripts/docker_run_map_query_split.sh
- Runsscantools/run_map_query_split_manual.py
in a Docker container. -
run_scripts/docker_run_query_pruning.sh
- Runsscantools/run_query_pruning.py
in a Docker container. -
scantools/docker_run_vis_trajectories.sh
- Runsscantools/run_visualize_trajectories.py
in a Docker container. -
scantools/docker_run_vis_map_query.sh
- Runsscantools/run_visualize_map_query.py
in a Docker container. -
scantools/docker_run_vis_map_query_matrix.sh
- Runsscantools/run_visualize_map_query_matrix.py
for all device combinations loin a Docker containercally. -
scantools/docker_run_vis_map_query_renders.sh
- Runsscantools/run_visualize_map_query_renders.py
for all available map/query sessions in a Docker container. -
scantools/docker_run_benchmarking.sh
- Runslamar/run.py
in a Docker container.
If you want to read more about data we provide, and how to download it you can have a look here.
Please consider citing our work if you use any code from this repo or ideas presented in the paper:
@InProceedings{Blum_2025_CVPR,
author = {Blum, Hermann and Mercurio, Alessandro and O'Reilly, Joshua and Engelbracht, Tim and Dusmanu, Mihai and Pollefeys, Marc and Bauer, Zuria},
title = {CroCoDL: Cross-device Collaborative Dataset for Localization},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
month = {June},
year = {2025},
pages = {27424-27434}
}