Skip to content

Datasets/cumulti#41

Merged
contagon merged 8 commits into
masterfrom
datasets/cumulti
Sep 10, 2025
Merged

Datasets/cumulti#41
contagon merged 8 commits into
masterfrom
datasets/cumulti

Conversation

@DanMcGann
Copy link
Copy Markdown
Collaborator

Cu-Multi

This PR adds a dataset description for the CU-Multi Dataset: https://arxiv.org/abs/2505.17576

A few notes on CU-Multi

  • The CU-Multi dataset is currently under revision, but this dataset description references what will be the final release of the dataset.
  • The groundtruth files have not yet been uploaded to the file hosting service @contagon see your email for a temp fix
  • The Accel/Gyro bias noise parameters for the microstrain IMU appear to be un-published publicly. There may be opportunity to compute these values internally, but for now the dataset uses default values of 1e-6

ROS2 Bag Support

This PR also updates the usage of rosbags.AnyReader. While the documentation of AnyReader implies that it can either handle multiple ros1 .bag files OR a single ros2 bag/ directory the actual implementation permits reading multiple ros2 bag/ directories much like multiple ros1 .bag files. The changes made are minor to the loaders, and should be backwards compatible, but updates the mcap flag to is_ros2 and globs over all sub-directories to permit multiple ros2 bag/ directories.

@DanMcGann DanMcGann requested a review from contagon August 6, 2025 16:15
@DanMcGann
Copy link
Copy Markdown
Collaborator Author

Okay test issue is that I broke part of the API that I did not realize we used.

I think that the desired API is that:
RosbagIter can take as input 1) a file or 2) a directory.
If provided a file this must be a .bag file and we read that bag.
If the input however, is a directory there are 3 things that we may want to do:

  • read that directory as a single ros2bag
  • Glob this directory for *.bag files
  • Glob this directory for subdirectories */ (i.e. multiple ros2 bags)

Currently the API doesn't let us disambiguate between the options when a directory is passed.
A simple solution is to have 2 flags: glob indicating we should glob for sub-bags and is_ros2 indicating the type of bag file we want / should glob for.

Happy to make these changes, but wanted to check the proposed API changes first.
-Dan

@contagon
Copy link
Copy Markdown
Owner

contagon commented Aug 6, 2025

Hmm, this is a great catch. There's a few trajectories in oxford_spires that have multiple ros2 bags that I never took the time to figure out that this would fix as well.

Ideally, we'd like to this to just work and magically infer ros versions and whether we should glob or not. is_mcap was a bit of a hack I added while under some deadline pressure. I propose the following when a directory is received:

  • I think we should be able to infer the version first by first globbing for any files *.bag in the subdirectory, and if there is any then it's Ros1, otherwise Ros2. Then proceed down the Ros1 or Ros2 path as needed.
  • If Ros1, glob all the bags
  • If Ros2, glob all the directories containing a yaml file. (All ros2 bags have to a yaml file right?) I do a similar thing in stats when searching for bottom level directories.

Do you think it's still worth having parameters to override the inferring behavior? I think we should be able to make it smart enough that it just handles all these cases, I was just too busy to code up it previously.

Should now automatically accept
- A path to a ros1 or ros2 bag
- A Path to a dir containing multiple ros1 or ros2 bags
@contagon
Copy link
Copy Markdown
Owner

This looks great Dan, I'll merge this once the ground truth is publicly available.

I am hitting a hitch though - when running main_campus_robot1 I am getting this backtrace,

Traceback (most recent call last):
  File "/home/contagon/.local/share/uv/python/cpython-3.12.5-linux-x86_64-gnu/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/contagon/.local/share/uv/python/cpython-3.12.5-linux-x86_64-gnu/lib/python3.12/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/contagon/Research/evalio/python/evalio/cli/run.py", line 232, in run_single
    for data in dbuilder.build():
                ^^^^^^^^^^^^^^^^
  File "/home/contagon/Research/evalio/python/evalio/datasets/loaders.py", line 170, in __iter__
    msg = self.reader.deserialize(rawdata, connection.msgtype)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/contagon/Research/evalio/.venv/lib/python3.12/site-packages/rosbags/highlevel/anyreader.py", line 107, in deserialize
    return self._deser_ros2(rawdata, typ) if self.is2 else self._deser_ros1(rawdata, typ)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/contagon/Research/evalio/.venv/lib/python3.12/site-packages/rosbags/highlevel/anyreader.py", line 103, in _deser_ros2
    return self.typestore.deserialize_cdr(rawdata, typ)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/contagon/Research/evalio/.venv/lib/python3.12/site-packages/rosbags/typesys/store.py", line 134, in deserialize_cdr
    msgdef = self.get_msgdef(typename)
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/contagon/Research/evalio/.venv/lib/python3.12/site-packages/rosbags/typesys/store.py", line 313, in get_msgdef
    entries = self.fielddefs[typename][1]
              ~~~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'sensor_msgs/msg/PointCloud2'

I did notice in the metadata.yaml the topic is missing the type_description_hash which the kittredge loop trajectories have. I'm curious if that's related at all.

Are you able to recreate this at all? It's an error I definitely haven't seen before.

@contagon
Copy link
Copy Markdown
Owner

Got this figured out - it appears that the lidar bags of the main robot sequences don't have embedded message definitions!

rosbags will try to load some message definitions if none are found in any of the bags. Since there are messages definitions in the imu bags, none were being loaded!

The fix was to forcibly load some messages definitions using some rosbags internal API:

if type_store is not None:
    self.reader.typestore.register(get_typestore(type_store).fielddefs)

I'll push the fix soon here. It may be worth reporting this to the original authors? In case they want to tweak those rosbags

@DanMcGann
Copy link
Copy Markdown
Collaborator Author

The root issue is definitely related to the rosbags themselves.
My local copy contains the correct metadata, but the current version on the file-hosting platform (which it looks like was updated a few days ago) clearly is missing it.
Will pass along to the authors.

DanMcGann and others added 2 commits September 6, 2025 11:26
- Updates the Lidar - IMU extrinsics
- Fixes LiDAR rate
- Fixes LiDAR stamp convention
@contagon
Copy link
Copy Markdown
Owner

Just pushed the tweak to manually load typestores, remove that strange legacy "orig" check, and some formatting tweaks. Should be good to merge now! :)

@contagon contagon merged commit 23b213f into master Sep 10, 2025
12 checks passed
@contagon contagon deleted the datasets/cumulti branch September 10, 2025 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants