Skip to content

feat(perception): add ORB-SLAM3 native module#1598

Open
leshy wants to merge 9 commits intodevfrom
feat/orbslam3-native-module
Open

feat(perception): add ORB-SLAM3 native module#1598
leshy wants to merge 9 commits intodevfrom
feat/orbslam3-native-module

Conversation

@leshy
Copy link
Copy Markdown
Contributor

@leshy leshy commented Mar 17, 2026

part of monocular nav experiment #1596 but ORB-SLAM3 is a good thing to have around in general

otherwise extremely light native-module

TBH I don't understand rerun transforms (I keep forgetting and re-reading the docs) and not sure that our bridge API in this context is most convinient, so will review, this doesn't give you a nice moving camera frame, just odom and image messages, I will implement this later, it's a rerun config issue

Adds ORB-SLAM3 visual SLAM as a NativeModule under perception/slam/,
following the same pattern as FastLIO2. Phase 1: nix build from source,
binary initializes System and blocks — no LCM I/O yet.

- Python module with OrbSlam3Config and perception.Odometry protocol
- C++ wrapper using dimos::NativeModule for CLI arg parsing
- Nix flake that builds ORB-SLAM3 from github:thuvasooriya/orb-slam3
- Default RealSense D435i camera settings
- Vocab auto-resolved from nix store path
@jeff-hykin
Copy link
Copy Markdown
Member

So we've got a problem (I'm having this with stuff in the nav stack too) ... ORB SLAM3 is copy left, GPLv3. We can't really include it in our codebase.

I think for a lot of our native modules we're going to need a way to keep it in a separate codebase and then pull it in at runtime if the user opts to use the module.

@leshy
Copy link
Copy Markdown
Contributor Author

leshy commented Mar 18, 2026

So we've got a problem (I'm having this with stuff in the nav stack too) ... ORB SLAM3 is copy left, GPLv3. We can't really include it in our codebase.

I think for a lot of our native modules we're going to need a way to keep it in a separate codebase and then pull it in at runtime if the user opts to use the module.

True, despite this not including orb-slam3 code, main.cpp links against orb-slam3. so we might want to put this as a separate GPL3 repo - edit - done

leshy added 4 commits March 18, 2026 16:25
ORB-SLAM3 is GPL-3.0, incompatible with our Apache 2.0 license.
Moved C++ native module (main.cpp, CMakeLists, flake.nix) and
ORB-SLAM3 configs to dimensionalOS/dimos-orb-slam3.

The Python wrapper now builds from the external flake via
`nix build github:dimensionalOS/dimos-orb-slam3` into a local
cache directory. IPC boundary via LCM is unchanged.
Add SensorMode enum, remove model_post_init, use module directory
for cwd, pin build to dimos-orb-slam3/v0.1.0. Gitignore **/result.
@leshy leshy marked this pull request as ready for review March 18, 2026 09:38
@dimensionalOS dimensionalOS deleted a comment from greptile-apps bot Mar 18, 2026
@jeff-hykin
Copy link
Copy Markdown
Member

if you add a comment about that weirdness (basically brokenness) you were showing me last night, then it looks good to me

Documents the known transform mismatch where reconstructed trajectory
diverges from ground-truth poses. Needs investigation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@leshy
Copy link
Copy Markdown
Contributor Author

leshy commented Mar 26, 2026

if you add a comment about that weirdness (basically brokenness) you were showing me last night, then it looks good to me

done

…odule

# Conflicts:
#	dimos/robot/all_blueprints.py
Copilot AI review requested due to automatic review settings March 26, 2026 14:16
@leshy leshy enabled auto-merge (squash) March 26, 2026 14:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an ORB-SLAM3-based SLAM capability to the perception stack via a thin NativeModule wrapper, plus a runnable webcam blueprint, as part of the monocular navigation experiment work.

Changes:

  • Register new ORB-SLAM3 module and webcam blueprint in the global blueprint/module registries.
  • Add OrbSlam3 native-module wrapper + webcam autoconnect blueprint + README.
  • Improve Webcam camera_info defaults and update repo .gitignore to ignore Nix build result outputs.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
dimos/robot/all_blueprints.py Registers orbslam3-webcam blueprint and orbslam3-module module entrypoints.
dimos/perception/slam/orbslam3/module.py Introduces OrbSlam3 NativeModule wrapper and configuration for ORB-SLAM3 subprocess integration.
dimos/perception/slam/orbslam3/blueprints/webcam.py Adds a simple autoconnect blueprint wiring webcam → ORB-SLAM3 → visualization.
dimos/perception/slam/orbslam3/README.md Documents the new native module and current known issues.
dimos/hardware/sensors/camera/webcam.py Switches camera_info default to pydantic.Field(default_factory=...) and fills width/height when unset.
.gitignore Ignores Nix build result directories repository-wide.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +18 to +21
from dimos.visualization.rerun.bridge import _resolve_viewer_mode, rerun_bridge

orbslam3_webcam = autoconnect(
rerun_bridge(viewer_mode=_resolve_viewer_mode()),
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dimos.visualization.rerun.bridge doesn’t define rerun_bridge (only RerunBridgeModule / run_bridge exist), so this import will raise at runtime when loading the orbslam3-webcam blueprint. Switch to RerunBridgeModule.blueprint(...) (as done in other blueprints) or add/rename the intended helper in the bridge module.

Suggested change
from dimos.visualization.rerun.bridge import _resolve_viewer_mode, rerun_bridge
orbslam3_webcam = autoconnect(
rerun_bridge(viewer_mode=_resolve_viewer_mode()),
from dimos.visualization.rerun.bridge import _resolve_viewer_mode, RerunBridgeModule
orbslam3_webcam = autoconnect(
RerunBridgeModule.blueprint(viewer_mode=_resolve_viewer_mode()),

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +55
class SensorMode(enum.StrEnum):
MONOCULAR = "MONOCULAR"
STEREO = "STEREO"
RGBD = "RGBD"
IMU_MONOCULAR = "IMU_MONOCULAR"
IMU_STEREO = "IMU_STEREO"
IMU_RGBD = "IMU_RGBD"

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum.StrEnum is only available on Python 3.11+, but this repo declares/supports Python 3.10 (and CI runs mypy under py3.10). Importing this module on Python 3.10 will crash. Use a py3.10-compatible enum definition (e.g., class SensorMode(str, enum.Enum): ...) or gate StrEnum behind a version check / typing_extensions fallback.

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +79
settings_path: str = str(
_MODULE_DIR / "result" / "share" / "orbslam3" / "config" / "RealSense_D435i.yaml"
)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default settings_path points at a RealSense D435i calibration YAML, but the provided orbslam3-webcam blueprint wires a generic Webcam via CameraModule and doesn’t override settings_path. That likely makes the runnable blueprint produce incorrect poses by default. Consider providing a webcam-appropriate default (or a minimal pinhole config), or explicitly override settings_path in the webcam blueprint / make it a required config parameter for that blueprint.

Suggested change
settings_path: str = str(
_MODULE_DIR / "result" / "share" / "orbslam3" / "config" / "RealSense_D435i.yaml"
)
settings_path: str

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +171
if info.width == 0 or info.height == 0:
info.width = self.config.width
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mutates the shared CameraInfo instance in config and also overwrites both dimensions when only one is unset. If a caller provides (say) height but leaves width=0, this will clobber the provided height. Prefer filling width and height independently only when each is 0 (and consider avoiding in-place mutation by constructing a new CameraInfo if feasible).

Suggested change
if info.width == 0 or info.height == 0:
info.width = self.config.width
if info.width == 0:
info.width = self.config.width
if info.height == 0:

Copilot uses AI. Check for mistakes.

## Known Issues

- **Transform / trajectory reconstruction mismatch**: The reconstructed trajectory does not match ground-truth poses. There is a suspected coordinate-frame or transform-composition issue causing output to diverge from base truth. Needs investigation.
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling: “base truth” should be “ground truth” (or just “ground-truth”).

Suggested change
- **Transform / trajectory reconstruction mismatch**: The reconstructed trajectory does not match ground-truth poses. There is a suspected coordinate-frame or transform-composition issue causing output to diverge from base truth. Needs investigation.
- **Transform / trajectory reconstruction mismatch**: The reconstructed trajectory does not match ground-truth poses. There is a suspected coordinate-frame or transform-composition issue causing output to diverge from ground truth. Needs investigation.

Copilot uses AI. Check for mistakes.
@jeff-hykin
Copy link
Copy Markdown
Member

jeff-hykin commented Mar 27, 2026

Right now I've got a voxel slam branch, a lt-mapper, a khronos-slam, better-fastlio2, etc. I'm kinda thinking after rosnav we add a handful of loop-closure benchmarks (lidar and visual) and dynamic map change benchmarks to DimOS, make sure stuff like ORB-SLAM3 is benchmarking how its supposed to, then let openclaw go crazy porting every major slam system to DimOS with validation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants