bae is a PyTorch-based library supporting exact 2nd-order optimization techniques. The library provides efficient implementations for sparse optimization problems in robotics, particularly Bundle Adjustment (BA) and Pose Graph Optimization (PGO).
| Garden | Counter | Kitchen |
bae powering BA and global positioning in downstream system, InstantSfM.
|
|
|
| Sphere Big Noise | Grid3D | Sphere (g2o) |
- 2026-03-22: Added skills for coding agents to write custom compute graphs.
- 2025-12-12: Added a VGGT integration example.
- Sparse Block Matrix Operations: Optimized implementations of sparse matrix operations for large-scale optimization
- CUDA Acceleration: Custom CUDA kernels for high-performance sparse linear algebra
- Bundle Adjustment: Efficient implementation for camera pose and 3D structure optimization
- Pose Graph Optimization: Tools for optimizing robot trajectories using pose graph representations
- PyTorch Integration: Seamlessly integrates with PyTorch's automatic differentiation framework
- Levenberg-Marquardt Optimizer: Custom implementation of the LM algorithm for non-linear least squares problems
- Add Apple Silicon GPU support, PyTorch PR WIP
- Reduce runtime overhead using CUDA graph
- Distributed Tensor (DTensor) support
- An new backend for distributed solver
- CUDA toolkit (tested with CUDA 12.x)
- PyTorch (2.0+)
- (Optional) CUDSS (CUDA Sparse Solver library)
-
(Optional) Install CUDSS (recommended through package manager)
- For CUDA 12 (0.6.0)
sudo apt install cudss=0.6.0-1 cudss0=0.6.0-1 cudss-cuda-12=0.6.0.5-1 \ libcudss0-cuda-12=0.6.0.5-1 libcudss0-dev-cuda-12=0.6.0.5-1 libcudss0-static-cuda-12=0.6.0.5-1
- For CUDA 13
sudo apt install cudss-cuda-13
-
Install PyPose:
pip install git+https://github.com/pypose/pypose.git
-
Clone this repository:
git clone https://github.com/zitongzhan/bae.git cd bae -
Install dependencies:
pip install -r requirements.txt
-
Install the package in development mode:
python -m pip install --no-build-isolation -v -e . # following https://github.com/pytorch/pytorch
If you are unable to install cudss with the system package manager, you can control the build process with these environment variables:
USE_CUDSS: Set to "1" (default) to enable CUDSS support, "0" to disableCUDSS_DIR: Optional path to CUDSS installation directory if not in standard locations
This repo includes skills in .agent/skills:
bae-compute-graph: guidance for defining BAL/PGO and more complex compute graphs
Bundle Adjustment optimizes camera poses and 3D point positions to minimize reprojection error. The following example shows how to perform BA using bae:
import torch
import pypose as pp
from datapipes.bal_loader import get_problem
from ba_helpers import ReprojNonBatched, least_square_error
from bae.sparse.py_ops import *
from bae.sparse.solve import *
from bae.optim import LM
from bae.utils.pysolvers import PCG
# Load a problem from the BAL dataset
dataset = get_problem("problem-49-7776-pre", "ladybug", use_quat=True)
dataset = {k: v.to('cuda') for k, v in dataset.items() if isinstance(v, torch.Tensor)}
# Prepare input for the optimization
input = {
"points_2d": dataset['points_2d'],
"camera_indices": dataset['camera_index_of_observations'],
"point_indices": dataset['point_index_of_observations']
}
# Initialize model with camera parameters and 3D points
model = Reproj(
dataset['camera_params'].clone(),
dataset['points_3d'].clone()
).to('cuda')
# Configure optimizer
strategy = pp.optim.strategy.TrustRegion(up=2.0, down=0.5**4)
solver = PCG(tol=1e-4, maxiter=250)
optimizer = LM(model, strategy=strategy, solver=solver, reject=30)
# Run optimization for multiple iterations
for idx in range(20):
loss = optimizer.step(input)
print(f'Iteration {idx}, loss: {loss.item()}')bae is used as an optional Bundle Adjustment backend in our VGGT fork (Visual Geometry Grounded Transformer) to refine the camera poses, intrinsics, and 3D points predicted by VGGT before exporting a COLMAP reconstruction.
After installing bae, you can run VGGT's COLMAP export with BA enabled and bae selected as the solver:
python demo_colmap.py --scene_dir /path/to/scene --use_ba --implementation bae # optional: --shared_cameraThis command invokes prepare_bae(...) inside vggt/demo_colmap.py, which wraps VGGT tracks and predictions into bae.optim.LM and updates extrinsic, intrinsic, and points_3d in place before writing scene_dir/sparse/ in COLMAP format.
bae is designed for high performance using:
- Efficient sparse block matrix operations
- CUDA acceleration for core operations
- Optimized linear solvers (PCG, CUDA Sparse Solver)
- Memory-efficient data structures
If you use bae in your research, please cite:
@article{zhan2025bundle,
title = {Bundle Adjustment in the Eager Mode},
author = {Zhan, Zitong and Xu, Huan and Fang, Zihang and Wei, Xinpeng and Hu, Yaoyu and Wang, Chen},
journal = {arXiv preprint arXiv:2409.12190},
year = {2024},
url = {https://arxiv.org/abs/2409.12190}
}The implementation draws inspiration from:
- PyPose for SE(3) pose representations
- GTSAM for reprojection jacobian concepts
- Ceres for manifold parameter update





