Skip to content

[Caspar] GPU device selection for solver #461

Open
tordnat wants to merge 9 commits into
symforce-org:mainfrom
tordnat:tordnat/caspar-gpu-idx-option-#460
Open

[Caspar] GPU device selection for solver #461
tordnat wants to merge 9 commits into
symforce-org:mainfrom
tordnat:tordnat/caspar-gpu-idx-option-#460

Conversation

@tordnat
Copy link
Copy Markdown
Contributor

@tordnat tordnat commented May 3, 2026

Adds device_id throughout the Caspar stack so callers can pick all GPU operations to a specific device. Default is set to device 0.

Changes in Solver (solver.h/cc.jinja, solver_pybinding.h.jinja, lib.pyi.jinja):

  • device_id added to the constructor and stored as int device_id_. Every method calls cudaSetDevice(device_id_) at entry.
  • Exposed as a keyword argument in the pybind11 binding and .pyi stub Solver(*, device_id=1).
  • Added GetDeviceId(const py::object&), which reads the tensor pointer via cuda array interface and calls cudaPointerGetAttributes to find the owning device. This adds some overhead, but it was the least intrusive method I found.
  • stacked_to_caspar and caspar_to_stacked call cudaSetDevice(GetDeviceId(...)) before each kernel launch and infer device

Was advised to ping @matias-christensen-skydio for best CUDA practises. Would gladly appreciate any feedback, especially on the device inference via cuda array interface for the pybindings.

Blocked by #458, fixes #460.

@tordnat
Copy link
Copy Markdown
Contributor Author

tordnat commented May 3, 2026

Feature requested in colmap/colmap#4018

Copy link
Copy Markdown
Member

@aaron-skydio aaron-skydio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, general approach here makes sense to me, the solver object gets an associated device because it doesn't make sense to move it across devices between calls, because its allocated buffers are on a particular device; and the free functions infer the right device.

Curious if Matias has any other feedback but otherwise this looks good to me

Comment thread symforce/caspar/source/templates/lib.pyi.jinja Outdated
Comment thread symforce/caspar/source/templates/solver.cc.jinja Outdated
@tordnat
Copy link
Copy Markdown
Contributor Author

tordnat commented May 4, 2026

Should we wait to merge this until someone with a multi-gpu rig can test this? I currently don't have a rig with more than one GPU.

@aaron-skydio
Copy link
Copy Markdown
Member

Hmm, maybe worth asking someone on the Colmap thread if they can do that? It seemed like someone over there might already have a multi-gpu rig sitting around. I don't think I really have time to test this on a multi-gpu rig right now unfortunately

@tordnat
Copy link
Copy Markdown
Contributor Author

tordnat commented May 5, 2026

I'll make a new PR after colmap/colmap#4018 is merged so this can be tested.

@tordnat
Copy link
Copy Markdown
Contributor Author

tordnat commented May 6, 2026

Testing this in colmap/colmap#4379

@tordnat
Copy link
Copy Markdown
Contributor Author

tordnat commented May 6, 2026

Multi-GPU support worked well, according to COLMAP contributors. However, this does not test it on the Python side. Do we want to exclude those from this PR, or just gamble on it working as expected?

@aaron-skydio
Copy link
Copy Markdown
Member

I'm just going to go ahead and merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Caspar] Add GPU index argument to Caspar solver

2 participants