If device="auto" and multiple GPUs are present, only select the first. #517

oscarkey · 2025-09-17T08:56:54Z

We're seeing prediction errors when multiple GPUs are used. Switch to only using multiple GPUs if explicitly enabled, while we debug.

Tested manually on a machine with 2 GPUs and a dataset exhibiting the issue:

current main: inference does not work
this PR with device=auto: inference works
this PR with device=["cuda:0","cuda:1"]: inference does not work

Also, update the device docstring. Don't include multi-gpu inference for now, until we've fixed it.

We're seeing poor prediction quality when multiple GPUs are used. Switch to only using multiple GPUs if explicitly enabled, while we debug.

gemini-code-assist

Code Review

This pull request addresses a bug with multi-GPU inference when device="auto" by correctly defaulting to a single GPU (cuda:0) as a temporary fix. The implementation change in infer_devices is simple and effective. The accompanying docstring updates in TabPFNClassifier and TabPFNRegressor clearly communicate this new behavior to users.

My main concern, which is critical, is that this change breaks an existing unit test. I've left a specific comment with a suggested fix to ensure the test suite passes. Please address this to maintain code quality and test coverage.

src/tabpfn/utils.py

noahho

LGTM barring geminis comment

…ly select the first. (#157) * Record copied public PR 517 * If device="auto" and multiple GPUs are present, only select the first. (#517) (cherry picked from commit 68093c6) --------- Co-authored-by: mirror-bot <mirror-bot@users.noreply.github.com> Co-authored-by: Oscar Key <oscar@priorlabs.ai>

oscarkey added 2 commits September 17, 2025 10:35

If device="auto" and multiple GPUs are present, only select the first.

2de809c

We're seeing poor prediction quality when multiple GPUs are used. Switch to only using multiple GPUs if explicitly enabled, while we debug.

Update device docstrings.

46b6139

oscarkey requested review from bejaeger and noahho September 17, 2025 08:57

gemini-code-assist bot reviewed Sep 17, 2025

View reviewed changes

src/tabpfn/utils.py Show resolved Hide resolved

noahho approved these changes Sep 17, 2025

View reviewed changes

Fix tests.

ce221f3

oscarkey merged commit 68093c6 into main Sep 17, 2025
10 checks passed

oscarkey deleted the ok-disable-multigpu branch September 17, 2025 09:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

If device="auto" and multiple GPUs are present, only select the first. #517

If device="auto" and multiple GPUs are present, only select the first. #517

Uh oh!

oscarkey commented Sep 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

noahho left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

If device="auto" and multiple GPUs are present, only select the first. #517

If device="auto" and multiple GPUs are present, only select the first. #517

Uh oh!

Conversation

oscarkey commented Sep 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

noahho left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants