Skip to content

Conversation

@oscarkey
Copy link
Contributor

@oscarkey oscarkey commented Sep 9, 2025

If the specified device is "auto" and multiple cuda gpus are available, then select all the gpus. This PR doesn't actually implement multi-gpu inference. Instead, all the inference engines just use the first.

Public API changes:

  • TabPFNClassifier/Regressor.device_ has been renamed to .devices_, and is now a tuple rather than a single device.
  • The device argument of TabPFNClassifier/Regressor.__init__() is unchanged, as is the .device property. Multiple gpus will only be used when this is set to "auto". An alternative would be to update this to devices. I kept it like this as most users probably only want a single device, it avoids changing the api, and we don't actually support multi-device inference in this PR. But maybe it is confusing to have both device and devices_. What do you think?

Notes:

  • TabPFNRegressor.znorm_space_bardist_ is always placed on the first device, as we're not aiming to parallelise this portion of inference
  • infer_devices() no longer accepts device=None: I couldn't find anywhere using this

If the specified device is "auto" and multiple cuda gpus are available,
then select all the gpus. This PR doesn't actually implement multi-gpu
inference. Instead, all the inference engines just use the first.

Public API changes:
- `TabPFNClassifier/Regressor.device_` has been renamed to `.devices_`,
  and is now a tuple rather than a single device.
- The `device` argument of `TabPFNClassifier/Regressor.__init__()` is
  unchanged, as is the `.device` property. Multiple gpus will only be used
  when this is set to "auto".

Notes:
- `TabPFNRegressor.znorm_space_bardist_` is always placed on the first
  device, as we're not aiming to parallelise this portion of inference
- `infer_devices()` no longer accepts `device=None`: I couldn't find
  anywhere using this
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request makes the necessary preparatory changes for multi-GPU inference by updating the device handling logic. The device_ attribute is correctly renamed to devices_ and now holds a tuple of devices. The new infer_devices utility function is well-implemented and tested, correctly selecting all available CUDA GPUs when device="auto". The changes are consistent across the codebase. I've left a few minor suggestions regarding a typo in the changelog, some repeated code in the inference engines, and a confusing test case name. Overall, this is a solid step towards multi-GPU support.

@oscarkey oscarkey requested a review from noahho September 9, 2025 10:15
@oscarkey
Copy link
Contributor Author

@noahho are you happy with these public api changes?

Copy link
Collaborator

@noahho noahho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@oscarkey oscarkey enabled auto-merge (squash) September 11, 2025 10:10
@oscarkey oscarkey merged commit 01a8205 into main Sep 11, 2025
10 checks passed
@oscarkey oscarkey deleted the ok-multiple-devices branch September 11, 2025 11:40
@noahho noahho mentioned this pull request Sep 14, 2025
oscarkey added a commit to PriorLabs/tabpfn-extensions that referenced this pull request Sep 16, 2025
PriorLabs/TabPFN#496 renamed
infer_device_and_type() to infer_devices(), and updated it to return a
tuple of devices. Update tabpfn-extensions to use this new method, and
select the first device if multiple devices are returned.
oscarkey added a commit to PriorLabs/tabpfn-extensions that referenced this pull request Sep 16, 2025
PriorLabs/TabPFN#496 renamed
infer_device_and_type() to infer_devices(), and updated it to return a
tuple of devices. Update tabpfn-extensions to use this new method, and
select the first device if multiple devices are returned.
oscarkey added a commit to PriorLabs/tabpfn-extensions that referenced this pull request Sep 17, 2025
PriorLabs/TabPFN#496 renamed
infer_device_and_type() to infer_devices(), and updated it to return a
tuple of devices. Update tabpfn-extensions to use this new method, and
select the first device if multiple devices are returned.
oscarkey added a commit to PriorLabs/tabpfn-extensions that referenced this pull request Sep 17, 2025
… 2.1.4. (#165)

PriorLabs/TabPFN#496 renamed
infer_device_and_type() to infer_devices(), and updated it to return a
tuple of devices. Update tabpfn-extensions to use this new method, and
select the first device if multiple devices are returned.

Fix seed in random forest test to fix flakiness.
SilenceX12138 pushed a commit to SilenceX12138/tabpfn-extensions that referenced this pull request Sep 18, 2025
… 2.1.4. (PriorLabs#165)

PriorLabs/TabPFN#496 renamed
infer_device_and_type() to infer_devices(), and updated it to return a
tuple of devices. Update tabpfn-extensions to use this new method, and
select the first device if multiple devices are returned.

Fix seed in random forest test to fix flakiness.
klemens-floege added a commit to PriorLabs/tabpfn-extensions that referenced this pull request Sep 19, 2025
* Add TabEBM implementation for synthetic data generation using SGLD

- Introduced the TabEBM class for generating synthetic tabular data based on an energy-based model.
- Implemented methods for data preprocessing, SGLD sampling, and energy computation.
- Added support for various input formats including numpy arrays, PyTorch tensors, and pandas DataFrames.
- Created a caching mechanism for fitted models to optimize repeated sampling.
- Developed comprehensive unit tests to validate functionality and performance of the TabEBM class.
- Included error handling for invalid input types and mismatched shapes.

* Update src/tabpfn_extensions/tabebm/TabEBM.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/tabpfn_extensions/tabebm/TabEBM.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/tabpfn_extensions/tabebm/TabEBM.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/tabpfn_extensions/tabebm/TabEBM.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update src/tabpfn_extensions/tabebm/TabEBM.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Fix formatting in author section of README.md

* Fix import paths in TabEBM example and test files

* Refactor code for improved readability and consistency in TabEBM class and tests

* Remove redundant test for invalid input in to_numpy method in TestTabEBMStaticMethods

* Reorder import statements for consistency in test_tabebm.py

* Update src/tabpfn_extensions/tabebm/tabebm.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

* Update src/tabpfn_extensions/tabebm/README.md

Co-authored-by: Klemens Flöge <117587964+klemens-floege@users.noreply.github.com>

* Rename 'debug' parameter to 'verbose' in TabEBM class methods for clarity and consistency

* Add script for augmenting real-world data using TabEBM

This script replicates the functionality of the notebook, utilizing sklearn for data loading and preprocessing. It includes functions for loading the adult dataset, identifying feature types, subsampling data, preprocessing, and training models on both original and augmented datasets. The script demonstrates the effectiveness of data augmentation with TabEBM by comparing the performance of models trained on real data versus those trained on augmented data.

* Remove verbose output during SGLD updates in TabEBM class to streamline performance

* Fix bug in AutoTabPFN due to renamed infer_devices() method in tabpfn 2.1.4. (#165)

PriorLabs/TabPFN#496 renamed
infer_device_and_type() to infer_devices(), and updated it to return a
tuple of devices. Update tabpfn-extensions to use this new method, and
select the first device if multiple devices are returned.

Fix seed in random forest test to fix flakiness.

* Run the tests on push to main, and also when requested. (#166)

At the moment we only run the tests in PRs, but this can mean that the
tests are broken on main due to interactions with concurrent PRs.

Also add a "workflow_dispatch" trigger so we can run the tests manually.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Klemens Flöge <117587964+klemens-floege@users.noreply.github.com>
Co-authored-by: Oscar Key <oscar@priorlabs.ai>
oscarkey added a commit that referenced this pull request Nov 12, 2025
…Don't use them yet. (#143)

* Record copied public PR 496

* Select multiple gpus for inference, if available. Don't use them yet. (#496)

If the specified device is "auto" and multiple cuda gpus are available, then select all the gpus. This PR doesn't actually implement multi-gpu inference. Instead, all the inference engines just use the first.

Public API changes:
- `TabPFNClassifier/Regressor.device_` has been renamed to `.devices_`, and is now a tuple rather than a single device.
- The `device` argument of `TabPFNClassifier/Regressor.__init__()` is unchanged, as is the `.device` property. Multiple gpus will only be used when this is set to "auto". An alternative would be to update this to `devices`. I kept it like this as most users probably only want a single device, it avoids changing the api, and we don't actually support multi-device inference in this PR.

Notes:
- `TabPFNRegressor.znorm_space_bardist_` is always placed on the first device, as we're not aiming to parallelise this portion of inference
- `infer_devices()` no longer accepts `device=None`: I couldn't find anywhere using this

---------

Co-authored-by: mirror-bot <mirror-bot@users.noreply.github.com>
Co-authored-by: Oscar Key <oscar@priorlabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants