Evaluate the estimators in parallel. #484

oscarkey · 2025-09-04T16:05:27Z

Only support parallel evaluation for the "low_memory" and "cache_preprocessing" fit modes for now.

Use multithreading to evaluate the model in parallel for each estimator. I selected multithreading over multiprocessing because our benchmarking shows that for longer datasets we spend almost all our time in the flash attention kernel, during which time the GIL is released. This allows multithreading to work efficiently, and it is less complex and avoids starting additional processes (which can take a substantial fraction of the inference time).

Ideally inference.py would be refactored, but I only did minimal refactoring in this PR.

In addition to the tests added in this PR, these changes are also convered by the new consistency tests for each inference mode: #498 . Unfortunately, GitHub does not support multi-gpu testing on the CI yet, so the testing of the parallelisation is a bit limited.

I simplified the logic for converting the inputs to tensors and setting the dtype in _prepare_model_inputs():

Always call torch.as_tensor(): this only copies if necessary, no need for if
Don't surpress exceptions raised by X_full.float(): the comment says this is to avoid overflow errors, but I can't find any case when .float() would through an exception, except if X_full is complex

gemini-code-assist

Code Review

This pull request introduces parallel evaluation of estimators to improve performance. This is achieved by refactoring the inference engines to use a new parallel_evaluate utility, which can leverage multiple devices using multithreading. The core evaluation logic for a single estimator has been extracted into a helper function _evaluate_estimator.

The changes look good overall and are a nice improvement. I've found a few issues:

A high-severity bug in parallel_evaluate.py that would cause a crash on non-CUDA devices.
A high-severity bug in inference.py where a model type cast is not assigned back to the model.
Several medium-severity issues related to code cleanup, leftover TODOs, and API consistency.

Please see the detailed comments for suggestions.

src/tabpfn/inference.py

src/tabpfn/parallel_evaluate.py

src/tabpfn/inference.py

src/tabpfn/parallel_evaluate.py

oscarkey · 2025-09-04T16:08:39Z

Hey Brendan, I'd love to get your overall thoughts on this draft before I finish it up.

brendan-priorlabs · 2025-09-04T17:14:37Z

@oscarkey, would love to. I can give this a proper pass tomorrow if that works!

brendan-priorlabs

Hey @oscarkey, thanks for putting this together! Very clean and readable. I left a few minor comments, but this looks solid overall. The only that might be major is the one on Manager.Queue. Keep me posted!

src/tabpfn/parallel_evaluate.py

src/tabpfn/inference.py

src/tabpfn/parallel_evaluate.py

src/tabpfn/classifier.py

src/tabpfn/parallel_evaluate.py

oscarkey · 2025-09-10T09:52:35Z

This is now ready for a full review!

oscarkey · 2025-09-10T09:52:54Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces parallel evaluation of estimators in the tabpfn library, focusing on the "low_memory" and "cache_preprocessing" fit modes. It leverages multithreading to improve performance, particularly for longer datasets where the flash attention kernel releases the GIL. The changes include refactoring input preparation logic and adding a new parallel_execute module to manage parallel execution across multiple PyTorch devices. Tests have been added to ensure consistency between serial and parallel execution.

src/tabpfn/inference.py

src/tabpfn/parallel_execute.py

src/tabpfn/inference.py

brendan-priorlabs

LGTM! Thanks, Oscar!

src/tabpfn/parallel_execute.py

LeoGrin

Looks great to me!

* Record copied public PR 484 * Evaluate the estimators in parallel. (#484) Only support parallel evaluation for the "low_memory" and "cache_preprocessing" fit modes for now. Use multithreading to evaluate the model in parallel for each estimator. I selected multithreading over multiprocessing because our benchmarking shows that for longer datasets we spend almost all our time in the flash attention kernel, during which time the GIL is released. This allows multithreading to work efficiently, and it is less complex and avoids starting additional processes (which can take a substantial fraction of the inference time). Ideally `inference.py` would be refactored, but I only did minimal refactoring in this PR. In addition to the tests added in this PR, these changes are also convered by the new consistency tests for each inference mode: #498 . Unfortunately, GitHub does not support multi-gpu testing on the CI yet, so the testing of the parallelisation is a bit limited. I simplified the logic for converting the inputs to tensors and setting the dtype in `_prepare_model_inputs()`: - Always call `torch.as_tensor()`: this only copies if necessary, no need for `if` - Don't surpress exceptions raised by `X_full.float()`: the comment says this is to avoid overflow errors, but I can't find any case when `.float()` would through an exception, except if `X_full` is complex --------- Co-authored-by: mirror-bot <mirror-bot@users.noreply.github.com> Co-authored-by: Oscar Key <oscar@priorlabs.ai>

gemini-code-assist bot reviewed Sep 4, 2025

View reviewed changes

oscarkey requested a review from brendan-priorlabs September 4, 2025 16:08

brendan-priorlabs reviewed Sep 5, 2025

View reviewed changes

oscarkey force-pushed the ok-multiple-devices branch from 9eb0099 to eb1d626 Compare September 9, 2025 08:51

brendan-priorlabs reviewed Sep 9, 2025

View reviewed changes

src/tabpfn/classifier.py Outdated Show resolved Hide resolved

brendan-priorlabs reviewed Sep 9, 2025

View reviewed changes

src/tabpfn/parallel_evaluate.py Outdated Show resolved Hide resolved

oscarkey force-pushed the ok-multiple-devices-2 branch from 0a04c54 to 187551c Compare September 10, 2025 07:16

oscarkey changed the title ~~[WIP] Evaluate the estimators in parallel.~~ Evaluate the estimators in parallel. Sep 10, 2025

oscarkey requested a review from LeoGrin September 10, 2025 09:51

gemini-code-assist bot reviewed Sep 10, 2025

View reviewed changes

brendan-priorlabs approved these changes Sep 10, 2025

View reviewed changes

src/tabpfn/parallel_execute.py Outdated Show resolved Hide resolved

LeoGrin approved these changes Sep 10, 2025

View reviewed changes

Base automatically changed from ok-multiple-devices to main September 11, 2025 11:40

oscarkey added 3 commits September 11, 2025 13:42

[WIP] Evaluate the estimators in parallel.

37c21c6

Switch to multi-threading only, tidy up, and add tests.

2d3a295

Some more tweaks.

74c2ae8

oscarkey force-pushed the ok-multiple-devices-2 branch from 90e8ca2 to 74c2ae8 Compare September 11, 2025 11:43

Update function interface name, and fix typo.

1c08639

oscarkey enabled auto-merge (squash) September 11, 2025 12:16

oscarkey merged commit 2745764 into main Sep 11, 2025
10 checks passed

oscarkey deleted the ok-multiple-devices-2 branch September 11, 2025 12:47

Evaluate the estimators in parallel. #484

Evaluate the estimators in parallel. #484

Uh oh!

Conversation

oscarkey commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oscarkey commented Sep 4, 2025

Uh oh!

brendan-priorlabs commented Sep 4, 2025

Uh oh!

brendan-priorlabs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oscarkey commented Sep 10, 2025

Uh oh!

oscarkey commented Sep 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brendan-priorlabs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LeoGrin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

oscarkey commented Sep 4, 2025 •

edited

Loading