modernizing support for parallelism #866

simonpcouch · 2024-03-08T15:52:26Z

Currently, tune uses foreach to facilitate parallel computing. As a group, we've discussed gradually deprecating our support for foreach in favor of the futureverse. The big ideas:

future does a better job identifying globals (i.e. no need for namespacing where we usually wouldn't, reassigning in strange situations), generating random numbers soundly, and handling errors, and comes with better progress support via progressr.
Newer R users are more likely to be familiar with the future user API (plan(multisession), etc) than the foreach one.
The futureverse is more actively developed/maintained than foreach.

Spent some time thinking through how this might come together this morning. The process could look something like this:

Timeline	User API	Developer API	Backend
Now	foreach	foreach	foreach
Proposed	future (foreach supported)	foreach (via doFuture)	future
2/3 Years Out	future	future (maybe via furrr)	future

By using doFuture as a backend, our codebase looks similar to how it does with foreach, which can help us with maintainability during the transition period where we support users using either the foreach user API (e.g. registerDoParallel()) or the future user API (i.e. plan(multisession)). Depending on the user's setup, we'd use foreach::`%do%`, foreach::`%doPar%`, or doFuture::`%doFuture%` where we currently just decided between the first two.

Currently, for error logging, tune uses ad-hoc machinery for logging when tuning in parallel and cli for logging sequentially. I'm not sure that we want we want to maintain a third logging system for parallelism with progressr during the transition period. Once we fully deprecate support for foreach, though, we should consider rewriting all of our error logging with progressr.

🚀🚀🚀

The text was updated successfully, but these errors were encountered:

simonpcouch · 2024-03-14T18:36:48Z

Next step for post-v1.2.0: revert e79aac1 to introduce a deprecation warning when using foreach.

jrosell · 2024-03-22T07:55:42Z

Regarding logging in paralelism, one now can develop custom realtime logging using the extract function but now it's very limited because only the worfklow object is available, not the split object. What could be the NEW recommended way with future? I don't like not having realtime logging when using parallelism. I think it's good to be able to debug when some hyperparameter fails for some model and at which split.

simonpcouch · 2024-03-22T13:53:41Z

Your extract approach you use currently will work fine!

The change that will happen once we convert to future is that an interactive logger (that summarizes warnings/errors rather than printing all warnings/errors and their locations out) will be available for tuning either sequentially or in parallel. You will be able to turn that interactive logging off using verbose = TRUE in either case.

Right now, the interactive logger is only available for sequential tuning. At the moment, when tuning in parallel, users see the verbose = TRUE logging regardless of how they set verbose.

simonpcouch added breaking change ☠️ API change likely to affect existing code feature a feature request or enhancement labels Mar 8, 2024

This was referenced Mar 11, 2024

add support for parallelism with future #867

Merged

add support for parallelism with future tidymodels/stacks#208

Merged

This was referenced Mar 18, 2024

nestedness of grid code paths #740

Open

reducing complexity in tune #742

Open

simonpcouch mentioned this issue Mar 22, 2024

warn when parallel processing with foreach #878

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modernizing support for parallelism #866

modernizing support for parallelism #866

simonpcouch commented Mar 8, 2024

simonpcouch commented Mar 14, 2024

jrosell commented Mar 22, 2024

simonpcouch commented Mar 22, 2024

modernizing support for parallelism #866

modernizing support for parallelism #866

Comments

simonpcouch commented Mar 8, 2024

simonpcouch commented Mar 14, 2024

jrosell commented Mar 22, 2024

simonpcouch commented Mar 22, 2024