The 'configureMlr' can not work with the parallel computing #504

Seager1989 · 2021-01-03T03:44:54Z

I met the error before as using the 'classif.gausspr' as mentioned in #501. Thanks to the help of @jakob-r , this error was fixed by setting configureMlr(on.learner.error = "warn"). to make the code running.

Now, I met a new problem. As I implement parallel computing by,
parallelStartSocket(4) ,
the error can not be neglected. The code stopped after evaluating the initial DOE with errors similar to the following one again.
00002: Error in if (err < tol) break : missing value where TRUE/FALSE needed

Is there a way to make the configureMlr work with parallel computing? Thank you

The text was updated successfully, but these errors were encountered:

Seager1989 · 2021-01-03T04:27:08Z

I found one possible solution is set the level of parallel computing as resample by,

parallelStartSocket(cpus=5,level="mlr.resample").

The default option should be mlrMBO.propose.points, which can result in the above problem.

I also tried other options, which may call multiple cores but can not assign the most computational expensive CV training process to the cores. In this way, the resample level parallel may be the best choice for the CV based hyperparameter tuning.

This is a simple analysis. It is appreciated if someone has any suggestions for implementing mlrMBO.propose.points with the configuremlr(on.learner.error = "warn"). Thanks

jakob-r · 2021-01-04T09:40:56Z

The following is not a solution to your problem, but a general hint:

If you want to speed up the optimization through parallelization it is advisable to parallelize the evaluation of the black box (i.e. the train/test-resampling) rather than proposing multiple points because two sequential proposals are generally worth more than tow parallel proposals. Why? Because the second sequential point has been proposed with the knowledge of the first point whereas the second parallel point is generated from the same knowledge as the first.

Seager1989 · 2021-01-05T03:32:08Z

Yes, Parallizing on the resample level seems good. However, does this means if I have a 5-folds CV, I can only use 5 cores simultaneously? Is there a method to take advantage more?

jakob-r · 2021-01-29T09:19:01Z

Closing, because this is not a bug in mlrMBO, but rather a problem in parallelMap and mlr.

jakob-r closed this as completed Jan 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The 'configureMlr' can not work with the parallel computing #504

The 'configureMlr' can not work with the parallel computing #504

Seager1989 commented Jan 3, 2021

Seager1989 commented Jan 3, 2021 •

edited

Loading

jakob-r commented Jan 4, 2021

Seager1989 commented Jan 5, 2021

jakob-r commented Jan 29, 2021

The 'configureMlr' can not work with the parallel computing #504

The 'configureMlr' can not work with the parallel computing #504

Comments

Seager1989 commented Jan 3, 2021

Seager1989 commented Jan 3, 2021 • edited Loading

jakob-r commented Jan 4, 2021

Seager1989 commented Jan 5, 2021

jakob-r commented Jan 29, 2021

Seager1989 commented Jan 3, 2021 •

edited

Loading