[ray] Perform preprocessing before hyperopt when possible #1416

tgaddair · 2021-10-24T20:14:23Z

Currently, we perform preprocessing for every trial separately regardless of whether any preprocessing params are being tuned. This is inefficient, and also seems to cause a deadlock when using RayDatasets with dynamic resource allocation during tuning.

Instead, we can perform preprocessing up front, and then reuse the RayDataset among all the trials. It may be the case that we need to persist to parquet as well, otherwise it may cause issues when multiple trials attempt to run the pipeline simultaneously.

tgaddair added the feature New feature or request label Oct 24, 2021

tgaddair added this to To do in Ray via automation Oct 24, 2021

tgaddair mentioned this issue Oct 24, 2021

Perform preprocessing first before hyperopt when possible #1415

Merged

tgaddair closed this as completed in #1415 Oct 30, 2021

Ray automation moved this from To do to Done Oct 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ray] Perform preprocessing before hyperopt when possible #1416

[ray] Perform preprocessing before hyperopt when possible #1416

tgaddair commented Oct 24, 2021

[ray] Perform preprocessing before hyperopt when possible #1416

[ray] Perform preprocessing before hyperopt when possible #1416

Comments

tgaddair commented Oct 24, 2021