You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we perform preprocessing for every trial separately regardless of whether any preprocessing params are being tuned. This is inefficient, and also seems to cause a deadlock when using RayDatasets with dynamic resource allocation during tuning.
Instead, we can perform preprocessing up front, and then reuse the RayDataset among all the trials. It may be the case that we need to persist to parquet as well, otherwise it may cause issues when multiple trials attempt to run the pipeline simultaneously.
The text was updated successfully, but these errors were encountered:
Currently, we perform preprocessing for every trial separately regardless of whether any preprocessing params are being tuned. This is inefficient, and also seems to cause a deadlock when using RayDatasets with dynamic resource allocation during tuning.
Instead, we can perform preprocessing up front, and then reuse the RayDataset among all the trials. It may be the case that we need to persist to parquet as well, otherwise it may cause issues when multiple trials attempt to run the pipeline simultaneously.
The text was updated successfully, but these errors were encountered: