[tune] Automatically detect/activate proper SyncConfig given autoscaler #11867

richardliaw · 2020-11-06T23:37:41Z

Describe your feature request

This is a common failure mode.

def train_func(config, checkpoint_dir):
    ...
    with tune.checkpoint_dir(...) as ckpt:
       ...
    tune.report()

One common problem is that the user is using K8s or Docker with the autoscaler, and they do not remember to set DockerSyncer or KubernetesSyncer in the SyncConfig.

Instead, we should automatically detect the autoscaler configuration presence and activate these syncers without the user knowing about it.

@krfricke

cc @mkoh-asapp @richardrl

The text was updated successfully, but these errors were encountered:

richardliaw · 2020-11-08T05:23:31Z

This requires a utility function for identifying whether it's using an autoscaling cluster.

mkoh-asapp · 2020-11-09T14:04:50Z

I missed the part in the docs explaining this so I had to have Richard explain it to me. But even after I set the syncer on tune.run, it wasn't working for me. I realized that this is because we are passing Experiment directly to tune.run, instead of passing a function or class. Once I set the syncer on the Experiments, then it worked, but that was not very clear from the docs.

Maybe it isn't a standard way to use tune (creating Experiments manually), but it might be nice to have that explained somewhere.

Just wanted to bring up a point to consider. Thanks 🎉

bllchmbrs · 2020-11-17T00:14:22Z

does setting sync_to_driver=False in tune.run(...) silence the error?

The answer to 👆 is no

krfricke · 2021-02-26T14:30:41Z

Closed via #12108

richardliaw assigned krfricke Dec 14, 2020

ericl added this to the Serverless Autoscaling milestone Jan 20, 2021

ericl removed the autoscaler label Jan 20, 2021

krfricke closed this as completed Feb 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune] Automatically detect/activate proper SyncConfig given autoscaler #11867

[tune] Automatically detect/activate proper SyncConfig given autoscaler #11867

richardliaw commented Nov 6, 2020

richardliaw commented Nov 8, 2020

mkoh-asapp commented Nov 9, 2020

bllchmbrs commented Nov 17, 2020 •

edited

Loading

krfricke commented Feb 26, 2021

[tune] Automatically detect/activate proper SyncConfig given autoscaler #11867

[tune] Automatically detect/activate proper SyncConfig given autoscaler #11867

Comments

richardliaw commented Nov 6, 2020

Describe your feature request

richardliaw commented Nov 8, 2020

mkoh-asapp commented Nov 9, 2020

bllchmbrs commented Nov 17, 2020 • edited Loading

krfricke commented Feb 26, 2021

bllchmbrs commented Nov 17, 2020 •

edited

Loading