You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
False warning:
2021-02-04 18:13:22,924 WARNING function_runner.py:541 – Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be func(config, checkpoint_dir=None).
However I suspect this warning is faulty, I manually verified and found checkpoints had been saved, my call to tune had more parameters passed in after checkpoint_dir = None
Ray version and other system information (Python version, TensorFlow version, OS):
ray latest
The text was updated successfully, but these errors were encountered:
RaedShabbir
added
bug
Something that is supposed to be working; but isn't
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Feb 8, 2021
richardliaw
changed the title
False Checkpoint Warning with tune.with_parameters() [tune]
[tune] False Checkpoint Warning with tune.with_parameters()
Feb 9, 2021
richardliaw
added
P1
Issue that should be fixed within a few weeks
tune
Tune-related issues
and removed
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Feb 9, 2021
Hi @RaedShabbir, looking at the discussion I am wondering what models.trainers.train_ptl_checkpoint looks like. Can you share some context here?
Also, you are binding the checkpoint_dir parameter in tune.with_parameters:
tune.with_parameters(
models.trainers.train_ptl_checkpoint,
checkpoint_dir=model_config["checkpoint_dir"], #none
model_config=model_config, # model specific parameters
num_epochs=num_epochs,
num_gpus=gpus_per_trial,
report_on=report_on, # reporting frequency
checkpoint_on=report_on, # checkpointing frequency if different than reporting freq
),
This will look to the function detector like there's no checkpoint_dir parameter anymore, as you're assigned it already. If you remove that line, it might already work.
What is the problem?
Detected at https://discuss.ray.io/t/tune-performance-bottlenecks/520/3
False warning:
2021-02-04 18:13:22,924 WARNING function_runner.py:541 – Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be func(config, checkpoint_dir=None).
However I suspect this warning is faulty, I manually verified and found checkpoints had been saved, my call to tune had more parameters passed in after checkpoint_dir = None
Ray version and other system information (Python version, TensorFlow version, OS):
ray latest
The text was updated successfully, but these errors were encountered: