-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tune] get checkpoints paths for a trial after tuning #6643
Conversation
Can one of the admins verify this patch? |
Test PASSed. |
cc @richardliaw Is this close to what you mentioned? Thanks. |
""" | ||
from ray.tune.checkpoint_manager import Checkpoint | ||
|
||
checkpoints = trial.checkpoint_manager.best_checkpoints() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is odd because the user may not have access to the Trial object... maybe we can load the path instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought this can be the next step after get_best_trial or just using self.trials. We can load from disk check points given the trial logdir.
Shall we support both? i.e. trial can be a trial or a path, and we take different approaches according to its type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I assume this should be a static method that can operate without trials in memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry for the slow reply - supporting both would be ideal. get_best_trial
is only available for live experiments (not offline analysis).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thanks for the comments. Updated to find checkpoints from logdir. |
Test FAILed. |
cf6741b
to
9c8c850
Compare
Sorry for the late reply. Updated to use the checkpoint file. The getting by path part has many hard coded strings, shall we move that part code to TrainableUtil, for reuse and easier change management? |
Test FAILed. |
Yes, this sounds like a good idea to me! |
Arguments: | ||
trial(Trial): The log directory of a trial, or a trial instance. | ||
metric (str): key for trial info to return, e.g. "mean_accuracy". | ||
"training_iteration" is used by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"training_iteration" is used by default. | |
"training_iteration" is used by default. |
Test FAILed. |
Test FAILed. |
Add a new API in Analysis to fetch the checkpoint paths for a trial.
Checks
scripts/format.sh
to lint the changes in this PR.