-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[train][2.7][1/n] cherry-picks for documentations, tests, examples #39105
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ject#38938) Signed-off-by: Matthew Deng <matt@anyscale.com>
…amples (Python 3.7)` (ray-project#38923) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
…ding Ray AIR examples)` (ray-project#38940) Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Matthew Deng <matt@anyscale.com> Co-authored-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…#38918) Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…y-project#38905) Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…rain Integration GPU Tests and Examples (ray-project#38910) Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> Signed-off-by: Justin Yu <justinvyu@anyscale.com> Co-authored-by: Justin Yu <justinvyu@anyscale.com>
…python: Lightning 2.0 Train GPU tests (ray-project#38903) Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com> Signed-off-by: Yunxuan Xiao <xiaoyunxuan1998@gmail.com>
…earn` trainers, checkpoints + tests (ray-project#38959) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…ay-project#38915) Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…ts (ray-project#38965) Signed-off-by: Matthew Deng <matt@anyscale.com>
…t#38932) Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
…ay-project#38895) Signed-off-by: Kai Fricke <kai@anyscale.com> Co-authored-by: matthewdeng <matt@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…roject#39020) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com> Signed-off-by: matthewdeng <matt@anyscale.com>
Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
….6. (ray-project#38794) Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Fixes multinode tests by using the new train.report() API. Signed-off-by: Kai Fricke <kai@anyscale.com>
The new storage path does not create "empty" checkpoints per default anymore. Previously, when no checkpoint is saved, PAUSEing a trial would create a dummy checkpoint that only contains trial metadata (such as the iteration number). This is not the case anymore. Examples now have to implement checkpointing to properly restore previous state. This was also true previously - but some of our simple examples (e.g. the one in this PR) didn't implement it and still "worked". I think it's fine to keep the functionality as is and require our examples to show checkpointing implementations. This will ensure that users don't shoot their feet trying to use e.g. BOHB. Separately, BOHB was malfunctioning as trials were repeatedly PAUSED and restarted as they've never been removed from `bracket.trials_to_unpause`. @justinvyu mentioned this in the review where it was introduced and I believed at the time it wasn't necessary - turns out it is, as we can end up in a situation where a bracket is never finished because trials are constantly running. This was not caught by any tests. We should add one in a follow-up - for now we can proceed with this PR to pick onto Ray 2.7. Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Yunxuan Xiao <yunxuanx@anyscale.com> Signed-off-by: Yunxuan Xiao <xiaoyunxuan1998@gmail.com>
…t#39023) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Matthew Deng <matt@anyscale.com>
This PR fixes rllib-related tests that didn't pass changes related to the new storage context. Signed-off-by: Kai Fricke <kai@anyscale.com> Signed-off-by: matthewdeng <matt@anyscale.com> Co-authored-by: matthewdeng <matt@anyscale.com>
…ium)` (ray-project#39081) Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@matthewdeng DCO are failing :) |
zhe-thoughts
approved these changes
Aug 30, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This qualifies for picking
Leaving to @GeneDer to make sure tests pass before picking. Thanks!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.