Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[train] New persistence mode: Tune tests + examples (small) (batch 2) #38818

Merged
merged 20 commits into from
Aug 25, 2023

Conversation

justinvyu
Copy link
Contributor

@justinvyu justinvyu commented Aug 24, 2023

Why are these changes needed?

This PR fixes a bug in uploading files to storage with an exclude pattern, and updates many unit tests to work in the new codepath.

Related issue number

#38570

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
… suites

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Comment on lines -287 to -290
def test_syncer_sync_up_down_custom(temp_data_dirs):
"""Check that syncing up and down works"""
tmp_source, tmp_target = temp_data_dirs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer use syncer.sync_down, so I removed all of these tests.

Comment on lines +53 to +55
root_path: Root path to strip when matching with the exclude pattern.
Ex: root_path="/tmp/a/b/c", exclude=["*a*"], will exclude
/tmp/a/b/c/_a_.txt but not ALL of /tmp/a/*.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was bugged from before:

if name="checkpoint_anything", our fnmatch exclude on the full path would exclude the entire experiment directory.

exclude=["*/checkpoint_*"] would match /storage/path/checkpoint_anything/, rather than /storage/path/checkpoint_anything/*/checkpoint_*.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm thinking about it, this exclude is not really needed since we write the checkpoint directly to storage, and we have no control over what the user names their files in the trial dir. Still may be useful in the future for whitelisting, so let's keep it for now.

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…persistence/ci/tune2

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ericl
Copy link
Contributor

ericl commented Aug 24, 2023

Merge conflicts.

@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Aug 24, 2023
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…persistence/ci/tune2

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
@justinvyu justinvyu added tests-ok The tagger certifies test failures are unrelated and assumes personal liability. and removed @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. labels Aug 25, 2023
@justinvyu
Copy link
Contributor Author

@zhe-thoughts this one's good to merge
@matthewdeng / @ericl to stamp as well

Copy link
Collaborator

@zhe-thoughts zhe-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Necessary Train change for 2.7

@zhe-thoughts zhe-thoughts merged commit eac65b1 into ray-project:master Aug 25, 2023
60 of 64 checks passed
@justinvyu justinvyu deleted the air/persistence/ci/tune2 branch August 25, 2023 18:46
arvind-chandra pushed a commit to lmco/ray that referenced this pull request Aug 31, 2023
…2) (ray-project#38818)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
LeonLuttenberger pushed a commit to jaidisido/ray that referenced this pull request Sep 5, 2023
…2) (ray-project#38818)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
jimthompson5802 pushed a commit to jimthompson5802/ray that referenced this pull request Sep 12, 2023
…2) (ray-project#38818)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
…2) (ray-project#38818)

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ray 2.7 tests-ok The tagger certifies test failures are unrelated and assumes personal liability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants