Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[air] Add test for remote_storage with real hdfs backend. #31940

Merged
merged 18 commits into from Jan 31, 2023

Conversation

xwjiang2010
Copy link
Contributor

@xwjiang2010 xwjiang2010 commented Jan 25, 2023

Signed-off-by: xwjiang2010 xwjiang2010@gmail.com

Why are these changes needed?

Fix get_file_and_path logic in remote_storage to take into account hdfs uri with namenode case.
set up a hdfs environment in CI to test integration like this.
dataset tests can also potentially benefit from this.

Related issue number

Closes #31673

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
@xwjiang2010
Copy link
Contributor Author

Let me see if the ci runs fine...

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
@xwjiang2010
Copy link
Contributor Author

Maybe related is https://github.com/ray-project/ray/pull/30611/files. Still clarifying..

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
update ssh-kengen command.

fix a few typos.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few questions regarding the install-hdfs.sh script. I'm also happy to take a look!

Let's put the test setup into a fixture so we can re-use it across tests

ci/env/install-hdfs.sh Show resolved Hide resolved
python/ray/air/tests/test_remote_storage_hdfs.py Outdated Show resolved Hide resolved
python/ray/air/tests/test_remote_storage_hdfs.py Outdated Show resolved Hide resolved
@krfricke krfricke added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 28, 2023
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
@xwjiang2010 xwjiang2010 removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 28, 2023
Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Just minor naming nits

ci/env/install-hdfs.sh Show resolved Hide resolved
python/ray/air/tests/test_remote_storage_hdfs.py Outdated Show resolved Hide resolved
python/ray/air/tests/test_remote_storage_hdfs.py Outdated Show resolved Hide resolved
python/ray/air/tests/test_remote_storage_hdfs.py Outdated Show resolved Hide resolved
.buildkite/pipeline.build.yml Outdated Show resolved Hide resolved
xwjiang2010 and others added 3 commits January 30, 2023 11:25
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
@xwjiang2010
Copy link
Contributor Author

https://buildkite.com/ray-project/oss-ci-build-pr/builds/10913#018608ac-a154-477f-ab78-ad9acce3501c
hdfs test is passing. Failing tests are not related.

@xwjiang2010 xwjiang2010 merged commit 5cf61f0 into ray-project:master Jan 31, 2023
@xwjiang2010 xwjiang2010 deleted the hdfs branch January 31, 2023 17:45
clarng pushed a commit to clarng/ray that referenced this pull request Jan 31, 2023
…t#31940)

* [air] Add test for remote_storage with real hdfs backend.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* try a different syntax.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* change `install-hdfs.sh` permission.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* -hdfs in air tests.

update ssh-kengen command.

fix a few typos.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* test_env=

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* cat hdfs_env

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* move `PATH` as well to a separate file.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* setting env vars in test only.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix import

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* nit

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix fixture

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

---------

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
cadedaniel pushed a commit to cadedaniel/ray that referenced this pull request Mar 22, 2023
…t#31940)

* [air] Add test for remote_storage with real hdfs backend.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* try a different syntax.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* change `install-hdfs.sh` permission.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* -hdfs in air tests.

update ssh-kengen command.

fix a few typos.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* test_env=

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* cat hdfs_env

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* move `PATH` as well to a separate file.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* setting env vars in test only.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix import

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* nit

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix fixture

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

---------

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
edoakes pushed a commit to edoakes/ray that referenced this pull request Mar 22, 2023
…t#31940)

* [air] Add test for remote_storage with real hdfs backend.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* try a different syntax.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* change `install-hdfs.sh` permission.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* -hdfs in air tests.

update ssh-kengen command.

fix a few typos.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* test_env=

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* cat hdfs_env

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* move `PATH` as well to a separate file.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* setting env vars in test only.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix import

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* nit

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix fixture

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

---------

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Signed-off-by: Edward Oakes <ed.nmi.oakes@gmail.com>
scottsun94 pushed a commit to scottsun94/ray that referenced this pull request Mar 28, 2023
…t#31940)

* [air] Add test for remote_storage with real hdfs backend.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* try a different syntax.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* change `install-hdfs.sh` permission.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* -hdfs in air tests.

update ssh-kengen command.

fix a few typos.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* test_env=

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* cat hdfs_env

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* move `PATH` as well to a separate file.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* setting env vars in test only.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix import

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* nit

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix fixture

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

---------

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
cassidylaidlaw pushed a commit to cassidylaidlaw/ray that referenced this pull request Mar 28, 2023
…t#31940)

* [air] Add test for remote_storage with real hdfs backend.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* typo

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* try a different syntax.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* change `install-hdfs.sh` permission.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* -hdfs in air tests.

update ssh-kengen command.

fix a few typos.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* test_env=

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* cat hdfs_env

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* move `PATH` as well to a separate file.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* setting env vars in test only.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix import

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments.

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* nit

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* fix fixture

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

* address comments

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>

---------

Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[air][tune] Inconsistency when parsing hdfs path for syncing
2 participants