Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in DBFSLocalStore #3510

Merged
merged 6 commits into from Apr 20, 2022
Merged

Conversation

WeichenXu123
Copy link
Contributor

Signed-off-by: Weichen Xu weichen.xu@databricks.com

Checklist before submitting

  • Did you read the contributor guide?
  • Did you update the docs?
  • Did you write any tests to validate this change?
  • Did you update the CHANGELOG, if this change affects users?

Description

Fix bugs in DBFSLocalStore.

  1. Make DBFSLocalStore support "file:/dbfs/..." style path
  2. Make DBFSLocalStore implement get_localized_path correctly.

Review process to land

  1. All tests and other checks must succeed.
  2. At least one member of the technical steering committee must review and approve.
  3. If any member of the technical steering committee requests changes, they must be addressed.

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
@github-actions
Copy link

github-actions bot commented Apr 18, 2022

Unit Test Results

     735 files  ±0       735 suites  ±0   8h 45m 4s ⏱️ + 23m 5s
     768 tests ±0       678 ✔️ ±0       90 💤 ±0  0 ±0 
16 960 runs  ±0  12 195 ✔️ ±0  4 765 💤 ±0  0 ±0 

Results for commit 309cae9. ± Comparison against base commit 98db066.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Apr 18, 2022

Unit Test Results (with flaky tests)

     841 files  +  20       841 suites  +20   9h 14m 5s ⏱️ + 20m 14s
     768 tests ±    0       677 ✔️  -     1       90 💤 ±  0  1 +1 
19 494 runs  +248  13 815 ✔️ +208  5 677 💤 +38  2 +2 

For more details on these failures, see this check.

Results for commit 309cae9. ± Comparison against base commit 98db066.

♻️ This comment has been updated with latest results.

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
@WeichenXu123
Copy link
Contributor Author

@tgaddair Could you help review this PR ? thanks!

and is it possible to make a patch release after this PR merged ? Thanks!

horovod/spark/common/store.py Outdated Show resolved Hide resolved
@@ -181,7 +181,8 @@ def __init__(self, prefix_path, train_path=None, val_path=None, test_path=None,
super().__init__()

def exists(self, path):
return self.fs.exists(self.get_localized_path(path)) or self.fs.isdir(path)
localized_path = self.get_localized_path(path)
return self.fs.exists(localized_path) or self.fs.isdir(localized_path)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semantics of fs.isdir looks like it includes fs.exists, so the or self.fs.isdir(localized_path) looks redundant. Can you please specify in a comment why this bit is needed? Otherwise, someone might remove it in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why it includes or self.fs.isdir(localized_path) here, it is in AbstractFilesystemStore, maybe other filesystem (such as HDFS) require this ?
For DBFSLocal and local fs, fs.isdir is redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep calling fs.isdir in AbstractFilesystemStore.exists and let author of AbstractFilesystemStore to update it.

Copy link
Collaborator

@EnricoMi EnricoMi Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I did not see that this existed before. It has been introduced here: 7a69711#diff-68621af6dea1527be13ddb5823a2d54a48a3f7b8fa98dd17cfa6ec1e3d677fccR180

However, you are changing the argument of calling into self.isdir from path to localized_path.

@tgaddair do you remember why you introduced the seemingly redundand isdir(path) here?

@WeichenXu123 why do you think AbstractFilesystemStore is the right place for exists to use the localized path rather than the given path? This affects all FilesystemStore implementations. Maybe this should go into DBFSLocalStore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I made DBFSLocalStore override "exists" method.

and I restored the exists method in AbstractFilesystemStore

horovod/spark/common/store.py Outdated Show resolved Hide resolved
horovod/spark/common/store.py Show resolved Hide resolved
horovod/spark/common/store.py Outdated Show resolved Hide resolved
horovod/spark/common/store.py Outdated Show resolved Hide resolved
@EnricoMi
Copy link
Collaborator

and is it possible to make a patch release after this PR merged ?

I could cut a 0.24.3 release.

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
@WeichenXu123
Copy link
Contributor Author

@EnricoMi PR updated and addressed your comments. Could you take another look ? Thanks!

horovod/spark/common/store.py Outdated Show resolved Hide resolved
horovod/spark/common/store.py Outdated Show resolved Hide resolved
horovod/spark/common/store.py Outdated Show resolved Hide resolved
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Copy link
Collaborator

@EnricoMi EnricoMi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EnricoMi EnricoMi merged commit 399e9ec into horovod:master Apr 20, 2022
@WeichenXu123
Copy link
Contributor Author

@EnricoMi Thank you very much ! When will the 0.24.3 release be cut ? :)

EnricoMi pushed a commit that referenced this pull request Apr 21, 2022
…_path (#3510)

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
EnricoMi pushed a commit that referenced this pull request Apr 21, 2022
…_path (#3510)

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants