New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in DBFSLocalStore #3510
Conversation
Unit Test Results (with flaky tests) 841 files + 20 841 suites +20 9h 14m 5s ⏱️ + 20m 14s For more details on these failures, see this check. Results for commit 309cae9. ± Comparison against base commit 98db066. ♻️ This comment has been updated with latest results. |
@tgaddair Could you help review this PR ? thanks! and is it possible to make a patch release after this PR merged ? Thanks! |
horovod/spark/common/store.py
Outdated
@@ -181,7 +181,8 @@ def __init__(self, prefix_path, train_path=None, val_path=None, test_path=None, | |||
super().__init__() | |||
|
|||
def exists(self, path): | |||
return self.fs.exists(self.get_localized_path(path)) or self.fs.isdir(path) | |||
localized_path = self.get_localized_path(path) | |||
return self.fs.exists(localized_path) or self.fs.isdir(localized_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics of fs.isdir
looks like it includes fs.exists
, so the or self.fs.isdir(localized_path)
looks redundant. Can you please specify in a comment why this bit is needed? Otherwise, someone might remove it in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why it includes or self.fs.isdir(localized_path)
here, it is in AbstractFilesystemStore
, maybe other filesystem (such as HDFS) require this ?
For DBFSLocal and local fs, fs.isdir
is redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep calling fs.isdir
in AbstractFilesystemStore.exists
and let author of AbstractFilesystemStore
to update it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I did not see that this existed before. It has been introduced here: 7a69711#diff-68621af6dea1527be13ddb5823a2d54a48a3f7b8fa98dd17cfa6ec1e3d677fccR180
However, you are changing the argument of calling into self.isdir
from path
to localized_path
.
@tgaddair do you remember why you introduced the seemingly redundand isdir(path)
here?
@WeichenXu123 why do you think AbstractFilesystemStore
is the right place for exists
to use the localized path rather than the given path? This affects all FilesystemStore implementations. Maybe this should go into DBFSLocalStore
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I made DBFSLocalStore
override "exists" method.
and I restored the exists
method in AbstractFilesystemStore
I could cut a 0.24.3 release. |
@EnricoMi PR updated and addressed your comments. Could you take another look ? Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@EnricoMi Thank you very much ! When will the 0.24.3 release be cut ? :) |
…_path (#3510) Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
…_path (#3510) Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Weichen Xu weichen.xu@databricks.com
Checklist before submitting
Description
Fix bugs in DBFSLocalStore.
get_localized_path
correctly.Review process to land