New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored FilesystemStore to use fsspec to support additional remote filesystems #2927
Conversation
Signed-off-by: Travis Addair <tgaddair@gmail.com>
Unit Test Results 792 files ±0 792 suites ±0 6h 8m 13s ⏱️ ±0s For more details on these errors, see this check. Results for commit 7a69711. ± Comparison against base commit 7a69711. ♻️ This comment has been updated with latest results. |
horovod/spark/common/store.py
Outdated
@staticmethod | ||
def _get_fs_and_protocol(url): | ||
protocol, path = split_protocol(url) | ||
fs = fsspec.filesystem(protocol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can storage_options be provided as second argument [fsspec.filesystem(protocol, **storage_options)] ? This will help provide authentication options to fsspec.
horovod/spark/common/store.py
Outdated
|
||
def __init__(self, prefix_path, *args, **kwargs): | ||
self._fs = pa.LocalFileSystem() | ||
super(LocalStore, self).__init__(prefix_path, *args, **kwargs) | ||
self._fs, self.protocol = FilesystemStore._get_fs_and_protocol(prefix_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be added ?:
self.storage_options = kwargs['storage_options'] if 'storage_options' in kwargs else dict()
self._fs, self.protocol = FilesystemStore._get_fs_and_protocol(prefix_path, self.storage_options)
Also, adding storage_options dict param to HorovodEstimator (EstimatorParams ?), should help RemoteTrainer()s to pass them to make_reader & make_batch_reader Petastorm calls. |
I am testing this change with ADLS; I will update with results. |
Thanks @umashankark, I hope to be able to follow-up on your suggestions soon (maybe this weekend). Just got a few other things I need to take care of first. Thanks for your patience! |
Getting this error only when Estimator(save_runs=True): This happens in second epoch. Looks like Store.sync_fn:: fs.put() couldn't overwrite the folder content. Checking whether fs.put() expected to overwrite. |
FilesystemStore: Store.sync_fn(): """ snip self.fs.put(local_run_path, run_path, recursive=True, overwrite=True) overwrite=True takes care of the error. Will share the minor changes done. |
Signed-off-by: Travis Addair <tgaddair@gmail.com>
@chongxiaoc @irasit please take a look. Should hopefully not break anything on the Uber side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
…l remote filesystems (horovod#2927)" This reverts commit 7a69711.
…l remote filesystems (horovod#2927)" This reverts commit 7a69711. Signed-off-by: Chongxiao Cao <chongxiaoc@uber.com>
…l remote filesystems (horovod#2927)" This reverts commit 7a69711. Signed-off-by: Chongxiao Cao <chongxiaoc@uber.com>
This adds support for various cloud storage systems like S3, GCS, ADLS, etc. in the Spark Estimators.
Fixes #2905.
Fixes #1732.