-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Datasets] Add clearer actionable error message for AWS S3 credential error #26619
Conversation
cc @jianoaix and @clarkzinzow for review, and also cc @pcmoritz and @matthewdeng FYI. |
# readability. List of file paths will be shown up as ['foo', 'boo'], | ||
# so only quote single file path here. | ||
paths = f'"{paths}"' | ||
raise PermissionError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't add from None
intentionally to still keep the original error message, that still might be useful for debugging in the future, if other unknown errors happen other than credential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
@@ -2903,6 +2903,23 @@ def get_node_id(): | |||
assert sorted(ds.take()) == ["goodbye", "hello", "world"] | |||
|
|||
|
|||
def test_read_s3_file_error(ray_start_regular_shared, s3_path): | |||
dummy_path = s3_path + "_dummy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is not an error like "not found" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file not found and missing credential have same error. File not found is easier to unit test so doing test for it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for putting this together! We should probably keep the error be an OSError, since it can be either FileNotFound or PermissionError. Once we can do more fine grained differentiation between those in the future we can switch. Let's also make sure the original error is preserved / the error code is not lost. I realize that this situation is a little tough to deal with, but we should do our best. I also submitted an upstream issue https://issues.apache.org/jira/browse/ARROW-17079, so hopefully we can improve this going forward :) In my experience, the |
except OSError as e: | ||
_handle_read_s3_files_error(e, path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: At first glance this is a bit confusing - would it make sense to rename this function to something more generic, and just add a comment within the function that only s3 handling is currently supported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewdeng - sorry not fully get the point, do you have a suggested function name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe _handle_read_os_error
? The scenario I was thinking about was if we run into the same error when reading from GCS, what would make extending this logic the simplest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewdeng - got it, that makes sense, updated in #26669 .
Let's refine this in a separate PR; merging to unblock. |
@pcmoritz - make sense, updated to
@pcmoritz - yes the origianl error and error code (also original stack trace) is not lost, and still printed out as part of error, per #26619 (comment) . |
…error (#26669) As a followup of #26619 (comment) and #26619 (comment), here we change from PermissionError to OSError, to be consistent as original error, and also change function name from _handle_read_s3_files_error to _handle_read_os_error, which is more general that we can handle other file systems such as GCS in the future. Also change to hanlde any error message with pattern AWS Error [code xxx]: No response body as new issue with error code 100 is raised in #26672 .
… error (ray-project#26619) In ray-project#19799, and ray-project#24184, we found when using Datasets to read S3 file, if file's credential is not set up right, the `read_xxx` API would throw confusing error message with `AWS Error [code 15]: No response body` like below: ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/chengsu/ray/python/ray/data/read_api.py", line 758, in read_binary_files return read_datasource( File "/Users/chengsu/ray/python/ray/data/read_api.py", line 267, in read_datasource requested_parallelism, min_safe_parallelism, read_tasks = ray.get( File "/Users/chengsu/ray/python/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/Users/chengsu/ray/python/ray/_private/worker.py", line 2196, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(PermissionError): ray::_get_read_tasks() (pid=80200, ip=127.0.0.1) File "pyarrow/_fs.pyx", line 439, in pyarrow._fs.FileSystem.get_file_info File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 114, in pyarrow.lib.check_status OSError: When getting information for key 'trainaasdasd' in bucket 'balajis-tiny-imagenet': AWS Error [code 15]: No response body. ``` The error message mentions nothing related to file credential, so it's quite confusing. This PR is to catch the error and give a better error message: ``` ray::_get_read_tasks() (pid=80200, ip=127.0.0.1) File "/Users/chengsu/ray/python/ray/data/read_api.py", line 1127, in _get_read_tasks reader = ds.create_reader(**kwargs) File "/Users/chengsu/ray/python/ray/data/datasource/file_based_datasource.py", line 212, in create_reader return _FileBasedDatasourceReader(self, **kwargs) File "/Users/chengsu/ray/python/ray/data/datasource/file_based_datasource.py", line 350, in __init__ self._paths, self._file_sizes = meta_provider.expand_paths( File "/Users/chengsu/ray/python/ray/data/datasource/file_meta_provider.py", line 173, in expand_paths _handle_read_s3_files_error(e, path) File "/Users/chengsu/ray/python/ray/data/datasource/file_meta_provider.py", line 342, in _handle_read_s3_files_error raise PermissionError( PermissionError: Failing to read AWS S3 file(s): "balajis-tiny-imagenet/trainaasdasd". Please check file exists and has proper AWS credential. See https://docs.ray.io/en/latest/data/creating-datasets.html#reading-from-remote-storage for more information. ``` Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
…error (ray-project#26669) As a followup of ray-project#26619 (comment) and ray-project#26619 (comment), here we change from PermissionError to OSError, to be consistent as original error, and also change function name from _handle_read_s3_files_error to _handle_read_os_error, which is more general that we can handle other file systems such as GCS in the future. Also change to hanlde any error message with pattern AWS Error [code xxx]: No response body as new issue with error code 100 is raised in ray-project#26672 . Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com>
… error (ray-project#26619) In ray-project#19799, and ray-project#24184, we found when using Datasets to read S3 file, if file's credential is not set up right, the `read_xxx` API would throw confusing error message with `AWS Error [code 15]: No response body` like below: ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/chengsu/ray/python/ray/data/read_api.py", line 758, in read_binary_files return read_datasource( File "/Users/chengsu/ray/python/ray/data/read_api.py", line 267, in read_datasource requested_parallelism, min_safe_parallelism, read_tasks = ray.get( File "/Users/chengsu/ray/python/ray/_private/client_mode_hook.py", line 105, in wrapper return func(*args, **kwargs) File "/Users/chengsu/ray/python/ray/_private/worker.py", line 2196, in get raise value.as_instanceof_cause() ray.exceptions.RayTaskError(PermissionError): ray::_get_read_tasks() (pid=80200, ip=127.0.0.1) File "pyarrow/_fs.pyx", line 439, in pyarrow._fs.FileSystem.get_file_info File "pyarrow/error.pxi", line 143, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 114, in pyarrow.lib.check_status OSError: When getting information for key 'trainaasdasd' in bucket 'balajis-tiny-imagenet': AWS Error [code 15]: No response body. ``` The error message mentions nothing related to file credential, so it's quite confusing. This PR is to catch the error and give a better error message: ``` ray::_get_read_tasks() (pid=80200, ip=127.0.0.1) File "/Users/chengsu/ray/python/ray/data/read_api.py", line 1127, in _get_read_tasks reader = ds.create_reader(**kwargs) File "/Users/chengsu/ray/python/ray/data/datasource/file_based_datasource.py", line 212, in create_reader return _FileBasedDatasourceReader(self, **kwargs) File "/Users/chengsu/ray/python/ray/data/datasource/file_based_datasource.py", line 350, in __init__ self._paths, self._file_sizes = meta_provider.expand_paths( File "/Users/chengsu/ray/python/ray/data/datasource/file_meta_provider.py", line 173, in expand_paths _handle_read_s3_files_error(e, path) File "/Users/chengsu/ray/python/ray/data/datasource/file_meta_provider.py", line 342, in _handle_read_s3_files_error raise PermissionError( PermissionError: Failing to read AWS S3 file(s): "balajis-tiny-imagenet/trainaasdasd". Please check file exists and has proper AWS credential. See https://docs.ray.io/en/latest/data/creating-datasets.html#reading-from-remote-storage for more information. ``` Signed-off-by: Stefan van der Kleij <s.vanderkleij@viroteq.com>
…error (ray-project#26669) As a followup of ray-project#26619 (comment) and ray-project#26619 (comment), here we change from PermissionError to OSError, to be consistent as original error, and also change function name from _handle_read_s3_files_error to _handle_read_os_error, which is more general that we can handle other file systems such as GCS in the future. Also change to hanlde any error message with pattern AWS Error [code xxx]: No response body as new issue with error code 100 is raised in ray-project#26672 . Signed-off-by: Stefan van der Kleij <s.vanderkleij@viroteq.com>
Why are these changes needed?
In #19799, and #24184, we found when using Datasets to read S3 file, if file's credential is not set up right, the
read_xxx
API would throw confusing error message withAWS Error [code 15]: No response body
like below:The error message mentions nothing related to file credential, so it's quite confusing. This PR is to catch the error and give a better error message:
Related issue number
Closes #19799 and #24184
Checks
scripts/format.sh
to lint the changes in this PR.