For exists(), more precise error handling. #757

kasuteru · 2023-07-20T16:48:18Z

This PR improves the error handling logic for s3fs.S3FileSystem.exists() and fixes #750

First change is that there is no longer a generic except Exception block, meaning that e.g. ConnectionError now correctly raises an error instead of returning False when querying a bucket.

Second change is that a warning is now logged when attempting to query the existence of a bucket that the user does not have access to - this is important since the bucket might actually exist. This is in line with Amazon's proposed behavior, see here.

For more information on this, see related issue #750.

kasuteru · 2023-07-20T16:49:21Z

@martindurant this implements solution 2) mentioned in #750. I am happy to change it to another solution, as discussed.

martindurant · 2023-07-20T17:01:15Z

I think 2) is the right choice.

martindurant · 2023-07-21T14:19:00Z

Can we make a test which surfaces the ConnectionError? Perhaps just set the endpoint_url to something silly.

kasuteru · 2023-07-21T14:22:26Z

I will try to think of something, and also attempt to add tests for the other changes.

kasuteru · 2023-07-21T15:58:43Z

I created test coverage for two of the three new cases:

A bucket that is not existent, but we have access to check (raises FileNotFoundError, which we catch)
A ConnectionError as requested, symbolic for all other types of errors where we are actually unable to even check whether a bucket exists. This is what this PR is trying to fix (since currently, False is returned instead).

However, I cannot figure out how to create good test coverage for the third path, the PermissionError that is returned in case listing of buckets is prohibited by the endpoint. I am unsure on how to create a s3 file system locally that disallows listing of available buckets (which would be required for this test). The best I found is to trigger the permissionerror by providing the wrong credentials:

def test_exists_bucket_nonexistent_or_no_access(caplog):
    # Ensure that a warning is raised and False is returned if querying a
    # bucket that might not exist or might be private.
    fs = s3fs.S3FileSystem(key="asdfasfsdf", secret="sadfdsfds")
    assert not fs.exists("s3://this-bucket-might-not-exist/")
    assert caplog.records[0].levelname == "WARNING"
    assert "doesn't exist or you don't have access" in caplog.text

If this is good enough, it is already in the PR. But I am happy to improve this one.

martindurant · 2023-07-21T16:53:31Z

Yes, that test is fine - it is, of course, failing on permissions with real AWS in this case (moto does not enforce credentials, which is why you couldn't test with that)

s3fs/tests/test_s3fs.py

martindurant · 2023-07-21T17:14:16Z

s3fs/tests/test_s3fs.py

+
+def test_exists_bucket_nonexistent(s3, caplog):
+    # Ensure that NO warning is raised and False is returned if checking bucket
+    # existance when we have full access.


Well, the bucket might still exist, but it's not one we can see. We happen to know for the sake of the test that it doesn't exist.

Not sure whether this is true for all S3 systems, but at least for our internal systems, I get a "PermissionError" when trying to access a bucket I cannot see (because I don't have permission to see it) but a FileNotFoundError like in this test when I would have permission to see, but it simply doesn't exist. This is why in the PR change, the warning would only be raised in case of PermissionError but not in case of FileNotFoundError. Should I change it to return the warning in both cases?

s3fs/tests/test_s3fs.py

Fix precommit and warnings error

kasuteru · 2023-07-24T14:19:57Z

I finally managed to set up pre-commit, I ran in SSL errors before. So hopefully the PR should not fail because of that any more. I also added changes as requested. My tests pass locally but I don't have the setup to test all Python versions currently.

s3fs/tests/test_s3fs.py

Co-authored-by: Martin Durant <martindurant@users.noreply.github.com>

kasuteru · 2023-07-25T13:00:05Z

Merged PR but removed duplicate endpoint_url (this was raising a different error). In addition, the other test now clears warnings cache as well - this was the reason for the test failure.

kasuteru · 2023-07-25T15:04:09Z

Forgot to run pre-commit, this is now fixed... But I am honestly not sure where the warnings are coming from, I even installed Python 3.8 locally and ran the tests there, they run fine...

The problem is the Unclosed client session warning, but I am unable to reproduce how s3fs.exists() triggers that...

martindurant · 2023-07-25T15:37:29Z

The following may remove the warnings

--- a/s3fs/core.py
+++ b/s3fs/core.py
@@ -541,6 +541,12 @@ class S3FileSystem(AsyncFileSystem):
     @staticmethod
     def close_session(loop, s3):
         if loop is not None and loop.is_running():
+            try:
+                loop = asyncio.get_event_loop()
+                loop.create_task(s3.__aexit__(None, None, None))
+                return
+            except RuntimeError:
+                pass
             try:
                 sync(loop, s3.__aexit__, None, None, None, timeout=0.1)

martindurant · 2023-07-27T13:29:52Z

Your failures look like genuine problems.

For the client close warnings (which are not errors) #760 will fix this.

kasuteru · 2023-07-27T13:48:21Z

If we don't manage to fix the warnings issue, I could also completely remove the warning from the PR. While I think it would be benefitial to warn users, it would already be an improvement if we manage to get rid of the generic except Exception clause. Let's see whether #760 changes things.

For exists(), more precise error handling

a93de68

Remove unnecessary exception object

ebb6535

Fix pre-commit error

8090d41

Add tests

36dd838

martindurant reviewed Jul 21, 2023

View reviewed changes

kasuteru and others added 4 commits July 24, 2023 11:37

Fix precommit and warnings error

bac41a1

Fix precommit and warnings error

908931e

Fix precommit and warnings error

Add comment and skip instance cache

1f957ad

Merge branch 'main' of https://github.com/kasuteru/s3fs

c6a1860

martindurant reviewed Jul 24, 2023

View reviewed changes

s3fs/tests/test_s3fs.py Outdated Show resolved Hide resolved

kasuteru and others added 2 commits July 25, 2023 14:54

Update s3fs/tests/test_s3fs.py

2ac545b

Co-authored-by: Martin Durant <martindurant@users.noreply.github.com>

Fix tests: Clear existing warnings and remove duplicate url

54c3bee

Fix pre-commit error

49a9ac5

Remove outdated monkeypatch

584c74f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For exists(), more precise error handling. #757

For exists(), more precise error handling. #757

kasuteru commented Jul 20, 2023 •

edited by martindurant

Loading

kasuteru commented Jul 20, 2023

martindurant commented Jul 20, 2023

martindurant commented Jul 21, 2023

kasuteru commented Jul 21, 2023

kasuteru commented Jul 21, 2023

martindurant commented Jul 21, 2023

martindurant Jul 21, 2023

kasuteru Jul 24, 2023

kasuteru commented Jul 24, 2023 •

edited

Loading

kasuteru commented Jul 25, 2023

kasuteru commented Jul 25, 2023

martindurant commented Jul 25, 2023

martindurant commented Jul 27, 2023

kasuteru commented Jul 27, 2023

For exists(), more precise error handling. #757

Are you sure you want to change the base?

For exists(), more precise error handling. #757

Conversation

kasuteru commented Jul 20, 2023 • edited by martindurant Loading

kasuteru commented Jul 20, 2023

martindurant commented Jul 20, 2023

martindurant commented Jul 21, 2023

kasuteru commented Jul 21, 2023

kasuteru commented Jul 21, 2023

martindurant commented Jul 21, 2023

martindurant Jul 21, 2023

Choose a reason for hiding this comment

kasuteru Jul 24, 2023

Choose a reason for hiding this comment

kasuteru commented Jul 24, 2023 • edited Loading

kasuteru commented Jul 25, 2023

kasuteru commented Jul 25, 2023

martindurant commented Jul 25, 2023

martindurant commented Jul 27, 2023

kasuteru commented Jul 27, 2023

kasuteru commented Jul 20, 2023 •

edited by martindurant

Loading

kasuteru commented Jul 24, 2023 •

edited

Loading