Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: List Bucket/IAM Edge Cases #2399

Merged
merged 1 commit into from
Feb 3, 2024

Conversation

jmcarpenter2
Copy link
Contributor

This pull request resolves an edge case when the XML fails to add objects to the ListBucket response, and correctly identifies that there are still objects under the path provided. This can be caused by slightly restricted (but still correct) IAM permissions, where you add a condition on the s3:prefix to the s3:ListBucket permissions. Specifically, this issue only occurs in special cases where there are special characters in the prefix/object path.

Relevant Issue (if applicable)

#2129 (comment)

Details

There are two changes in this pull request.

  1. We apply a prefix to the reiter string before checking if reiter is an empty directory. This resolves an Access Denied response from S3, because for some reason the full path did not include the / separator between the base prefix and the rest of the path.
  2. We handle the case where directory_empty function returns -ENOTEMPTY, which implies that while adding objects from the XML response of the ListBucket CURL command, it failed but there are still objects under the path provided. This failure mode appears to happen with the special IAM policy condition mentioned above in combination with special characters in the object or prefix path. By handling this special case we getStatCacheData from the updated dirpath instead and proceed with the s3 prefix appearing in the mount.

Demonstration of fix

Basic setup
Version of s3fs being used (s3fs --version)
1.93

Version of fuse being used (pkg-config --modversion fuse, rpm -qi fuse or dpkg -s fuse)
2.9.9

Kernel information (uname -r)
4.14.334-252.552.amzn2.x86_64

GNU/Linux Distribution, if applicable (cat /etc/os-release)

NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Toy example, based on real issue

How to run s3fs, if applicable

s3fs "david-bucket:/home/David" "/home/jovyan/work/personal" -o sigv4,iam_role=auto -o endpoint=us-east-2 -o url=https://s3.us-east-2.amazonaws.com \
    -o allow_other -o uid=1000,gid=1000 -o umask="0000" \
    -o compat_dir,complement_stat,nonempty -o dbglevel=warn -f &

s3fs syslog messages (grep s3fs /var/log/syslog, journalctl | grep s3fs, or s3fs outputs)

2024-01-12T18:14:02.304Z [ERR] s3fs.cpp:list_bucket(3402): ListBucketRequest returns with error.
2024-01-12T18:14:02.304Z [ERR] s3fs.cpp:directory_empty(1259): list_bucket returns error.
2024-01-12T18:14:02.304Z [WAN] s3fs.cpp:readdir_multi_head(3299): david_stuff/ object does not have any object under it(errno=-1),
2024-01-12T18:14:02.309Z [ERR] curl.cpp:RequestPerform(2566): HTTP response code 403, returning EPERM. Body Text: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>XXX</RequestId><HostId>XXX</HostId></Error>

IAM Policy

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "bucket0nav",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::david-bucket"
			],
			"Condition": {
				"StringEquals": {
					"s3:prefix": [
						"",
						"home",
						"home/David"
					],
					"s3:delimiter": [
						"/"
					]
				}
			}
		},
		{
			"Sid": "bucket0recurse",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::david-bucket"
			],
			"Condition": {
				"StringLike": {
					"s3:prefix": [
						"home/David/*"
					]
				}
			}
		},
		{
			"Sid": "writeaccess",
			"Action": [
				"s3:AbortMultipartUpload",
				"s3:DeleteObject",
				"s3:GetObject",
				"s3:GetObjectVersion",
				"s3:ListMultipartUploadParts",
				"s3:PutObject"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::david-bucket/home/David/*",
			]
		}
	]
}

We are trying to list bucket which has contents like the following

home
|------David
           |------david_stuff
                      |-------Life and Health History Survey-responses.json
                      |-------file_2.txt
                      |-------etc..

Details
It appears that the spaces in the object key Life and Health History Survey-responses.json under the david_stuff prefix is causing issues somehow in relationship with the listbucket permissions on home/David/*

To demonstrate how this fixes the issue, elaborating here on the failure mode workflow, given the example setup:

  • Mount mountpoint david-bucket:/home/David to /home/jovyan/work/personal, with prefix under it david_stuff/
  • function readdir_multi_head invoked on mount prefix /home/David
  • function directory_empty invoked on path david_stuff
  • Inside of function list_bucket, append_objects_from_xml is invoked
  • append_objects_from_xml_xe is invoked, trying to parse the Contents/Key or the CommonPrefixes/Prefix
  • Back in directory_empty, we get one of two responses.
    1. Before fix (1) mentioned above, we get result -1 on Access Denied from AWS S3. This is because the david_stuff path is missing the / prefix, which was noticed by viewing the CURL commands to AWS S3. Without the / prefix, the constrained IAM permissions cause the access denied to get returned by AWS S3
    2. After fix (1) mentioned above and before fix (2), we get result -39 which responds to -ENOTEMPTY
      • -ENOTEMPTY in this case means that, although we got the expression evaluation trying to append objects from xml. head, which is of type S3ObjList, still has objects under it. Which means that there are in fact objects under the path /david_stuff.
  • However, even though we know there are objects under the path, we return this warning and do not display the s3 prefix david_stuff
    • After fix (2) mentioned above, we instead handle -ENOTEMPTY case and now display the s3 prefix david_stuff as a directory

Copy link
Member

@ggtakec ggtakec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmcarpenter2 I'm so sorry for the late confirmation.

Thanks for found this bug, I beleive this bug is probably related to #2129.

I wrote a review, so please take a look at the comments.(I think the fix should need more little change.)

I did the testing on AWS S3 for confirmation about this bug.
I tested it by creating a file object(/dir1/dir2/file) that has no directory object(middle path objects).
I then tried mounting to /, /dir1, /dir1/dir2 and confirmed that it works.

It would be helpful if you could refer to the review and fix the code in this PR.(and please rebase to master when modifying.)

Thanks in advance for your kindness.

src/s3fs.cpp Outdated Show resolved Hide resolved
@ggtakec ggtakec merged commit 5e6f21a into s3fs-fuse:master Feb 3, 2024
18 of 19 checks passed
@ggtakec
Copy link
Member

ggtakec commented Feb 3, 2024

I merged this PR.(macos ci failure is not problem.)

@jmcarpenter2 Thank you for your contribution.

@jmcarpenter2
Copy link
Contributor Author

Incredible. Thanks for your guidance @ggtakec !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants