Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat prefix ending with "/" as directory #219

Merged
merged 2 commits into from
Oct 1, 2021

Conversation

HiroakiMikami
Copy link
Member

Currently, S3 does not imitate directory. Thus, we have to branch code based on the filesystem when handling directories with pfio.

This PR handles common prefixes that end with "/" as directories in S3 to increase compatibility between FileSystems.

@kuenishi
Copy link
Member

kuenishi commented Oct 1, 2021

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit 402486c:

@kuenishi kuenishi added this to the 2.0.0 milestone Oct 1, 2021
Copy link
Member

@kuenishi kuenishi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion for S3Stat is to try s3.list('somedir') for imitating 'stat'. It might be heavy, but will behave better.

pfio/v2/s3.py Outdated
@@ -414,8 +426,28 @@ def isdir(self, file_path: str):
returning boolean or ``None`` would be nicer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this note as it contradicts with the new implementaiton.

@@ -398,12 +408,14 @@ def stat(self, path):
return S3ObjectStat(key, res)
except ClientError as e:
if e.response['Error']['Code'] == '404':
if self.isdir(path):
return S3PrefixStat(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a good design - in cases like kind of a "non-existent" directory (somewhat hard to define), objects in a bucket like

  • foo/bar.txt and
  • somefile.txt

In this case, s3.isdir('foo') would be true and also false for s3.isdir('somefile.txt'), which is nice, but this code will also return true for s3.isdir('whatever-nothing'), which is just nothing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this code will also return true for s3.isdir('whatever-nothing'), which is just nothing.

I think (and I confirmed locally) that s3.isdir('whatever-nothing') returns false because this PR handles "common-prefix" as directories.

on the other hand, I think it's not the best choice in terms of performance. It always calls list-object API when the server returns 404 status code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my bad! I misread the code.

@kuenishi
Copy link
Member

kuenishi commented Oct 1, 2021

/test

@pfn-ci-bot
Copy link

Successfully created a job for commit 39f1b04:

@kuenishi kuenishi merged commit 5075030 into pfnet:master Oct 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants