Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CloudPath resolve/absolute/samefile/parts don't handle trailing slash consistently or like pathlib #357

Closed
indigoviolet opened this issue Jul 28, 2023 · 4 comments

Comments

@indigoviolet
Copy link

In [45]: bucket = CloudPath("gs://foo")

In [46]: bucket_slash = CloudPath("gs://foo/")

In [47]: bucket.samefile(bucket_slash)
Out[47]: False

In [48]: bucket.absolute(), bucket_slash.absolute()
Out[48]: (GSPath('gs://foo'), GSPath('gs://foo/'))

In [49]: bucket.resolve(), bucket_slash.resolve()
Out[49]: (GSPath('gs://foo'), GSPath('gs://foo/'))

In [50]: bucket.parts == bucket_slash.parts
Out[50]: True

Path:

In [57]: from pathlib import Path

In [58]: Path("/foo/").absolute()
Out[58]: PosixPath('/foo')
@pjbull
Copy link
Member

pjbull commented Jul 28, 2023

We don't do any character stripping in cloudpathlib (if you find it, it is likely a bug). This is because in most of the backends you can have both a file called s3://bucket/a and a fake virtual directory at s3://bucket/a/.

Because of this we can't safely strip the trailing slash.

@pjbull pjbull closed this as completed Jul 28, 2023
@indigoviolet
Copy link
Author

In that case, isn't the parts method incorrect?

@jayqi
Copy link
Member

jayqi commented Jul 28, 2023

What would you expect to be the correct output of parts?

Because the cloud storage services don't actually have / behave as a separator, splitting on / into parts isn't a thing that makes sense. So, unfortunately, I don't think there is an obvious right way to parts to behave. If you have any feedback here on what feels more intuitive, we'd appreciate hearing it.

@indigoviolet
Copy link
Author

IMO, the most useful thing would be to ignore the distinction between a/ and a, ie treat prefixes ending in delimiters as folders, for example, like the S3 console does.

It seems to me that a user would be more likely to prefer this behavior than one that maintains this distinction at the cost of absolute(), samefile() etc not working like in Path. Then parts can retain its current behavior (returning "folders").

I would probably introduce a method like sameprefix to capture the current behavior if someone really needed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants