Description
As I understand, simple code like this is supposed to work just fine, but it's not:
root = AnyPath(f's3://')
gen = root.glob('*')
buckets = list(gen)
files = list(buckets[0].glob('*'))
pp(files)
So, in bucket S3Paths I have malformed url like: "s3:////bucket1":
The error happens here: https://github.com/drivendataorg/cloudpathlib/blob/master/cloudpathlib/cloudpath.py#L398
It happens when s3://
gets joined with /bucket1
via slash in
https://github.com/drivendataorg/cloudpathlib/blob/master/cloudpathlib/client.py#L64
Next problem is that these "bucket" entries don't actually have a bucket attribute set, it causes confusion inside, so the next bucket.glob('*')
causes havoc inside, it pulls 2nd bucket into the 1st one somehow:
raise ValueError("{!r} is not in the subpath of {!r}" ValueError: '/bucket2' is not in the subpath of '/bucket1' OR one path is relative and the other is absolute.
Moreover, using library like this:
files = list(AnyPath('s3://bucket1').glob('*'))
produces next: S3Path('s3://bucket1/bucket1/root_folder')
which is obviously incorrect with bucket name twice in the path (and consequent .glob('*') failing as well).
Is it me doing something horribly wrong, or S3 is broken right now?