Skip to content

Globbing top-level bucket returns malformed CloudPaths #311

Closed
@ssoj13

Description

@ssoj13

As I understand, simple code like this is supposed to work just fine, but it's not:

root = AnyPath(f's3://')
gen = root.glob('*')
buckets = list(gen)
files = list(buckets[0].glob('*'))
pp(files)

So, in bucket S3Paths I have malformed url like: "s3:////bucket1":
The error happens here: https://github.com/drivendataorg/cloudpathlib/blob/master/cloudpathlib/cloudpath.py#L398
It happens when s3:// gets joined with /bucket1 via slash in
https://github.com/drivendataorg/cloudpathlib/blob/master/cloudpathlib/client.py#L64

Next problem is that these "bucket" entries don't actually have a bucket attribute set, it causes confusion inside, so the next bucket.glob('*') causes havoc inside, it pulls 2nd bucket into the 1st one somehow:
raise ValueError("{!r} is not in the subpath of {!r}" ValueError: '/bucket2' is not in the subpath of '/bucket1' OR one path is relative and the other is absolute.

Moreover, using library like this:
files = list(AnyPath('s3://bucket1').glob('*')) produces next: S3Path('s3://bucket1/bucket1/root_folder')
which is obviously incorrect with bucket name twice in the path (and consequent .glob('*') failing as well).

Is it me doing something horribly wrong, or S3 is broken right now?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions