Skip to content

Obtain metadata field at once in file system's IO connectors #21551

@damccorm

Description

@damccorm

This tasks involves refactoring and improvements of IO connectors' file metadata related methods (GcsIO, S3IO, BlobIO, hadoop).

Currently, we have individual methods like size, last_updated, checksum, and others. Each one would make a HTTP request in order to get the specific metadata field. If one needs to gather multiple metadata fields, then every specific method are called and making multiple requests under the hood. Actually, the HTTP response contains multiple file metadata fields but each time only one field is collected and others are discarded.

We should have a public method that returns a named tuple which contains multiple file metadata fields. In its implementation it only makes one request, as existing methods for single metadata field. 

 

Imported from Jira BEAM-14393. Original Jira may contain additional context.
Reported by: yihu.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions