Skip to content

[Feature Request]: Add a Beam utility for efficiently copying files across file-systems #38312

@chamikaramj

Description

@chamikaramj

What would you like to happen?

There are scenarios where we need to copy a large number of files between file systems, for example, from S3 to GCS.

We currently can do this via a custom composite transform that uses fileio.Match -> fileio.ReadFiles -> custom sink implementation that copies bytes from source to destination without parsing the file format but it's good to have an standard efficient utility in Beam for this.

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions