Skip to content

DataFrame API: to_parquet(partition_cols=) doesn't work as intended #20915

@damccorm

Description

@damccorm

Currently we accept the partition_cols keyword argument, but it doesn't work as intended. It should partition by the specified columns and use dynamic destinations to write partitions to different files.

Context: https://lists.apache.org/thread.html/ra1e647440ffb43e922d9289cbe6f59e581c00055cf7f6a71b3fab205%40%3Cuser.beam.apache.org%3E

Imported from Jira BEAM-12201. Original Jira may contain additional context.
Reported by: bhulette.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions