Skip to content

How to partition by quantity with write_to_dataset  #14725

@phpsxg

Description

@phpsxg

Describe the usage question you have. Please include as many useful details as possible.

write_to_dataset partition by field, error is reported: pyarrow.lib.ArrowInvalid: Fragment would be written into 32768 partitions. This exceeds the maximum of 1024, is there a better way to partition ? Or can it be partitioned by quantity? Thanks

pq.write_to_dataset(table,
  root_path=file_path,
  partition_cols=cls.partition_cols,
  existing_data_behavior='delete_matching',
  use_legacy_dataset=False
)
  • python=3.10
  • pyarrow=10.0.0

Component

Python

Metadata

Metadata

Assignees

No one assigned

    Labels

    Component: PythonStatus: stale-warningIssues and PRs flagged as stale which are due to be closed if no indication otherwiseType: usageIssue is a user question

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions