Skip to content

Implement partition path generation (PartitionPathUtils) #127

@luoyuxia

Description

@luoyuxia

Parent Issue

Part of #124 (support partitioned table)
Depends on #126 (BinaryRow deserialization)

Background

Paimon stores data files under partition directories like {table_path}/dt=2024-01-01/hr=12/bucket-0/. Currently there is no code to convert partition values (from BinaryRow) into this filesystem path format.

What needs to be done

  1. Create a partition path utility (e.g., spec/partition_utils.rs or similar)

    • Given partition keys (names from TableSchema.partition_keys()) + partition field types (from TableSchema.fields()) + partition values (decoded BinaryRow), generate the partition path string
    • Format: {key1}={value1}/{key2}={value2}/...
  2. Handle value formatting per data type

    • Int/BigInt/SmallInt/TinyInt → integer string
    • VarChar/Char/String → string value
    • Dateyyyy-MM-dd format (days since epoch)
    • Timestamp → appropriate timestamp string
    • Booleantrue/false
    • Float/Double → decimal string
    • Decimal → decimal string
  3. Handle special cases

    • Null partition values → use __DEFAULT_PARTITION__
    • Special characters in string values → URL encoding if needed (follow Java Paimon's behavior)
  4. Unit tests

    • Single partition key
    • Multiple partition keys
    • Different data types (string, int, date)
    • Null partition value → __DEFAULT_PARTITION__

Reference

Affected files

  • New file: crates/paimon/src/spec/partition_utils.rs (or integrated into existing module)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions