Skip to content

Conversation

@rs-unity
Copy link
Owner

@rs-unity rs-unity commented Jan 21, 2026

Problem

When creating a Paimon table with partition fields that syncs to Iceberg REST catalog, the resulting metadata includes an extra empty partition spec (spec-id: 0) alongside the actual partition spec (spec-id: 1). This causes the REST catalog to have:

"partition-specs": [
  {
    "spec-id": 0,
    "fields": []
  },
  {
    "spec-id": 1,
    "fields": [
      {"name": "partition_date", "transform": "identity", ...},
      {"name": "partition_hour", "transform": "identity", ...}
    ]
  }
]

Instead of the expected:

"partition-specs": [
  {
    "spec-id": 0,
    "fields": [
      {"name": "partition_date", "transform": "identity", ...},
      {"name": "partition_hour", "transform": "identity", ...}
    ]
  }
]

Root Cause

The REST catalog integration follows this flow:

  1. createTable() creates an Iceberg table with an empty schema (to handle field ID offsets between Paimon and Iceberg)
  2. Iceberg automatically creates an empty partition spec (spec-id: 0) when creating a table with an empty schema
  3. updatesForCorrectBase() then adds the actual partition spec, which gets assigned spec-id: 1
  4. The empty spec-id: 0 remains because Iceberg doesn't allow removing partition specs once created

Solution

Since we cannot remove the empty spec-id: 0, this fix ensures that when a new table is created with an empty base spec but a non-empty partition spec in the new metadata:

  1. The new partition spec is added (it will get spec-id: 1)
  2. The new spec is explicitly set as the default partition spec using setDefaultPartitionSpec()

This ensures Iceberg readers use the correct partition spec, even though the empty spec-id: 0 remains in the metadata for historical reasons.

Changes

  • Added detection in updatesForCorrectBase() for the case where the base has an empty partition spec but the new metadata has a non-empty spec
  • Explicitly set the newly added partition spec as the default to ensure correct behavior

Limitations

This fix doesn't remove the empty spec-id: 0 from the metadata (Iceberg doesn't support removing partition specs). However, it ensures the correct spec is used as the default, which resolves the functional issue for Iceberg readers.

Note

Improves partition spec handling during initial table creation against the REST catalog.

  • For new tables, if the base has an empty partition spec (spec-id: 0) and the new metadata has a non-empty spec, add the new spec and set it as the default; otherwise use newMetadata.defaultSpecId()
  • Adds documentation noting that creating a table with an empty schema auto-creates an empty partition spec (spec-id: 0) that cannot be removed

Written by Cursor Bugbot for commit 9036adc. This will update automatically on new commits. Configure here.

@rs-unity rs-unity changed the title [rest] Improve partition spec handling in IcebergRestMetadataCommitter for better compatibility Fix: Ensure correct partition spec is set as default in REST catalog Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants