branch-4.1: [feature](iceberg) Support static partition overwrite for Iceberg tables #58396 #59951#61372
Merged
morningman merged 4 commits intoapache:branch-4.1from Mar 16, 2026
Conversation
…les (apache#58396) ### What problem does this PR solve? ### Proposed changes This PR implements static partition overwrite functionality for Iceberg external tables, allowing users to precisely overwrite specific partitions using the `INSERT OVERWRITE ... PARTITION (col='value', ...)` syntax. ### Background Before this PR, Doris supports: - ✅ `INSERT INTO` with dynamic partition for Iceberg tables - ✅ `INSERT OVERWRITE` for full table replacement - ❌ `INSERT OVERWRITE ... PARTITION (...)` for static partition overwrite ### New Features 1. **Full Static Partition Mode**: Overwrite a specific partition when all partition columns are specified ```sql INSERT OVERWRITE TABLE iceberg_db.tbl PARTITION (dt='2025-01-25', region='bj') SELECT id, name FROM source_table; ``` 2. **Hybrid Partition Mode**: Partial static + partial dynamic partition ```sql -- dt is static, region comes from SELECT dynamically INSERT OVERWRITE TABLE iceberg_db.tbl PARTITION (dt='2025-01-25') SELECT id, name, region FROM source_table; ``` ### Implementation Details #### FE Changes - **Parser** (`DorisParser.g4`, `LogicalPlanBuilder.java`): Extended partition spec parsing to support `PARTITION (col='value', ...)` syntax - **InsertPartitionSpec**: New unified data structure to represent partition modes (auto-detect, dynamic, static) - **UnboundIcebergTableSink**: Added `staticPartitionKeyValues` field to carry static partition info - **BindSink**: Added validation for static partition columns and generate constant expressions for static partition values - **IcebergTransaction**: Implemented `commitStaticPartitionOverwrite()` using Iceberg's `OverwriteFiles.overwriteByRowFilter()` API - **IcebergUtils**: Added `parsePartitionValueFromString()` utility for partition value type conversion #### BE Changes - **VIcebergTableWriter**: - Support full static partition mode (all data goes to single partition) - Support hybrid partition mode (static columns from config, dynamic columns from data) - Added `_is_full_static_partition` and `_dynamic_partition_column_indices` for mode detection #### Thrift Changes - Added `static_partition_values` field to `TIcebergTableSink` for passing static partition info from FE to BE (cherry picked from commit 8a974d4)
…se column count validation (apache#59951) - Related Pr: apache#58396 ### What problem does this PR solve? Related: apache#58396 ## Problem When using `INSERT OVERWRITE` with static partition syntax and `VALUES` clause, the operation incorrectly failed with: ``` Column count doesn't match value count. Expected: N, but got: M ``` ### Example that was failing: ```sql -- Table has 2 columns: id (int), par (string) - partitioned by par INSERT OVERWRITE TABLE test_partition_branch PARTITION (par='a') VALUES (11), (12); ``` **Error**: Column count doesn't match value count. Expected: 2, but got: 1 ### Root Cause The column count validation in `InsertUtils.normalizePlan()` did not account for static partition columns. When using `PARTITION (col='value')` syntax: - The partition column value is **already fixed** in the PARTITION clause - The VALUES should **only provide non-partition column values** - This is standard SQL behavior (Hive, Iceberg, etc.) The validation was comparing VALUES count against **all** table columns instead of **non-partition** columns only. ## Solution Modified `InsertUtils.java:363-372` to: 1. Detect when the sink is `UnboundIcebergTableSink` 2. Extract static partition columns from `staticPartitionKeyValues` 3. Filter out static partition columns from the column list before validation 4. Only compare VALUES count against non-partition columns ```java if (unboundLogicalSink instanceof UnboundIcebergTableSink && CollectionUtils.isEmpty(unboundLogicalSink.getColNames())) { UnboundIcebergTableSink<?> icebergSink = (UnboundIcebergTableSink<?>) unboundLogicalSink; Map<String, Expression> staticPartitions = icebergSink.getStaticPartitionKeyValues(); if (staticPartitions != null && !staticPartitions.isEmpty()) { Set<String> staticPartitionColNames = staticPartitions.keySet(); columns = columns.stream() .filter(column -> !staticPartitionColNames.contains(column.getName())) .collect(ImmutableList.toImmutableList()); } } ``` (cherry picked from commit e572788)
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
Author
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick #58396 #59951 to branch-4.1
What problem does this PR solve?
Support static partition overwrite for Iceberg tables on
branch-4.1, including the follow-up fix forINSERT OVERWRITE ... PARTITION (...) VALUES ...column count validation.Cherry-pick commit
dc1708e44d9- feature Support static partition overwrite for Iceberg tables ([feature](iceberg) Support static partition overwrite for Iceberg tables #58396)99b3274505c- Fix Fix static partition INSERT OVERWRITE with VALUES clause column count validation ([Fix](iceberg) Fix static partition INSERT OVERWRITE with VALUES clause column count validation #59951)486baa6bb11- test Clean static partition overwrite regression files