branch-4.1: [Feature](Iceberg) Implement publish_changes procedure for Iceberg tables #58755#61358
Open
xylaaaaa wants to merge 1 commit intoapache:branch-4.1from
Open
branch-4.1: [Feature](Iceberg) Implement publish_changes procedure for Iceberg tables #58755#61358xylaaaaa wants to merge 1 commit intoapache:branch-4.1from
xylaaaaa wants to merge 1 commit intoapache:branch-4.1from
Conversation
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
Implements the Iceberg publish_changes EXECUTE action in FE to support the Write-Audit-Publish (WAP) workflow, and adds regression coverage plus Docker preinstalled data to exercise the behavior in external Iceberg tests.
Changes:
- Add FE action
publish_changesthat locates a snapshot bywap.idand cherry-picks it into the current table state. - Register
publish_changesin the Iceberg execute-action factory. - Add external regression tests (and expected output) plus a Spark preinstalled SQL script to set up WAP/non-WAP Iceberg tables.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/external_table_p0/iceberg/action/test_iceberg_execute_actions.groovy | Adds WAP publish_changes happy-path + negative regression cases |
| regression-test/data/external_table_p0/iceberg/action/test_iceberg_execute_actions.out | Updates golden output for new WAP queries |
| fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/action/IcebergPublishChangesAction.java | New FE action implementation for publish_changes |
| fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/action/IcebergExecuteActionFactory.java | Registers publish_changes as a supported Iceberg action |
| docker/thirdparties/docker-compose/iceberg/scripts/create_preinstalled_scripts/iceberg/run23.sql | Preinstalls WAP/non-WAP Iceberg tables & staged snapshot for regression testing |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import org.apache.doris.common.UserException; | ||
| import org.apache.doris.datasource.ExternalTable; | ||
| import org.apache.doris.datasource.iceberg.IcebergExternalTable; | ||
| import org.apache.doris.info.PartitionNamesInfo; |
Comment on lines
+551
to
+640
| // Test Case 6: publish_changes action with WAP (Write-Audit-Publish) pattern | ||
| // Simplified workflow: | ||
| // | ||
| // - Main branch is initially empty (0 rows) | ||
| // - A WAP snapshot exists with wap.id = "test_wap_001" and 2 rows | ||
| // - publish_changes should cherry-pick the WAP snapshot into the main branch | ||
| // ===================================================================================== | ||
|
|
||
| logger.info("Starting simplified WAP (Write-Audit-Publish) workflow verification test") | ||
|
|
||
| // WAP test database and table | ||
| String wap_db = "wap_test" | ||
| String wap_table = "orders_wap" | ||
|
|
||
| // Step 1: Verify no data is visible before publish_changes | ||
| logger.info("Step 1: Verifying table is empty before publish_changes") | ||
| qt_wap_before_publish """ | ||
| SELECT order_id, customer_id, amount, order_date | ||
| FROM ${catalog_name}.${wap_db}.${wap_table} | ||
| ORDER BY order_id | ||
| """ | ||
|
|
||
| // Step 2: Publish the WAP changes with wap_id = "test_wap_001" | ||
| logger.info("Step 2: Publishing WAP changes with wap_id=test_wap_001") | ||
| sql """ | ||
| ALTER TABLE ${catalog_name}.${wap_db}.${wap_table} | ||
| EXECUTE publish_changes("wap_id" = "test_wap_001") | ||
| """ | ||
| logger.info("Publish changes executed successfully") | ||
|
|
||
| // Step 3: Verify WAP data is visible after publish_changes | ||
| logger.info("Step 3: Verifying WAP data is visible after publish_changes") | ||
| qt_wap_after_publish """ | ||
| SELECT order_id, customer_id, amount, order_date | ||
| FROM ${catalog_name}.${wap_db}.${wap_table} | ||
| ORDER BY order_id | ||
| """ | ||
|
|
||
| logger.info("Simplified WAP (Write-Audit-Publish) workflow verification completed successfully") | ||
|
|
||
| // Negative tests for publish_changes | ||
|
|
||
| // publish_changes on table without write.wap.enabled = true (should fail) | ||
| test { | ||
| String nonWapDb = "wap_test" | ||
| String nonWapTable = "orders_non_wap" | ||
|
|
||
| sql """ | ||
| ALTER TABLE ${catalog_name}.${nonWapDb}.${nonWapTable} | ||
| EXECUTE publish_changes("wap_id" = "test_wap_001") | ||
| """ | ||
| exception "Cannot find snapshot with wap.id = test_wap_001" | ||
| } | ||
|
|
||
|
|
||
| // publish_changes with missing wap_id (should fail) | ||
| test { | ||
| sql """ | ||
| ALTER TABLE ${catalog_name}.${db_name}.${table_name} | ||
| EXECUTE publish_changes () | ||
| """ | ||
| exception "Missing required argument: wap_id" | ||
| } | ||
|
|
||
| // publish_changes with invalid wap_id (should fail) | ||
| test { | ||
| sql """ | ||
| ALTER TABLE ${catalog_name}.${wap_db}.${wap_table} | ||
| EXECUTE publish_changes("wap_id" = "non_existing_wap_id") | ||
| """ | ||
| exception "Cannot find snapshot with wap.id = non_existing_wap_id" | ||
| } | ||
|
|
||
| // publish_changes with partition specification (should fail) | ||
| test { | ||
| sql """ | ||
| ALTER TABLE ${catalog_name}.${db_name}.${table_name} | ||
| EXECUTE publish_changes ("wap_id" = "test_wap_001") PARTITIONS (part1) | ||
| """ | ||
| exception "Action 'publish_changes' does not support partition specification" | ||
| } | ||
|
|
||
| // publish_changes with WHERE condition (should fail) | ||
| test { | ||
| sql """ | ||
| ALTER TABLE ${catalog_name}.${db_name}.${table_name} | ||
| EXECUTE publish_changes ("wap_id" = "test_wap_001") WHERE id > 0 | ||
| """ | ||
| exception "Action 'publish_changes' does not support WHERE condition" | ||
| } |
| INSERT INTO orders_wap VALUES | ||
| (1, 103, 150.00, '2025-12-03'), | ||
| (2, 104, 320.25, '2025-12-04'); | ||
|
|
Contributor
Author
|
run buildall |
…bles (apache#58755) - **Issue Number**: part of apache#58199 - **Related PR**: N/A Problem Summary: This PR implements the `publish_changes` action for Iceberg tables. This action serves as the "Publish" step in the Write-Audit-Publish (WAP) pattern. The procedure locates a snapshot tagged with a specific `wap.id` property and cherry-picks it into the current table state. This allows users to atomically make "staged" data visible after validation. Syntax: ```sql EXECUTE TABLE catalog.db.table_name publish_changes("wap_id" = "batch_123"); ```` Output: Returns `previous_snapshot_id` (STRING) and `current_snapshot_id` (STRING) indicating the state transition. Use cases: 1. Implement Write-Audit-Publish (WAP) workflows. 2. Atomically publish validated data to the main branch. 3. Manage staged snapshots based on custom WAP IDs. Co-authored-by: Chenjunwei <chenjunwei@ChenjunweideMacBook-Pro.local>
cea928c to
af87e5f
Compare
Contributor
Author
|
run buildall |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick #58755 to branch-4.1
What problem does this PR solve?
Implements the
publish_changesaction for Iceberg tables (WAP pattern).Cherry-pick commit
cea928c7a62- Feature Implement publish_changes procedure for Iceberg tables ([Feature](Iceberg) Implement publish_changes procedure for Iceberg tables #58755)🤖 Generated with Claude Code