HIVE-28127: Exception when rebuilding materialized view with calculated columns on iceberg sources #5143
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Incremental materialized view rebuild plan has a step when the AST built from the optimized Calcite plan is transformed to an AST which represents a multi insert statement when the MV definition has aggregate and no record were deleted from the source tables since the last rebuild. To update existing values in the MV one of the branches of the multi insert is writing delete delta another is writing insert delta files. In case of Iceberg source tables the one which writes the delete delta needs the old values from the MV. If the MV definition has aggregate functions which return value can be calculated from other projected expressions such columns are not projected from the MV in the incremental rebuild plan.
Example MV from
mv_iceberg_orc5.q
In the AST built from Calcite plan the branch scanning the MV does not project
avg
But it is referenced later in the multi insert plan's delete branch (see select expression
$hdt$_0._c4
):Why are the changes needed?
Old values coming from the MV are necessary for delete Iceberg writers but the current plan does not project calculated column hence Exception is thrown at compile time when generating Hive operator plan.
Does this PR introduce any user-facing change?
No.
Is the change a dependency upgrade?
No.
How was this patch tested?