Skip to content

HIVE-29506: HiveRelDecorrelator fails to decorrelate plan with Values operator#6360

Open
kasakrisz wants to merge 3 commits intoapache:masterfrom
kasakrisz:HIVE-29506-master-decorrelate-values
Open

HIVE-29506: HiveRelDecorrelator fails to decorrelate plan with Values operator#6360
kasakrisz wants to merge 3 commits intoapache:masterfrom
kasakrisz:HIVE-29506-master-decorrelate-values

Conversation

@kasakrisz
Copy link
Contributor

@kasakrisz kasakrisz commented Mar 13, 2026

What changes were proposed in this pull request?

Remove the special handling of the Values operator from HiveRelDecorrelator

Why are the changes needed?

The specialization returned null, signaling the decorrelator algorithm to stop and return the original plan. Consequently, Correlate operators may remain in the plan, which Hive cannot handle.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

mvn test -Dtest.output.overwrite -Dtest=TestMiniLlapLocalCliDriver -Dqfile=empty_result_correlate2.q -pl itests/qtest -Pitests

@kasakrisz kasakrisz marked this pull request as draft March 13, 2026 13:48
@kasakrisz kasakrisz force-pushed the HIVE-29506-master-decorrelate-values branch from c1e7433 to 3331420 Compare March 17, 2026 12:04
@kasakrisz kasakrisz marked this pull request as ready for review March 17, 2026 12:10
-- To create a HiveValues operator before decorrelation automatic query rewrite with
-- materialized views is used.
-- All tables must be transactional to trigger automatic query rewrite.
create table mv_source (any_col int) stored as orc TBLPROPERTIES ('transactional'='true');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure we should be adding new tests with ACID table format, would it work with iceberg as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug should occur when decorrelating any plan with a Values node. How about adding a unit test in a new class HiveRelDecorrelatorTest instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Thomas.
I removed this q test and added a unit test focusing on testing HiveRelDecorrelator only because at this point the table and storage format doesn't matter.

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, I just have some small questions.

@sonarqubecloud
Copy link

Copy link
Contributor

@thomasrebele thomasrebele left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some minor comments.



@Test
public void testDecorrelateCorrelateIsRemovedWhenPlanHasEmptyValues() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RelOptUtil.toString(decorrelatedPlan) and RelOptUtil.toString(base) appear repeatedly. Assigning them to a variable saves a few computations.


Assert.assertFalse("Plan after decorrelation still has correlate: \n" + RelOptUtil.toString(decorrelatedPlan),
RelOptUtil.toString(decorrelatedPlan).contains("Correlate"));
Assert.assertTrue("Plan after decorrelation does not contain Values: \n" + RelOptUtil.toString(decorrelatedPlan),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend to dump the before and after plan into a comment. The plans are reasonably small and they help to understand what's going on.

.union(false)
.build();

Assert.assertTrue("Plan before decorrelation does not contain correlate: \n" + RelOptUtil.toString(base),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit confused by the messages, until I realized that they apply only in the case of an assertion error.

How about reformulating them to

  • "Plan before decorrelation must contain a Correlate node"
  • "Plan before decorrelation must contain a Values node"
  • "Plan after decorrelation may not contain a Correlate node"
  • "Plan after decorrelation must still contain a Values node"

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I left a bunch of nits that we don't necessarily need to address. However, it would be nice to treat some of the Sonar issues related to unused and wildcard imports, line length. Feel free to merge it as you see fit.



@Test
public void testDecorrelateCorrelateIsRemovedWhenPlanHasEmptyValues() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the whole test is about the decorrelator the prefix testDecorrelate is redundant and combined with the rest of the line it makes the test name hard to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to add Union in the test name since empty values could appear in many different places and apparently the decorrelator is not handling everything.

Comment on lines +63 to +73
Assert.assertTrue("Plan before decorrelation does not contain correlate: \n" + RelOptUtil.toString(base),
RelOptUtil.toString(base).contains("Correlate"));
Assert.assertTrue("Plan before decorrelation does not contain Values: \n" + RelOptUtil.toString(base),
RelOptUtil.toString(base).contains("Values"));

RelNode decorrelatedPlan = HiveRelDecorrelator.decorrelateQuery(base);

Assert.assertFalse("Plan after decorrelation still has correlate: \n" + RelOptUtil.toString(decorrelatedPlan),
RelOptUtil.toString(decorrelatedPlan).contains("Correlate"));
Assert.assertTrue("Plan after decorrelation does not contain Values: \n" + RelOptUtil.toString(decorrelatedPlan),
RelOptUtil.toString(decorrelatedPlan).contains("Values"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the usual (full) plan comparisons which are used when testing rules and transformations easier to follow. They may require more frequent updates but seeing the plan before and after makes the test easier to follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants