Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6109] Avoid extra loops when optimizing statements with ternary expressions #3518

Closed
wants to merge 1 commit into from

Conversation

asdfgh19
Copy link

jira: CALCITE-6109

When there are statements with ternary expressions in BlockBuilder#statements, the optimize method always returns true, causing the loop in the toBlock method to always be executed 10 times.

The reason is that when OptimizeShuttle traverses the statement, a new instance of TernaryExpression will always be created, regardless of whether the optimization is actually performed.

@@ -152,6 +154,25 @@ void checkAssignInConditionOptimizedOut(int modifiers, String s) {
assertThat(Expressions.toString(builder.toBlock()), is(s));
}

@Test void testInlineTernaryNonOptimized() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's customary to add a Javadoc comment indicating the issue which is being solved. Please follow the format of the many examples existing, the tools are very particular about the format, including the final period.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's customary to add a Javadoc comment indicating the issue which is being solved. Please follow the format of the many examples existing, the tools are very particular about the format, including the final period.

Thanks for your suggestion, I have added Javadoc comment on the test case.

@mihaibudiu
Copy link
Contributor

This seems to affect the output of at least one Druid test.

@JiajunBernoulli JiajunBernoulli added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Nov 13, 2023
@JiajunBernoulli
Copy link
Contributor

Failed tests are not related to changes.

Please rerun the CI.

@JiajunBernoulli JiajunBernoulli removed the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Nov 13, 2023
@asdfgh19
Copy link
Author

Failed tests are not related to changes.

Please rerun the CI.

Thanks for helping to review the code.

@JiajunBernoulli
Copy link
Contributor

Oops! Druid Tests Failed again.

  1. https://github.com/apache/calcite/actions/runs/6840930774/job/18600708940
  2. https://github.com/apache/calcite/actions/runs/6845913981/job/18611830618?pr=3518
org.apache.calcite.test.DruidAdapter2IT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() failure marker
FAILURE   0.2sec, org.apache.calcite.test.DruidAdapter2IT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics()
    java.lang.AssertionError: 
    Expected: "store_state=CA; brand_name=King; A=21.4632\nstore_state=OR; brand_name=Symphony; A=32.176\nstore_state=CA; brand_name=Toretti; A=32.2465\nstore_state=WA; brand_name=King; A=34.6104\nstore_state=OR; brand_name=Toretti; A=36.3"
         but: was "store_state=OR; brand_name=ADJ; A=83.8764\nstore_state=WA; brand_name=Akron; A=85.8402\nstore_state=OR; brand_name=American; A=86.7898\nstore_state=WA; brand_name=ADJ; A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076"
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at org.apache.calcite.test.CalciteAssert.lambda$checkResult$6(CalciteAssert.java:453)
        at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:582)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1495)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1434)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1493)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1483)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returnsOrdered(CalciteAssert.java:1509)
        at org.apache.calcite.test.DruidAdapter2IT.testInterleaveBetweenAggregateAndGroupOrderByOnMetrics(DruidAdapter2IT.java:2025)
        Suppressed: org.apache.calcite.util.TestUtil$ExtraInformation: With materializationsEnabled=false, limit=-1, sql=select "store_state", "brand_name", "A" from (
          select sum("store_sales")-sum("store_cost") as a, "store_state", "brand_name"
          from "foodmart"
          group by "store_state", "brand_name" ) subq
        order by "A" limit 5
            at app//org.apache.calcite.util.TestUtil.rethrow(TestUtil.java:389)
            at app//org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:598)
            ... 6 more

FAILURE  28.0sec,  210 completed,   1 failed,   6 skipped, org.apache.calcite.test.DruidAdapter2IT
org.apache.calcite.test.DruidAdapterIT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() failure marker
FAILURE   0.2sec, org.apache.calcite.test.DruidAdapterIT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics()
    java.lang.AssertionError: 
    Expected: "store_state=CA; brand_name=King; A=21.4632\nstore_state=OR; brand_name=Symphony; A=32.176\nstore_state=CA; brand_name=Toretti; A=32.2465\nstore_state=WA; brand_name=King; A=34.6104\nstore_state=OR; brand_name=Toretti; A=36.3"
         but: was "store_state=OR; brand_name=ADJ; A=83.8764\nstore_state=WA; brand_name=Akron; A=85.8402\nstore_state=OR; brand_name=American; A=86.7898\nstore_state=WA; brand_name=ADJ; A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076"
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at org.apache.calcite.test.CalciteAssert.lambda$checkResult$6(CalciteAssert.java:453)
        at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:582)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1495)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1434)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1493)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1483)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returnsOrdered(CalciteAssert.java:1509)
        at org.apache.calcite.test.DruidAdapterIT.testInterleaveBetweenAggregateAndGroupOrderByOnMetrics(DruidAdapterIT.java:2[336](https://github.com/apache/calcite/actions/runs/6845913981/job/18611830618?pr=3518#step:8:343))
        Suppressed: org.apache.calcite.util.TestUtil$ExtraInformation: With materializationsEnabled=false, limit=-1, sql=select "store_state", "brand_name", "A" from (
          select sum("store_sales")-sum("store_cost") as a, "store_state", "brand_name"
          from "foodmart"
          group by "store_state", "brand_name" ) subq
        order by "A" limit 5
            at app//org.apache.calcite.util.TestUtil.rethrow(TestUtil.java:389)
            at app//org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:598)
            ... 6 more

          3.4sec, org.apache.calcite.test.DruidAdapterIT > testFilterClauseWithNoConjunction()
          2.6sec, org.apache.calcite.test.DruidAdapterIT > testSelectTimestampColumnNoTables4()
FAILURE  37.3sec,  235 completed,   1 failed,   6 skipped, org.apache.calcite.test.DruidAdapterIT
FAILURE  38.2sec,  453 completed,   2 failed,  12 skipped, Gradle Test Run :druid:test

@asdfgh19 asdfgh19 force-pushed the CALCITE-6109 branch 2 times, most recently from 6fa88a3 to 70c3a62 Compare November 13, 2023 06:54
@asdfgh19
Copy link
Author

Oops! Druid Tests Failed again.

  1. https://github.com/apache/calcite/actions/runs/6840930774/job/18600708940
  2. https://github.com/apache/calcite/actions/runs/6845913981/job/18611830618?pr=3518
org.apache.calcite.test.DruidAdapter2IT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() failure marker
FAILURE   0.2sec, org.apache.calcite.test.DruidAdapter2IT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics()
    java.lang.AssertionError: 
    Expected: "store_state=CA; brand_name=King; A=21.4632\nstore_state=OR; brand_name=Symphony; A=32.176\nstore_state=CA; brand_name=Toretti; A=32.2465\nstore_state=WA; brand_name=King; A=34.6104\nstore_state=OR; brand_name=Toretti; A=36.3"
         but: was "store_state=OR; brand_name=ADJ; A=83.8764\nstore_state=WA; brand_name=Akron; A=85.8402\nstore_state=OR; brand_name=American; A=86.7898\nstore_state=WA; brand_name=ADJ; A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076"
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at org.apache.calcite.test.CalciteAssert.lambda$checkResult$6(CalciteAssert.java:453)
        at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:582)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1495)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1434)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1493)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1483)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returnsOrdered(CalciteAssert.java:1509)
        at org.apache.calcite.test.DruidAdapter2IT.testInterleaveBetweenAggregateAndGroupOrderByOnMetrics(DruidAdapter2IT.java:2025)
        Suppressed: org.apache.calcite.util.TestUtil$ExtraInformation: With materializationsEnabled=false, limit=-1, sql=select "store_state", "brand_name", "A" from (
          select sum("store_sales")-sum("store_cost") as a, "store_state", "brand_name"
          from "foodmart"
          group by "store_state", "brand_name" ) subq
        order by "A" limit 5
            at app//org.apache.calcite.util.TestUtil.rethrow(TestUtil.java:389)
            at app//org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:598)
            ... 6 more

FAILURE  28.0sec,  210 completed,   1 failed,   6 skipped, org.apache.calcite.test.DruidAdapter2IT
org.apache.calcite.test.DruidAdapterIT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics() failure marker
FAILURE   0.2sec, org.apache.calcite.test.DruidAdapterIT > testInterleaveBetweenAggregateAndGroupOrderByOnMetrics()
    java.lang.AssertionError: 
    Expected: "store_state=CA; brand_name=King; A=21.4632\nstore_state=OR; brand_name=Symphony; A=32.176\nstore_state=CA; brand_name=Toretti; A=32.2465\nstore_state=WA; brand_name=King; A=34.6104\nstore_state=OR; brand_name=Toretti; A=36.3"
         but: was "store_state=OR; brand_name=ADJ; A=83.8764\nstore_state=WA; brand_name=Akron; A=85.8402\nstore_state=OR; brand_name=American; A=86.7898\nstore_state=WA; brand_name=ADJ; A=97.6488\nstore_state=CA; brand_name=ADJ; A=98.0076"
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
        at org.apache.calcite.test.CalciteAssert.lambda$checkResult$6(CalciteAssert.java:453)
        at org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:582)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.lambda$returns$1(CalciteAssert.java:1495)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.withConnection(CalciteAssert.java:1434)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1493)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returns(CalciteAssert.java:1483)
        at org.apache.calcite.test.CalciteAssert$AssertQuery.returnsOrdered(CalciteAssert.java:1509)
        at org.apache.calcite.test.DruidAdapterIT.testInterleaveBetweenAggregateAndGroupOrderByOnMetrics(DruidAdapterIT.java:2[336](https://github.com/apache/calcite/actions/runs/6845913981/job/18611830618?pr=3518#step:8:343))
        Suppressed: org.apache.calcite.util.TestUtil$ExtraInformation: With materializationsEnabled=false, limit=-1, sql=select "store_state", "brand_name", "A" from (
          select sum("store_sales")-sum("store_cost") as a, "store_state", "brand_name"
          from "foodmart"
          group by "store_state", "brand_name" ) subq
        order by "A" limit 5
            at app//org.apache.calcite.util.TestUtil.rethrow(TestUtil.java:389)
            at app//org.apache.calcite.test.CalciteAssert.assertQuery(CalciteAssert.java:598)
            ... 6 more

          3.4sec, org.apache.calcite.test.DruidAdapterIT > testFilterClauseWithNoConjunction()
          2.6sec, org.apache.calcite.test.DruidAdapterIT > testSelectTimestampColumnNoTables4()
FAILURE  37.3sec,  235 completed,   1 failed,   6 skipped, org.apache.calcite.test.DruidAdapterIT
FAILURE  38.2sec,  453 completed,   2 failed,  12 skipped, Gradle Test Run :druid:test

I saw the same error in some other PRs, and after I rerun the CI, the tests passed.

@JiajunBernoulli
Copy link
Contributor

@asdfgh19
I created JIRA ticket for Druid tests:https://issues.apache.org/jira/browse/CALCITE-6093

@asdfgh19
Copy link
Author

@asdfgh19 I created JIRA ticket for Druid tests:https://issues.apache.org/jira/browse/CALCITE-6093

OK, I will try to working on it latter.

Expressions.parameter(int.class, "c")));
b.add(originStatement);
BlockStatement block = b.toBlock();
assertEquals(block.statements.size(), 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use assertThat rather than assertEquals and assertSame. people get the args in the wrong order. as you did.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that assertSame and assertEquals easily make people confused which is expected and which is actual? So I need to change to assertThat, right?

b.add(originStatement);
BlockStatement block = b.toBlock();
assertEquals(block.statements.size(), 1);
assertSame(originStatement, block.statements.get(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why you require assertSame is rather subtle. and it's the whole point of the test. so, say why.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the Javadoc comment of the BlockBuilder#optimize method indicates that this method returns whether any optimizations were made, and the basis for its judgment is whether the original statement and the statement optimized by InlineVariableVisitor and OptimizeShuttle are the same instance.

Moreover, in the Shuttle class, all visit methods will check the corresponding statement. If the fields of the statement have not been modified, the original statement will be returned.

So I wrote this test case as simple as possible, which only includes one line of code that ensures the conventions of the BlockBuilder and Shuttle classes.

Copy link

sonarcloud bot commented Nov 15, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

100.0% 100.0% Coverage
0.0% 0.0% Duplication

julianhyde added a commit to julianhyde/calcite that referenced this pull request Nov 17, 2023
… of TernaryExpression if it does not do any optimization

Before this change, OptimizeShuttle might return a different
Expression object that is equivalent to the original; this
would cause the optimizer to loop, mistakenly believing that
optimizations were occurring, which was inefficient.

After this change, the OptimizeShuttle avoids extra loops
when optimizing statements with ternary expressions.

Close apache#3518
github-actions bot pushed a commit to julianhyde/calcite that referenced this pull request Nov 17, 2023
… of TernaryExpression if it does not do any optimization

Before this change, OptimizeShuttle might return a different
Expression object that is equivalent to the original; this
would cause the optimizer to loop, mistakenly believing that
optimizations were occurring, which was inefficient.

After this change, the OptimizeShuttle avoids extra loops
when optimizing statements with ternary expressions.

Close apache#3518
julianhyde added a commit to julianhyde/calcite that referenced this pull request Nov 20, 2023
… of TernaryExpression if it does not do any optimization

Before this change, OptimizeShuttle might return a different
Expression object that is equivalent to the original; this
would cause the optimizer to loop, mistakenly believing that
optimizations were occurring, which was inefficient.

After this change, the OptimizeShuttle avoids extra loops
when optimizing statements with ternary expressions.

Close apache#3518
@asfgit asfgit closed this in b74c934 Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants