[DO NOT MERGE][TEST ONLY] Add once-policy rule check #22060

maryannxue · 2018-08-09T17:29:22Z

What changes were proposed in this pull request?

Rules like HandleNullInputsForUDF (https://issues.apache.org/jira/browse/SPARK-24891) do not stabilize (can apply new changes to a plan indefinitely) and can cause problems like SQL cache mismatching.
Ideally, all rules whether in a once-policy batch or a fixed-point-policy batch should stabilize after the number of runs specified. Once-policy should be considered a performance improvement, a assumption that the rule can stabilize after just one run rather than an assumption that the rule won't be applied more than once. Those once-policy rules should be able to run fine with fixed-point policy rule as well.
Currently we already have a check for fixed-point and throws an exception if maximum number of runs is reached and the plan is still changing. Here, in this PR, a similar check is added for once-policy and throws an exception if the plan changes between the first run and the second run of a once-policy rule.

From this test result, we can find out which of the analysis rules break this check so we can fix them later.

How was this patch tested?

N/A

SparkQA · 2018-08-09T20:04:42Z

Test build #94513 has finished for PR 22060 at commit 3236568.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2018-08-09T20:55:07Z

AliasViewChild is the only rule? Can we whitelist it first? It sounds like many tests are skipped.

maryannxue · 2018-08-10T02:57:46Z

retest this please

SparkQA · 2018-08-10T05:05:21Z

Test build #94540 has finished for PR 22060 at commit 3236568.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun · 2018-09-11T06:07:58Z

It seems that the testing is finished. :)
If you don't mind, shall we close this test PR, @maryannxue ?

dongjoon-hyun · 2018-09-16T04:52:10Z

Gentle ping, @maryannxue .

dongjoon-hyun · 2018-09-27T17:41:41Z

Ping, @maryannxue .

gatorsmile · 2018-09-28T07:12:45Z

@maropu Are you willing to take this over?

maropu · 2018-09-28T07:18:10Z

Yea, I can next week (I'm now in Canada and I'm going back to Japan now...)

maryannxue · 2018-09-28T15:38:05Z

Sorry for the late reply. The purpose of this is to find out the rules that violate the once-policy assumption and also tests that can reproduce the issues. I think we should eventually turn this check on after we've fixed all those rules and extend this check to optimizer too.

maropu · 2018-10-04T02:15:43Z

@maryannxue ah, I see. Do you still keep working on this?

maryannxue · 2018-10-04T14:37:20Z

retest this please

maryannxue · 2018-10-04T14:38:49Z

@maropu I'll follow up on this. I started the test again and I'll keep track of "which rules violate the assumption" and "which tests can reproduce the violation" in this PR.

SparkQA · 2018-10-04T15:22:50Z

Test build #96940 has finished for PR 22060 at commit 3236568.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-05T19:05:47Z

Test build #97002 has finished for PR 22060 at commit 7986a43.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-05T22:54:12Z

Test build #97009 has finished for PR 22060 at commit d595a0c.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-06T05:33:42Z

Test build #97023 has finished for PR 22060 at commit 7fc1d11.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2018-10-22T14:18:06Z

Test build #97840 has started for PR 22060 at commit 7fc1d11.

SparkQA · 2018-10-22T16:20:54Z

Test build #97875 has started for PR 22060 at commit 7fc1d11.

AmplabJenkins · 2018-10-22T16:36:50Z

Merged build finished. Test FAILed.

…o compare attributes ## What changes were proposed in this pull request? When we compare attributes, in general, we should always refer to semantic equality, as the default `equal` method can return false when there are "cosmetic" differences between them, but still they are the same thing; at least we have to consider them so when analyzing/optimizing queries. The PR focuses on the usage and comparison of the `output` of a `LogicalPlan`, which is a `Seq[Attribute]` in `AliasViewChild`. In this case, using equality implicitly fails to check the semantic equality. This results in the operator failing to stabilize. ## How was this patch tested? running the tests with the patch provided by maryannxue in #22060 Closes #22713 from mgaido91/SPARK-25691. Authored-by: Marco Gaido <marcogaido91@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

HyukjinKwon · 2018-11-11T03:30:10Z

hey @maryannxue, where are we here? Let's close this if it's going to be inactive a couple of weeks.

maryannxue · 2018-11-11T03:41:35Z

Thank you for reminding me, @HyukjinKwon! And thanks to @mgaido91's contribution, this has been fixed already.

…o compare attributes ## What changes were proposed in this pull request? When we compare attributes, in general, we should always refer to semantic equality, as the default `equal` method can return false when there are "cosmetic" differences between them, but still they are the same thing; at least we have to consider them so when analyzing/optimizing queries. The PR focuses on the usage and comparison of the `output` of a `LogicalPlan`, which is a `Seq[Attribute]` in `AliasViewChild`. In this case, using equality implicitly fails to check the semantic equality. This results in the operator failing to stabilize. ## How was this patch tested? running the tests with the patch provided by maryannxue in apache#22060 Closes apache#22713 from mgaido91/SPARK-25691. Authored-by: Marco Gaido <marcogaido91@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

Add once-policy batch check

3236568

Whitelist batch UDF

7986a43

Whitelist batch View

d595a0c

Whitelist fix

7fc1d11

mgaido91 mentioned this pull request Oct 13, 2018

[SPARK-25691][SQL] Use semantic equality in AliasViewChild in order to compare attributes #22713

Closed

maryannxue closed this Nov 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DO NOT MERGE][TEST ONLY] Add once-policy rule check #22060

[DO NOT MERGE][TEST ONLY] Add once-policy rule check #22060

maryannxue commented Aug 9, 2018

SparkQA commented Aug 9, 2018

gatorsmile commented Aug 9, 2018

maryannxue commented Aug 10, 2018

SparkQA commented Aug 10, 2018

dongjoon-hyun commented Sep 11, 2018

dongjoon-hyun commented Sep 16, 2018

dongjoon-hyun commented Sep 27, 2018

gatorsmile commented Sep 28, 2018

maropu commented Sep 28, 2018

maryannxue commented Sep 28, 2018

maropu commented Oct 4, 2018

maryannxue commented Oct 4, 2018

maryannxue commented Oct 4, 2018

SparkQA commented Oct 4, 2018

SparkQA commented Oct 5, 2018

SparkQA commented Oct 5, 2018

SparkQA commented Oct 6, 2018

SparkQA commented Oct 22, 2018

SparkQA commented Oct 22, 2018

AmplabJenkins commented Oct 22, 2018

HyukjinKwon commented Nov 11, 2018

maryannxue commented Nov 11, 2018

[DO NOT MERGE][TEST ONLY] Add once-policy rule check #22060

[DO NOT MERGE][TEST ONLY] Add once-policy rule check #22060

Conversation

maryannxue commented Aug 9, 2018

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Aug 9, 2018

gatorsmile commented Aug 9, 2018

maryannxue commented Aug 10, 2018

SparkQA commented Aug 10, 2018

dongjoon-hyun commented Sep 11, 2018

dongjoon-hyun commented Sep 16, 2018

dongjoon-hyun commented Sep 27, 2018

gatorsmile commented Sep 28, 2018

maropu commented Sep 28, 2018

maryannxue commented Sep 28, 2018

maropu commented Oct 4, 2018

maryannxue commented Oct 4, 2018

maryannxue commented Oct 4, 2018

SparkQA commented Oct 4, 2018

SparkQA commented Oct 5, 2018

SparkQA commented Oct 5, 2018

SparkQA commented Oct 6, 2018

SparkQA commented Oct 22, 2018

SparkQA commented Oct 22, 2018

AmplabJenkins commented Oct 22, 2018

HyukjinKwon commented Nov 11, 2018

maryannxue commented Nov 11, 2018