[SPARK-22494][SQL] Fix 64KB limit exception with Coalesce and AtleastNNonNulls #19720

mgaido91 · 2017-11-10T22:00:45Z

What changes were proposed in this pull request?

Both Coalesce and AtLeastNNonNulls can cause the 64KB limit exception when used with a lot of arguments and/or complex expressions.
This PR splits their expressions in order to avoid the issue.

How was this patch tested?

Added UTs

…NNonNulls

SparkQA · 2017-11-10T22:38:12Z

Test build #83707 has finished for PR 19720 at commit f0edd7e.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-11-11T11:33:46Z

Test build #83719 has finished for PR 19720 at commit 1722d12.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-11-12T09:52:57Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

+    ev.copy(code = s"""
+      ${ev.isNull} = true;
+      ${ev.value} = ${ctx.defaultValue(dataType)};
+      ${ctx.splitExpressions(ctx.INPUT_ROW, evals)}""")


We can only split the expressions when they are created from a row object.

sorry, I don't see which is the problem here. I see that here the row object is null, but the goal is to set ev.isNull and ev.value, which is done. May you explain me if there is something I am missing? Thanks.

splitExpressions puts expression codes in individual functions. Only if those expressions's input is from a row object, we can do this. If their input is from currentVars, the splitting doesn't work.

It won't cause problem in fact. In that case, it works as before, i.e., expressions.mkString("\n").

This is the reason why I put ev.isNull and ev.value as attributes of the generated class, in this way they can be used as before in the individual functions. If you want, I can try and use the other overloaded method splitExpressions without passing the input row to it.

Ok. nvm. looks good.

viirya · 2017-11-12T09:53:51Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

-    }.mkString("\n")
+    }
+
+    val code = ctx.splitExpressions(ctx.INPUT_ROW, evals)


viirya · 2017-11-12T13:40:33Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

-      ${ctx.javaType(dataType)} ${ev.value} = ${firstEval.value};""" +
-      rest.map { e =>
+    ctx.addMutableState("boolean", ev.isNull, s"")
+    ctx.addMutableState(ctx.javaType(dataType), ev.value, s"")


viirya · 2017-11-12T13:41:13Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

@@ -357,7 +358,8 @@ case class AtLeastNNonNulls(n: Int, children: Seq[Expression]) extends Predicate

  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    val nonnull = ctx.freshName("nonnull")
-    val code = children.map { e =>
+    ctx.addMutableState("int", nonnull, s"")


thanks! Anyway, I am refactoring this, since I figured out a way to avoid the declaration of a global attribute. I can't do the same for coalesce unfortunately, because there I'd need to return two values from the methods.

viirya · 2017-11-12T13:41:40Z

LGTM

SparkQA · 2017-11-12T14:45:40Z

Test build #83742 has finished for PR 19720 at commit 911e172.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

This reverts commit 911e172.

SparkQA · 2017-11-12T18:14:13Z

Test build #83743 has finished for PR 19720 at commit e8320d6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

kiszk · 2017-11-12T19:27:37Z

LGTM

SparkQA · 2017-11-13T00:10:25Z

Test build #83754 has finished for PR 19720 at commit 924aab9.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-11-13T12:19:14Z

Test build #83781 has finished for PR 19720 at commit 423245e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mgaido91 · 2017-11-13T14:47:32Z

I reviewed the PR according to what came out on the conversations on the other PRs by @kiszk .

@viirya @kiszk May I kindly ask you to review it again now? Thank you very much.

SparkQA · 2017-11-13T20:54:58Z

Test build #83797 has finished for PR 19720 at commit a548ddb.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-11-14T08:32:05Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

+          """
+        },
+        foldFunctions = { funcCalls =>
+          funcCalls.map { funcCall => s"$nonnull = $funcCall;" }.mkString("\n")


funcCalls.map(funcCall => s"$nonnull = $funcCall;").mkString("\n")

viirya · 2017-11-14T08:36:39Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

+    val code = if (ctx.INPUT_ROW == null || ctx.currentVars != null) {
+      evals.mkString("\n")
+    } else {
+      ctx.splitExpressions(evals, "atLeastNNonNull",


nit: atLeastNNonNulls

viirya · 2017-11-14T08:40:09Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

@@ -357,7 +358,7 @@ case class AtLeastNNonNulls(n: Int, children: Seq[Expression]) extends Predicate

  override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
    val nonnull = ctx.freshName("nonnull")


Does it really matter not to have nonnull as a global variable? The code is simpler if it is a global one.

I think it does, because having it as a global variable may put more pressure on the constant pool (see SPARK-18016). Thus, whenever feasible, I do think that we should keep it local.

SparkQA · 2017-11-14T13:19:50Z

Test build #83843 has finished for PR 19720 at commit 3a5c683.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2017-11-14T15:30:17Z

LGTM

kiszk · 2017-11-14T15:56:31Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/nullExpressions.scala

-      boolean ${ev.isNull} = ${firstEval.isNull};
-      ${ctx.javaType(dataType)} ${ev.value} = ${firstEval.value};""" +
-      rest.map { e =>
+    ctx.addMutableState("boolean", ev.isNull, "")


Can we ensure ev.isNull always has a variable name? In other words, ev.isNull never has true or false.

in the previous code there is the same assumption, thus yes.

This is guaranteed by Expression.genCode.

…NNonNulls ## What changes were proposed in this pull request? Both `Coalesce` and `AtLeastNNonNulls` can cause the 64KB limit exception when used with a lot of arguments and/or complex expressions. This PR splits their expressions in order to avoid the issue. ## How was this patch tested? Added UTs Author: Marco Gaido <marcogaido91@gmail.com> Author: Marco Gaido <mgaido@hortonworks.com> Closes #19720 from mgaido91/SPARK-22494. (cherry picked from commit 4e7f07e) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

cloud-fan · 2017-11-16T17:20:01Z

thanks, merging to master/2.2!

mgaido91 · 2017-11-16T17:24:43Z

@cloud-fan please do not backport this to 2.2. In 2.2 we don't have SPARK-18016 and this is adding new variables in the case of coalesce. Thus it can generate an higher pressure on the constant pool and this may even cause a regression IMHO.

cloud-fan · 2017-11-16T17:27:17Z

hmm, isn't running slower better than can't run?

mgaido91 · 2017-11-16T17:59:30Z

It's not about running slower. This PR solves the problem which makes the user facing an exception if there are a lot of arguments in coalesce (or AtLestNNonNulls), but what I am doing in the coalesce function here is that I am adding to variables for each coalesce function. If there is a query with a lot of coalesce function (instead of a coalesce with a lot of parameters), this might result in having much more variables than before. This can cause the problem and the exception described in SPARK-18016. Thus a query that was previously running can fail.

The same thing is true for all the other PRs similar to this one submitted by @kiszk. Then, we should keep all these changes only on master, where part of SPARK-18016 is landing and hopefully soon it will be completely solved.

cloud-fan · 2017-11-16T20:18:49Z

If there is a query with a lot of coalesce function, wouldn't it hit the 64kb issue?

mgaido91 · 2017-11-16T22:37:35Z

No, a query with a coalesce with many/complex parameters will hit this problem. A query with a lot of small coalesce will not have the problem.
For AtLeastNNonNulls the fix would be safe to be backported, because no class variables are defined, but for coalesce it is safer to fix it only with SPARK-18016. In particular, the ongoing PR will solve the issue.
The same is true also for all the other similar PRs.
Maybe what we can do to backport this to branch-2.2 is to do the splitting and define class level variables only after a threshold of parameter is met, otherwise we go on with the previous code generation (without splitting). In this way we don't have any regression.
Or maybe we can backport to 2.2 only those fix which are not introducing class level variables, like for AtLeastNNonNulls.
Actually I think that the most important of all of these fixes is AtLeastNNonNulls indeed, because it is used to drop rows containing all nulls and this fails with dataset with a lot of columns before this PR. All the other functions are less likely to have a huge amount of parameters, despite this may happen and we should support it.

cloud-fan · 2017-11-17T16:28:32Z

I don't have a strong preference, but there were many 64kb compile error fixes for 2.2 or prior(e.g. CreateStruct, CreateArray, Invoke, CreateExternalRow, erc.). They all add more global variables, and it's werid to me that we stop doing this because the master branch has SPARK-18016.

@kiszk what do you think?

…NNonNulls ## What changes were proposed in this pull request? Both `Coalesce` and `AtLeastNNonNulls` can cause the 64KB limit exception when used with a lot of arguments and/or complex expressions. This PR splits their expressions in order to avoid the issue. ## How was this patch tested? Added UTs Author: Marco Gaido <marcogaido91@gmail.com> Author: Marco Gaido <mgaido@hortonworks.com> Closes apache#19720 from mgaido91/SPARK-22494. (cherry picked from commit 4e7f07e) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

[SPARK-22494][SQL] Fix 64KB limit exception with Coalesce and Atleast…

f0edd7e

…NNonNulls

fix bug caused by mutable states init ordering

1722d12

viirya reviewed Nov 12, 2017

View reviewed changes

remove useless INPUT_ROW parameter

911e172

Revert "remove useless INPUT_ROW parameter"

e8320d6

This reverts commit 911e172.

mgaido91 mentioned this pull request Nov 12, 2017

[SPARK-22499][SQL] Fix 64KB JVM bytecode limit problem with least and greatest #19729

Closed

mgaido91 added 3 commits November 12, 2017 22:06

avoid usage of class variables for atleastNNonNull

3a38211

review comment

0afee4a

easier mkstring

924aab9

add null check

423245e

fix currentVars check

a548ddb

viirya reviewed Nov 14, 2017

View reviewed changes

address review comments

3a5c683

kiszk reviewed Nov 14, 2017

View reviewed changes

asfgit closed this in 4e7f07e Nov 16, 2017

		@@ -357,7 +358,7 @@ case class AtLeastNNonNulls(n: Int, children: Seq[Expression]) extends Predicate

		override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
		val nonnull = ctx.freshName("nonnull")

[SPARK-22494][SQL] Fix 64KB limit exception with Coalesce and AtleastNNonNulls #19720

[SPARK-22494][SQL] Fix 64KB limit exception with Coalesce and AtleastNNonNulls #19720

Conversation

mgaido91 commented Nov 10, 2017

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Nov 10, 2017

SparkQA commented Nov 11, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viirya Nov 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viirya commented Nov 12, 2017

SparkQA commented Nov 12, 2017

SparkQA commented Nov 12, 2017

kiszk commented Nov 12, 2017

SparkQA commented Nov 13, 2017

SparkQA commented Nov 13, 2017

mgaido91 commented Nov 13, 2017

SparkQA commented Nov 13, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Nov 14, 2017

viirya commented Nov 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan commented Nov 16, 2017

mgaido91 commented Nov 16, 2017

cloud-fan commented Nov 16, 2017

mgaido91 commented Nov 16, 2017

cloud-fan commented Nov 16, 2017

mgaido91 commented Nov 16, 2017

cloud-fan commented Nov 17, 2017

viirya Nov 12, 2017 •

edited

Loading