Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22682][SQL] HashExpression does not need to create global variables #19878

Closed
wants to merge 1 commit into from

Conversation

cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

It turns out that HashExpression can pass around some values via parameter when splitting codes into methods, to save some global variable slots.

This can also prevent a weird case that global variable appears in parameter list, which is discovered by #19865

How was this patch tested?

existing tests

@cloud-fan
Copy link
Contributor Author

cc @kiszk @mgaido91 @viirya @gatorsmile

}

val hashResultType = ctx.javaType(dataType)
val codes = if (ctx.INPUT_ROW == null || ctx.currentVars != null) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern appears many times in the code base, we may need to create a ctx.splitExpressionsWithCurrentInput for it later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @kiszk is doing this

Copy link
Member

@gatorsmile gatorsmile Dec 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one has been merged, but this one is still different.

childResult, ctx)}
$localResult = (31 * $localResult) + $childResult;
"""
}.mkString(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We forgot to split the code for computing hive hash of struct, it's fixed now.

nullSafeElementHash(input, index.toString, field.nullable, field.dataType, result, ctx)
}
val hashResultType = ctx.javaType(dataType)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is done also in line 281. Can we do this only once? maybe with a lazy val?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx is only available inside doGenCode

@mgaido91
Copy link
Contributor

mgaido91 commented Dec 4, 2017

left only one very minor comment, it LGTM

@SparkQA
Copy link

SparkQA commented Dec 4, 2017

Test build #84431 has finished for PR 19878 at commit 0e9998e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

ev.copy(code =
s"""
|${ctx.JAVA_INT} ${ev.value} = $seed;
|${ctx.JAVA_INT} $childHash = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: childHash is only needed to declare here when we don't split functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm, splitExpressions could possibly not split expressions if only one block.

""".stripMargin
}

s"${ctx.JAVA_INT} $childResult = 0;\n" + ctx.splitExpressions(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check ctx.INPUT_ROW == null || ctx.currentVars != null here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, the input here is a row that may be produced by row.getStruct instead of ctx.INPUT_ROW, so we don't need this check as the input won't be ctx.currentVars.

@viirya
Copy link
Member

viirya commented Dec 5, 2017

LGTM

@cloud-fan
Copy link
Contributor Author

thanks, merging to master!

@asfgit asfgit closed this in a8af4da Dec 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants