Skip to content

Commit

Permalink
[SPARK-28127][SQL] Micro optimization on TreeNode's mapChildren method
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

The `mapChildren` method in the TreeNode class is commonly used across the whole Spark SQL codebase. In this method, there's a if statement that checks non-empty children. However, there's a cached lazy val `containsChild`, which can avoid unnecessary computation since `containsChild` is used in other methods and therefore constructed anyway.

Benchmark showed that this optimization can improve the whole TPC-DS planning time by 6.8%. There is no regression on any TPC-DS query.

## How was this patch tested?

Existing UTs.

Closes #24925 from yeshengm/treenode-children.

Authored-by: Yesheng Ma <kimi.ysma@gmail.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
  • Loading branch information
yeshengm authored and gatorsmile committed Jun 21, 2019
1 parent 47f54b1 commit 54da3bb
Showing 1 changed file with 1 addition and 1 deletion.
Expand Up @@ -319,7 +319,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
* Returns a copy of this node where `f` has been applied to all the nodes in `children`.
*/
def mapChildren(f: BaseType => BaseType): BaseType = {
if (children.nonEmpty) {
if (containsChild.nonEmpty) {
mapChildren(f, forceCopy = false)
} else {
this
Expand Down

0 comments on commit 54da3bb

Please sign in to comment.