[SPARK-37369][SQL] Avoid redundant ColumnarToRow transistion on InMemoryTableScan by viirya · Pull Request #34642 · apache/spark

viirya · 2021-11-18T04:53:51Z

What changes were proposed in this pull request?

This patch proposes to let InMemoryTableScanExec produces row output directly, if its parent query plan only accepts rows instead of columnar output. In particular, this change adds a new method in SparkPlan called supportsRowBased, alongside with the existing supportsColumnar.

Why are the changes needed?

We currently have supportsColumnar indicating if a physical node can produce columnar output. The current columnar transition rule seems taking an assumption that one node can only produce columnar output but not row-based one if supportsColumnar returns true. But actually one node can possibly produce both format, i.e. columnar and row-based. For such node, if we require row-based output, the columnar transition rule will add additional columnar-to-row after it due to the wrong assumption.

So this change introduces supportsRowBased which is used to indicates if the node can produce row-based output. The rule can check this method when deciding if a columnar-to-row transition is necessary or not.

For example, InMemoryTableScanExec can produce columnar output. So if its parent plan isn't columnar, the rule adds a ColumnarToRow between them, e.g.,

+- Union
:- ColumnarToRow
: +- InMemoryTableScan i#8, j#9
: +- InMemoryRelation i#8, j#9, StorageLevel(disk, memory, deserialized, 1 replicas)

But InMemoryTableScanExec is capable of row-based output too. After this change, for such case, we can simply ask InMemoryTableScanExec to produce row output instead of a redundant conversion.

================================================================================================
Int In-memory
================================================================================================

OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.16
Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
Int In-Memory scan:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
--------------------------------------------------------------------------------------------------------------------------
columnar deserialization + columnar-to-row            228            245          15          4.4         227.7       1.0X
row-based deserialization                             179            187          10          5.6         179.4       1.3X

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests.

sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala

SparkQA · 2021-11-18T05:47:17Z

Test build #145370 has finished for PR 34642 at commit b73941c.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-11-18T05:57:04Z

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49843/

SparkQA · 2021-11-18T07:24:55Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49846/

SparkQA · 2021-11-18T08:08:53Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49846/

SparkQA · 2021-11-18T08:49:30Z

Test build #145373 has finished for PR 34642 at commit 6c379f4.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

attilapiros

I think we need an explicit unittest to validate the to-row transition is missing from the plan in this case.

attilapiros · 2021-11-18T07:56:34Z

sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala

    conf.cacheVectorizedReaderEnabled  &&
        !WholeStageCodegenExec.isTooManyFields(conf, relation.schema) &&
-        relation.cacheBuilder.serializer.supportsColumnarOutput(relation.schema)
+        relation.cacheBuilder.serializer.supportsColumnarOutput(relation.schema) && outputColumnar


As evaluating the outputColumnar flag is one of the fastest in this expression (where only && operators are used) I would move it before the isTooManyFields call (which uses a recursive function).

Yea, moved.

sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala

viirya · 2021-11-18T09:19:30Z

Thanks @attilapiros. Yea, I will add some tests later.

SparkQA · 2021-11-19T07:26:31Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49912/

SparkQA · 2021-11-19T08:25:32Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49912/

SparkQA · 2021-11-19T08:35:45Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49916/

SparkQA · 2021-11-19T08:40:32Z

Test build #145440 has finished for PR 34642 at commit 7dc5b51.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-11-19T09:32:03Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49919/

SparkQA · 2021-11-19T09:33:53Z

Test build #145444 has finished for PR 34642 at commit 9c057b1.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-11-19T09:38:38Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49916/

SparkQA · 2021-11-19T10:16:57Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49919/

SparkQA · 2021-11-19T10:34:32Z

Test build #145447 has finished for PR 34642 at commit eb85130.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

attilapiros · 2021-11-19T16:11:47Z

@viirya
Could it be at the root of this problem we have a missing flag?
I think we have 3 different states here and only 1 boolean flag:

supports columnar	supports row based
true	false
false	true
true	true

So we might abuse supportsColumnar == false and take it as supportsRowBased.

What about introducing a new flag right beside the old one:

spark/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala

Line 77 in eb85130

def supportsColumnar: Boolean = false

Something like:

 def supportsRowBased: Boolean = !supportsColumnar

And in InMemoryTableScanExec we can override it.

This would a more generic solution as any node which support both can avoid to use the unneeded to-row transition.
WDYT?

viirya · 2021-11-20T01:21:27Z

Ideally, yes, it is a more general one to have another flag. In practice, I suspect if there will be more such nodes that could choose to output row-based or columnar output for some conditions? For the in-memory relation scan here, adding a new flag supportsRowBased seems not bring too much benefit?

SparkQA · 2021-12-09T03:39:36Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50497/

SparkQA · 2021-12-09T04:32:50Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50497/

SparkQA · 2021-12-09T04:34:05Z

Test build #146017 has finished for PR 34642 at commit 2d078e1.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-12-09T07:31:25Z

Test build #146021 has finished for PR 34642 at commit 788ae78.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class TimedeltaOps(DataTypeOps):
class TimedeltaIndex(Index):
class MissingPandasLikeTimedeltaIndex(MissingPandasLikeIndex):
class PandasSQLStringFormatter(string.Formatter):
class UDFBasicProfiler(BasicProfiler):
class CloudPickleSerializer(FramedSerializer):
class SQLStringFormatter(string.Formatter):
class ExecutorPodsPollingSnapshotSource(
class ExecutorPodsWatchSnapshotSource(
class ExecutorRollPlugin extends SparkPlugin with Logging
class AnsiCombinedTypeCoercionRule(rules: Seq[TypeCoercionRule]) extends
case class RelationTimeTravel(
case class AsOfTimestamp(timestamp: Long) extends TimeTravelSpec
case class AsOfVersion(version: String) extends TimeTravelSpec
class CombinedTypeCoercionRule(rules: Seq[TypeCoercionRule]) extends TypeCoercionRule
case class PrettyPythonUDF(
case class TryElementAt(left: Expression, right: Expression, child: Expression)
case class ConvertTimezone(
case class UnclosedCommentProcessor(
case class CreateTable(
case class TableSpec(
case class OptimizeSkewedJoin(ensureRequirements: EnsureRequirements)

viirya · 2021-12-09T21:08:25Z

cc @sunchao @dongjoon-hyun

sunchao

So the motivation of this PR is to improve from the existing:

CachedBatch -> ColumnarBatch -> InternalRow

transition to:

CachedBatch -> InternalRow

Is that right? this makes sense to me.

sunchao · 2021-12-09T22:51:23Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala

  val id: Int = SparkPlan.newPlanId()

+  /**
+   * Return true if this stage of the plan supports row-based execution.


Maybe add some explanation why we need both this and supportsColumnar? it's a bit confusing when reading this code.

Also I'm wondering if something like prefersColumnar is better, so that we have:

supportsColumnar: this plan can support columnar output, alongside the default row-based output which every plan supports.

prefersColumnar: this plan prefers to output columnar batches even if it is not explicitly requested (e.g., outputsColumnar is false).

supportsColumnar: this plan can support columnar output, alongside the default row-based output which every plan supports.

Hmm, seems not exactly? Not every plan supports row-based output.

prefersColumnar seems redundant? As I see, we usually prefer columnar output already.

Not every plan supports row-based output.

Hmm any example? every physical plan node has to implement doExecute which outputs rows, while doExecuteColumnar throws exception by default.

As I see, we usually prefer columnar output already.

I'm not sure about this part. To my understanding, at the moment it appears we prefer columnar output because 1) vectorized readers for OPC/Parquet yield much better performance so we always want to use that over the default row-based impls, and 2) supportsColumnar defaults to false as most operators don't support columnar execution yet, so we'll do the columnar-row conversion and switch back to whole-stage codegen.

However this may not hold true if we add columnar support for more operators like filter/project etc in future. Do we want to prefer columnar execution over the whole-stage codegen approach? I'm not sure yet and maybe some evaluation is required. prefersColumnar could give us a knob to control this.

Hmm any example? every physical plan node has to implement doExecute which outputs rows, while doExecuteColumnar throws exception by default.

I think there is no guarantee that a physical node must implement a working doExecute. For a columnar node, it can just throw exception saying it is not implemented (like the default doExecuteColumnar) if it is not designed to be executed under row-based execution.

I also don't see a need to have implement both (working) row-based and columnar execution for a node in general. But in Spark, because we don't actually have official columnar execution nodes, so maybe I cannot get an example from Spark itself. Hopefully I convey the idea clearly.

prefersColumnar: this plan prefers to output columnar batches even if it is not explicitly requested (e.g., outputsColumnar is false).

BTW, outputsColumnar is not a preference option I think (at least for its usage now in the rule). It actually means that the output should be in columnar or not. Once outputsColumnar is false, the plan should output row-based output and it is why we add ColumnarToRowExec for the case.

Yea, the preference I mentioned is pretty limited so far. I agree that we maybe need to have a preference rule (or something) in the future. As we don't have real built-in columnar operators in Spark, so currently the situation seems that some columnar extensions/libraries replace row-based operators with columnar operators during planning. I'm not sure if we can estimate which one is preferred during planning.

BTW, IMHO, if we add columnar support for more operators in the future, I guess it already implicitly indicates we "prefer" it over current execution (whole-stage codegen or interpreted one)? Just like whole-stage codegen, seems we simply prefer it once we verify it having better performance generally. This is similar to the 3rd party extensions/libraries situation, I think.

I see, makes sense. I was referring to nodes in Spark itself but yea an extension could only implement doExecuteColumnar.

Although I'm still slightly in favor of prefersColumnar, but it's only a minor personal preference. Overall it looks OK.

For now I guess columnar route seems to be considered as superior, otherwise there should be calculation for the cost of plan between row vs columnar.

This is just to cover the case that downstream doesn't support columnar but upstream can support both row and columnar, and has performance of producing output as columnar output > row output > columnar output + columnar-to-row conversion, so upstream wants to produce directly whatever downstream wants without conversion.

If upstream can produce columnar output faster enough to cover the overhead of columnar-to-row conversion (columnar output > columnar output + columnar-to-row conversion > row output), then it could just tactically say "it only cover columnar output" and Spark will take the conversion.

sunchao · 2021-12-09T22:55:03Z

sql/core/benchmarks/InMemoryColumnarBenchmark-results.txt

@@ -0,0 +1,12 @@
+================================================================================================


nit: ideally we should generate result using the GitHub workflow

Updated with the result of GitHub Action.

sunchao · 2021-12-09T23:06:41Z

sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala

  private def buildBuffers(): RDD[CachedBatch] = {
-    val cb = if (cachedPlan.supportsColumnar) {
+    val cb = if (cachedPlan.supportsColumnar &&
+        serializer.supportsColumnarInput(cachedPlan.output)) {


hmm why this is necessary? shouldn't cachedPlan.supportsColumnar already covers this? for instance in InMemoryTableScanExec

This is actually a bug. cachedPlan.supportsColumnar only indicates the cached plan can output columnar format, but whether this cached rdd builder can take such input, is depending on its serializer.

There is one test which failed due to the proposed change. I remember that it happens for InMemoryRelation under InMemoryRelation.

Previously we always add additional ColumnarToRow transition between two InMemoryRelations, so we don't hit this.

HeartSaVioR · 2021-12-10T02:44:23Z

cc. @revans2 @tgravescs @andygrove
I'd like to see reviews from them to ensure we don't break the origin intention of making the serializer pluggable.

SparkQA · 2021-12-10T07:46:35Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50540/

SparkQA · 2021-12-10T08:30:14Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50540/

SparkQA · 2021-12-10T11:02:29Z

Test build #146065 has finished for PR 34642 at commit a216a6d.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

revans2

The change looks good to me. More comments on supportsRowBased and supportsColumnar might be good to make it clear how to use them, but it is fairly clear to me.

SparkQA · 2021-12-10T20:49:20Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50554/

SparkQA · 2021-12-10T21:45:03Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50554/

SparkQA · 2021-12-11T00:35:21Z

Test build #146079 has finished for PR 34642 at commit dc9ad83.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

viirya · 2021-12-12T07:59:38Z

I think this change should be pretty clear. If no more comments or objection, I will merge this in next few days. Thanks.

sunchao

LGTM

HeartSaVioR

The code change looks good, as long as this got approved by @revans2 who made recent change here and also depends on the change (I guess).

Would you mind if I ask to explain the addition of supportsRowBased in the PR description? It would help to track the change afterwards.

attilapiros

LGTM

viirya · 2021-12-12T22:13:42Z

@HeartSaVioR Updated the description. Thanks.

HeartSaVioR

+1 thanks for the patience!

dongjoon-hyun

+1, LGTM. Thank you, @viirya and all.

viirya · 2021-12-13T01:49:07Z

Thank you all! Merging to master.

cloud-fan · 2021-12-15T16:29:34Z

we can simply ask InMemoryTableScanExec to produce row output instead of a redundant conversion.

Sorry for the late review. How do we "ask" for producing row output? I don't see any related change to the in-memory table scan in this PR.

viirya · 2021-12-15T17:44:54Z

Sorry for the late review. How do we "ask" for producing row output? I don't see any related change to the in-memory table scan in this PR.

Sorry for confusion. I should say we can let InMemoryTableScanExec to produce row output. We let the transition rule knows exactly the in-memory scan can do it via the newly added supportsRowBased flag.

### What changes were proposed in this pull request? In PR #34642, we added a `supportsRowBased` in `SparkPlan` in order to avoid redundant `ColumnarToRow` transition in `InMemoryTableScan `. But, this optimization also applies to Union if its children both support row-based output. So, this PR adds the `supportsRowBased` implementation for `UnionExec`. ### Why are the changes needed? followup PR ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests passed. Closes #35061 from linhongliu-db/SPARK-37369-followup. Authored-by: Linhong Liu <linhong.liu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

github-actions bot added the SQL label Nov 18, 2021

viirya commented Nov 18, 2021

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala Outdated Show resolved Hide resolved

Ask InMemoryTableScanExec to output row if needed.

6c379f4

viirya force-pushed the SPARK-37369 branch from b73941c to 6c379f4 Compare November 18, 2021 06:32

viirya marked this pull request as draft November 18, 2021 08:57

attilapiros reviewed Nov 18, 2021

View reviewed changes

viirya added 5 commits November 18, 2021 22:29

Add test.

146e94d

Merge remote-tracking branch 'upstream/master' into SPARK-37369

b55eed7

Fix typo.

7025627

Move code comment.

7dc5b51

Fix tests.

9c057b1

Fix another test.

eb85130

viirya marked this pull request as ready for review November 19, 2021 08:26

sunchao reviewed Dec 9, 2021

View reviewed changes

Update benchmark result.

a216a6d

revans2 approved these changes Dec 10, 2021

View reviewed changes

Update comment.

dc9ad83

sunchao approved these changes Dec 12, 2021

View reviewed changes

HeartSaVioR reviewed Dec 12, 2021

View reviewed changes

attilapiros approved these changes Dec 12, 2021

View reviewed changes

HeartSaVioR approved these changes Dec 12, 2021

View reviewed changes

dongjoon-hyun approved these changes Dec 13, 2021

View reviewed changes

viirya closed this in a30bec1 Dec 13, 2021

linhongliu-db mentioned this pull request Dec 29, 2021

[SPARK-37369][SQL][FOLLOWUP] Override supportsRowBased in UnionExec #35061

Closed

viirya deleted the SPARK-37369 branch December 27, 2023 18:25

		@@ -0,0 +1,12 @@
		================================================================================================

Conversation

viirya commented Nov 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Uh oh!

SparkQA commented Nov 18, 2021

Uh oh!

SparkQA commented Nov 18, 2021

Uh oh!

SparkQA commented Nov 18, 2021

Uh oh!

SparkQA commented Nov 18, 2021

Uh oh!

SparkQA commented Nov 18, 2021

Uh oh!

attilapiros left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

viirya commented Nov 18, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

SparkQA commented Nov 19, 2021

Uh oh!

attilapiros commented Nov 19, 2021

Uh oh!

viirya commented Nov 20, 2021

Uh oh!

SparkQA commented Dec 9, 2021

Uh oh!

SparkQA commented Dec 9, 2021

Uh oh!

SparkQA commented Dec 9, 2021

Uh oh!

SparkQA commented Dec 9, 2021

Uh oh!

viirya commented Dec 9, 2021

Uh oh!

sunchao left a comment

Choose a reason for hiding this comment

Uh oh!

sunchao Dec 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HeartSaVioR Dec 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

viirya commented Nov 18, 2021 •

edited

Loading

sunchao Dec 9, 2021 •

edited

Loading

HeartSaVioR Dec 12, 2021 •

edited

Loading