[SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter #34568

ulysses-you · 2021-11-12T07:38:19Z

What changes were proposed in this pull request?

Add a new trait V1Write to hold some sort infos of v1 write. e.g., partition columns, bucket spec.
Then let the following writing command extend the V1Write, includes both datasource and hive
- InsertIntoHadoopFsRelationCommand
- CreateDataSourceTableAsSelectCommand
- InsertIntoHiveTable
- CreateHiveTableAsSelectBase
Add a new rule V1Writes to decide if we should add a Sort operator based its V1Write.requiredOrdering. This rule should be similar with V2Writes.
So now we can remove the SortExec in FileFormatWriter.write.

Why are the changes needed?

FileFormatWriter.write now is used by all V1 write which includes datasource and hive table. However it contains a sort which is based on dynamic partition and bucket columns that can not be seen in plan directly.

V2 write has a better approach that it satisfies the order or even distribution by using rule V2Writes.

V1 write should do the similar thing with V2 write.

Does this PR introduce any user-facing change?

no.

How was this patch tested?

this is a code refactor, so it should pass CI

ulysses-you · 2021-11-12T07:40:05Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SortExec.scala

@@ -206,3 +179,43 @@ case class SortExec(
  override protected def withNewChildInternal(newChild: SparkPlan): SortExec =
    copy(child = newChild)
 }
+object SortExec {
+  def createSorter(


this change is because maxConcurrentOutputFileWriters need to create sorter at FileFormatWriter

SparkQA · 2021-11-12T08:59:51Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49620/

SparkQA · 2021-11-12T09:59:48Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49620/

SparkQA · 2021-11-12T10:02:05Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49625/

SparkQA · 2021-11-12T10:42:52Z

Test build #145149 has finished for PR 34568 at commit 19a367f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
trait V1Write extends DataWritingCommand with V1WritesHelper
trait V1WritesHelper
trait CreateHiveTableAsSelectBase extends V1Write with V1HiveWritesHelper
trait V1HiveWritesHelper

SparkQA · 2021-11-12T10:46:01Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49625/

SparkQA · 2021-11-12T13:25:16Z

Test build #145155 has finished for PR 34568 at commit e38ed44.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
trait V1Write extends DataWritingCommand with V1WritesHelper
trait V1WritesHelper
trait CreateHiveTableAsSelectBase extends V1Write with V1HiveWritesHelper
trait V1HiveWritesHelper

SparkQA · 2021-11-12T15:01:46Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49634/

SparkQA · 2021-11-12T16:02:24Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49634/

SparkQA · 2021-11-12T18:46:56Z

Test build #145163 has finished for PR 34568 at commit d673f8d.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
trait V1Write extends DataWritingCommand with V1WritesHelper
trait V1WritesHelper
trait CreateHiveTableAsSelectBase extends V1Write with V1HiveWritesHelper
trait V1HiveWritesHelper

ulysses-you · 2021-11-15T02:11:16Z

cc @MaxGekk @cloud-fan if you have time to take a look

c21

Thanks @ulysses-you for improving on this! Have some questions. Thanks.

c21 · 2021-11-15T10:47:14Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala

+ * V1 write includes both datasoruce and hive, that requires a specific ordering of data.
+ * It should be resolved by [[V1Writes]].
+ *
+ * TODO: we can also support specific distribution here if necessary


Could you help create a JIRA here? Thanks, cc @wangyum.

created SPARK-37333

c21 · 2021-11-15T10:52:46Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala

+      .map(SortOrder(_, Ascending))
+  }
+
+  def prepareQuery(


I don't think it's safe to check output ordering inside logical plan. The output ordering may be changed quite much during physical planning (e.g. shuffle added for join/aggregate can destroy output ordering). Ideally we should rely on physical plan rule EnsureRequirements to add proper sort.

I am wondering, how hard to make the code rely on SparkPlan.requiredChildOrdering for DSv1 write code path?

Actually after taking a deeper look, LogicalPlan.outputOrdering was introduced to eliminate unnecessary sort in logical planning (#20560), and only several operators preserves ordering (like Filter, Project), so it won't cause correctness issue here.

But the problem of LogicalPlan.outputOrdering is it being too conservative. We may add unnecessary sort here for complex queries (e.g. query with sort merge join on partition columns before writing to table with dynamic partitions)

thank you @c21 for pointing out this and I see what you concern about. The reason I used the LogicalPlan.outputOrdering is:

Add sort at logical side has benefits if the plan exists a sort. e.g.
InsertIntoTable (partition) Sort (not dynamic columns) ....
We can eliminate the user specified sort using EliminateSorts in Optimizer. But if we add the sort at physical plan, we will do the sort twice even the first sort has no effect.

For now, I prefer to keep the same approach with V2Writes which also add the required ordering even distribution at logical side. We can optimize them together if we find a more better approach in future.

I thnk it's safe and no perf regression that add a sort at logical side. Since we have the RemoveRedundantSorts at physical side, that rule can remove the sort we added if it's uncessary (e.g. sort + smj with dynamic partitions).

@ulysses-you - ah yes, we are also having RemoveRedundantSorts at physical planning, so I think we are good here. Thanks for explanation!

SparkQA · 2021-11-15T13:22:09Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49708/

SparkQA · 2021-11-15T14:07:06Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49708/

SparkQA · 2021-11-15T17:16:42Z

Test build #145238 has finished for PR 34568 at commit dfd7435.

This patch fails SparkR unit tests.
This patch merges cleanly.
This patch adds no public classes.

ulysses-you · 2021-11-19T10:15:20Z

retest this please

SparkQA · 2021-11-19T11:34:35Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49925/

SparkQA · 2021-11-19T12:17:58Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49925/

SparkQA · 2021-11-19T15:29:54Z

Test build #145453 has finished for PR 34568 at commit dfd7435.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

MaxGekk · 2021-11-29T19:04:09Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala

+    SchemaPruning :: V2ScanRelationPushDown :: V1Writes :: V2Writes ::
+      PruneFileSourcePartitions:: Nil


It does slightly confuses that V*Writes are here. Look, earlyScanPushDownRules is about:
" ... projection and filter pushdown to scans"
but V1Writes:
"... makes sure the v1 write requirement, e.g. requiredOrdering"
something like opposite thing - pulling from instead of pushing down.

Here is some history of why V2Writes is at earlyScanPushDownRules , #30806 (comment).

I agree the name is not matched, do you have other better place to go ?

MaxGekk · 2021-11-29T19:17:04Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SortExec.scala

+    val prefixComputer = new UnsafeExternalRowSorter.PrefixComputer {
+      private val result = new UnsafeExternalRowSorter.PrefixComputer.Prefix
+      override def computePrefix(row: InternalRow):
+      UnsafeExternalRowSorter.PrefixComputer.Prefix = {


Please, fix indentation like in the original code.

SparkQA · 2021-11-30T03:24:54Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50210/

SparkQA · 2021-11-30T04:26:29Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50210/

SparkQA · 2021-11-30T07:22:13Z

Test build #145740 has finished for PR 34568 at commit 7fddb62.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

c21

Thanks @ulysses-you for the work! Having some comments. @cloud-fan and @wangyum could you guys help take a look when you have time? Thanks.

c21 · 2022-02-16T04:10:03Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala

+        val enableRadixSort = sparkSession.sessionState.conf.enableRadixSort
+        val outputSchema = empty2NullPlan.schema
+        Some(ConcurrentOutputWriterSpec(maxWriters,
+          () => SortExec.createSorter(


I feel this refactoring (SortExec.createSorter) is not very necessary. Why can't we create a SortExec operator and call createSorter() as before? What's the advantage of current code compared to before?

Look at the previous code, we create and eval a SortExec is mainly for the ordering of dynamic partition. For the concurrent writers, we only need the sorter. After we pull out the sort, create a new SortExec seems overkill.

c21 · 2022-02-16T04:11:50Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala

+import org.apache.spark.sql.internal.SQLConf
+
+/**
+ * V1 write includes both datasoruce and hive, that requires a specific ordering of data.


nit: datasoruce -> datasource v1

c21 · 2022-02-16T04:15:40Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala

+
+trait V1WritesHelper {
+
+  def getBucketSpec(


nit: how about naming it as getWriterBucketSpec? BucketSpec is another class in Spark, which is different from WriterBucketSpec. Also bucketSpec is a parameter, so getWriterBucketSpec looks less confusing.

make sense !

c21 · 2022-02-16T04:18:51Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala

+  }
+}
+
+trait V1WritesHelper {


After looking through the subclasses of this one, I found this class is meant to be a utility class, but not an interface to implement. Shall we change this to object V1WritesUtils ?

The idea of the V1WritesHelper is from the AdaptiveSparkPlanHelper which also contains some util methods. And there are many place in sql use the helper even if they are not stateful. Personally, I don't have a big option about utility and helper.

c21 · 2022-02-16T04:21:42Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/V1HiveWritesHelper.scala

+import org.apache.spark.sql.execution.datasources.BucketingUtils
+import org.apache.spark.sql.hive.client.HiveClientImpl
+
+trait V1HiveWritesHelper {


this seems to be a utility class as well, how about object V1WritesForHiveUtils?

c21 · 2022-02-16T04:23:13Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/V1HiveWritesHelper.scala

+import org.apache.spark.sql.hive.client.HiveClientImpl
+
+trait V1HiveWritesHelper {
+  def options(bucketSpec: Option[BucketSpec]): Map[String, String] = {


nit: we can make the function name more verbose, e.g. getOptionsWithHiveBucketWrite

c21 · 2022-02-16T04:28:02Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/V1Writes.scala

+ *
+ * TODO(SPARK-37333): Specify the required distribution at V1Write
+ */
+trait V1Write extends DataWritingCommand with V1WritesHelper {


V1Write extending an interface called V1WritesHelper, looks a bit weird. I think V1WrtiesHelper is just a utility class, so we don't need extend here (per https://github.com/apache/spark/pull/34568/files#r807523085).

ulysses-you · 2022-03-02T09:59:58Z

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileFormatWriter.scala

-    val requiredOrdering =
-      partitionColumns ++ writerBucketSpec.map(_.bucketIdExpression) ++ sortColumns
-    // the sort order doesn't matter
-    val actualOrdering = empty2NullPlan.outputOrdering.map(_.child)


There is a issue here, since we have AQE. The plan is the AdaptiveSparkPlanExec who has no outputOrdering. For dynamic partition write, the code will always add an extra sort.

This pr can resolve this issue together. @cloud-fan @c21

…ite-sort

…leFormatWriter ### What changes were proposed in this pull request? `FileFormatWriter.write` is used by all V1 write commands including data source and hive tables. Depending on dynamic partitions, bucketed, and sort columns in the V1 write command, `FileFormatWriter` can add a physical sort on top of the query plan which is not visible from plan directly. This PR (based on #34568) intends to pull out the physical sort added by `FileFormatWriter` into logical planning. It adds a new logical rule `V1Writes` to add logical Sort operators based on the required ordering of a V1 write command. This behavior can be controlled by the new config **spark.sql.optimizer.plannedWrite.enabled** (default: true). ### Why are the changes needed? Improve observability of V1 write, and unify the logic of V1 and V2 write commands. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? New unit tests. Closes #37099 from allisonwang-db/spark-37287-v1-writes. Authored-by: allisonwang-db <allison.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>

github-actions bot added SQL STRUCTURED STREAMING labels Nov 12, 2021

ulysses-you commented Nov 12, 2021

View reviewed changes

ulysses-you force-pushed the file-write-sort branch from 19a367f to e38ed44 Compare November 12, 2021 09:08

ulysses-you force-pushed the file-write-sort branch from e38ed44 to d673f8d Compare November 12, 2021 13:51

c21 reviewed Nov 15, 2021

View reviewed changes

MaxGekk reviewed Nov 29, 2021

View reviewed changes

ulysses-you force-pushed the file-write-sort branch from dfd7435 to 7fddb62 Compare November 30, 2021 02:37

ulysses-you added 2 commits January 10, 2022 20:26

Pull out dynamic partition and bucket sort

30eee7e

jira

858c0ec

indentation

d36f2f0

ulysses-you force-pushed the file-write-sort branch from 7fddb62 to d36f2f0 Compare January 10, 2022 12:26

c21 reviewed Feb 16, 2022

View reviewed changes

comment

c3f30b4

ulysses-you commented Mar 2, 2022

View reviewed changes

ulysses-you added 2 commits March 4, 2022 13:25

Merge branch 'master' of https://github.com/apache/spark into file-wr…

6593ca3

…ite-sort

style

faa5be9

ulysses-you mentioned this pull request Mar 24, 2022

[SPARK-38578][SQL] AdaptiveSparkPlanExec should ensure user-specified ordering #35924

Closed

ulysses-you added 3 commits March 30, 2022 11:47

Merge branch 'master' of https://github.com/apache/spark into file-wr…

5ed40c1

…ite-sort

non excludable rule

121aa24

Merge branch 'master' of https://github.com/apache/spark into file-wr…

08e08b4

…ite-sort

allisonwang-db mentioned this pull request Jul 6, 2022

[SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter #37099

Closed

ulysses-you closed this Jul 25, 2022

ulysses-you deleted the file-write-sort branch July 25, 2022 07:23

		SchemaPruning :: V2ScanRelationPushDown :: V1Writes :: V2Writes ::
		PruneFileSourcePartitions:: Nil

[SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter #34568

[SPARK-37287][SQL] Pull out dynamic partition and bucket sort from FileFormatWriter #34568

Conversation

ulysses-you commented Nov 12, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Choose a reason for hiding this comment

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

SparkQA commented Nov 12, 2021

ulysses-you commented Nov 15, 2021

c21 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

c21 Nov 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Nov 15, 2021

SparkQA commented Nov 15, 2021

SparkQA commented Nov 15, 2021

ulysses-you commented Nov 19, 2021

SparkQA commented Nov 19, 2021

SparkQA commented Nov 19, 2021

SparkQA commented Nov 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Nov 30, 2021

SparkQA commented Nov 30, 2021

SparkQA commented Nov 30, 2021

c21 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ulysses-you Mar 2, 2022 • edited Loading

Choose a reason for hiding this comment

c21 Nov 15, 2021 •

edited

Loading

ulysses-you Mar 2, 2022 •

edited

Loading