Skip to content

Comments

[SPARK-39709][SQL] The result of executeCollect and doExecute of TakeOrderedAndProjectExec should be the same#37118

Closed
wangyum wants to merge 1 commit intoapache:masterfrom
wangyum:SPARK-39709
Closed

[SPARK-39709][SQL] The result of executeCollect and doExecute of TakeOrderedAndProjectExec should be the same#37118
wangyum wants to merge 1 commit intoapache:masterfrom
wangyum:SPARK-39709

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Jul 7, 2022

What changes were proposed in this pull request?

This PR makes TakeOrderedAndProjectExec's executeCollect use doExecute()'s result.

Why are the changes needed?

To make the result of executeCollect and doExecute of TakeOrderedAndProjectExec the same. For example:

import testImplicits._

Seq((1, 1), (1, 2), (2, 3), (2, 4), (3, 5), (3, 6), (3, 7)).toDF("a", "b")
  .orderBy("a")
  .selectExpr("b")
  .limit(6)
  .show()/.collect().foreach(println)

.show() will use doExecute and the result is:

+---+
|  b|
+---+
|  1|
|  2|
|  3|
|  4|
|  5|
|  6|
+---+

.collect().foreach(println) will use executeCollect and the result is:

[2]
[1]
[3]
[4]
[7]
[6]

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Unit test.

@github-actions github-actions bot added the SQL label Jul 7, 2022
@wangyum
Copy link
Member Author

wangyum commented Jul 7, 2022

cc @cloud-fan @HyukjinKwon

val limited = if (orderingSatisfies) {
child.execute().mapPartitionsInternal(_.map(_.copy()).take(limit)).takeOrdered(limit)(ord)
} else {
child.execute().mapPartitionsInternal(_.map(_.copy())).takeOrdered(limit)(ord)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so RDD.takeOrdered does not return ordered records?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The result does not match if it contains duplicate values. For example, order by column a:

+---+---+
|  a|  b|
+---+---+
|  1|  1|
|  1|  2|
|  2|  3|
|  2|  4|
|  3|  5|
|  3|  6|
+---+---+

Copy link
Contributor

@JoshRosen JoshRosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this example, it looks like the ordering does not result in a total order: there are multiple values for each sort key:

>>> Seq((1, 1), (1, 2), (2, 3), (2, 4), (3, 5), (3, 6), (3, 7)).sortBy(_._1).map(_._2)
res5: Seq[Int] = List(1, 2, 3, 4, 5, 6, 7)

>>> Seq((1, 1), (1, 2), (2, 3), (2, 4), (3, 5), (3, 6), (3, 7)).reverse.sortBy(_._1).map(_._2)
res6: Seq[Int] = List(2, 1, 4, 3, 7, 6, 5)

In these cases, the ordering of the inputs to the sort will affect the final outcome.

Spark does not guarantee the order in which shuffle blocks are fetched: doExecute() will shuffle the per-partition results into a single reduce partition and then does a final sorting there, but the order in which that final partition fetches data from mappers is not guaranteed. As a result, the final sort may receive its input in different orders in different executions and this can lead to non-deterministic results.

That effect doesn't show up in the toy examples that use LocalRelation / parallelize, but I think we could demonstrate it with a more realistic example involving multiple input partitions and multiple executors.

As a result, I don't think that this patch's changes are sufficient to guarantee determinism of results when the sorting order does not totally order the records. Therefore I don't think we should make this change: it adds additional performance overheads but I don't think it will actually solve the problem of differing results with non-total sort orders.

@wangyum
Copy link
Member Author

wangyum commented Jul 8, 2022

Thank you @JoshRosen. Make sense. I will close this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants