[SPARK-21595] Separate thresholds for buffering and spilling in ExternalAppendOnlyUnsafeRowArray #18843

tejasapatil · 2017-08-04T06:25:46Z

What changes were proposed in this pull request?

SPARK-21595 reported that there is excessive spilling to disk due to default spill threshold for ExternalAppendOnlyUnsafeRowArray being quite small for WINDOW operator. Old behaviour of WINDOW operator (pre #16909) would hold data in an array for first 4096 records post which it would switch to UnsafeExternalSorter and start spilling to disk after reaching spark.shuffle.spill.numElementsForceSpillThreshold (or earlier if there was paucity of memory due to excessive consumers).

Currently the (switch from in-memory to UnsafeExternalSorter) and (UnsafeExternalSorter spilling to disk) for ExternalAppendOnlyUnsafeRowArray is controlled by a single threshold. This PR aims to separate that to have more granular control.

How was this patch tested?

Added unit tests

tejasapatil · 2017-08-04T06:26:19Z

@hvanhovell : let me know what you think about this.

SparkQA · 2017-08-04T07:04:51Z

Test build #80236 has finished for PR 18843 at commit 8e3bfb7.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

hvanhovell · 2017-08-04T10:10:39Z

retest this please

SparkQA · 2017-08-04T11:53:08Z

Test build #80247 has finished for PR 18843 at commit 8e3bfb7.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-08-04T19:49:54Z

Test build #80255 has finished for PR 18843 at commit 9f66038.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-08-04T20:00:00Z

Test build #80256 has finished for PR 18843 at commit 398ccaf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

hvanhovell · 2017-08-09T09:54:03Z

retest this please

hvanhovell · 2017-08-09T09:56:28Z

LGTM - pending jenkins

cloud-fan · 2017-08-09T10:29:03Z

sql/core/src/main/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArray.scala

- * - If the spill threshold is too low, we spill frequently and incur unnecessary disk writes.
- *   This may lead to a performance regression compared to the normal case of using an
- *   [[ArrayBuffer]] or [[Array]].
+ * - If [[numRowsSpillThreshold]] is too high, the in-memory array may occupy more memory than


typo? this should be numRowsInMemoryBufferThreshold. We may spill before reaching numRowsSpillThreshold if memory is not enough.

Yes it was a typo. Corrected it

cloud-fan · 2017-08-09T10:41:04Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

-      .createWithDefault(4096)
+      .createWithDefault(UnsafeExternalSorter.DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD.toInt)
+
+  val SORT_MERGE_JOIN_EXEC_BUFFER_IN_MEMORY_THRESHOLD =


can we just have one config for both window and SMJ? ideally we can say this config is for ExternalAppendOnlyUnsafeRowArray

I am fine with that. We can even go a step further and just have two configs : in-mem threshold and spill threshold at the ExternalAppendOnlyUnsafeRowArray for all its clients (currently SMJ, cartesian product, Window). That way we have consistency across all clients and both knobs. One downside is backward compatibility : spill threshold was already defined per operator level and people might be using it in prod.

Let me know what you think about that.

ok let's keep them separated for each operator.

cloud-fan · 2017-08-09T10:41:39Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      .doc("Threshold for number of rows guaranteed to be held in memory by the sort merge " +
+        "join operator")
+      .intConf
+      .createWithDefault(Int.MaxValue)


is this a reasonable default value? won't it lead to OOM according to the document?

It is the current value. I suppose you want to be able to tune it if you have to. Not all of us are running Spark at FB scale :)...

Before introducing ExternalAppendOnlyUnsafeRowArray, SMJ used to hold in-memory data in scala's ArrayBuffer. Its backed by an array which would at max be Int.MaxValue in size... so this default is keeping things as they were before.

SparkQA · 2017-08-09T12:35:59Z

Test build #80451 has finished for PR 18843 at commit 398ccaf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tejasapatil

Updated the PR as per review comments by @cloud-fan. I haven't made changes for all his comments and replied for more discussion in those places

tejasapatil · 2017-08-10T19:15:34Z

sql/core/src/main/scala/org/apache/spark/sql/execution/ExternalAppendOnlyUnsafeRowArray.scala

- * - If the spill threshold is too low, we spill frequently and incur unnecessary disk writes.
- *   This may lead to a performance regression compared to the normal case of using an
- *   [[ArrayBuffer]] or [[Array]].
+ * - If [[numRowsSpillThreshold]] is too high, the in-memory array may occupy more memory than


Yes it was a typo. Corrected it

tejasapatil · 2017-08-10T19:28:30Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

-      .createWithDefault(4096)
+      .createWithDefault(UnsafeExternalSorter.DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD.toInt)
+
+  val SORT_MERGE_JOIN_EXEC_BUFFER_IN_MEMORY_THRESHOLD =


I am fine with that. We can even go a step further and just have two configs : in-mem threshold and spill threshold at the ExternalAppendOnlyUnsafeRowArray for all its clients (currently SMJ, cartesian product, Window). That way we have consistency across all clients and both knobs. One downside is backward compatibility : spill threshold was already defined per operator level and people might be using it in prod.

Let me know what you think about that.

tejasapatil · 2017-08-10T21:17:20Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+      .doc("Threshold for number of rows guaranteed to be held in memory by the sort merge " +
+        "join operator")
+      .intConf
+      .createWithDefault(Int.MaxValue)


Before introducing ExternalAppendOnlyUnsafeRowArray, SMJ used to hold in-memory data in scala's ArrayBuffer. Its backed by an array which would at max be Int.MaxValue in size... so this default is keeping things as they were before.

SparkQA · 2017-08-10T23:49:42Z

Test build #80504 has finished for PR 18843 at commit a69969c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-08-11T04:37:34Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

  def sortMergeJoinExecBufferSpillThreshold: Int =
    getConf(SORT_MERGE_JOIN_EXEC_BUFFER_SPILL_THRESHOLD)

+  def sortMergeJoinExecBufferInMemoryThreshold: Int =


shall we introduce a similar config for cartesian product?

Sure. Since there was no in-memory buffer for cartesian product before, I am using a conservative value 4096 for the in-memory buffer threshold. However, the spill threshold is set to UnsafeExternalSorter.DEFAULT_NUM_ELEMENTS_FOR_SPILL_THRESHOLD like it was before.

cloud-fan · 2017-08-11T04:44:19Z

LGTM except one question, thanks for the fix!

cloud-fan · 2017-08-11T07:31:52Z

LGTM, pending jenkins

cloud-fan · 2017-08-11T07:32:03Z

retest this please

tejasapatil · 2017-08-11T14:26:36Z

jenkins test this please

SparkQA · 2017-08-11T19:22:08Z

Test build #80536 has finished for PR 18843 at commit ab5cd2e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

hvanhovell · 2017-08-11T20:01:29Z

Merging to master/2.2. Thanks!

…nalAppendOnlyUnsafeRowArray ## What changes were proposed in this pull request? [SPARK-21595](https://issues.apache.org/jira/browse/SPARK-21595) reported that there is excessive spilling to disk due to default spill threshold for `ExternalAppendOnlyUnsafeRowArray` being quite small for WINDOW operator. Old behaviour of WINDOW operator (pre #16909) would hold data in an array for first 4096 records post which it would switch to `UnsafeExternalSorter` and start spilling to disk after reaching `spark.shuffle.spill.numElementsForceSpillThreshold` (or earlier if there was paucity of memory due to excessive consumers). Currently the (switch from in-memory to `UnsafeExternalSorter`) and (`UnsafeExternalSorter` spilling to disk) for `ExternalAppendOnlyUnsafeRowArray` is controlled by a single threshold. This PR aims to separate that to have more granular control. ## How was this patch tested? Added unit tests Author: Tejas Patil <tejasp@fb.com> Closes #18843 from tejasapatil/SPARK-21595. (cherry picked from commit 9443999) Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>

…nalAppendOnlyUnsafeRowArray ## What changes were proposed in this pull request? [SPARK-21595](https://issues.apache.org/jira/browse/SPARK-21595) reported that there is excessive spilling to disk due to default spill threshold for `ExternalAppendOnlyUnsafeRowArray` being quite small for WINDOW operator. Old behaviour of WINDOW operator (pre apache#16909) would hold data in an array for first 4096 records post which it would switch to `UnsafeExternalSorter` and start spilling to disk after reaching `spark.shuffle.spill.numElementsForceSpillThreshold` (or earlier if there was paucity of memory due to excessive consumers). Currently the (switch from in-memory to `UnsafeExternalSorter`) and (`UnsafeExternalSorter` spilling to disk) for `ExternalAppendOnlyUnsafeRowArray` is controlled by a single threshold. This PR aims to separate that to have more granular control. ## How was this patch tested? Added unit tests Author: Tejas Patil <tejasp@fb.com> Closes apache#18843 from tejasapatil/SPARK-21595.

…nalAppendOnlyUnsafeRowArray ## What changes were proposed in this pull request? [SPARK-21595](https://issues.apache.org/jira/browse/SPARK-21595) reported that there is excessive spilling to disk due to default spill threshold for `ExternalAppendOnlyUnsafeRowArray` being quite small for WINDOW operator. Old behaviour of WINDOW operator (pre apache#16909) would hold data in an array for first 4096 records post which it would switch to `UnsafeExternalSorter` and start spilling to disk after reaching `spark.shuffle.spill.numElementsForceSpillThreshold` (or earlier if there was paucity of memory due to excessive consumers). Currently the (switch from in-memory to `UnsafeExternalSorter`) and (`UnsafeExternalSorter` spilling to disk) for `ExternalAppendOnlyUnsafeRowArray` is controlled by a single threshold. This PR aims to separate that to have more granular control. ## How was this patch tested? Added unit tests Author: Tejas Patil <tejasp@fb.com> Closes apache#18843 from tejasapatil/SPARK-21595. (cherry picked from commit 9443999) Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>

tejasapatil force-pushed the SPARK-21595 branch 2 times, most recently from 9f66038 to 398ccaf Compare August 4, 2017 17:23

cloud-fan reviewed Aug 9, 2017

View reviewed changes

tejasapatil force-pushed the SPARK-21595 branch from 398ccaf to a69969c Compare August 10, 2017 21:18

tejasapatil commented Aug 10, 2017

View reviewed changes

cloud-fan reviewed Aug 11, 2017

View reviewed changes

tejasapatil added 3 commits August 10, 2017 23:09

Separate thresholds for buffering and spilling

25dd385

review comment

c4c7145

separate configs for cartesian product operator

ab5cd2e

tejasapatil force-pushed the SPARK-21595 branch from a69969c to ab5cd2e Compare August 11, 2017 06:20

asfgit closed this in 9443999 Aug 11, 2017

mridulm mentioned this pull request Sep 27, 2017

[SPARK-21971][CORE] Too many open files in Spark due to concurrent fi… #19184

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-21595] Separate thresholds for buffering and spilling in ExternalAppendOnlyUnsafeRowArray #18843

[SPARK-21595] Separate thresholds for buffering and spilling in ExternalAppendOnlyUnsafeRowArray #18843

tejasapatil commented Aug 4, 2017

tejasapatil commented Aug 4, 2017

SparkQA commented Aug 4, 2017

hvanhovell commented Aug 4, 2017

SparkQA commented Aug 4, 2017

SparkQA commented Aug 4, 2017

SparkQA commented Aug 4, 2017

hvanhovell commented Aug 9, 2017

hvanhovell commented Aug 9, 2017

cloud-fan Aug 9, 2017

tejasapatil Aug 10, 2017

cloud-fan Aug 9, 2017

tejasapatil Aug 10, 2017 •

edited

Loading

cloud-fan Aug 11, 2017

cloud-fan Aug 9, 2017

hvanhovell Aug 9, 2017

tejasapatil Aug 10, 2017

cloud-fan Aug 11, 2017

SparkQA commented Aug 9, 2017

tejasapatil left a comment

tejasapatil Aug 10, 2017

tejasapatil Aug 10, 2017 •

edited

Loading

tejasapatil Aug 10, 2017

SparkQA commented Aug 10, 2017

cloud-fan Aug 11, 2017

tejasapatil Aug 11, 2017

cloud-fan commented Aug 11, 2017

cloud-fan commented Aug 11, 2017

cloud-fan commented Aug 11, 2017

tejasapatil commented Aug 11, 2017

SparkQA commented Aug 11, 2017

hvanhovell commented Aug 11, 2017

[SPARK-21595] Separate thresholds for buffering and spilling in ExternalAppendOnlyUnsafeRowArray #18843

[SPARK-21595] Separate thresholds for buffering and spilling in ExternalAppendOnlyUnsafeRowArray #18843

Conversation

tejasapatil commented Aug 4, 2017

What changes were proposed in this pull request?

How was this patch tested?

tejasapatil commented Aug 4, 2017

SparkQA commented Aug 4, 2017

hvanhovell commented Aug 4, 2017

SparkQA commented Aug 4, 2017

SparkQA commented Aug 4, 2017

SparkQA commented Aug 4, 2017

hvanhovell commented Aug 9, 2017

hvanhovell commented Aug 9, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tejasapatil Aug 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Aug 9, 2017

tejasapatil left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tejasapatil Aug 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SparkQA commented Aug 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cloud-fan commented Aug 11, 2017

cloud-fan commented Aug 11, 2017

cloud-fan commented Aug 11, 2017

tejasapatil commented Aug 11, 2017

SparkQA commented Aug 11, 2017

hvanhovell commented Aug 11, 2017

tejasapatil Aug 10, 2017 •

edited

Loading

tejasapatil Aug 10, 2017 •

edited

Loading