New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12251] Document and improve off-heap memory configurations #10237
Conversation
Any preferences RE: |
Test build #47469 has finished for PR 10237 at commit
|
Jenkins, retest this please. |
Test build #47482 has finished for PR 10237 at commit
|
Test build #47486 has finished for PR 10237 at commit
|
LGTM will merge once it passes tests |
Test build #2200 has finished for PR 10237 at commit
|
Haha, because of the Spark PRs update issues / downtime, your NewSparkPullRequestBuilder runs triggered runs of an older commit rather than the latest commit containing the fix for that test failure. |
Test build #2199 has finished for PR 10237 at commit
|
Test build #2201 has finished for PR 10237 at commit
|
Test build #47541 has finished for PR 10237 at commit
|
Merged into master 1.6 |
This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs. - Change `spark.memory.offHeapSize` to `spark.memory.offHeap.size`. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6). - Deprecated `spark.unsafe.offHeap` in favor of a new `spark.memory.offHeap.enabled` configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix. - Add a check which prevents users from setting `spark.memory.offHeap.enabled=true` when `spark.memory.offHeap.size == 0`. After SPARK-11389 (#9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without setting `spark.memory.offHeap.size` will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion. - Document these configurations on the configuration page. Author: Josh Rosen <joshrosen@databricks.com> Closes #10237 from JoshRosen/SPARK-12251. (cherry picked from commit 23a9e62) Signed-off-by: Andrew Or <andrew@databricks.com>
This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs.
spark.memory.offHeapSize
tospark.memory.offHeap.size
. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6).spark.unsafe.offHeap
in favor of a newspark.memory.offHeap.enabled
configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix.spark.memory.offHeap.enabled=true
whenspark.memory.offHeap.size == 0
. After SPARK-11389 ([SPARK-11389][CORE] Add support for off-heap memory to MemoryManager #9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without settingspark.memory.offHeap.size
will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion.