Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12251] Document and improve off-heap memory configurations #10237

Closed
wants to merge 5 commits into from

Conversation

JoshRosen
Copy link
Contributor

This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs.

  • Change spark.memory.offHeapSize to spark.memory.offHeap.size. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6).
  • Deprecated spark.unsafe.offHeap in favor of a new spark.memory.offHeap.enabled configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix.
  • Add a check which prevents users from setting spark.memory.offHeap.enabled=true when spark.memory.offHeap.size == 0. After SPARK-11389 ([SPARK-11389][CORE] Add support for off-heap memory to MemoryManager #9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without setting spark.memory.offHeap.size will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion.
  • Document these configurations on the configuration page.

@JoshRosen
Copy link
Contributor Author

Any preferences RE: offHeap vs offheap?

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #47469 has finished for PR 10237 at commit 2dd2d9e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #47482 has finished for PR 10237 at commit 216fc46.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #47486 has finished for PR 10237 at commit 216fc46.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

LGTM will merge once it passes tests

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #2200 has finished for PR 10237 at commit 216fc46.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):\n * public class JavaBinarizerExample\n * public class JavaBucketizerExample\n * public class JavaDCTExample\n * public class JavaElementwiseProductExample\n * public class JavaMinMaxScalerExample\n * public class JavaNGramExample\n * public class JavaNormalizerExample\n * public class JavaOneHotEncoderExample\n * public class JavaPCAExample\n * public class JavaPolynomialExpansionExample\n * public class JavaRFormulaExample\n * public class JavaStandardScalerExample\n * public class JavaStopWordsRemoverExample\n * public class JavaStringIndexerExample\n * public class JavaTokenizerExample\n * public class JavaVectorAssemblerExample\n * public class JavaVectorIndexerExample\n * public class JavaVectorSlicerExample\n

@JoshRosen
Copy link
Contributor Author

Haha, because of the Spark PRs update issues / downtime, your NewSparkPullRequestBuilder runs triggered runs of an older commit rather than the latest commit containing the fix for that test failure.

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #2199 has finished for PR 10237 at commit 216fc46.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds the following public classes (experimental):\n * public class JavaBinarizerExample\n * public class JavaBucketizerExample\n * public class JavaDCTExample\n * public class JavaElementwiseProductExample\n * public class JavaMinMaxScalerExample\n * public class JavaNGramExample\n * public class JavaNormalizerExample\n * public class JavaOneHotEncoderExample\n * public class JavaPCAExample\n * public class JavaPolynomialExpansionExample\n * public class JavaRFormulaExample\n * public class JavaStandardScalerExample\n * public class JavaStopWordsRemoverExample\n * public class JavaStringIndexerExample\n * public class JavaTokenizerExample\n * public class JavaVectorAssemblerExample\n * public class JavaVectorIndexerExample\n * public class JavaVectorSlicerExample\n * logInfo(s\"HBase class not found $e\")\n * logDebug(\"HBase class not found\", e)\n

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #2201 has finished for PR 10237 at commit 216fc46.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 10, 2015

Test build #47541 has finished for PR 10237 at commit 12ffce3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@andrewor14
Copy link
Contributor

Merged into master 1.6

asfgit pushed a commit that referenced this pull request Dec 10, 2015
This patch adds documentation for Spark configurations that affect off-heap memory and makes some naming and validation improvements for those configs.

- Change `spark.memory.offHeapSize` to `spark.memory.offHeap.size`. This is fine because this configuration has not shipped in any Spark release yet (it's new in Spark 1.6).
- Deprecated `spark.unsafe.offHeap` in favor of a new `spark.memory.offHeap.enabled` configuration. The motivation behind this change is to gather all memory-related configurations under the same prefix.
- Add a check which prevents users from setting `spark.memory.offHeap.enabled=true` when `spark.memory.offHeap.size == 0`. After SPARK-11389 (#9344), which was committed in Spark 1.6, Spark enforces a hard limit on the amount of off-heap memory that it will allocate to tasks. As a result, enabling off-heap execution memory without setting `spark.memory.offHeap.size` will lead to immediate OOMs. The new configuration validation makes this scenario easier to diagnose, helping to avoid user confusion.
- Document these configurations on the configuration page.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10237 from JoshRosen/SPARK-12251.

(cherry picked from commit 23a9e62)
Signed-off-by: Andrew Or <andrew@databricks.com>
@asfgit asfgit closed this in 23a9e62 Dec 10, 2015
@JoshRosen JoshRosen deleted the SPARK-12251 branch August 29, 2016 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants