Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-15273] YarnSparkHadoopUtil#getOutOfMemoryErrorArgument should respect OnOutOfMemoryError parameter given by user #13057

Closed
wants to merge 16 commits into from

Conversation

tedyu
Copy link
Contributor

@tedyu tedyu commented May 11, 2016

What changes were proposed in this pull request?

As Nirav reported in this thread:
http://search-hadoop.com/m/q3RTtdF3yNLMd7u

YarnSparkHadoopUtil#getOutOfMemoryErrorArgument previously specified 'kill %p' unconditionally.
We should respect the parameter given by user.

How was this patch tested?

Existing tests

@tedyu tedyu changed the title YarnSparkHadoopUtil#getOutOfMemoryErrorArgument should respect OnOutOfMemoryError parameter given by user [SPARK-15273] YarnSparkHadoopUtil#getOutOfMemoryErrorArgument should respect OnOutOfMemoryError parameter given by user May 11, 2016
@SparkQA
Copy link

SparkQA commented May 11, 2016

Test build #58397 has finished for PR 13057 at commit 1fa3634.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 11, 2016

Test build #58398 has finished for PR 13057 at commit b0a84ff.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 11, 2016

Test build #58400 has finished for PR 13057 at commit 9bf7967.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

if (Utils.isWindows) {
escapeForShell("-XX:OnOutOfMemoryError=taskkill /F /PID %%%%p")
} else {
"-XX:OnOutOfMemoryError='kill %p'"
val onOOME = javaOpts.find(x => x.contains("-XX:OnOutOfMemoryError"))
if (onOOME == None) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several things wrong with this Scala code, but more generally, this isn't a good way to design this method. If you really mean to optionally add an argument, then see how things like the PermGen argument are handled and make it work more that way.

@SparkQA
Copy link

SparkQA commented May 12, 2016

Test build #58483 has finished for PR 13057 at commit 044ca7e.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 12, 2016

Test build #58485 has finished for PR 13057 at commit 2e05881.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 12, 2016

Test build #58486 has finished for PR 13057 at commit c7b17ed.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 12, 2016

Test build #58487 has started for PR 13057 at commit 67e1d3a.

@shaneknapp
Copy link
Contributor

i will retrigger this build once maintenance is over.

@tedyu
Copy link
Contributor Author

tedyu commented May 12, 2016

@srowen
Can you take another look ?

Thanks

@@ -418,11 +420,11 @@ object YarnSparkHadoopUtil {
*
* @return The correct OOM Error handler JVM option, platform dependent.
*/
def getOutOfMemoryErrorArgument: String = {
def getOutOfMemoryErrorArgument(sparkConf: SparkConf, javaOpts: ListBuffer[String]): String = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need sparkConf as an arg right? and javaOpts can just be a Seq?

@shaneknapp
Copy link
Contributor

jenkins, test this please

1 similar comment
@shaneknapp
Copy link
Contributor

jenkins, test this please

@SparkQA
Copy link

SparkQA commented May 12, 2016

Test build #58498 has finished for PR 13057 at commit bf06049.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tedyu
Copy link
Contributor Author

tedyu commented May 13, 2016

@srowen
Mind taking another look ?

Thanks

@SparkQA
Copy link

SparkQA commented May 13, 2016

Test build #58562 has finished for PR 13057 at commit c5a0971.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 13, 2016

Test build #58568 has finished for PR 13057 at commit 7f05b71.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tedyu
Copy link
Contributor Author

tedyu commented May 13, 2016

@srowen
I think I have addressed your comments.

Cheers

@tedyu
Copy link
Contributor Author

tedyu commented May 15, 2016

@srowen
Gentle ping.

@tedyu
Copy link
Contributor Author

tedyu commented May 16, 2016

@srowen
Pardon for the ping.

@@ -334,6 +334,18 @@ static void addPermGenSizeOpt(List<String> cmd) {
}

/**
* Gets the OutOfMemoryError option for Spark if the user hasn't set it.
*/
public static void addOutOfMemoryErrorArgument(List<String> cmd) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry last question -- why does this method need to be in this class? it's not used, it seems, except from YarnSparkHadoopUtil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please suggest a suitable class which is better host for this Java method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just YarnSparkHadoopUtil ? it can be inlined the one place it's called or am I overlooking something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YarnSparkHadoopUtil is written in Scala while this method is in Java.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't it be written in Scala?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you take a look at my initial attempt ?

+      val = onOOME = javaOpts.find(x => x.contains("-XX:OnOutOfMemoryError"))
+      if (onOOME == None) {
+        "-XX:OnOutOfMemoryError='kill %p'"
+      } else {
+        ""
+      }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, maybe ...

if (!javaOpts.exists(_.contains("...")) {
  javaOpts.add("...")
}

@SparkQA
Copy link

SparkQA commented May 17, 2016

Test build #58707 has finished for PR 13057 at commit 1ff83ff.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class CommandBuilderUtils

@SparkQA
Copy link

SparkQA commented May 17, 2016

Test build #58703 has finished for PR 13057 at commit 0847e6e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 17, 2016

Test build #58709 has finished for PR 13057 at commit 67d5bfc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -128,6 +128,11 @@ private[yarn] class ExecutorRunnable(
}
}

// Kill if OOM is raised - leverage yarn's failure handling to cause rescheduling.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a comment on addOutOfMemoryErrorArgument right? that's what I meant.

@SparkQA
Copy link

SparkQA commented May 18, 2016

Test build #58778 has finished for PR 13057 at commit 09fcb1e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tedyu
Copy link
Contributor Author

tedyu commented May 19, 2016

@srowen
See if I have addressed all your comments.

@tedyu
Copy link
Contributor Author

tedyu commented May 20, 2016

@srowen
Gentle ping.

if (Utils.isWindows) {
escapeForShell("-XX:OnOutOfMemoryError=taskkill /F /PID %%%%p")
if (!javaOpts.exists(_.contains("-XX:OnOutOfMemoryError"))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but this duplicates the condition. You probably want to flip the nesting of the if conditions here to avoid it

@SparkQA
Copy link

SparkQA commented May 20, 2016

Test build #58995 has finished for PR 13057 at commit e7c4472.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tedyu
Copy link
Contributor Author

tedyu commented May 20, 2016

@srowen :
Is this ready to go in ?

Thanks

@srowen
Copy link
Member

srowen commented May 20, 2016

@tedyu yes, but that's a lot of pinging. I think many iterations on this simple change could have been saved, so I'd focus not on hurrying to merge, but thinking through the feedback and changes you're making more holistically.

asfgit pushed a commit that referenced this pull request May 20, 2016
…respect OnOutOfMemoryError parameter given by user

## What changes were proposed in this pull request?

As Nirav reported in this thread:
http://search-hadoop.com/m/q3RTtdF3yNLMd7u

YarnSparkHadoopUtil#getOutOfMemoryErrorArgument previously specified 'kill %p' unconditionally.
We should respect the parameter given by user.

## How was this patch tested?

Existing tests

Author: tedyu <yuzhihong@gmail.com>

Closes #13057 from tedyu/master.

(cherry picked from commit 06c9f52)
Signed-off-by: Sean Owen <sowen@cloudera.com>
@srowen
Copy link
Member

srowen commented May 20, 2016

Merged to master/2.0

@asfgit asfgit closed this in 06c9f52 May 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants