Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) #283

skakker · 2017-09-06T08:42:18Z

To calculate if the executor memory is set to a good value, the average (or median) and maximum peak JVM used memory for executors is examined. Information about each executor’s peak JVM used memory will be available from the executors REST API, in maxPeakJvmUsedMemory.peakJvmUsedMemory. To avoid wasted memory, its checked that the peak JVM used memory is reasonably close to the allocated executor memory, spark.executor.memory -- if it is much smaller, then executor memory can be reduced.

edwinalu · 2017-09-15T23:03:31Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+  * A heuristic based on peak JVM used memory for the spark executors and drivers
+  *
+  */
+


Extra newline.

edwinalu · 2017-09-15T23:09:02Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+
+    if(evaluator.severityExecutor.getValue > Severity.LOW.getValue) {
+      new HeuristicResultDetails("Note", "The allocated memory for the executor (in " + SPARK_EXECUTOR_MEMORY +") is much more than the peak JVM used memory by executors.")
+      new HeuristicResultDetails("Reasonable size for executor memory", ((1+BUFFER_PERCENT/100)*evaluator.maxExecutorPeakJvmUsedMemory).toString)


Will this evaluate properly, with integer division? How about 100.0?

edwinalu · 2017-09-15T23:10:18Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+    )
+    result
+  }
+


Extra newline.

edwinalu · 2017-09-15T23:27:53Z

app/views/help/spark/helpJvmUsedMemoryHeuristic.scala.html

+*@
+<p>This is a heuristic for peak JVM used memory.</p>
+<h4>Executor Max Peak JVM Used Memory</h4>
+<p>This is to analyse whether the executor memory is set to a good value. To avoid wasted memory, its checked that the peak JVM used memory is reasonably close to the allocated executor memory, (spark.executor.memory) -- if it is much smaller, then executor memory should be reduced.</p>


For "its checked that the peak JVM used memory", how about "it checks if the peak JVM used memory"

edwinalu · 2017-09-20T22:15:18Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+      new HeuristicResultDetails("Note", "The allocated memory for the driver (in " + SPARK_DRIVER_MEMORY + ") is much more than the peak JVM used memory by the driver.")
+    }
+
+    if(evaluator.severitySkew.getValue > Severity.LOW.getValue) {


Looking at some Spark applications, it looks like tasks can often be assigned to a fraction of the executors (based on node locality). It may not make sense to check for skew at the executor level. We can look for skew (data, peak execution memory and execution time) in tasks instead. We can also check how many executors have tasks assigned, and how imbalanced the assignment is -- we can recommend using fewer executors (and perhaps more cores), and/or locality. These would be separate heuristics, and we can discuss at the next meeting, or set up a separate meeting to discuss.

Thanks for making the changes. For now, can you disable the skew check? Everything else looks good.

skakker · 2017-09-21T02:05:41Z

Hi Edwina Thank you for taking out time to review. :) I will disable the skew test and will discuss it with you about adding it at task level. Regards. Get Outlook for iOS<https://aka.ms/o0ukef>

________________________________ From: edwinalu <notifications@github.com> Sent: Thursday, September 21, 2017 3:45:23 AM To: linkedin/dr-elephant Cc: Swasti Kakker; Author Subject: Re: [linkedin/dr-elephant] Peak JVM used memory heuristic **DEPENDENCY : new REST API** (#283) @edwinalu commented on this pull request.

________________________________ In app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala<#283 (comment)>:

+

+ var resultDetails = Seq( + new HeuristicResultDetails("Max peak JVM used memory", evaluator.maxExecutorPeakJvmUsedMemory.toString), + new HeuristicResultDetails("Median peak JVM used memory", evaluator.medianPeakJvmUsedMemory.toString) + ) + + if(evaluator.severityExecutor.getValue > Severity.LOW.getValue) { + new HeuristicResultDetails("Note", "The allocated memory for the executor (in " + SPARK_EXECUTOR_MEMORY +") is much more than the peak JVM used memory by executors.") + new HeuristicResultDetails("Reasonable size for executor memory", ((1+BUFFER_PERCENT.toDouble/100.0)*evaluator.maxExecutorPeakJvmUsedMemory).toString) + } + + if(evaluator.severityDriver.getValue > Severity.LOW.getValue) { + new HeuristicResultDetails("Note", "The allocated memory for the driver (in " + SPARK_DRIVER_MEMORY + ") is much more than the peak JVM used memory by the driver.") + } + + if(evaluator.severitySkew.getValue > Severity.LOW.getValue) { Looking at some Spark applications, it looks like tasks can often be assigned to a fraction of the executors (based on node locality). It may not make sense to check for skew at the executor level. We can look for skew (data, peak execution memory and execution time) in tasks instead. We can also check how many executors have tasks assigned, and how imbalanced the assignment is -- we can recommend using fewer executors (and perhaps more cores), and/or locality. These would be separate heuristics, and we can discuss at the next meeting, or set up a separate meeting to discuss. Thanks for making the changes. For now, can you disable the skew check? Everything else looks good. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#283 (review)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ATQ3cB2kRpMrp94MGmTQR3K03-wmpt0jks5skY57gaJpZM4POCDl>.

edwinalu · 2017-09-21T18:23:35Z

Thanks for making the change. Task-level skew would be done with the stage metrics, so you can remove the lines, but commented out is fine too.

skakker · 2017-09-22T08:48:09Z

Sure! Will remove the lines. 👍

…ppy to .lz4

akshayrai · 2018-01-08T14:59:30Z

app-conf/HeuristicConf.xml

@@ -193,6 +193,12 @@
    <classname>com.linkedin.drelephant.spark.heuristics.StagesHeuristic</classname>
    <viewname>views.html.help.spark.helpStagesHeuristic</viewname>
  </heuristic>
+  <heuristic>
+    <applicationtype>spark</applicationtype>
+    <heuristicname>Spark JVM Used Memory</heuristicname>


JVM Used Memory

akshayrai · 2018-01-08T15:02:21Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+      _.peakJvmUsedMemory.getOrElse(JVM_USED_MEMORY, 0).asInstanceOf[Number].longValue
+    }.max
+
+    val DEFAULT_MAX_EXECUTOR_PEAK_JVM_USED_MEMORY_THRESHOLDS =


Configure thresholds in HeuristicConf

akshayrai · 2018-01-08T15:09:24Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+    val executorList : Seq[ExecutorSummary] = executorSummaries.filterNot(_.id.equals("driver"))
+    val sparkExecutorMemory : Long = (appConfigurationProperties.get(SPARK_EXECUTOR_MEMORY).map(MemoryFormatUtils.stringToBytes)).getOrElse(0L)
+    val sparkDriverMemory : Long = appConfigurationProperties.get(SPARK_DRIVER_MEMORY).map(MemoryFormatUtils.stringToBytes).getOrElse(0L)
+    val medianPeakJvmUsedMemory: Long = if (executorList.isEmpty) 0L else executorList.map {


I don't see this being used anywhere? Mention it along with the Heuristic Details

akshayrai · 2018-01-08T15:11:26Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+    val sparkDriverMemory : Long = appConfigurationProperties.get(SPARK_DRIVER_MEMORY).map(MemoryFormatUtils.stringToBytes).getOrElse(0L)
+    val medianPeakJvmUsedMemory: Long = if (executorList.isEmpty) 0L else executorList.map {
+      _.peakJvmUsedMemory.getOrElse(JVM_USED_MEMORY, 0L).asInstanceOf[Number].longValue
+    }.sortWith(_< _).drop(executorList.size/2).head


Median computation is not accurate. Take a look at this code. http://rosettacode.org/wiki/Averages/Median#Scala

akshayrai · 2018-01-08T15:14:47Z

app/com/linkedin/drelephant/spark/heuristics/JvmUsedMemoryHeuristic.scala

+  val JVM_USED_MEMORY = "jvmUsedMemory"
+  val SPARK_EXECUTOR_MEMORY = "spark.executor.memory"
+  val SPARK_DRIVER_MEMORY = "spark.driver.memory"
+  val reservedMemory : Long = 314572800


Add comment on the value.

Use 300 * FileUtils.ONE_MB or 300 * 1024 * 1024 for clarity

…uires peakJvmUsedMemory metric) (#283)" This reverts commit 6b2f7e8.

…uires peakJvmUsedMemory metric) (#283)" (#317) This reverts commit 6b2f7e8.

…akJvmUsedMemory metric) (linkedin#283)

…uires peakJvmUsedMemory metric) (linkedin#283)" (linkedin#317) This reverts commit 6b2f7e8.

skakker force-pushed the peakJvmMemory branch from c8e15e5 to b310982 Compare September 8, 2017 04:50

edwinalu reviewed Sep 15, 2017

View reviewed changes

edwinalu reviewed Sep 20, 2017

View reviewed changes

skakker force-pushed the peakJvmMemory branch from 45ae7cf to 6147188 Compare October 5, 2017 11:09

akshayrai force-pushed the master branch from 7c2fd7f to 8b46933 Compare December 12, 2017 05:09

skakker force-pushed the peakJvmMemory branch from 492cf34 to 4f2a9b9 Compare January 4, 2018 09:20

swasti and others added 16 commits January 8, 2018 14:42

Peak JVM used memory heuristic

fb3268b

Update JvmUsedMemoryHeuristic.scala

6a18167

Update JvmUsedMemoryHeuristic.scala

7efad4e

Update helpJvmUsedMemoryHeuristic.scala.html

ad292d3

Update helpJvmUsedMemoryHeuristic.scala.html

b5730f5

Update JvmUsedMemoryHeuristic.scala

9bd53b1

Update JvmUsedMemoryHeuristicTest.scala

eb228ec

Update JvmUsedMemoryHeuristicTest.scala

e30fe33

Update JvmUsedMemoryHeuristic.scala

f4e26fd

adding a local instance of StageStatus and changing default from .sna…

2ff8f78

…ppy to .lz4

fixed the help page and issue with recommendations shown

95e095a

fixed the exception can't cast int to Long

e63c30d

preventing the exception int can't be cast to long

2574509

head of empty list execpeiton fixed

23073fc

Update JvmUsedMemoryHeuristic.scala

b49026b

Update helpJvmUsedMemoryHeuristic.scala.html

9f8bdf2

skakker force-pushed the peakJvmMemory branch from 6d82efa to 9f8bdf2 Compare January 8, 2018 09:16

changes in test due to rebasing

ed58ba0

akshayrai suggested changes Jan 8, 2018

View reviewed changes

acknowledging review comments

7397e50

akshayrai approved these changes Jan 9, 2018

View reviewed changes

akshayrai changed the title Peak JVM used memory heuristic **DEPENDENCY : new REST API** Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) Jan 10, 2018

akshayrai merged commit 6b2f7e8 into linkedin:master Jan 10, 2018

akshayrai added a commit that referenced this pull request Jan 10, 2018

Revert "Peak JVM used memory heuristic - (Depends on Custom SHS - Req…

05a551f

…uires peakJvmUsedMemory metric) (#283)" This reverts commit 6b2f7e8.

akshayrai mentioned this pull request Jan 10, 2018

Revert "Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric)" #317

Merged

akshayrai added a commit that referenced this pull request Jan 10, 2018

Revert "Peak JVM used memory heuristic - (Depends on Custom SHS - Req…

5500aad

…uires peakJvmUsedMemory metric) (#283)" (#317) This reverts commit 6b2f7e8.

skakker mentioned this pull request Jan 10, 2018

Spark Peak jvm memory Heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) #318

Merged

pralabhkumar pushed a commit to pralabhkumar/dr-elephant that referenced this pull request Aug 31, 2018

Peak JVM used memory heuristic - (Depends on Custom SHS - Requires pe…

ebf1692

…akJvmUsedMemory metric) (linkedin#283)

pralabhkumar pushed a commit to pralabhkumar/dr-elephant that referenced this pull request Aug 31, 2018

Revert "Peak JVM used memory heuristic - (Depends on Custom SHS - Req…

94dbced

…uires peakJvmUsedMemory metric) (linkedin#283)" (linkedin#317) This reverts commit 6b2f7e8.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) #283

Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) #283

skakker commented Sep 6, 2017 •

edited

Loading

edwinalu Sep 15, 2017

edwinalu Sep 15, 2017

edwinalu Sep 15, 2017

edwinalu Sep 15, 2017

edwinalu Sep 20, 2017

skakker commented Sep 21, 2017 via email

edwinalu commented Sep 21, 2017

skakker commented Sep 22, 2017

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) #283

Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) #283

Conversation

skakker commented Sep 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skakker commented Sep 21, 2017 via email

edwinalu commented Sep 21, 2017

skakker commented Sep 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skakker commented Sep 6, 2017 •

edited

Loading