Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires peakUnifiedMemory metric) #281

skakker · 2017-09-04T06:49:52Z

The amount of unified memory can be examined to see if spark.memory.fraction can be adjusted. If the ratio of unified memory to executor memory is much smaller than spark.memory.fraction, then more memory has been reserved for execution than is being used. This memory can instead be allocated for user memory, and/or total executor memory could be reduced.

edwinalu · 2017-09-16T00:01:38Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+  *
+  * This heuristic reports the fraction of memory used/ memory allocated for execution and if the fraction can be reduced. Also, it checks for the skew in peak unified memory and reports if the skew is too much.
+  */
+


Extra newline.

edwinalu · 2017-09-16T00:02:16Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+import com.linkedin.drelephant.spark.data.SparkApplicationData
+import com.linkedin.drelephant.spark.fetchers.statusapiv1.ExecutorSummary
+import scala.collection.JavaConverters
+


Extra newline.

edwinalu · 2017-09-16T00:03:01Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+import com.linkedin.drelephant.analysis._
+import com.linkedin.drelephant.configurations.heuristic.HeuristicConfigurationData
+import com.linkedin.drelephant.spark.data.SparkApplicationData
+import com.linkedin.drelephant.spark.fetchers.statusapiv1.ExecutorSummary


Newline between linkedin and scala imports?

edwinalu · 2017-09-16T00:03:44Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+/**
+  * A heuristic based on peak unified memory for the spark executors
+  *
+  * This heuristic reports the fraction of memory used/ memory allocated for execution and if the fraction can be reduced. Also, it checks for the skew in peak unified memory and reports if the skew is too much.


Unified memory is both execution and storage.

edwinalu · 2017-09-16T00:05:09Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+class UnifiedMemoryHeuristic(private val heuristicConfigurationData: HeuristicConfigurationData)
+  extends Heuristic[SparkApplicationData] {
+
+  import UnifiedMemoryHeuristic._


Are these imports needed?

Yes, we need unifiedMemoryHeuristic import as we are using the "evaluator" inside the class and the JavaConverters import as the resultDetails is converted (.asJava) before being passed as an argument to Heuristic Result.

edwinalu · 2017-09-16T00:20:39Z

app/views/help/spark/helpUnifiedMemoryHeuristic.scala.html

+<p>If the ratio of unified memory to executor memory is much smaller than "spark.memory.fraction", then more memory has been reserved for execution than is being used. This memory can instead be allocated for user memory, and/or total executor memory could be reduced.</p>
+<p>spark.memory.fraction: this is the fraction of (executor memory - reserved memory) used for execution and storage. This partitions user memory from execution and storage memory.</p>
+<h4>Unified Memory Skew</h4>
+<p>Skew in the amount of unified memory for different executors might indicates a similar imbalance in the amount of work (and data) for tasks. It should be more balanced.</p>


Change "might indicates" to "might indicate"

edwinalu · 2017-09-16T00:22:28Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+
+object UnifiedMemoryHeuristic {
+
+  val JVM_USED_MEMORY = "jvmUsedMemory"


This should be replaced by vals for storage memory and execution memory.

I didn't get this, could you please be more specific?

Unified memory is the sum of storage and execution memory. This is different from JVM used memory.
Storage memory is the amount of memory used by Spark for storing RDDS, broadcast variables, etc.
Execution memory is the amount of memory used for execution (shuffle, sort, etc.).
For more recent versions of Spark, execution and storage memory share a unified memory region, and are able to borrow from each other, if one still has extra capacity. Unified memory is thus the sum of storage and execution memory.
JVM memory is the JVM memory used by the Spark application.

Yupp I understand that. Now peakUnifiedMemory contains the following key value pairs:

“peakUnifiedMemory” : {
“jvmUsedMemory” : ,
“executionMemory”: ,
“storageMemory”: ,
“time”: ,
“activeStages”: [, ...]
}

so for the purpose of this heuristic, to calculate the peak unified memory, do you mean that I should take sum of executionMemory and storageMemory instead of jvmUsedMemory?

Yes, please use the sum of executionMemory and storageMemory for unifiedMemory.

Sure.. Thank you! :)

edwinalu · 2017-09-16T00:23:14Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+      SeverityThresholds(low = 1.5 * meanUnifiedMemory, moderate = 2 * meanUnifiedMemory, severe = 4 * meanUnifiedMemory, critical = 8 * meanUnifiedMemory, ascending = true)
+
+    def getPeakUnifiedMemoryExecutorSeverity(executorSummary: ExecutorSummary): Severity = {
+      var jvmPeakUnifiedMemory: Long = executorSummary.peakUnifiedMemory.getOrElse(JVM_USED_MEMORY, 0)


This is memory controlled by Spark, so peakUnifiedMemory might be better than jvmPeakUnifiedMemory.

edwinalu · 2017-09-16T00:26:11Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+      SeverityThresholds(low = 1.5 * meanUnifiedMemory, moderate = 2 * meanUnifiedMemory, severe = 4 * meanUnifiedMemory, critical = 8 * meanUnifiedMemory, ascending = true)
+
+    def getPeakUnifiedMemoryExecutorSeverity(executorSummary: ExecutorSummary): Severity = {
+      var jvmPeakUnifiedMemory: Long = executorSummary.peakUnifiedMemory.getOrElse(JVM_USED_MEMORY, 0)


Unified memory is the sum of execution and storage memory. Right now it is using jvmUsedMemory.

edwinalu · 2017-09-16T00:28:40Z

test/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristicTest.scala

+  val memoryFractionHeuristic = new UnifiedMemoryHeuristic(heuristicConfigurationData)
+
+  val executorData = Seq(
+    newDummyExecutorData("1", 400000, Map("jvmUsedMemory" -> 394567)),


These should set execution and storage memory.

shkhrgpt · 2017-10-11T20:52:05Z

app/com/linkedin/drelephant/spark/fetchers/statusapiv1/statusapiv1.scala

@@ -87,7 +86,9 @@ trait ExecutorSummary{
  def totalShuffleRead: Long
  def totalShuffleWrite: Long
  def maxMemory: Long
-  def executorLogs: Map[String, String]}
+  def executorLogs: Map[String, String]
+  def peakUnifiedMemory: Map[String, Long]


How do you get peakUnifiedMemroy value? I don't think Spark history server reports it.

shkhrgpt · 2017-10-11T20:52:14Z

app/com/linkedin/drelephant/spark/fetchers/statusapiv1/statusapiv1.scala

@@ -292,7 +293,8 @@ class ExecutorSummaryImpl(
  var totalShuffleRead: Long,
  var totalShuffleWrite: Long,
  var maxMemory: Long,
-  var executorLogs: Map[String, String]) extends ExecutorSummary
+  var executorLogs: Map[String, String],
+  var peakUnifiedMemory: Map[String, Long]) extends ExecutorSummary


How do you get peakUnifiedMemroy value? I don't think Spark history server reports it.

akshayrai · 2018-01-08T10:28:51Z

app/views/help/spark/helpUnifiedMemoryHeuristic.scala.html

+* License for the specific language governing permissions and limitations under
+* the License.
+*@
+<p> This is a heuristic for peak Unified Memory </p>


This trivial sentence doesn't convey anything. You could say something like,

Peak Unified Memory Heuristic identifies and flags jobs which have over allocated Unified Memory region.

akshayrai · 2018-01-08T10:29:29Z

test/com/linkedin/drelephant/spark/heuristics/StagesHeuristicTest.scala

 import org.apache.spark.scheduler.SparkListenerEnvironmentUpdate
 import org.scalatest.{FunSpec, Matchers}
+import com.linkedin.drelephant.spark.fetchers.statusapiv1.StageStatus


Organize the imports

akshayrai · 2018-01-08T10:31:50Z

app-conf/HeuristicConf.xml

@@ -193,6 +193,12 @@
    <classname>com.linkedin.drelephant.spark.heuristics.StagesHeuristic</classname>
    <viewname>views.html.help.spark.helpStagesHeuristic</viewname>
  </heuristic>
+  <heuristic>
+    <applicationtype>spark</applicationtype>
+    <heuristicname>Spark Peak Unified Memory</heuristicname>


Because these heuristics are applicable only to Spark applications, I don't think we should mention Spark in the Heuristic Name. It could just be Peak Unified Memory

Maybe a personal preference. What do you think?

akshayrai · 2018-01-08T10:48:59Z

app/views/help/spark/helpUnifiedMemoryHeuristic.scala.html

+<p> This is a heuristic for peak Unified Memory </p>
+<h4>Peak Unified Memory</h4>
+<p>If the peak unified memory is much smaller than allocated executor memory then we recommend to decrease spark.memory.fraction, or total executor memory.</p>
+<p>spark.memory.fraction: this is the fraction of (executor memory - reserved memory) used for execution and storage. This partitions user memory from execution and storage memory.</p>


Requesting changes:

Add a <p>Note:</p> and then provide the details.

Make spark.memory.fraction italics.

Rephrase: This is the fraction of JVM Used Memory (Executor memory - Reserved memory) dedicated to the unified memory region (execution + storage).

akshayrai · 2018-01-08T10:51:27Z

app/views/help/spark/helpUnifiedMemoryHeuristic.scala.html

+*@
+<p> This is a heuristic for peak Unified Memory </p>
+<h4>Peak Unified Memory</h4>
+<p>If the peak unified memory is much smaller than allocated executor memory then we recommend to decrease spark.memory.fraction, or total executor memory.</p>


Rephrase:

If the job's Peak Unified Memory Consumption is much smaller than the allocated Unified Memory space, then we recommend decreasing the allocated Unified Memory Region for your job.

Action Item:
The Allocated Unified Memory Region can be reduced in the following ways.

If your job's Executor Memory is already low, then reduce spark.memory.fraction which will reduce the amount of space allocated to the Unified Memory Region.

If your job's Executor Memory is high, then we recommend reducing the spark.executor.memory itself which will lower the Allocated Unified Memory space.

akshayrai · 2018-01-08T14:01:17Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+    var resultDetails = Seq(
+      new HeuristicResultDetails("Allocated memory for the unified region", MemoryFormatUtils.bytesToString(evaluator.maxMemory)),
+      new HeuristicResultDetails("Mean peak unified memory", MemoryFormatUtils.bytesToString(evaluator.meanUnifiedMemory))
+    )


Show the Spark Executor Memory as well.

akshayrai · 2018-01-08T14:02:06Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+      new HeuristicResultDetails("Mean peak unified memory", MemoryFormatUtils.bytesToString(evaluator.meanUnifiedMemory))
+    )
+
+    if (evaluator.severity.getValue > Severity.LOW.getValue) {


Remove this. Suggestions are given on the help page.

akshayrai · 2018-01-08T14:04:53Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+    val executorList: Seq[ExecutorSummary] = executorSummaries.filterNot(_.id.equals("driver"))
+
+    //allocated memory for the unified region
+    val maxMemory: Long = executorList.head.maxMemory


What happens if executorList is empty?

akshayrai · 2018-01-08T14:08:35Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+    lazy val executorSummaries: Seq[ExecutorSummary] = data.executorSummaries
+    val executorList: Seq[ExecutorSummary] = executorSummaries.filterNot(_.id.equals("driver"))
+
+    //allocated memory for the unified region


Are you sure if maxMemory is the allocated memory? I couldn't find any documentation in Spark

akshayrai · 2018-01-08T14:13:29Z

app/com/linkedin/drelephant/spark/heuristics/UnifiedMemoryHeuristic.scala

+
+    var resultDetails = Seq(
+      new HeuristicResultDetails("Allocated memory for the unified region", MemoryFormatUtils.bytesToString(evaluator.maxMemory)),
+      new HeuristicResultDetails("Mean peak unified memory", MemoryFormatUtils.bytesToString(evaluator.meanUnifiedMemory))


I think it would also make sense to show the max Peak unified memory among all executors.

akshayrai · 2018-01-09T13:27:32Z

Nice work @skakker. Thanks for addressing all the comments.

…odec and fixed calculation of peak unified memory

…kUnifiedMemory metric) (#281)

…kUnifiedMemory metric) (linkedin#281)

…kUnifiedMemory metric) (#281)

…kUnifiedMemory metric) (linkedin#281)

…kUnifiedMemory metric) (#281)

edwinalu reviewed Sep 16, 2017

View reviewed changes

shkhrgpt reviewed Oct 11, 2017

View reviewed changes

akshayrai force-pushed the master branch from 7c2fd7f to 8b46933 Compare December 12, 2017 05:09

skakker force-pushed the unifiedMemoryHeuristic branch 2 times, most recently from 1f4d52f to 4c060a2 Compare January 8, 2018 08:45

akshayrai suggested changes Jan 8, 2018

View reviewed changes

akshayrai approved these changes Jan 9, 2018

View reviewed changes

akshayrai changed the title ~~Peak unified memory heuristic implemented, DEPENDENCY: REST-API changes~~ Peak unified memory heuristic implemented - (Depends on Custom SHS - Requires peakUnifiedMemory metric) Jan 10, 2018

skakker changed the base branch from master to customSHSWork January 10, 2018 06:14

akshayrai changed the title ~~Peak unified memory heuristic implemented - (Depends on Custom SHS - Requires peakUnifiedMemory metric)~~ Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires peakUnifiedMemory metric) Jan 10, 2018

swasti and others added 11 commits January 10, 2018 17:49

peak unified memory heuristic implemented, DEPENDENCY: REST-API changes

d29b6fb

peak unified memory heuristic implemented, DEPENDENCY: REST-API changes

a254ebc

Update UnifiedMemoryHeuristic.scala

2dce21e

Added a local instance for StageStatus, changed default compression c…

fd2980d

…odec and fixed calculation of peak unified memory

Removed skew check

256e615

printing memory in units

ece1aa2

changes in test because of rebasing

fe372fd

acknowledging review comments

752f2dc

minor fixes

044d242

refined configrable thresholds

a6910a0

changes required in tests

13c1943

skakker force-pushed the unifiedMemoryHeuristic branch from 5473a05 to 13c1943 Compare January 10, 2018 12:28

akshayrai merged commit 13115c2 into linkedin:customSHSWork Jan 10, 2018

akshayrai pushed a commit that referenced this pull request Feb 21, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

65b659e

…kUnifiedMemory metric) (#281)

akshayrai pushed a commit that referenced this pull request Feb 27, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

ee711be

…kUnifiedMemory metric) (#281)

akshayrai pushed a commit that referenced this pull request Mar 6, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

49b0c7e

…kUnifiedMemory metric) (#281)

arpang pushed a commit to arpang/dr-elephant that referenced this pull request Mar 14, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

5f3b527

…kUnifiedMemory metric) (linkedin#281)

akshayrai pushed a commit that referenced this pull request Mar 19, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

b296615

…kUnifiedMemory metric) (#281)

akshayrai pushed a commit that referenced this pull request Mar 19, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

830879b

…kUnifiedMemory metric) (#281)

akshayrai pushed a commit that referenced this pull request Mar 30, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

233d53b

…kUnifiedMemory metric) (#281)

akshayrai pushed a commit that referenced this pull request Apr 6, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

78dd699

…kUnifiedMemory metric) (#281)

akshayrai pushed a commit that referenced this pull request May 21, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

834d07c

…kUnifiedMemory metric) (#281)

pralabhkumar pushed a commit to pralabhkumar/dr-elephant that referenced this pull request Aug 31, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

baa3ba3

…kUnifiedMemory metric) (linkedin#281)

varunsaxena pushed a commit that referenced this pull request Oct 16, 2018

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires pea…

093b6c5

…kUnifiedMemory metric) (#281)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires peakUnifiedMemory metric) #281

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires peakUnifiedMemory metric) #281

skakker commented Sep 4, 2017 •

edited

Loading

edwinalu Sep 16, 2017

edwinalu Sep 16, 2017

edwinalu Sep 16, 2017

edwinalu Sep 16, 2017

edwinalu Sep 16, 2017

skakker Oct 9, 2017

edwinalu Sep 16, 2017

edwinalu Sep 16, 2017

skakker Oct 9, 2017

edwinalu Oct 9, 2017

skakker Oct 9, 2017

edwinalu Oct 9, 2017

skakker Oct 9, 2017

edwinalu Sep 16, 2017

edwinalu Sep 16, 2017

edwinalu Sep 16, 2017

shkhrgpt Oct 11, 2017

shkhrgpt Oct 11, 2017

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai Jan 8, 2018

akshayrai commented Jan 9, 2018


		object UnifiedMemoryHeuristic {

		val JVM_USED_MEMORY = "jvmUsedMemory"

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires peakUnifiedMemory metric) #281

Peak Unified Memory Heuristic - (Depends on Custom SHS - Requires peakUnifiedMemory metric) #281

Conversation

skakker commented Sep 4, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akshayrai commented Jan 9, 2018

skakker commented Sep 4, 2017 •

edited

Loading