Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) #283

Merged
merged 18 commits into from
Jan 10, 2018

Conversation

skakker
Copy link
Contributor

@skakker skakker commented Sep 6, 2017

To calculate if the executor memory is set to a good value, the average (or median) and maximum peak JVM used memory for executors is examined. Information about each executor’s peak JVM used memory will be available from the executors REST API, in maxPeakJvmUsedMemory.peakJvmUsedMemory. To avoid wasted memory, its checked that the peak JVM used memory is reasonably close to the allocated executor memory, spark.executor.memory -- if it is much smaller, then executor memory can be reduced.

* A heuristic based on peak JVM used memory for the spark executors and drivers
*
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra newline.


if(evaluator.severityExecutor.getValue > Severity.LOW.getValue) {
new HeuristicResultDetails("Note", "The allocated memory for the executor (in " + SPARK_EXECUTOR_MEMORY +") is much more than the peak JVM used memory by executors.")
new HeuristicResultDetails("Reasonable size for executor memory", ((1+BUFFER_PERCENT/100)*evaluator.maxExecutorPeakJvmUsedMemory).toString)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this evaluate properly, with integer division? How about 100.0?

)
result
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra newline.

*@
<p>This is a heuristic for peak JVM used memory.</p>
<h4>Executor Max Peak JVM Used Memory</h4>
<p>This is to analyse whether the executor memory is set to a good value. To avoid wasted memory, its checked that the peak JVM used memory is reasonably close to the allocated executor memory, (spark.executor.memory) -- if it is much smaller, then executor memory should be reduced.</p>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "its checked that the peak JVM used memory", how about "it checks if the peak JVM used memory"

new HeuristicResultDetails("Note", "The allocated memory for the driver (in " + SPARK_DRIVER_MEMORY + ") is much more than the peak JVM used memory by the driver.")
}

if(evaluator.severitySkew.getValue > Severity.LOW.getValue) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at some Spark applications, it looks like tasks can often be assigned to a fraction of the executors (based on node locality). It may not make sense to check for skew at the executor level. We can look for skew (data, peak execution memory and execution time) in tasks instead. We can also check how many executors have tasks assigned, and how imbalanced the assignment is -- we can recommend using fewer executors (and perhaps more cores), and/or locality. These would be separate heuristics, and we can discuss at the next meeting, or set up a separate meeting to discuss.

Thanks for making the changes. For now, can you disable the skew check? Everything else looks good.

@skakker
Copy link
Contributor Author

skakker commented Sep 21, 2017 via email

@edwinalu
Copy link

Thanks for making the change. Task-level skew would be done with the stage metrics, so you can remove the lines, but commented out is fine too.

@skakker
Copy link
Contributor Author

skakker commented Sep 22, 2017

Sure! Will remove the lines. 👍

@@ -193,6 +193,12 @@
<classname>com.linkedin.drelephant.spark.heuristics.StagesHeuristic</classname>
<viewname>views.html.help.spark.helpStagesHeuristic</viewname>
</heuristic>
<heuristic>
<applicationtype>spark</applicationtype>
<heuristicname>Spark JVM Used Memory</heuristicname>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JVM Used Memory

_.peakJvmUsedMemory.getOrElse(JVM_USED_MEMORY, 0).asInstanceOf[Number].longValue
}.max

val DEFAULT_MAX_EXECUTOR_PEAK_JVM_USED_MEMORY_THRESHOLDS =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configure thresholds in HeuristicConf

val executorList : Seq[ExecutorSummary] = executorSummaries.filterNot(_.id.equals("driver"))
val sparkExecutorMemory : Long = (appConfigurationProperties.get(SPARK_EXECUTOR_MEMORY).map(MemoryFormatUtils.stringToBytes)).getOrElse(0L)
val sparkDriverMemory : Long = appConfigurationProperties.get(SPARK_DRIVER_MEMORY).map(MemoryFormatUtils.stringToBytes).getOrElse(0L)
val medianPeakJvmUsedMemory: Long = if (executorList.isEmpty) 0L else executorList.map {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this being used anywhere? Mention it along with the Heuristic Details

val sparkDriverMemory : Long = appConfigurationProperties.get(SPARK_DRIVER_MEMORY).map(MemoryFormatUtils.stringToBytes).getOrElse(0L)
val medianPeakJvmUsedMemory: Long = if (executorList.isEmpty) 0L else executorList.map {
_.peakJvmUsedMemory.getOrElse(JVM_USED_MEMORY, 0L).asInstanceOf[Number].longValue
}.sortWith(_< _).drop(executorList.size/2).head
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Median computation is not accurate. Take a look at this code. http://rosettacode.org/wiki/Averages/Median#Scala

val JVM_USED_MEMORY = "jvmUsedMemory"
val SPARK_EXECUTOR_MEMORY = "spark.executor.memory"
val SPARK_DRIVER_MEMORY = "spark.driver.memory"
val reservedMemory : Long = 314572800
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment on the value.

Use 300 * FileUtils.ONE_MB or 300 * 1024 * 1024 for clarity

@akshayrai akshayrai changed the title Peak JVM used memory heuristic **DEPENDENCY : new REST API** Peak JVM used memory heuristic - (Depends on Custom SHS - Requires peakJvmUsedMemory metric) Jan 10, 2018
@akshayrai akshayrai merged commit 6b2f7e8 into linkedin:master Jan 10, 2018
akshayrai added a commit that referenced this pull request Jan 10, 2018
…uires peakJvmUsedMemory metric) (#283)"

This reverts commit 6b2f7e8.
akshayrai added a commit that referenced this pull request Jan 10, 2018
…uires peakJvmUsedMemory metric) (#283)" (#317)

This reverts commit 6b2f7e8.
pralabhkumar pushed a commit to pralabhkumar/dr-elephant that referenced this pull request Aug 31, 2018
pralabhkumar pushed a commit to pralabhkumar/dr-elephant that referenced this pull request Aug 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants