-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-22471][SQL] SQLListener consumes much memory causing OutOfMemoryError #19711
Conversation
ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
@@ -207,6 +210,14 @@ class SQLListener(conf: SparkConf) extends SparkListener with Logging { | |||
} | |||
} | |||
|
|||
override def onStageCompleted(stageCompleted: SparkListenerStageCompleted): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
synchronized
Test build #83656 has finished for PR 19711 at commit
|
@@ -113,7 +116,7 @@ class SQLListener(conf: SparkConf) extends SparkListener with Logging { | |||
*/ | |||
private val _jobIdToExecutionId = mutable.HashMap[Long, Long]() | |||
|
|||
private val _stageIdToStageMetrics = mutable.HashMap[Long, SQLStageMetrics]() | |||
private val _stageIdToStageMetrics = mutable.LinkedHashMap[Long, SQLStageMetrics]() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use Java's LinkedHashMap
and override removeEldestEntry
to what we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removeEldestEntry
is a protected method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java's LinkedHashMap
can be overridden with a custom implementation of removeEldestEntry
, that will save the codes done below. It is not the user who call this removeEldestEntry
...
Test build #83680 has finished for PR 19711 at commit
|
Retest this please. |
Test build #83682 has finished for PR 19711 at commit
|
Corrupted build node? |
Retest this please. |
1 similar comment
Retest this please. |
Test build #83805 has finished for PR 19711 at commit
|
LGTM, merging to 2.2. |
…ryError ## What changes were proposed in this pull request? This PR addresses the issue [SPARK-22471](https://issues.apache.org/jira/browse/SPARK-22471). The modified version of `SQLListener` respects the setting `spark.ui.retainedStages` and keeps the number of the tracked stages within the specified limit. The hash map `_stageIdToStageMetrics` does not outgrow the limit, hence overall memory consumption does not grow with time anymore. A 2.2-compatible fix. Maybe incompatible with 2.3 due to #19681. ## How was this patch tested? A new unit test covers this fix - see `SQLListenerMemorySuite.scala`. Author: Arseniy Tashoyan <tashoyan@gmail.com> Closes #19711 from tashoyan/SPARK-22471-branch-2.2.
@tashoyan please close this since github doesn't do it automatically for branches. |
…ryError ## What changes were proposed in this pull request? This PR addresses the issue [SPARK-22471](https://issues.apache.org/jira/browse/SPARK-22471). The modified version of `SQLListener` respects the setting `spark.ui.retainedStages` and keeps the number of the tracked stages within the specified limit. The hash map `_stageIdToStageMetrics` does not outgrow the limit, hence overall memory consumption does not grow with time anymore. A 2.2-compatible fix. Maybe incompatible with 2.3 due to apache#19681. ## How was this patch tested? A new unit test covers this fix - see `SQLListenerMemorySuite.scala`. Author: Arseniy Tashoyan <tashoyan@gmail.com> Closes apache#19711 from tashoyan/SPARK-22471-branch-2.2.
What changes were proposed in this pull request?
This PR addresses the issue SPARK-22471. The modified version of
SQLListener
respects the settingspark.ui.retainedStages
and keeps the number of the tracked stages within the specified limit. The hash map_stageIdToStageMetrics
does not outgrow the limit, hence overall memory consumption does not grow with time anymore.A 2.2-compatible fix. Maybe incompatible with 2.3 due to #19681.
How was this patch tested?
A new unit test covers this fix - see
SQLListenerMemorySuite.scala
.