[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees #5669

jimfcarroll · 2015-04-23T19:21:09Z

This fixes a leak of a persisted RDD where GradientBoostTrees can call persist but never unpersists.

Jira: https://issues.apache.org/jira/browse/SPARK-7100

Discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/GradientBoostTrees-leaks-a-persisted-RDD-td11750.html

srowen · 2015-04-23T19:33:04Z

mllib/src/main/scala/org/apache/spark/mllib/tree/GradientBoostedTrees.scala


-    timer.stop("init")
+    try {


@jkbradley it's an interesting question -- it makes sense to ensure the RDD is unpersisted with try-finally but I don't know if any other code does it. I think the assumption has been that the app will soon die anyway if something unexpected is going wrong like an exception, so RDD cleanup isn't that important. Should we stick to that imperfect reasoning here and keep the code simpler?

I override and intercept most RDD calls in my own code to optimize them (which is how I found this in the first place) and we use try-with-resource to persist and unpersist. It's too bad you cant do that simply in scala.

If you want the simpler solution let me know. I can change it.

I too think it makes some sense but have not seen it done elsewhere. I think I asked about it when I first started working on Spark and was told something like @srowen said. Even if it's "leaked," other RDDs can still push it out of memory or disk if needed, right? Have you encountered cases where that does not happen?

srowen · 2015-04-25T22:04:07Z

ok to test

SparkQA · 2015-04-25T23:40:16Z

Test build #704 has finished for PR 5669 at commit e5be57c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.
This patch does not change any dependencies.

jkbradley · 2015-04-27T20:50:59Z

After discussing with @mengxr I think we should not bother with the try-finally wrapper. As mentioned above, the method should generally not fail, so the data will be unpersisted as needed. When an exception is thrown, then the data will be unpersisted whenever another RDD pushes "input" out of memory/disk, without undue harm to other jobs.

@jimfcarroll Could you please update the PR to remove the try-finally wrapper, but keep the unpersist at the end?

Thanks for going through this discussion!

jimfcarroll · 2015-04-27T20:53:29Z

Your project. I'll downgrade it if you want. :-)

jkbradley · 2015-04-27T22:47:57Z

OK thank you. Sometimes, a slight improvement can be outweighed by the extra complexity.

jimfcarroll · 2015-04-27T22:53:02Z

Okay. I force pushed a different commit. I removed the try-finally.

Hope this works. Thanks.

SparkQA · 2015-04-27T23:33:22Z

Test build #723 has started for PR 5669 at commit 45f4b03.

srowen · 2015-04-28T11:50:44Z

This test actually passed, but Jenkins failed to post. LGTM.

data: {"body": " Test build #723 has finished for PR 5669 at commit 45f4b03.\n * This patch passes all tests.\n * This patch merges cleanly.\n * This patch adds no public classes.\nYour branch is ahead of 'origin/master' by 295 commits.

This patch does not change any dependencies."}

This fixes a leak of a persisted RDD where GradientBoostTrees can call persist but never unpersists. Jira: https://issues.apache.org/jira/browse/SPARK-7100 Discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/GradientBoostTrees-leaks-a-persisted-RDD-td11750.html Author: Jim Carroll <jim@dontcallme.com> Closes apache#5669 from jimfcarroll/gb-unpersist-fix and squashes the following commits: 45f4b03 [Jim Carroll] [SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees

srowen reviewed Apr 23, 2015
View reviewed changes

[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees

45f4b03

asfgit closed this in 75905c5 Apr 28, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees #5669

[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees #5669

jimfcarroll commented Apr 23, 2015

srowen Apr 23, 2015

jimfcarroll Apr 23, 2015

jkbradley Apr 23, 2015

srowen commented Apr 25, 2015

SparkQA commented Apr 25, 2015

jkbradley commented Apr 27, 2015

jimfcarroll commented Apr 27, 2015

jkbradley commented Apr 27, 2015

jimfcarroll commented Apr 27, 2015

SparkQA commented Apr 27, 2015

srowen commented Apr 28, 2015

[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees #5669

[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees #5669

Conversation

jimfcarroll commented Apr 23, 2015

srowen Apr 23, 2015

Choose a reason for hiding this comment

jimfcarroll Apr 23, 2015

Choose a reason for hiding this comment

jkbradley Apr 23, 2015

Choose a reason for hiding this comment

srowen commented Apr 25, 2015

SparkQA commented Apr 25, 2015

jkbradley commented Apr 27, 2015

jimfcarroll commented Apr 27, 2015

jkbradley commented Apr 27, 2015

jimfcarroll commented Apr 27, 2015

SparkQA commented Apr 27, 2015

srowen commented Apr 28, 2015