Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees #5669

Closed
wants to merge 1 commit into from
Closed

[SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees #5669

wants to merge 1 commit into from

Conversation

jimfcarroll
Copy link

This fixes a leak of a persisted RDD where GradientBoostTrees can call persist but never unpersists.

Jira: https://issues.apache.org/jira/browse/SPARK-7100

Discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/GradientBoostTrees-leaks-a-persisted-RDD-td11750.html


timer.stop("init")
try {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkbradley it's an interesting question -- it makes sense to ensure the RDD is unpersisted with try-finally but I don't know if any other code does it. I think the assumption has been that the app will soon die anyway if something unexpected is going wrong like an exception, so RDD cleanup isn't that important. Should we stick to that imperfect reasoning here and keep the code simpler?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I override and intercept most RDD calls in my own code to optimize them (which is how I found this in the first place) and we use try-with-resource to persist and unpersist. It's too bad you cant do that simply in scala.

If you want the simpler solution let me know. I can change it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too think it makes some sense but have not seen it done elsewhere. I think I asked about it when I first started working on Spark and was told something like @srowen said. Even if it's "leaked," other RDDs can still push it out of memory or disk if needed, right? Have you encountered cases where that does not happen?

@srowen
Copy link
Member

srowen commented Apr 25, 2015

ok to test

@SparkQA
Copy link

SparkQA commented Apr 25, 2015

Test build #704 has finished for PR 5669 at commit e5be57c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@jkbradley
Copy link
Member

After discussing with @mengxr I think we should not bother with the try-finally wrapper. As mentioned above, the method should generally not fail, so the data will be unpersisted as needed. When an exception is thrown, then the data will be unpersisted whenever another RDD pushes "input" out of memory/disk, without undue harm to other jobs.

@jimfcarroll Could you please update the PR to remove the try-finally wrapper, but keep the unpersist at the end?

Thanks for going through this discussion!

@jimfcarroll
Copy link
Author

Your project. I'll downgrade it if you want. :-)

@jkbradley
Copy link
Member

OK thank you. Sometimes, a slight improvement can be outweighed by the extra complexity.

@jimfcarroll
Copy link
Author

Okay. I force pushed a different commit. I removed the try-finally.

Hope this works. Thanks.

@SparkQA
Copy link

SparkQA commented Apr 27, 2015

Test build #723 has started for PR 5669 at commit 45f4b03.

@srowen
Copy link
Member

srowen commented Apr 28, 2015

This test actually passed, but Jenkins failed to post. LGTM.

data: {"body": " Test build #723 has finished for PR 5669 at commit 45f4b03.\n * This patch passes all tests.\n * This patch merges cleanly.\n * This patch adds no public classes.\nYour branch is ahead of 'origin/master' by 295 commits.

  • This patch does not change any dependencies."}

@asfgit asfgit closed this in 75905c5 Apr 28, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 14, 2015
This fixes a leak of a persisted RDD where GradientBoostTrees can call persist but never unpersists.

Jira: https://issues.apache.org/jira/browse/SPARK-7100

Discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/GradientBoostTrees-leaks-a-persisted-RDD-td11750.html

Author: Jim Carroll <jim@dontcallme.com>

Closes apache#5669 from jimfcarroll/gb-unpersist-fix and squashes the following commits:

45f4b03 [Jim Carroll] [SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This fixes a leak of a persisted RDD where GradientBoostTrees can call persist but never unpersists.

Jira: https://issues.apache.org/jira/browse/SPARK-7100

Discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/GradientBoostTrees-leaks-a-persisted-RDD-td11750.html

Author: Jim Carroll <jim@dontcallme.com>

Closes apache#5669 from jimfcarroll/gb-unpersist-fix and squashes the following commits:

45f4b03 [Jim Carroll] [SPARK-7100][MLLib] Fix persisted RDD leak in GradientBoostTrees
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants