-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-54753][SQL][4.1] Fix memory leak of ArtifactManager #53654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
JIRA Issue Information=== Bug SPARK-54753 === This comment was automatically generated by GitHub Actions |
|
@scottme PR title should be @HyukjinKwon, this looks like a critical issue, will you consider this as a blocker for 4.1.1? |
|
@hvanhovell @vicennial should this be a release blocker? |
updated as suggested , thank you @pan3793 |
|
Alright let me -1 for 4.1.1 rc1 and start RC2 with this fix tmr. It's my phone so I can't merge now 🫠 |
|
Merged to branch-4.1. |
### What changes were proposed in this pull request? As stated in https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/ref/Cleaner.html **The cleaning action could be a lambda but all too easily will capture the object reference, by referring to fields of the object being cleaned, preventing the object from becoming phantom reachable. Using a static nested class, as above, will avoid accidentally retaining the object reference.** For more details, and the test and analysis are in https://issues.apache.org/jira/browse/SPARK-54753 <img width="1462" height="559" alt="image" src="https://github.com/user-attachments/assets/83de9e8e-8f63-41fe-8318-b1cea6a1de9c" /> After running with Spark 4.0.1, the ArtififactManager is leaked, its referenced SessionState/SparkSession is as well leaked. ### Why are the changes needed? use a separate class to ref the cleanup state ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? with test program in https://issues.apache.org/jira/browse/SPARK-54753, and use Visual VM to monitor the memory usage ### Was this patch authored or co-authored using generative AI tooling? No **This PR backports the fix in #53591 to branch 4.1** cc dongjoon-hyun pranavdev022 hvanhovell vicennial HyukjinKwon Closes #53654 from scottme/SPARK-54753-4.1. Authored-by: xihuan_mstr <xihuan@strategy.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
As stated in https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/ref/Cleaner.html
The cleaning action could be a lambda but all too easily will capture the object reference, by referring to fields of the object being cleaned, preventing the object from becoming phantom reachable. Using a static nested class, as above, will avoid accidentally retaining the object reference.
For more details, and the test and analysis are in https://issues.apache.org/jira/browse/SPARK-54753
After running with Spark 4.0.1, the ArtififactManager is leaked, its referenced SessionState/SparkSession is as well leaked.
Why are the changes needed?
use a separate class to ref the cleanup state
Does this PR introduce any user-facing change?
No
How was this patch tested?
with test program in https://issues.apache.org/jira/browse/SPARK-54753, and use Visual VM to monitor the memory usage
Was this patch authored or co-authored using generative AI tooling?
No
This PR backports the fix in #53591 to branch 4.1
cc @dongjoon-hyun @pranavdev022 @hvanhovell @vicennial @HyukjinKwon