-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-55093][CORE] Handle TaskRunner construction failures in launchTask #53865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…on failure - Move createTaskRunner into try-catch block to handle construction failures - Add cleanup to remove TaskRunner from runningTasks if threadPool.execute throws - Prevent potential memory leak by cleaning up TaskRunner when threadPool.execute fails - Update test to use current TaskDescription API
- Use reflection to mock runningTasks.put to throw exception - Tests cleanup logic when TaskRunner is created but fails to be added to runningTasks - Verify exception is properly caught and reported to driver
JIRA Issue Information=== Bug SPARK-55093 === This comment was automatically generated by GitHub Actions |
|
@cloud-fan @Ngone51 What do you think about this PR ? |
Ngone51
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ng mocked Executor fields The test was failing because mocked Executor's val fields (runningTasks, threadPool, conf, env, killMarks) were not initialized when using mock[Executor](CALLS_REAL_METHODS). This caused NullPointerExceptions when the real launchTask method tried to access them. Solution: Use reflection to manually set these val fields on the mocked Executor object after creation, allowing the real launchTask method to execute properly and add tasks to the runningTasks map. Test now passes consistently in ~1.4 seconds.
|
@Ngone51 Could you please take another look here ? I fix the test case "track allocated resources by taskId" in a new commit. The test case didn't mock the killMarks and ran into a null Exception. It passed before this PR because we didn't handle the exception and run the cleanup job. |
What changes were proposed in this pull request?
Prevent potential memory leak by cleaning up TaskRunner when threadPool.execute fails
Why are the changes needed?
The createTaskRunner may throw an Exception.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added an unit test.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: claude-4.5