New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17742][core] Handle child process exit in SparkLauncher. #18877
Conversation
Currently the launcher handle does not monitor the child spark-submit process it launches; this means that if the child exits with an error, the handle's state will never change, and an application will not know that the application has failed. This change adds code to monitor the child process, and changes the handle state appropriately when the child process exits. Tested with added unit tests.
Test build #80368 has finished for PR 18877 at commit
|
int ec; | ||
try { | ||
ec = childProc.exitValue(); | ||
} catch (Exception e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to log the exception here
try { | ||
childProc.waitFor(); | ||
} catch (Exception e) { | ||
// Try again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log exception here?
And log exceptions. And add a transition to KILLED state that was missing.
Test build #80511 has finished for PR 18877 at commit
|
If there's no more feedback, I plan to push this soon to unblock other work on this module. |
Alright, merging this to master. |
yes @danelkotev |
Currently the launcher handle does not monitor the child spark-submit
process it launches; this means that if the child exits with an error,
the handle's state will never change, and an application will not know
that the application has failed.
This change adds code to monitor the child process, and changes the
handle state appropriately when the child process exits.
Tested with added unit tests.