-
Notifications
You must be signed in to change notification settings - Fork 24.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Die with dignity on the network layer #21720
Die with dignity on the network layer #21720
Conversation
When a fatal error is thrown on the network layer, such an error never makes its way to the uncaught exception handler. This prevents the node from being torn down if an out of memory error or other fatal error is thrown while handling HTTP or transport traffic. This commit adds logic to ensure that such errors bubble their way up to the uncaught exception handler, even though Netty tries really hard to swallow everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
To test this, start an instance of Elasticsearch that does not have this patch applied with a 256m heap and
Elasticsearch will not die. Now apply this patch and test again. Elasticsearch will die with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left one comment. Other than that LGTM
* frame so that at least we know where it came from. | ||
*/ | ||
final StackTraceElement previous = Thread.currentThread().getStackTrace()[2]; | ||
new Thread(() -> { throw (Error)cause; }, previous.getClassName() + "#" + previous.getMethodName()).start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can try to log the stacktrace in a try-finally block in hopes that we can get the full stacktrace? In the finally we can throw the Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed bf47916.
retest this please |
When preparing to rethrow a fatal error, this commit adds an attempt to log the current stack trace so where know which handler saw the fatal error.
cb64c39
to
bf47916
Compare
When a fatal error is thrown on the network layer, such an error never makes its way to the uncaught exception handler. This prevents the node from being torn down if an out of memory error or other fatal error is thrown while handling HTTP or transport traffic. This commit adds logic to ensure that such errors bubble their way up to the uncaught exception handler, even though Netty tries really hard to swallow everything. Relates #21720
When a fatal error is thrown on the network layer, such an error never makes its way to the uncaught exception handler. This prevents the node from being torn down if an out of memory error or other fatal error is thrown while handling HTTP or transport traffic. This commit adds logic to ensure that such errors bubble their way up to the uncaught exception handler, even though Netty tries really hard to swallow everything. Relates #21720
When a fatal error is thrown on the network layer, such an error never
makes its way to the uncaught exception handler. This prevents the node
from being torn down if an out of memory error or other fatal error is
thrown while handling HTTP or transport traffic. This commit adds logic
to ensure that such errors bubble their way up to the uncaught exception
handler, even though Netty tries really hard to swallow everything.
Relates #19272