-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Lambda Extension logs shutdown #8063
Conversation
@DarcyRaynerDD I have refactored this and incorporated your suggestions. I updated the initial PR comment to reflect the new approach. Let me know your thoughts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a comment on the usage of the lock in the shutdown which I think needs one more commit, but except from that, solid!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
What does this PR do?
There is a race condition that causes panics and prevents us from receiving logs when shutting down the Lambda Extension. When the extension receives a shutdown event from the API, it stops the logs agent. However, the extension may continue to receive log messages from the logs API after receiving the shutdown event. Currently, when these messages are received, we try to send them to the logs agent. Since the logs agent has been stopped, this results in a panic.
To fix this issue:
Daemon.Stop()
function.Daemon.Stop()
after receiving aSIGINT
signal. This shouldn't really happen in practice because during a shutdown, we receive a shutdown message over the Lambda API, and thenSIGKILL
two seconds later.Motivation
The current behavior causes panics and lost log messages.
Describe how to test your changes
I have been manually testing these changes with functions running periodically in the sandbox account that frequently encounter the race condition. I used debug log messages to confirm the behavior is as expected.