-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Caught second signal incorrectly logging SIGTERM (15) for SIGPIPE #13067
Conversation
… (13) ### Introduction This PR fixes the issue of incorrectly showing SIGTERM signal for SIGPIPE as well for Caught second signal in agent. ### Problem Statement. Here is the background of the issue: If graceful shutdown is enabled for the agent. Agent will start the graceful shutdown on receiving SIGTERM. ``` {"@Level":"info","@message":"Caught","@module":"agent","@timestamp":"2022-05-12T00:04:16.712683Z","signal":15} ``` While shutting down if it receives a SIGPIPE due to broken pipe. ``` {"@Level":"error","@message":"failed to flush response","@module":"agent.server.raft","@timestamp":"2022-05-12T00:04:21.942772Z","error":"write tcp 192.168.55.0:8300-\u003e192.168.55.2:40833: write: broken pipe"} ``` It prints the following. ``` {"@Level":"info","@message":"Caught second signal, Exiting","@module":"agent","@timestamp":"2022-05-12T00:04:21.943055Z","signal":15} ``` As the signal show in 15 (SIGTERM) which is incorrect and leads to confusion. The log is not printing the correct signal recieved. ### Proposed Solution The current code to handle the second signal ``` case <-signalCh: c.logger.Info("Caught second signal, Exiting", "signal", sig) return 1 ``` Here it is catching the new signal a second signal which is actually a broken pipe. But the logger is printing the sig variable. Sig variable is actually set in line number 287. So when printing here it takes up the old value. The old sig variable value is 15 due to sigterm. The new value won't be set and it will print the 15 and creating confusion. In order to fix the problem we can catch the new signal and create a new variable secondSignal and assign the value to it. While logging use the secondSignal. ``` // Create a new variable which will hold the new sigterm and print it in the logs. case s := <-signalCh: var sig2 os.Signal // Declare a new sig variable. sig2 = s // Assign the caught sigterm signal. c.logger.Info("Caught second signal, Exiting", "signal", sig2) // Print the caught signal. return 1 ``` Thanks.
Hey @shweshi Thanks for getting to the root of that issue & the associated PR. Would you be open to adding a test case for this in |
@Amier3 Hey, can you share some ref on how to go about writing tests case for this use case. I checked the existing tests cases and couldn't find a tests case for agent graceful shutdown or tests case for the code related to os signal processing. As a general idea what i have thought is to:
A rough code example:
|
This pull request has been automatically flagged for inactivity because it has not been acted upon in the last 60 days. It will be closed if no new activity occurs in the next 30 days. Please feel free to re-open to resurrect the change if you feel this has happened by mistake. Thank you for your contributions. |
Closing due to inactivity. If you feel this was a mistake or you wish to re-open at any time in the future, please leave a comment and it will be re-surfaced for the maintainers to review. |
Hi, can we resurrect this? it seems we do have a issue here |
@thefallentree looks like the PR is closed now. This issue is still there. |
Description
This PR fixes the issue of incorrectly showing SIGTERM signal for SIGPIPE as well for Caught second signal in agent.
Problem Statement.
Here is the background of the issue:
If graceful shutdown is enabled for the agent. Agent will start the graceful shutdown on receiving SIGTERM.
While shutting down if it receives a SIGPIPE due to broken pipe.
It prints the following.
As the signal show in 15 (SIGTERM) which is incorrect and leads to confusion. The log is not printing the correct signal recieved.
Proposed Solution
The current code to handle the second signal
Here it is catching the new signal a second signal which is actually a broken pipe.
But the logger is printing the sig variable.
Sig variable is actually set in line number 287. So when printing here it takes up the old value.
The old sig variable value is 15 due to sigterm.
The new value won't be set and it will print the 15 and creating confusion.
In order to fix the problem we can catch the new signal and create a new variable secondSignal and assign the value to it. While logging use the secondSignal.
Thanks.
Testing & Reproduction steps
PR Checklist