Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't catch SIGQUIT; safer close of Quit channel #6477

Merged
merged 1 commit into from
Apr 27, 2016
Merged

Conversation

e-dard
Copy link
Contributor

@e-dard e-dard commented Apr 27, 2016

Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated

A SIGQUIT signal should indicate to a running process that it should exit immediately, and also dump its core. This is the default behaviour for a Go program, unless the program catches the signal using signal.Notify.

We were catching this signal in influx and then attempting to exit gracefully, which is the incorrect behaviour, especially if influx is stuck somewhere, or in a long running query.

Also, we were catching other signals that didn't make sense, e.g., SIGHUP—usually used to indicate a live configuration change, os.Kill—we're not even allowed to catch this.

This PR reduces the signals we catch down to syscall.SIGINT and syscall.SIGTERM. It also ensure that the Quit channel is closed properly.

Helps to fix #6474

@mention-bot
Copy link

By analyzing the blame information on this pull request, we identified @pires, @jsternberg and @sczk to be potential reviewers

@adamsvoboda
Copy link
Contributor

Nice catch! For what it's worth, this looks good to me, although you have more authority than I do! :)

@@ -182,7 +182,13 @@ func (c *CommandLine) Run() error {
for {
select {
case <-c.osSignals:
close(c.Quit)
// Attempt to gracefully exit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed since this section is single threaded. The close should always succeed. If the channel is already closed, then the Quit channel will always be the one chosen by select.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsternberg I'm not sure that's technically the case. If two signals are received in quick succession the scheduler could decide to chose the case <-c.osSignals case twice in succession.

I don't think there is anything that guarantees a closed-channel-receive will always be chosen if there are competing cases available to select over is there? I could be wrong though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought there was since I had done this previously, but this example seems to indicate there isn't.

package main

import "fmt"

func main() {
    for i := 0; i < 1000; i++ {
        ch := make(chan bool, 1)
        ch <- true
        closing := make(chan struct{})
        close(closing)
        select {
        case <-closing:
        case <-ch:
            fmt.Println("bad value", i)
            return
        }
    }
}

It prints bad value fairly quickly. So never mind.

It may just be more clear to duplicate the code from the c.Quit case rather than do some semi-convoluted for loop with a select statement. Whichever change is fine though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh. I can completely remove the need for these selects

@pires
Copy link
Contributor

pires commented Apr 27, 2016

The rationale behind the PR LGTM.

@jwilder jwilder added this to the 0.13.0 milestone Apr 27, 2016
@jwilder
Copy link
Contributor

jwilder commented Apr 27, 2016

👍

@e-dard e-dard merged commit 1d9919a into master Apr 27, 2016
@e-dard e-dard deleted the er-influx-signals branch April 27, 2016 15:48
timhallinflux added a commit that referenced this pull request Dec 20, 2016
v0.13.0 -- removed duplicate entry for  - [#6477](#6477): Don't catch SIGQUIT or SIGHUP signals.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't get rid of a stalled connection or long run query in CLI without kill -9
6 participants