Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README to warn about keep-alive and graceful shutdown #13

Open
coopernurse opened this issue Apr 21, 2014 · 13 comments
Open

Update README to warn about keep-alive and graceful shutdown #13

coopernurse opened this issue Apr 21, 2014 · 13 comments

Comments

@coopernurse
Copy link

After #10 was applied (which I agree with) I'm seeing behavior that confused me for a bit, and may confuse others.

manners relies on the socket open/close events to increment/decrement the WaitGroup counter. That makes sense.

If you use a reverse proxy, and that proxy uses keep-alive to reduce latency to the upstream, then this will cause manners to block until those keep-alive sockets are closed.

I noticed this when I was attempting a graceful shutdown of my Go process, and my signal handler logged the shutdown request normally, but the process never exited.

In my case I don't need keep-alive enabled on the reverse proxy, and disabling keep-alive fixed my shutdown issues. Others may run into this, so it may be worth a warning in the README.

Thanks, -- James

@alexzorin
Copy link

I think this is broader than just reverse proxies. Firefox currently ships with a network.http.keep-alive.timeout of 115 seconds, so it takes quite a while for the server to stop (possibly unbounded if a client never drops the connection?)

I think that the correct way to deal with this would be to set a low ReadTimeout (either always, or after the Shutdown signal, if that's possible).

However, at a quick glance neither http.Server nor the net.Connection is exposed when using this package.

Rather than updating the README, is there something we can expose in the API to deal with this issue?

@alexzorin
Copy link

Oops, just read through https://code.google.com/p/go/issues/detail?id=4674. Looks like this will be fixable in Go 1.3

@lionelbarrow
Copy link
Contributor

Fixed in 3.0.0.

@echampet
Copy link

hi @lionelbarrow
not fixed for me, nothing in your code is killing idle connection (http.StateIdle), nor calling SetKeepAlivesEnabled(false).

For now i'm manually adding header Connection:close, so i just have to wait for the ongoing keepalive, but it's not perfect

@bobrik
Copy link

bobrik commented May 26, 2015

It is definitely not fixed.

I see some Connection reset by peer in nginx error logs with this package, Wireshark shows RST packets sent in response for requests. I've read about this error. After looking and syscalls before my service gracefully exits it seems that my service doesn't close any connections, it just exits cleanly resulting ing RST.

I switched to github.com/tylerb/graceful and it works as expected, only raising Connection refused errors in logs. This way nginx is completely sure that request wasn't served and retries it on different upstream.

Hope this helps.

@lionelbarrow
Copy link
Contributor

I'll take another look, sorry for closing prematurely.

@lionelbarrow lionelbarrow reopened this May 28, 2015
@AlekSi
Copy link

AlekSi commented Aug 7, 2015

Any updates?

@lionelbarrow
Copy link
Contributor

Digging into this today.

I see the problem -- if the shutdown command is received and then a request on a keepalive connection finishes, we close the connection correctly. But if the connection is idle already, we won't do anything to it; we just shut down.

It seems the correct solution is for the library to hold onto the connection when it first is created and then close them all once it enters the final phase of shutdown. I'll play around with that, but it might be a bit involved, so I'll update the README with a warning in the mean time. Does that sound reasonable?

I don't think the behavior @coopernurse was describing can happen after we adopted the Go 1.3 TCP state function model.

@AlekSi
Copy link

AlekSi commented Aug 10, 2015

I don't think the behavior @coopernurse was describing can happen after we adopted the Go 1.3 TCP state function model.

Well, what I see now with Go 1.4.2 is that server.ListenAndServe() (where server := manners.NewServer()) does not return after server.Shutdown <- true in configuration described in the first comment.

@allgeek
Copy link

allgeek commented Oct 9, 2015

@lionelbarrow - I'm not sure if you've had a chance to play around with the refactoring to accommodate idle connections during shutdown yet, but I suspect your changes there may eliminate, or influence the solution to, a race condition I've encountered. In a similar fashion, if an idle keepalive connection becomes active after shutdown, a data race will be reported by the race detector, since that connection's state change will attempt a wg.Add(1) after the wg.Wait() has occurred.

As a temporary workaround (trying to eliminate occasional race detector noise from our CI builds), we've added an atomic read of the closing flag to StartRoutine() - but the need to do that is further evidence that our manners aren't quite proper for idle connections ;)

@allgeek
Copy link

allgeek commented Oct 9, 2015

Actually that workaround isn't safe, as it could cause wg to go negative; obviously we don't know if a call to FinishRoutine() is from a 'post-close reactivated idle conn' that shouldn't decrement the WaitGroup. I think the more fundamental changes to properly track and wait on / clean up idle connections during shutdown are the only solution here; let me know if there's something I can do to help there. As it is, we're seeing at least one WaitGroup race from manners reported on about 5% of our CI test runs.

@nullne
Copy link

nullne commented Aug 9, 2017

have this been solved already? I met a lot of negative WaitGroup counter in my production environment. I am taking investigation about this now.
can you @allgeek show me any example which will result in wg to go negative?

@nemo-by-replace
Copy link

today, In Go 1.9.4, Manners does not correctly shut down long-lived keepalive connections when issued a shutdown command yet! so how can i do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants