New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread leak? #2623
Comments
I'm testing other conditions.(ESI + HTTP/1 + varnish restart) |
All TZ is UTC+0900(JST)
h2_refcnt_delta and nidle_active increased simultaneously at 10:40-11:00. |
I'm looking for a cause commit... Leaked Not leaked 00:50(JST)~ removed instance , set thread_pool_timeout=10 and thread_pool_min=10. |
I think thread leak began from 0d3044d |
I'm testing at varnish 6.0.0 + bin/varnishd/http2/* changes. |
Leaked... |
Thanks for the updates - looks like we still have a thread leak situation happening somewhere. I'll keep looking. |
I've plugged yet another thread leak, this one also relating to request bodies (f690631). If we failed to schedule a thread for a stream, the h/2 session thread would get stuck indefinitely if a DATA frame was received on that stream prior to us cleaning it up. Since the h/2 session thread is also responsible for waking up any streams currently waiting for window updates (since 0d3044d), those threads would also be leaked, exacerbating the situation. Please test 👍 |
c7c7f52+add metrics Leaked... |
Thanks a lot @xcir, that was hugely helpful. I'm pretty confident I've gotten to the bottom of this now. Will push a fix asap. |
I'm pretty sure this is fixed now, but please reopen if you still see problems. |
leaked again... backtrace backtrace(h2=0x7f63194affc0) |
Previously we've been incorrectly transtitioning to CLOS_REM on END_HEADERS, which prevents us from seeing if a client incorrectly transmitted a DATA frame on a closed stream. This slightly complicates things in that we can now be in state OPEN with an inactive hpack decoding state, and we need to make sure on cleanup if that has already been finalized. This would be simpler if the h/2 spec had split the OPEN state in two parts, with an extra state transition on END_HEADERS. Again big thanks to @xcir for his help in diagnosing this. Fixes: varnishcache#2623
again.. backtrace + VSL working varnish(tmp/snap/0618-trunk) |
The current flow control code's use of h2->cond is racy. h2->cond is already used for handing over a DATA frame to a stream thread. In the event that we have both streams waiting on this condvar for window updates and at the same time the rxthread gets signaled for a DATA frame, we could end up waking up the wrong thread and the rxthread gets stuck forever. This commit addresses this by using a separate condvar for window updates. An alternative would be to always issue a broadcast on h2->cond instead of signal, but I found this approach much cleaner. Probably fixes: varnishcache#2623
The current flow control code's use of h2->cond is racy. h2->cond is already used for handing over a DATA frame to a stream thread. In the event that we have both streams waiting on this condvar for window updates and at the same time the rxthread gets signaled for a DATA frame, we could end up waking up the wrong thread and the rxthread gets stuck forever. This commit addresses this by using a separate condvar for window updates. An alternative would be to always issue a broadcast on h2->cond instead of signal, but I found this approach much cleaner. Probably fixes: varnishcache#2623
I'm running 812b136 commit for 4days. Not leak! |
Previously we've been incorrectly transtitioning to CLOS_REM on END_HEADERS, which prevents us from seeing if a client incorrectly transmitted a DATA frame on a closed stream. This slightly complicates things in that we can now be in state OPEN with an inactive hpack decoding state, and we need to make sure on cleanup if that has already been finalized. This would be simpler if the h/2 spec had split the OPEN state in two parts, with an extra state transition on END_HEADERS. Again big thanks to @xcir for his help in diagnosing this. Fixes: #2623
The current flow control code's use of h2->cond is racy. h2->cond is already used for handing over a DATA frame to a stream thread. In the event that we have both streams waiting on this condvar for window updates and at the same time the rxthread gets signaled for a DATA frame, we could end up waking up the wrong thread and the rxthread gets stuck forever. This commit addresses this by using a separate condvar for window updates. An alternative would be to always issue a broadcast on h2->cond instead of signal, but I found this approach much cleaner. Probably fixes: #2623
If a client mistakenly sent a DATA frame on a stream where it already transmitted an END_STREAM, it would lead to the rxthread sitting around indefinitely. Big thanks to @xcir for his help in diagnosing this. Fixes: varnishcache#2623
ESI: enabled HTTP/2: enabled
I disabled http/2 after noticing threads increase.(only hitch restart)
But, It seems gradual increase.
This is another instance(ESI: enabled + HTTP/2: enabled)
ESI: disabled HTTP/2: enabled
This instance is not using ESI.
Thread was not increase.
Your Environment
The text was updated successfully, but these errors were encountered: