-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reserve a fraction of idle threads for each priority class #2796
Conversation
71e0cf8
to
87b56f9
Compare
Maybe I could have chosen a cleaner wording: The reserve is proportional to the priority, so |
bugwash conclusion: phk to ponder a bit longer |
87b56f9
to
a119f98
Compare
rebased, force pushed |
a119f98
to
e87de12
Compare
3683f1b
to
438a138
Compare
4006203
to
d238ffe
Compare
fedb594
to
72b46da
Compare
44d7734
to
139f777
Compare
Bugwash generally in favour. @dridi to give it a whirl and feedback |
71212b7
to
a24e317
Compare
I managed to salvage r2418.vtc from the grave, following my intuition that the new reserve logic would not remove the deadlock. Show test casevarnishtest "h2 queuing deadlock"
barrier b1 cond 2
# dridi> I didn't work out exactly how I tickled the watchdog, because unlike
# the current reserve system on the master branch I run into a deadlock before
# before even involving a 3rd client. Analysis for later, I was sidetracked by
# a bug (049ce0e73) in varnishtest.
varnish v1 -cliok "param.set thread_pools 1"
varnish v1 -cliok "param.set thread_pool_min 5"
varnish v1 -cliok "param.set thread_pool_max 5"
varnish v1 -cliok "param.set thread_pool_reserve 0"
varnish v1 -cliok "param.set thread_pool_watchdog 2"
varnish v1 -cliok "param.set feature +http2"
varnish v1 -cliok "param.set feature +no_coredump"
varnish v1 -vcl {
backend be none;
sub vcl_recv {
return (synth(200));
}
} -start
logexpect l1 -v v1 -g raw -q "vxid == 0" {
expect * 0 Error "Pool Herder: Queue does not move"
} -start
client c1 {
txpri
stream 0 rxsettings -run
barrier b1 sync
stream 1 {
txreq
# dridi> not sure how relevant this still is
} -run
} -start
client c2 {
barrier b1 sync
txpri
delay 3
} -start
client c1 -wait
client c2 -wait
logexpect l1 -wait
varnish v1 -cliok panic.show
varnish v1 -cliok panic.clear
varnish v1 -expectexit 0x20 |
a24e317
to
0751ab2
Compare
1dcc1bf
to
54bed57
Compare
I'm no longer able to deadlock Varnish with this patch set in place. Good job :-) @dridi the test case does not deadlock, the reason we trigger the watchdog is due to the I wonder if we might still have a theoretical deadlock possibility for h/2 via |
Thank you @daghf for your analysis, when I produced the test case I wasn't sure what was happening yet and when I tried to analyze what was going on I couldn't focus properly. Besides your suggestion, do you have an idea on how to fix the spurious watchdog trigger? This is caused by a client, so this is something we should either fix as part of this pull request or soon after if we decide to merge it in its current form. But this is not something we should allow to fall on the back burner. |
54bed57
to
a59618a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I managed to reproduce the h2 panic in #3142 so there is nothing holding my approval back, except maybe that you might as well squash squashme commits when you rebase against master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
c34388f
to
1f3c5e0
Compare
1f3c5e0
to
dcf5915
Compare
Previously, we used a minimum number of idle threads (the reserve) to ensure that we do not assign all threads with client requests and no threads left over for backend requests. This was actually only a special case of the more general issue exposed by h2: Lower priority tasks depend on higher priority tasks (for h2, sessions need streams, which need requests, which may need backend requests). To solve this problem, we divide the reserve by the number of priority classes and schedule lower priority tasks only if there are enough idle threads to run higher priority tasks eventually. This change does not guarantee any upper limit on the amount of time it can take for a task to be scheduled (e.g. backend requests could be blocking on arbitrarily long timeouts), so the thread pool watchdog is still warranted. But this change should guarantee that we do make progress eventually. With the reserves, thread_pool_min needs to be no smaller than the number of priority classes (TASK_QUEUE__END). Ideally, we should have an even higher minimum (@dridi rightly suggested to make it 2 * TASK_QUEUE__END), but that would prevent the very useful test t02011.vtc. For now, the value of TASK_QUEUE__END (5) is hardcoded as such for the parameter configuration and documentation because auto-generating it would require include/macro dances which I consider over the top for now. Instead, the respective places are marked and an assert is in place to ensure we do not start a worker with too small a number of workers. I dicided against checks in the manager to avoid include pollution from the worker (cache.h) into the manager. Fixes varnishcache#2418 for real
This test is to detect a deadlock which does not exist any more. IMHO, the only sensible way to test for the lack of it now is to do a load test, which is not what we want in vtc.
dcf5915
to
5fe2a46
Compare
FTR:
from the commit message: the respective places are marked and an assert is in place to ensure we do not start a worker with too small a number of workers. I decided against checks in the manager to avoid include pollution from the worker (cache.h) into the manager. |
Previously, we used a minimum number of idle threads (the reserve) to ensure that we do not assign all threads with client requests and no threads left over for backend requests.
This was actually only a special case of the more general issue exposed by h2: Lower priority tasks depend on higher priority tasks (for h2, sessions need streams, which need requests, which may need backend requests).
To solve this situation, we divide the reserve by the number of priority classes and schedule lower priority tasks only if there are enough idle threads to run higher priority tasks eventually.
This change does not guarantee any upper limit on the amount of time it can take for a task to be scheduled (e.g. backend requests could be blocking on arbitrarily long timeouts), so the thread pool watchdog is still warranted. But this change should guarantee that we do make progress eventually.