New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H/2 thread starvation deadlock #2418

Closed
daghf opened this Issue Sep 7, 2017 · 4 comments

Comments

Projects
None yet
5 participants
@daghf
Copy link
Member

daghf commented Sep 7, 2017

Exhausting the available worker threads can make Varnish go completely unresponsive to client traffic indefinitely. This happens when the h/2 session threads are taking up all the available threads and request threads have been queued.

The issue is trivially reproducible by running an h2 load testing tool with a number of connections exceeding the available number of threads.

For a simple test I ran with a low-ish thread count (-p thread_pool_min=40 -p thread_pool_max=40)

and then run an h/2 load test with a number of connections exceeding the available threads,

$ ./h2load https://localhost:9443/ -c 100 -n 2000

(h2load is part of nghttp2, https://github.com/nghttp2/nghttp2)

The above h2load command will not finish. After killing the load testing tool, Varnish will stay unresponsive indefinitely - also for h/1 traffic.

Attaching gdb at this point, we find that the threads are stuck like follows:

#0  pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1  0x000000000044dba3 in Lck_CondWait (cond=0x7fcfeb7b7e40, lck=0x7fcfd9a85240, when=1504773044.7735426) at cache/cache_lck.c:207
#2  0x00000000004a0acb in h2_new_session (wrk=0x7fcfeb7b7dd0, arg=0x7fcfd9a8e020) at http2/cache_http2_session.c:344
#3  0x000000000048487c in Pool_Work_Thread (pp=0x7fcfe280e140, wrk=0x7fcfeb7b7dd0) at cache/cache_wrk.c:379
#4  0x00000000004840f1 in WRK_Thread (qp=0x7fcfe280e140, stacksize=49152, thread_workspace=2048) at cache/cache_wrk.c:132
#5  0x0000000000483d7a in pool_thread (priv=0x7fcfe180e100) at cache/cache_wrk.c:412
#6  0x00007fcfea63a494 in start_thread (arg=0x7fcfeb7b8700) at pthread_create.c:333
#7  0x00007fcfea37caff in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97

This is in the request cleanup part of h2_new_session. The deadlock occurs here because the sessions are signaling and waiting for requests that haven't been dispatched on a worker yet.

In summary,

  • h/2 session threads can't finish because they are waiting for request threads that haven't been dispatched yet
  • Request threads can't be dispatched because all the available threads are occupied by h/2 session threads
@Dridi

This comment has been minimized.

Copy link
Member

Dridi commented Sep 7, 2017

@daghf, this scenario?

varnishtest "h2 queuing deadlock"

server s1 {
        rxreq
        txresp
} -start

varnish v1 -cliok "param.set thread_pools 1"
varnish v1 -cliok "param.set thread_pool_min 2"
varnish v1 -cliok "param.set thread_pool_max 2"
varnish v1 -cliok "param.set feature +http2"

varnish v1 -vcl+backend { } -start

client c1 {
        txpri
        stream 0 rxsettings -run
} -start

client c2 {
        txpri
        stream 0 rxsettings -run
} -start

client c1 {
        stream 1 {
                txreq
                rxresp
                expect resp.status == 200
        } -start
} -start

client c2 {
        stream 1 {
                txreq
                rxresp
                expect resp.status == 200
        } -start
} -start

client c1 -wait
client c2 -wait

edit: I think this test only shows half of the story.

@Dridi

This comment has been minimized.

Copy link
Member

Dridi commented Sep 7, 2017

Adding an h1 client to match the whole description:

varnishtest "h2 queuing deadlock"

barrier b1 cond 2

server s1 { } -start

# A reserve of 1 thread in a pool of 3 leaves a maximum
# of 2 running sessions, the streams will be queued (except
# stream 0 that is part of the h2 session).

varnish v1 -cliok "param.set thread_pools 1"
varnish v1 -cliok "param.set thread_pool_min 3"
varnish v1 -cliok "param.set thread_pool_max 3"
varnish v1 -cliok "param.set thread_pool_reserve 1"
varnish v1 -cliok "param.set feature +http2"

varnish v1 -vcl+backend {
	sub vcl_recv {
		return (synth(200));
	}
} -start

# Starve the pool with h2 sessions

client c1 {
	txpri
	stream 0 rxsettings -run

	barrier b1 sync

	stream 1 {
		txreq
		# can't be scheduled, don't rx
	} -run
} -start

client c2 {
	txpri
	stream 0 rxsettings -run

	barrier b1 sync

	stream 1 {
		txreq
		# can't be scheduled, don't rx
	} -run
} -start

client c1 -wait
client c2 -wait

# At this point c1 and c2 closed their connections

client c3 {
	txreq
	rxresp
	expect resp.status == 200
} -run
@Dridi

This comment has been minimized.

Copy link
Member

Dridi commented Sep 7, 2017

Also fails on 5.1.3 with the same outcome, but does not fail on 5.0.0 after some tweaking:

diff r02418.vtc r02418.vtc
--- r02418.vtc
+++ r02418.vtc
@@ -9,9 +9,8 @@ server s1 { } -start
 # stream 0 that is part of the h2 session).
 
 varnish v1 -cliok "param.set thread_pools 1"
-varnish v1 -cliok "param.set thread_pool_min 3"
-varnish v1 -cliok "param.set thread_pool_max 3"
-varnish v1 -cliok "param.set thread_pool_reserve 1"
+varnish v1 -cliok "param.set thread_pool_min 2"
+varnish v1 -cliok "param.set thread_pool_max 2"
 varnish v1 -cliok "param.set feature +http2"
 
 varnish v1 -vcl+backend {

Please note that I compensated the lack of thread_pool_reserve by decreasing the number of threads.

@Dridi Dridi added r=trunk r=5.2 c=varnishd and removed r=trunk labels Sep 18, 2017

bsdphk added a commit that referenced this issue Oct 3, 2018

Add a watchdog to worker pools.
If the worker pool is configured too small, it can deadlock.

Recovering from this would require a lot of complicated code, to
discover queued but unscheduled tasks which can be cancelled
(because the client went away or otherwise) and code to do the
cancelling etc. etc.

But fundamentally either people configured their pools wrong, in
which case we want them to notice, or they are under DoS, in which
case recovering gracefully is unlikely be a major improvement over
a restart.

Instead we implement a per-pool watchdog and kill the child process
if nothing has been dequeued for too long.

Default value 10 seconds, open to discussion.

Band-aid for:	#2418

Test-case by:	@Dridi
@bsdphk

This comment has been minimized.

Copy link
Contributor

bsdphk commented Oct 3, 2018

Trying to recover gracefully is not worth the amount of code and complexity.

I have added a watchdog which kills the child process if no queued work gets scheduled for too long (=10s default, debatable)

@bsdphk bsdphk closed this Oct 3, 2018

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Oct 9, 2018

generalize the worker pool reserve to avoid deadlocks
Prioviously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this situation, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Oct 9, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

hermunn added a commit to hermunn/varnish-cache that referenced this issue Oct 11, 2018

Add a watchdog to worker pools.
If the worker pool is configured too small, it can deadlock.

Recovering from this would require a lot of complicated code, to
discover queued but unscheduled tasks which can be cancelled
(because the client went away or otherwise) and code to do the
cancelling etc. etc.

But fundamentally either people configured their pools wrong, in
which case we want them to notice, or they are under DoS, in which
case recovering gracefully is unlikely be a major improvement over
a restart.

Instead we implement a per-pool watchdog and kill the child process
if nothing has been dequeued for too long.

Default value 10 seconds, open to discussion.

Band-aid for:	varnishcache#2418

Test-case by:	@Dridi

Conflicts:
	bin/varnishd/cache/cache_pool.h

Dridi added a commit that referenced this issue Oct 16, 2018

Add a watchdog to worker pools.
If the worker pool is configured too small, it can deadlock.

Recovering from this would require a lot of complicated code, to
discover queued but unscheduled tasks which can be cancelled
(because the client went away or otherwise) and code to do the
cancelling etc. etc.

But fundamentally either people configured their pools wrong, in
which case we want them to notice, or they are under DoS, in which
case recovering gracefully is unlikely be a major improvement over
a restart.

Instead we implement a per-pool watchdog and kill the child process
if nothing has been dequeued for too long.

Default value 10 seconds, open to discussion.

Band-aid for:	#2418

Test-case by:	@Dridi

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Oct 18, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

hermunn added a commit that referenced this issue Oct 24, 2018

Add a watchdog to worker pools.
If the worker pool is configured too small, it can deadlock.

Recovering from this would require a lot of complicated code, to
discover queued but unscheduled tasks which can be cancelled
(because the client went away or otherwise) and code to do the
cancelling etc. etc.

But fundamentally either people configured their pools wrong, in
which case we want them to notice, or they are under DoS, in which
case recovering gracefully is unlikely be a major improvement over
a restart.

Instead we implement a per-pool watchdog and kill the child process
if nothing has been dequeued for too long.

Default value 10 seconds, open to discussion.

Band-aid for:	#2418

Test-case by:	@Dridi

Conflicts:
	bin/varnishd/cache/cache_pool.h

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Oct 27, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 1, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 6, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 6, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 6, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 13, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 15, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 21, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 22, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 22, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 22, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 26, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 27, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 27, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Nov 28, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Dec 3, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Dec 3, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Dec 7, 2018

generalize the worker pool reserve to avoid deadlocks
Previously, we used a minimum number of idle threads (the reserve) to
ensure that we do not assign all threads with client requests and no threads
left over for backend requests.

This was actually only a special case of the more general issue exposed
by h2: Lower priority tasks depend on higher priority tasks (for h2,
sessions need streams, which need requests, which may need backend
requests).

To solve this problem, we divide the reserve by the number of priority
classes and schedule lower priority tasks only if there are enough idle
threads to run higher priority tasks eventually.

This change does not guarantee any upper limit on the amount of time it
can take for a task to be scheduled (e.g. backend requests could be
blocking on arbitrarily long timeouts), so the thread pool watchdog is
still warranted. But this change should guarantee that we do make
progress eventually.

Fixes varnishcache#2418 for real
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment