Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
contexthelpers: Fix deadlock when nobody recvs on success channel
[ upstream commit 4a66961 ] It has been observed that the "sessionSuccess <- true" statement can block forever. E.g. Cilium v1.7.9: goroutine 742 [chan send, 547 minutes]: github.com/cilium/cilium/pkg/kvstore.(*etcdClient).renewLockSession(0xc000f66000, 0x2790b40, 0xc000e247c0, 0x0, 0x0) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:657 +0x3e2 github.com/cilium/cilium/pkg/kvstore.connectEtcdClient.func6(0x2790b40, 0xc000e247c0, 0x3aae820, 0x2) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:819 +0x3e github.com/cilium/cilium/pkg/controller.(*Controller).runController(0xc000f42500) /go/src/github.com/cilium/cilium/pkg/controller/controller.go:205 +0xa2a created by github.com/cilium/cilium/pkg/controller.(*Manager).updateController /go/src/github.com/cilium/cilium/pkg/controller/manager.go:120 +0xb09 This can happen when the context cancellation timer fires after a function which consumes the context has returned, but before "sessionSuccess <- true" is executed. After the timeout, the receiving goroutine is closed, making the sending block forever. Fix this by making the sessionSuccess channel buffered. Note that after sending we don't check whether the context has been cancelled, as we expect that any subsequent functions which consume the context will check for the cancellation. Fixes: 0262854 ("etcd: Fix incorrect context usage in session renewal") Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Quentin Monnet <quentin@isovalent.com>
- Loading branch information