Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node: Fix race condition on labels' getter/setter #17217

Merged
merged 1 commit into from Sep 2, 2021

Conversation

pchaigno
Copy link
Member

Our race detection pipeline detected a race condition between the setter and getter for the local node's labels:

WARNING: DATA RACE
Write at 0x000005882150 by goroutine 154:
  github.com/cilium/cilium/pkg/node.SetLabels()
      /go/src/github.com/cilium/cilium/pkg/node/host_endpoint.go:33 +0x1dc
  github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).updateK8sNodeV1()
      /go/src/github.com/cilium/cilium/pkg/k8s/watchers/node.go:89 +0x1cc
  github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).nodesInit.func1()
      /go/src/github.com/cilium/cilium/pkg/k8s/watchers/node.go:44 +0xd1
  k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:231 +0x8f
  k8s.io/client-go/tools/cache.(*ResourceEventHandlerFuncs).OnAdd()
      <autogenerated>:1 +0x2a
  github.com/cilium/cilium/pkg/k8s/informer.NewInformerWithStore.func1()
      /go/src/github.com/cilium/cilium/pkg/k8s/informer/informer.go:119 +0x297
  k8s.io/client-go/tools/cache.(*DeltaFIFO).Pop()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/delta_fifo.go:544 +0x50d
  k8s.io/client-go/tools/cache.(*controller).processLoop()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:183 +0x83
  k8s.io/client-go/tools/cache.(*controller).processLoop-fm()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:181 +0x4a
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x75
  k8s.io/apimachinery/pkg/util/wait.BackoffUntil()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xba
  k8s.io/apimachinery/pkg/util/wait.JitterUntil()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x114
  k8s.io/apimachinery/pkg/util/wait.Until()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x507
  k8s.io/client-go/tools/cache.(*controller).Run()
      /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:154 +0x4a9

Previous read at 0x000005882150 by main goroutine:
  github.com/cilium/cilium/pkg/nodediscovery.(*NodeDiscovery).StartDiscovery()
      /go/src/github.com/cilium/cilium/pkg/nodediscovery/nodediscovery.go:182 +0x3ab
  github.com/cilium/cilium/daemon/cmd.NewDaemon()
      /go/src/github.com/cilium/cilium/daemon/cmd/daemon.go:762 +0x34d6
  github.com/cilium/cilium/daemon/cmd.runDaemon()
      /go/src/github.com/cilium/cilium/daemon/cmd/daemon_main.go:1558 +0xb8b
  github.com/cilium/cilium/daemon/cmd.glob..func1()
      /go/src/github.com/cilium/cilium/daemon/cmd/daemon_main.go:137 +0x3a4
  github.com/spf13/cobra.(*Command).execute()
      /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:854 +0x8cb
  github.com/spf13/cobra.(*Command).ExecuteC()
      /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:958 +0x4b2
  github.com/spf13/cobra.(*Command).Execute()
      /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:895 +0x88
  github.com/cilium/cilium/daemon/cmd.Execute()
      /go/src/github.com/cilium/cilium/daemon/cmd/daemon_main.go:162 +0x69
  main.main()
      /go/src/github.com/cilium/cilium/daemon/main.go:22 +0x2f
  runtime.main()
      /usr/local/go/src/runtime/proc.go:225 +0x255
  github.com/cilium/ebpf.init()
      /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:437 +0x58b
  github.com/cilium/ebpf.init()
      /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/prog.go:453 +0x518
  github.com/cilium/ebpf.init()
      /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:419 +0x4ab
  github.com/cilium/ebpf.init()
      /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:235 +0x439
  github.com/cilium/ebpf.init()
      /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:217 +0x3cc
  runtime.doInit()
      /usr/local/go/src/runtime/proc.go:6309 +0xeb
  github.com/cilium/ebpf/internal/btf.init()
      /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/internal/btf/btf.go:728 +0x1cc

This commit fixes it.

Fixes: #15217.
Fixes: #16180.

@pchaigno pchaigno added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. area/host-firewall Impacts the host firewall or the host endpoint. kind/bug/race-detector needs-backport/1.10 labels Aug 23, 2021
@pchaigno pchaigno requested a review from a team August 23, 2021 11:27
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.10.4 Aug 23, 2021
@pchaigno pchaigno requested a review from borkmann August 23, 2021 11:28
pkg/node/host_endpoint.go Outdated Show resolved Hide resolved
pkg/node/host_endpoint.go Outdated Show resolved Hide resolved
@pchaigno
Copy link
Member Author

pchaigno commented Aug 24, 2021

/mlh new-flake Cilium-PR-K8s-1.21-kernel-4.9

👍 created #17232

Our race detection pipeline detected a race condition between the setter
and getter for the local node's labels:

    WARNING: DATA RACE
    Write at 0x000005882150 by goroutine 154:
      github.com/cilium/cilium/pkg/node.SetLabels()
          /go/src/github.com/cilium/cilium/pkg/node/host_endpoint.go:33 +0x1dc
      github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).updateK8sNodeV1()
          /go/src/github.com/cilium/cilium/pkg/k8s/watchers/node.go:89 +0x1cc
      github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).nodesInit.func1()
          /go/src/github.com/cilium/cilium/pkg/k8s/watchers/node.go:44 +0xd1
      k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:231 +0x8f
      k8s.io/client-go/tools/cache.(*ResourceEventHandlerFuncs).OnAdd()
          <autogenerated>:1 +0x2a
      github.com/cilium/cilium/pkg/k8s/informer.NewInformerWithStore.func1()
          /go/src/github.com/cilium/cilium/pkg/k8s/informer/informer.go:119 +0x297
      k8s.io/client-go/tools/cache.(*DeltaFIFO).Pop()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/delta_fifo.go:544 +0x50d
      k8s.io/client-go/tools/cache.(*controller).processLoop()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:183 +0x83
      k8s.io/client-go/tools/cache.(*controller).processLoop-fm()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:181 +0x4a
      k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x75
      k8s.io/apimachinery/pkg/util/wait.BackoffUntil()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xba
      k8s.io/apimachinery/pkg/util/wait.JitterUntil()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0x114
      k8s.io/apimachinery/pkg/util/wait.Until()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x507
      k8s.io/client-go/tools/cache.(*controller).Run()
          /go/src/github.com/cilium/cilium/vendor/k8s.io/client-go/tools/cache/controller.go:154 +0x4a9

    Previous read at 0x000005882150 by main goroutine:
      github.com/cilium/cilium/pkg/nodediscovery.(*NodeDiscovery).StartDiscovery()
          /go/src/github.com/cilium/cilium/pkg/nodediscovery/nodediscovery.go:182 +0x3ab
      github.com/cilium/cilium/daemon/cmd.NewDaemon()
          /go/src/github.com/cilium/cilium/daemon/cmd/daemon.go:762 +0x34d6
      github.com/cilium/cilium/daemon/cmd.runDaemon()
          /go/src/github.com/cilium/cilium/daemon/cmd/daemon_main.go:1558 +0xb8b
      github.com/cilium/cilium/daemon/cmd.glob..func1()
          /go/src/github.com/cilium/cilium/daemon/cmd/daemon_main.go:137 +0x3a4
      github.com/spf13/cobra.(*Command).execute()
          /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:854 +0x8cb
      github.com/spf13/cobra.(*Command).ExecuteC()
          /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:958 +0x4b2
      github.com/spf13/cobra.(*Command).Execute()
          /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:895 +0x88
      github.com/cilium/cilium/daemon/cmd.Execute()
          /go/src/github.com/cilium/cilium/daemon/cmd/daemon_main.go:162 +0x69
      main.main()
          /go/src/github.com/cilium/cilium/daemon/main.go:22 +0x2f
      runtime.main()
          /usr/local/go/src/runtime/proc.go:225 +0x255
      github.com/cilium/ebpf.init()
          /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:437 +0x58b
      github.com/cilium/ebpf.init()
          /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/prog.go:453 +0x518
      github.com/cilium/ebpf.init()
          /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:419 +0x4ab
      github.com/cilium/ebpf.init()
          /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:235 +0x439
      github.com/cilium/ebpf.init()
          /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/syscalls.go:217 +0x3cc
      runtime.doInit()
          /usr/local/go/src/runtime/proc.go:6309 +0xeb
      github.com/cilium/ebpf/internal/btf.init()
          /go/src/github.com/cilium/cilium/vendor/github.com/cilium/ebpf/internal/btf/btf.go:728 +0x1cc

This commit fixes it.

Fixes: 81f0626 ("datapath: Include the host endpoint ID in all endpoint headers")
Signed-off-by: Paul Chaignon <paul@cilium.io>
@pchaigno
Copy link
Member Author

pchaigno commented Sep 1, 2021

test-me-please

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 1, 2021
@joestringer joestringer added this to Needs backport from master in 1.10.5 Sep 1, 2021
@joestringer joestringer removed this from Needs backport from master in 1.10.4 Sep 1, 2021
@aditighag aditighag merged commit ac0d184 into cilium:master Sep 2, 2021
@pchaigno pchaigno deleted the fix-race-getset-node-labels branch September 2, 2021 16:26
@joestringer joestringer moved this from Needs backport from master to Backport done to v1.10 in 1.10.5 Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/host-firewall Impacts the host firewall or the host endpoint. kind/bug/race-detector kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.10.5
Backport done to v1.10
Development

Successfully merging this pull request may close these issues.

[Race Detector] Race between pkg/nodediscovery.(*NodeDiscovery).StartDiscovery and pkg/node.SetLabels()
6 participants