Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data race in FQDN ruleSyncTracker #5583

Merged
merged 1 commit into from
Oct 17, 2023
Merged

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Oct 16, 2023

ruleSyncTracker.Run() can update ruleSyncTracker.dirtyRules in-place, while ruleSyncTracker.getDirtyRules() returns the pointer of the set which could be read by other goroutines and leads to a data race like below:

WARNING: DATA RACE
Write at 0x00c000dd8180 by goroutine 276:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:301 +0x0
  k8s.io/apimachinery/pkg/util/sets.Set[go.shape.string].Delete()
      /root/go/pkg/mod/k8s.io/apimachinery@v0.26.4/pkg/util/sets/set.go:62 +0x2ae
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*ruleSyncTracker).Run()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:570 +0x1d2
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*fqdnController).runRuleSyncTracker()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:584 +0x4a
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1.4()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:495 +0x44

Previous read at 0x00c000dd8180 by goroutine 271:
  reflect.maplen()
      /usr/local/go/src/runtime/map.go:1411 +0x0
  reflect.Value.lenNonSlice()
      /usr/local/go/src/reflect/value.go:1720 +0x324
  reflect.Value.Len()
      /usr/local/go/src/reflect/value.go:1709 +0x158f
  reflect.deepValueEqual()
      /usr/local/go/src/reflect/deepequal.go:139 +0x1571
  reflect.DeepEqual()
      /usr/local/go/src/reflect/deepequal.go:237 +0x38b
  github.com/stretchr/testify/assert.ObjectsAreEqual()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:65 +0x172
  github.com/stretchr/testify/assert.Equal()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:414 +0x1f7
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:517 +0xb6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

The data race was introduced by #5566 so is not present in releases yet. The patch needs to be backported together with #5566.

ruleSyncTracker.Run() can update ruleSyncTracker.dirtyRules in-place,
while ruleSyncTracker.getDirtyRules() returns the pointer of the set
which could be read by other goroutines and leads to a data race like
below:

WARNING: DATA RACE
Write at 0x00c000dd8180 by goroutine 276:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:301 +0x0
  k8s.io/apimachinery/pkg/util/sets.Set[go.shape.string].Delete()
      /root/go/pkg/mod/k8s.io/apimachinery@v0.26.4/pkg/util/sets/set.go:62 +0x2ae
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*ruleSyncTracker).Run()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:570 +0x1d2
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*fqdnController).runRuleSyncTracker()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:584 +0x4a
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1.4()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:495 +0x44

Previous read at 0x00c000dd8180 by goroutine 271:
  reflect.maplen()
      /usr/local/go/src/runtime/map.go:1411 +0x0
  reflect.Value.lenNonSlice()
      /usr/local/go/src/reflect/value.go:1720 +0x324
  reflect.Value.Len()
      /usr/local/go/src/reflect/value.go:1709 +0x158f
  reflect.deepValueEqual()
      /usr/local/go/src/reflect/deepequal.go:139 +0x1571
  reflect.DeepEqual()
      /usr/local/go/src/reflect/deepequal.go:237 +0x38b
  github.com/stretchr/testify/assert.ObjectsAreEqual()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:65 +0x172
  github.com/stretchr/testify/assert.Equal()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:414 +0x1f7
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:517 +0xb6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Signed-off-by: Quan Tian <qtian@vmware.com>
@tnqn tnqn added priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. action/backport Indicates a PR that requires backports. labels Oct 16, 2023
@tnqn
Copy link
Member Author

tnqn commented Oct 16, 2023

/test-all

Copy link
Contributor

@Dyanngg Dyanngg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix.

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
We could also replace getDirtyRules (which has a single caller) with a new dirtyRulesIntersection function, to avoid the extra set copy, but I don't have a strong preference.

@tnqn
Copy link
Member Author

tnqn commented Oct 17, 2023

We could also replace getDirtyRules (which has a single caller) with a new dirtyRulesIntersection function, to avoid the extra set copy, but I don't have a strong preference.

Thanks for the advice. In production the function will only be called in abnormal case (previous packetin not handled succussfullly) so its impact should be trival for now.

@tnqn tnqn merged commit 97ceb52 into antrea-io:main Oct 17, 2023
49 of 58 checks passed
@tnqn tnqn deleted the fix-data-race branch October 17, 2023 02:04
tnqn added a commit to luolanzone/antrea that referenced this pull request Oct 17, 2023
ruleSyncTracker.Run() can update ruleSyncTracker.dirtyRules in-place,
while ruleSyncTracker.getDirtyRules() returns the pointer of the set
which could be read by other goroutines and leads to a data race like
below:

WARNING: DATA RACE
Write at 0x00c000dd8180 by goroutine 276:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:301 +0x0
  k8s.io/apimachinery/pkg/util/sets.Set[go.shape.string].Delete()
      /root/go/pkg/mod/k8s.io/apimachinery@v0.26.4/pkg/util/sets/set.go:62 +0x2ae
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*ruleSyncTracker).Run()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:570 +0x1d2
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*fqdnController).runRuleSyncTracker()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:584 +0x4a
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1.4()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:495 +0x44

Previous read at 0x00c000dd8180 by goroutine 271:
  reflect.maplen()
      /usr/local/go/src/runtime/map.go:1411 +0x0
  reflect.Value.lenNonSlice()
      /usr/local/go/src/reflect/value.go:1720 +0x324
  reflect.Value.Len()
      /usr/local/go/src/reflect/value.go:1709 +0x158f
  reflect.deepValueEqual()
      /usr/local/go/src/reflect/deepequal.go:139 +0x1571
  reflect.DeepEqual()
      /usr/local/go/src/reflect/deepequal.go:237 +0x38b
  github.com/stretchr/testify/assert.ObjectsAreEqual()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:65 +0x172
  github.com/stretchr/testify/assert.Equal()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:414 +0x1f7
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:517 +0xb6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to luolanzone/antrea that referenced this pull request Oct 17, 2023
ruleSyncTracker.Run() can update ruleSyncTracker.dirtyRules in-place,
while ruleSyncTracker.getDirtyRules() returns the pointer of the set
which could be read by other goroutines and leads to a data race like
below:

WARNING: DATA RACE
Write at 0x00c000dd8180 by goroutine 276:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:301 +0x0
  k8s.io/apimachinery/pkg/util/sets.Set[go.shape.string].Delete()
      /root/go/pkg/mod/k8s.io/apimachinery@v0.26.4/pkg/util/sets/set.go:62 +0x2ae
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*ruleSyncTracker).Run()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:570 +0x1d2
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*fqdnController).runRuleSyncTracker()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:584 +0x4a
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1.4()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:495 +0x44

Previous read at 0x00c000dd8180 by goroutine 271:
  reflect.maplen()
      /usr/local/go/src/runtime/map.go:1411 +0x0
  reflect.Value.lenNonSlice()
      /usr/local/go/src/reflect/value.go:1720 +0x324
  reflect.Value.Len()
      /usr/local/go/src/reflect/value.go:1709 +0x158f
  reflect.deepValueEqual()
      /usr/local/go/src/reflect/deepequal.go:139 +0x1571
  reflect.DeepEqual()
      /usr/local/go/src/reflect/deepequal.go:237 +0x38b
  github.com/stretchr/testify/assert.ObjectsAreEqual()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:65 +0x172
  github.com/stretchr/testify/assert.Equal()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:414 +0x1f7
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:517 +0xb6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit to luolanzone/antrea that referenced this pull request Oct 17, 2023
ruleSyncTracker.Run() can update ruleSyncTracker.dirtyRules in-place,
while ruleSyncTracker.getDirtyRules() returns the pointer of the set
which could be read by other goroutines and leads to a data race like
below:

WARNING: DATA RACE
Write at 0x00c000dd8180 by goroutine 276:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:301 +0x0
  k8s.io/apimachinery/pkg/util/sets.Set[go.shape.string].Delete()
      /root/go/pkg/mod/k8s.io/apimachinery@v0.26.4/pkg/util/sets/set.go:62 +0x2ae
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*ruleSyncTracker).Run()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:570 +0x1d2
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*fqdnController).runRuleSyncTracker()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:584 +0x4a
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1.4()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:495 +0x44

Previous read at 0x00c000dd8180 by goroutine 271:
  reflect.maplen()
      /usr/local/go/src/runtime/map.go:1411 +0x0
  reflect.Value.lenNonSlice()
      /usr/local/go/src/reflect/value.go:1720 +0x324
  reflect.Value.Len()
      /usr/local/go/src/reflect/value.go:1709 +0x158f
  reflect.deepValueEqual()
      /usr/local/go/src/reflect/deepequal.go:139 +0x1571
  reflect.DeepEqual()
      /usr/local/go/src/reflect/deepequal.go:237 +0x38b
  github.com/stretchr/testify/assert.ObjectsAreEqual()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:65 +0x172
  github.com/stretchr/testify/assert.Equal()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:414 +0x1f7
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:517 +0xb6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit that referenced this pull request Oct 17, 2023
ruleSyncTracker.Run() can update ruleSyncTracker.dirtyRules in-place,
while ruleSyncTracker.getDirtyRules() returns the pointer of the set
which could be read by other goroutines and leads to a data race like
below:

WARNING: DATA RACE
Write at 0x00c000dd8180 by goroutine 276:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:301 +0x0
  k8s.io/apimachinery/pkg/util/sets.Set[go.shape.string].Delete()
      /root/go/pkg/mod/k8s.io/apimachinery@v0.26.4/pkg/util/sets/set.go:62 +0x2ae
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*ruleSyncTracker).Run()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:570 +0x1d2
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*fqdnController).runRuleSyncTracker()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:584 +0x4a
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1.4()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:495 +0x44

Previous read at 0x00c000dd8180 by goroutine 271:
  reflect.maplen()
      /usr/local/go/src/runtime/map.go:1411 +0x0
  reflect.Value.lenNonSlice()
      /usr/local/go/src/reflect/value.go:1720 +0x324
  reflect.Value.Len()
      /usr/local/go/src/reflect/value.go:1709 +0x158f
  reflect.deepValueEqual()
      /usr/local/go/src/reflect/deepequal.go:139 +0x1571
  reflect.DeepEqual()
      /usr/local/go/src/reflect/deepequal.go:237 +0x38b
  github.com/stretchr/testify/assert.ObjectsAreEqual()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:65 +0x172
  github.com/stretchr/testify/assert.Equal()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:414 +0x1f7
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:517 +0xb6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Signed-off-by: Quan Tian <qtian@vmware.com>
tnqn added a commit that referenced this pull request Oct 17, 2023
ruleSyncTracker.Run() can update ruleSyncTracker.dirtyRules in-place,
while ruleSyncTracker.getDirtyRules() returns the pointer of the set
which could be read by other goroutines and leads to a data race like
below:

WARNING: DATA RACE
Write at 0x00c000dd8180 by goroutine 276:
  runtime.mapdelete_faststr()
      /usr/local/go/src/runtime/map_faststr.go:301 +0x0
  k8s.io/apimachinery/pkg/util/sets.Set[go.shape.string].Delete()
      /root/go/pkg/mod/k8s.io/apimachinery@v0.26.4/pkg/util/sets/set.go:62 +0x2ae
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*ruleSyncTracker).Run()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:570 +0x1d2
  antrea.io/antrea/pkg/agent/controller/networkpolicy.(*fqdnController).runRuleSyncTracker()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn.go:584 +0x4a
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1.4()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:495 +0x44

Previous read at 0x00c000dd8180 by goroutine 271:
  reflect.maplen()
      /usr/local/go/src/runtime/map.go:1411 +0x0
  reflect.Value.lenNonSlice()
      /usr/local/go/src/reflect/value.go:1720 +0x324
  reflect.Value.Len()
      /usr/local/go/src/reflect/value.go:1709 +0x158f
  reflect.deepValueEqual()
      /usr/local/go/src/reflect/deepequal.go:139 +0x1571
  reflect.DeepEqual()
      /usr/local/go/src/reflect/deepequal.go:237 +0x38b
  github.com/stretchr/testify/assert.ObjectsAreEqual()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:65 +0x172
  github.com/stretchr/testify/assert.Equal()
      /root/go/pkg/mod/github.com/stretchr/testify@v1.8.4/assert/assertions.go:414 +0x1f7
  antrea.io/antrea/pkg/agent/controller/networkpolicy.TestSyncDirtyRules.func1()
      /root/antrea/pkg/agent/controller/networkpolicy/fqdn_test.go:517 +0xb6f
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x238
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Signed-off-by: Quan Tian <qtian@vmware.com>
@tnqn tnqn added the kind/bug Categorizes issue or PR as related to a bug. label Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/backport Indicates a PR that requires backports. kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants