Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

beacon-chain/sync: fix flaky segfault in tests #7822

Merged

Conversation

james-rms
Copy link
Contributor

@james-rms james-rms commented Nov 16, 2020

What type of PR is this?
Bug fix

What does this PR do? Why is it needed?
Observed this segfault running all tests on master, occurring
in around 2-3 out of 10 test runs.

FAIL: //beacon-chain/sync:go_default_test (shard 3 of 4, run 1 of 10) (see /home/j/.cache/bazel/_bazel_j/1ba834ca9d49f27aeb8f0bbb6f28fdf3/execroot/prysm/bazel-out/k8-fastbuild/testlogs/beacon-chain/sync/go_default_test/shard_3_of_4_run_1_of_10/test.log)
INFO: From Testing //beacon-chain/sync:go_default_test (shard 3 of 4, run 1 of 10):
==================== Test output for //beacon-chain/sync:go_default_test (shard 3 of 4, run 1 of 10):
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x138eea6]

goroutine 1660 [running]:
github.com/prysmaticlabs/prysm/shared/abool.(*AtomicBool).IsSet(...)
	shared/abool/abool.go:39
github.com/prysmaticlabs/prysm/beacon-chain/sync.(*Service).subscribeStaticWithSubnets.func1(0xc002dd4400, 0xc002990940, 0x17bca26, 0x1e)
	beacon-chain/sync/subscriber.go:207 +0xe6
created by github.com/prysmaticlabs/prysm/beacon-chain/sync.(*Service).subscribeStaticWithSubnets
	beacon-chain/sync/subscriber.go:200 +0x172
================================================================================

TestStaticSubnets was testing a Service with an uninitialized
chainStarted value. This commit initializes chainStarted explicitly
in all tests that construct a Service. This reduces the observed flake
rate to 0/10 runs. This was verified with:

./bazel.sh test //beacon-chain/sync:go_default_test --runs_per_test 10

Which issues(s) does this PR fix?

N/A

Other notes for review

Observed this segfault running all tests on mater, occurring
in around 2-3 out of 10 test runs.

```
FAIL: //beacon-chain/sync:go_default_test (shard 3 of 4, run 1 of 10) (see /home/j/.cache/bazel/_bazel_j/1ba834ca9d49f27aeb8f0bbb6f28fdf3/execroot/prysm/bazel-out/k8-fastbuild/testlogs/beacon-chain/sync/go_default_test/shard_3_of_4_run_1_of_10/test.log)
INFO: From Testing //beacon-chain/sync:go_default_test (shard 3 of 4, run 1 of 10):
==================== Test output for //beacon-chain/sync:go_default_test (shard 3 of 4, run 1 of 10):
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x138eea6]

goroutine 1660 [running]:
github.com/prysmaticlabs/prysm/shared/abool.(*AtomicBool).IsSet(...)
	shared/abool/abool.go:39
github.com/prysmaticlabs/prysm/beacon-chain/sync.(*Service).subscribeStaticWithSubnets.func1(0xc002dd4400, 0xc002990940, 0x17bca26, 0x1e)
	beacon-chain/sync/subscriber.go:207 +0xe6
created by github.com/prysmaticlabs/prysm/beacon-chain/sync.(*Service).subscribeStaticWithSubnets
	beacon-chain/sync/subscriber.go:200 +0x172
================================================================================
```

TestStaticSubnets was testing a Service with an uninitialized
chainStarted value. This commit initializes chainStarted explicitly
in all tests that construct a Service. This reduces the observed flake
rate to 0/10 runs. This was verified with:

```
./bazel.sh test //beacon-chain/sync:go_default_test --runs_per_test 10
```
@CLAassistant
Copy link

CLAassistant commented Nov 16, 2020

CLA assistant check
All committers have signed the CLA.

@james-rms james-rms changed the title beacon-chain: fix segfault beacon-chain/sync: fix flaky segfault in tests Nov 16, 2020
@rkapka
Copy link
Contributor

rkapka commented Nov 16, 2020

Thank you for the contribution!

@rkapka rkapka merged commit 758ec96 into prysmaticlabs:master Nov 16, 2020
@james-rms james-rms deleted the jrms/beacon-chain-segfault-fix branch November 17, 2020 03:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants