Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rollkit/p2p: unit test failure with data race flag #910

Closed
Bidon15 opened this issue Apr 27, 2023 · 5 comments · Fixed by #916
Closed

rollkit/p2p: unit test failure with data race flag #910

Bidon15 opened this issue Apr 27, 2023 · 5 comments · Fixed by #916
Assignees
Labels
P:high Priority: High T:bug Something isn't working

Comments

@Bidon15
Copy link
Member

Bidon15 commented Apr 27, 2023

Found during rebasing #903

Reproduction rate: seldom

==================
WARNING: DATA RACE
Read at 0x00c0058a9d83 by goroutine 2931:
  testing.(*common).logDepth()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:992 +0xc4
  testing.(*common).log()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:985 +0x84
  testing.(*common).Log()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1026 +0x5b
  github.com/rollkit/rollkit/log/test.(*Logger).Error()
      /home/runner/work/rollkit/rollkit/log/test/loggers.go:46 +0x216
  github.com/rollkit/rollkit/p2p.(*Gossiper).ProcessMessages()
      /home/runner/work/rollkit/rollkit/p2p/gossip.go:96 +0x16c
  github.com/rollkit/rollkit/p2p.(*Client).setupGossiping.func1()
      /home/runner/work/rollkit/rollkit/p2p/client.go:389 +0x58

Previous write at 0x00c0058a9d83 by goroutine 2682:
  testing.tRunner.func1()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1563 +0x82d
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.20.3/x64/src/runtime/panic.go:476 +0x32
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1629 +0x47

Goroutine 2931 (running) created at:
  github.com/rollkit/rollkit/p2p.(*Client).setupGossiping()
      /home/runner/work/rollkit/rollkit/p2p/client.go:389 +0x3e9
  github.com/rollkit/rollkit/p2p.(*Client).startWithHost()
      /home/runner/work/rollkit/rollkit/p2p/client.go:140 +0x6d6
  github.com/rollkit/rollkit/p2p.startTestNetwork()
      /home/runner/work/rollkit/rollkit/p2p/utils_test.go:117 +0xb84
  github.com/rollkit/rollkit/p2p.TestGossiping()
      /home/runner/work/rollkit/rollkit/p2p/client_test.go:133 +0x9ba
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1576 +0x216
  testing.(*T).Run.func1()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1629 +0x47

Goroutine 2682 (running) created at:
  testing.(*T).Run()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1629 +0x805
  testing.runTests.func1()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:2036 +0x8d
  testing.tRunner()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1576 +0x216
  testing.runTests()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:2034 +0x87c
  testing.(*M).Run()
      /opt/hostedtoolcache/go/1.20.3/x64/src/testing/testing.go:1906 +0xb44
  main.main()
      _testmain.go:55 +0x2e9
==================
FAIL
FAIL	github.com/rollkit/rollkit/p2p	2.162s
@Bidon15 Bidon15 added the T:bug Something isn't working label Apr 27, 2023
@nashqueue nashqueue added the P:high Priority: High label Apr 27, 2023
@nashqueue
Copy link
Member

Where did this CI job fail?

@smuu
Copy link
Contributor

smuu commented Apr 27, 2023

Where did this CI job fail?

https://github.com/rollkit/rollkit/actions/runs/4820708549/jobs/8585581264

Agreed with @gupadhyaya that he will fix that issue and then create a patch release.

@gupadhyaya
Copy link
Member

not able to reproduce this locally using go test -run ./...
also tried upgrading my go to v1.20.3, but no luck.
also, currently the ci/cd is failing for different reason (shown below): @smuu
Screenshot 2023-04-28 at 12 35 04 PM

@gupadhyaya
Copy link
Member

@smuu not seeing the data race issue recently. for example, after re-run of the jobs, no longer data race in: #891

we still have the bind: address already in use failure as stated above.

@Manav-Aggarwal
Copy link
Member

Manav-Aggarwal commented May 5, 2023

Took a look at this randomly while working on #916. Was able to reproduce it locally using go test -race ./p2p flag. Found the same issue open in golang/go: golang/go#40343.

So the problem is that we're passing in a logger associated with a *testing.T object, t, to a goroutine spawned within that test, t. However, the test can exit but the goroutine can keep running so when the goroutine tries to log, it sees that the test is already done, can't access logger associated to it and fails. So a simple fix here is to just use a different logger for goroutines spawned within the test. Added that fix in #916 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P:high Priority: High T:bug Something isn't working
Projects
None yet
5 participants