Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: incorrect ordering on sync.RWMutex, possible optimization/defer bug #36864

Closed
james-lawrence opened this issue Jan 29, 2020 · 3 comments
Closed

Comments

@james-lawrence
Copy link
Contributor

@james-lawrence james-lawrence commented Jan 29, 2020

I just want to preface that I fully expect this to be user error, given that
this class of bug would cause people to scream from the rooftops.

But I'm completely baffled. so putting this out there in case someone else has seen something similar.

this happens far more frequently in tests than actually running the program.

What version of Go are you using (go version)?

1.13.7

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

GOBIN=""
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"

What did you do?

I couldn't reproduce on the playground.

git clone git@github.com:usertorrent.git debug
cd debug; git checkout debug
go test -v -run TestClientTransferRateLimitedUpload . 2>&1 | tee log.txt

What did you expect to see?

expected the test complete with a standard pass/fail.

What did you see instead?

a panic due to unlock of an unlocked mutex.

what is important to note here is the line:
2020/01/29 06:27:58 chunks.go:214: 0xc000169060 unlock initiated - 4

there is no preceding log message relating to the lock being acquired or the function being invoked.
both log messages are executed before the unlock of the function occurs.

this simply shouldn't be happening, and I'm completely baffled as to why.

things I've checked:

  1. the functions all work in isolation and generate the proper output.
  2. there are an even number of locks/unlocks for this particular lock, and the majority use defer to unlock.
  3. this lock is only used by the functions on the struct. and isn't accessed anywhere else.
  4. I switched to only using .Lock instead of a mix of Lock and RLock.
  5. I've attempted to disable inlining and escaping using magic comments which resulted in no change in output (not 100% sure i did this correctly)
  6. I've checked both go 1.12 and 1.13.
  7. occurs with or without the locking debug code.
  8. reproduces in the test frequently but not as common when running the application.
2020/01/29 06:27:58 chunks.go:501: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc000169060
2020/01/29 06:27:58 chunks.go:503: 0xc000169060 lock initiated - 1
2020/01/29 06:27:58 chunks.go:503: 0xc000169060 lock completed - 1
2020/01/29 06:27:58 chunks.go:506: 0xc000169060 unlock initiated - 1
2020/01/29 06:27:58 chunks.go:506: 0xc000169060 unlock completed - 1
2020/01/29 06:27:58 chunks.go:506: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc000169060
2020/01/29 06:27:58 chunks.go:501: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc000169060
2020/01/29 06:27:58 chunks.go:503: 0xc000169060 lock initiated - 2
2020/01/29 06:27:58 chunks.go:503: 0xc000169060 lock completed - 2
2020/01/29 06:27:58 chunks.go:506: 0xc000169060 unlock initiated - 2
2020/01/29 06:27:58 chunks.go:506: 0xc000169060 unlock completed - 2
2020/01/29 06:27:58 chunks.go:506: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc000169060
2020/01/29 06:27:58 chunks.go:501: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc000169060
2020/01/29 06:27:58 chunks.go:503: 0xc000169060 lock initiated - 3
2020/01/29 06:27:58 chunks.go:503: 0xc000169060 lock completed - 3
2020/01/29 06:27:58 chunks.go:506: 0xc000169060 unlock initiated - 3
2020/01/29 06:27:58 chunks.go:506: 0xc000169060 unlock completed - 3
2020/01/29 06:27:58 chunks.go:214: 0xc000169060 unlock initiated - 4
2020/01/29 06:27:58 chunks.go:506: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc000169060
2020/01/29 06:27:58 chunks.go:300: github.com/anacrolix/torrent.(*chunks).ChunksRelease - initiated: 0xc000169060
2020/01/29 06:27:58 chunks.go:302: 0xc000169060 lock initiated - 4
2020/01/29 06:27:58 chunks.go:501: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc000169060
2020/01/29 06:27:58 chunks.go:503: 0xc000169060 lock initiated - 5
goroutine 119 [running]:
runtime.throw(0xbcd5cf, 0x20)
	/usr/lib/go/src/runtime/panic.go:774 +0x72 fp=0xc0004a3798 sp=0xc0004a3768 pc=0x439e52
sync.throw(0xbcd5cf, 0x20)
	/usr/lib/go/src/runtime/panic.go:760 +0x35 fp=0xc0004a37b8 sp=0xc0004a3798 pc=0x439dd5
sync.(*RWMutex).Unlock(0xc000169060)
	/usr/lib/go/src/sync/rwmutex.go:129 +0xbf fp=0xc0004a37f0 sp=0xc0004a37b8 pc=0x487d5f
github.com/anacrolix/torrent.(*DebugLock).Unlock(0xc000430c60)
	/shared/user/development/torrent/debug.go:143 +0x168 fp=0xc0004a38c8 sp=0xc0004a37f0 pc=0x94bff8
github.com/anacrolix/torrent.(*chunks).ChunksAvailable(0xc00017c200, 0x1, 0x0)
	/shared/user/development/torrent/chunks.go:214 +0x383 fp=0xc0004a3a10 sp=0xc0004a38c8 pc=0x925ef3
github.com/anacrolix/torrent.(*torrent).pieceAllDirty(0xc0004e4380, 0x1, 0x61dee600)
	/shared/user/development/torrent/torrent.go:1017 +0x48 fp=0xc0004a3a48 sp=0xc0004a3a10 pc=0x960728
github.com/anacrolix/torrent.(*torrent).onIncompletePiece(0xc0004e4380, 0x1)
	/shared/user/development/torrent/torrent.go:1786 +0x39 fp=0xc0004a3a78 sp=0xc0004a3a48 pc=0x9673a9
github.com/anacrolix/torrent.(*torrent).pieceHashed(0xc0004e4380, 0x1, 0xc98840, 0xc000474700, 0x0, 0x0)
	/shared/user/development/torrent/torrent.go:1759 +0x7e1 fp=0xc0004a3cf0 sp=0xc0004a3a78 pc=0x966d71
github.com/anacrolix/torrent.(*Client).newTorrent.func2(0x1, 0xc98840, 0xc000474700)
	/shared/user/development/torrent/client.go:1033 +0x7e fp=0xc0004a3dd8 sp=0xc0004a3cf0 pc=0x99542e
github.com/anacrolix/torrent.(*digests).check(0xc0004e4678, 0x1)
	/shared/user/development/torrent/digests.go:70 +0x157 fp=0xc0004a3f70 sp=0xc0004a3dd8 pc=0x94c9e7
github.com/anacrolix/torrent.(*digests).verify.func1(0xc0004e4678)
	/shared/user/development/torrent/digests.go:50 +0x91 fp=0xc0004a3fd8 sp=0xc0004a3f70 pc=0x9974f1
runtime.goexit()
	/usr/lib/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc0004a3fe0 sp=0xc0004a3fd8 pc=0x4698b1
created by github.com/anacrolix/torrent.(*digests).verify
	/shared/user/development/torrent/digests.go:48 +0x95
2020/01/29 07:12:36 chunks.go:498: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc00017d120
2020/01/29 07:12:36 chunks.go:500: 0xc00017d120 lock initiated - 1
2020/01/29 07:12:36 chunks.go:500: 0xc00017d120 lock completed - 1
2020/01/29 07:12:36 chunks.go:503: 0xc00017d120 unlock initiated - 1
2020/01/29 07:12:36 chunks.go:503: 0xc00017d120 unlock completed - 1
2020/01/29 07:12:36 chunks.go:503: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc00017d120
2020/01/29 07:12:36 chunks.go:498: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc00017d120
2020/01/29 07:12:36 chunks.go:500: 0xc00017d120 lock initiated - 2
2020/01/29 07:12:36 chunks.go:500: 0xc00017d120 lock completed - 2
2020/01/29 07:12:36 chunks.go:503: 0xc00017d120 unlock initiated - 2
2020/01/29 07:12:36 chunks.go:503: 0xc00017d120 unlock completed - 2
2020/01/29 07:12:36 chunks.go:503: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc00017d120
2020/01/29 07:12:36 chunks.go:498: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc00017d120
2020/01/29 07:12:36 chunks.go:500: 0xc00017d120 lock initiated - 3
2020/01/29 07:12:36 chunks.go:500: 0xc00017d120 lock completed - 3
2020/01/29 07:12:36 chunks.go:503: 0xc00017d120 unlock initiated - 3
2020/01/29 07:12:36 chunks.go:503: 0xc00017d120 unlock completed - 3
2020/01/29 07:12:36 chunks.go:503: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc00017d120
2020/01/29 07:12:36 chunks.go:215: 0xc00017d120 unlock initiated - 4
fatal error: sync: Unlock of unlocked RWMutex

goroutine 15 [running]:
runtime.throw(0xba496f, 0x20)
	/usr/lib/go/src/runtime/panic.go:774 +0x72 fp=0xc00011fb58 sp=0xc00011fb28 pc=0x439e52
sync.throw(0xba496f, 0x20)
	/usr/lib/go/src/runtime/panic.go:760 +0x35 fp=0xc00011fb78 sp=0xc00011fb58 pc=0x439dd5
sync.(*RWMutex).Unlock(0xc00017d120)
	/usr/lib/go/src/sync/rwmutex.go:129 +0xbf fp=0xc00011fbb0 sp=0xc00011fb78 pc=0x487b8f
github.com/anacrolix/torrent.(*DebugLock).Unlock(0xc0001fe720)
	/shared/user/development/torrent/debug.go:143 +0xf1 fp=0xc00011fc20 sp=0xc00011fbb0 pc=0x93f7b1
github.com/anacrolix/torrent.(*chunks).ChunksAvailable(0xc00047e480, 0x0, 0x0)
	/shared/user/development/torrent/chunks.go:215 +0x292 fp=0xc00011fd08 sp=0xc00011fc20 pc=0x922242
github.com/anacrolix/torrent.(*torrent).pieceAllDirty(...)
	/shared/user/development/torrent/torrent.go:1017
github.com/anacrolix/torrent.(*torrent).onIncompletePiece(0xc000124700, 0x0)
	/shared/user/development/torrent/torrent.go:1786 +0x40 fp=0xc00011fd30 sp=0xc00011fd08 pc=0x952730
github.com/anacrolix/torrent.(*torrent).pieceHashed(0xc000124700, 0x0, 0xc6d6a0, 0xc0000be900, 0x0, 0x0)
	/shared/user/development/torrent/torrent.go:1759 +0x324 fp=0xc00011fe58 sp=0xc00011fd30 pc=0x951f04
github.com/anacrolix/torrent.(*Client).newTorrent.func2(0x0, 0xc6d6a0, 0xc0000be900)
	/shared/user/development/torrent/client.go:1033 +0x5d fp=0xc00011feb8 sp=0xc00011fe58 pc=0x974eed
github.com/anacrolix/torrent.(*digests).check(0xc0001249f8, 0x0)
	/shared/user/development/torrent/digests.go:70 +0x404 fp=0xc00011ffa8 sp=0xc00011feb8 pc=0x940134
github.com/anacrolix/torrent.(*digests).verify.func1(0xc0001249f8)
	/shared/user/development/torrent/digests.go:50 +0x5b fp=0xc00011ffd8 sp=0xc00011ffa8 pc=0x97644b
runtime.goexit()
	/usr/lib/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc00011ffe0 sp=0xc00011ffd8 pc=0x4698b1
created by github.com/anacrolix/torrent.(*digests).verify
	/shared/user/development/torrent/digests.go:48 +0x70

example using synx.RWMutex directly without debugging code.

2020/01/29 07:20:00 chunks.go:498: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc000458b80
2020/01/29 07:20:00 chunks.go:503: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc000458b80
2020/01/29 07:20:00 chunks.go:498: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc000458b80
2020/01/29 07:20:00 chunks.go:503: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc000458b80
2020/01/29 07:20:00 chunks.go:498: github.com/anacrolix/torrent.(*chunks).Missing - initiated: 0xc000458b80
2020/01/29 07:20:00 chunks.go:503: github.com/anacrolix/torrent.(*chunks).Missing - completed: 0xc000458b80
fatal error: sync: Unlock of unlocked RWMutex
2020/01/29 07:20:00 chunks.go:281: github.com/anacrolix/torrent.(*chunks).ChunksPend - initiated: 0xc000458b80

goroutine 16 [running]:
runtime.throw(0xba3733, 0x20)
	/usr/lib/go/src/runtime/panic.go:774 +0x72 fp=0xc0001dfb98 sp=0xc0001dfb68 pc=0x439e52
sync.throw(0xba3733, 0x20)
	/usr/lib/go/src/runtime/panic.go:760 +0x35 fp=0xc0001dfbb8 sp=0xc0001dfb98 pc=0x439dd5
sync.(*RWMutex).Unlock(0xc000458b80)
	/usr/lib/go/src/sync/rwmutex.go:129 +0xbf fp=0xc0001dfbf0 sp=0xc0001dfbb8 pc=0x487b8f
github.com/anacrolix/torrent.(*chunks).ChunksComplete(0xc00013a540, 0x1, 0xc000458800)
	/shared/user/development/torrent/chunks.go:240 +0x1c9 fp=0xc0001dfcc0 sp=0xc0001dfbf0 pc=0x922619
github.com/anacrolix/torrent.(*Piece).queuedForHash(...)
	/shared/user/development/torrent/piece.go:210
github.com/anacrolix/torrent.(*torrent).wantPieceIndex(0xc0000ba700, 0x1, 0x0)
	/shared/user/development/torrent/torrent.go:946 +0xbb fp=0xc0001dfd08 sp=0xc0001dfcc0 pc=0x94c11b
github.com/anacrolix/torrent.(*torrent).onIncompletePiece(0xc0000ba700, 0x1)
	/shared/user/development/torrent/torrent.go:1789 +0x92 fp=0xc0001dfd30 sp=0xc0001dfd08 pc=0x9515d2
github.com/anacrolix/torrent.(*torrent).pieceHashed(0xc0000ba700, 0x1, 0xc6c2e0, 0xc0000f6a20, 0x0, 0x1)
	/shared/user/development/torrent/torrent.go:1758 +0x324 fp=0xc0001dfe58 sp=0xc0001dfd30 pc=0x950d54
github.com/anacrolix/torrent.(*Client).newTorrent.func2(0x1, 0xc6c2e0, 0xc0000f6a20)
	/shared/user/development/torrent/client.go:1033 +0x5d fp=0xc0001dfeb8 sp=0xc0001dfe58 pc=0x973d3d
github.com/anacrolix/torrent.(*digests).check(0xc0000ba9f8, 0x1)
	/shared/user/development/torrent/digests.go:70 +0x404 fp=0xc0001dffa8 sp=0xc0001dfeb8 pc=0x93ef94
github.com/anacrolix/torrent.(*digests).verify.func1(0xc0000ba9f8)
	/shared/user/development/torrent/digests.go:50 +0x5b fp=0xc0001dffd8 sp=0xc0001dffa8 pc=0x97529b
runtime.goexit()
	/usr/lib/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc0001dffe0 sp=0xc0001dffd8 pc=0x4698b1
created by github.com/anacrolix/torrent.(*digests).verify
@ALTree
Copy link
Member

@ALTree ALTree commented Jan 29, 2020

Where is the reproducer? The

git clone git@github.com:usertorrent.git debug

line doesn't look correct. Also does running go test with -race print any race detector warning?

@ALTree ALTree changed the title incorrect ordering on sync.RWMutex, possible optimization/defer bug runtime: incorrect ordering on sync.RWMutex, possible optimization/defer bug Jan 29, 2020
@mibes
Copy link

@mibes mibes commented Jan 29, 2020

I suppose @james-lawrence is referring to this one: https://github.com/james-lawrence/torrent.git

@ALTree
Copy link
Member

@ALTree ALTree commented Jan 29, 2020

To answer my own question: if the reproducer is the one at james-lawrence/torrent, it has indeed a data race.

I ran:

go test -race -v -run TestClientTransferRateLimitedUpload .

and the output had this in it (among other things):

==================
WARNING: DATA RACE
Write at 0x00c0002f87a8 by goroutine 31:
  github.com/anacrolix/torrent.(*torrent).pieceHashed()
      /home/alberto/debug/torrent.go:1723 +0x320
  github.com/anacrolix/torrent.(*Client).newTorrent.func2()
      /home/alberto/debug/client.go:1033 +0x8b
  github.com/anacrolix/torrent.(*digests).check()
      /home/alberto/debug/digests.go:79 +0x173
  github.com/anacrolix/torrent.(*digests).verify.func1()
      /home/alberto/debug/digests.go:50 +0x8a

Previous read at 0x00c0002f87a8 by goroutine 14:
  github.com/anacrolix/torrent.(*Piece).VerifyData()
      /home/alberto/debug/piece.go:201 +0x1ed
  github.com/anacrolix/torrent.(*torrent).VerifyData()
      /home/alberto/debug/torrent.go:1805 +0x43
  github.com/anacrolix/torrent.testClientTransfer()
      /home/alberto/debug/client_test.go:340 +0x430
  github.com/anacrolix/torrent.TestClientTransferRateLimitedUpload()
      /home/alberto/debug/client_test.go:198 +0x19c
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199

Goroutine 31 (running) created at:
  github.com/anacrolix/torrent.(*digests).verify()
      /home/alberto/debug/digests.go:48 +0xb2
  github.com/anacrolix/torrent.(*digests).Enqueue()
      /home/alberto/debug/digests.go:39 +0x7a
  github.com/anacrolix/torrent.(*Piece).VerifyData()
      /home/alberto/debug/piece.go:198 +0x179
  github.com/anacrolix/torrent.(*torrent).VerifyData()
      /home/alberto/debug/torrent.go:1805 +0x43
  github.com/anacrolix/torrent.testClientTransfer()
      /home/alberto/debug/client_test.go:340 +0x430
  github.com/anacrolix/torrent.TestClientTransferRateLimitedUpload()
      /home/alberto/debug/client_test.go:198 +0x19c
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199

Goroutine 14 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:960 +0x651
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1202 +0xa6
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1200 +0x521
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1117 +0x2ff
  github.com/anacrolix/torrent.TestMain()
      /home/alberto/debug/main_test.go:32 +0x38
  main.main()
      _testmain.go:194 +0x223
==================
2020/01/29 19:34:59 chunks.go:238: 0xc00041c100 lock initiated - 36
==================
WARNING: DATA RACE
Read at 0x00c0002f87a8 by goroutine 14:
  github.com/anacrolix/torrent.(*Piece).VerifyData()
      /home/alberto/debug/piece.go:201 +0x1ed
  github.com/anacrolix/torrent.(*torrent).VerifyData()
      /home/alberto/debug/torrent.go:1805 +0x43
  github.com/anacrolix/torrent.testClientTransfer()
      /home/alberto/debug/client_test.go:340 +0x430
  github.com/anacrolix/torrent.TestClientTransferRateLimitedUpload()
      /home/alberto/debug/client_test.go:198 +0x19c
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199

Previous write at 0x00c0002f87a8 by goroutine 31:
  github.com/anacrolix/torrent.(*torrent).pieceHashed()
      /home/alberto/debug/torrent.go:1723 +0x320
  github.com/anacrolix/torrent.(*Client).newTorrent.func2()
      /home/alberto/debug/client.go:1033 +0x8b
  github.com/anacrolix/torrent.(*digests).check()
      /home/alberto/debug/digests.go:79 +0x173
  github.com/anacrolix/torrent.(*digests).verify.func1()
      /home/alberto/debug/digests.go:50 +0x8a

Goroutine 14 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:960 +0x651
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1202 +0xa6
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1200 +0x521
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1117 +0x2ff
  github.com/anacrolix/torrent.TestMain()
      /home/alberto/debug/main_test.go:32 +0x38
  main.main()
      _testmain.go:194 +0x223

Goroutine 31 (running) created at:
  github.com/anacrolix/torrent.(*digests).verify()
      /home/alberto/debug/digests.go:48 +0xb2
  github.com/anacrolix/torrent.(*digests).Enqueue()
      /home/alberto/debug/digests.go:39 +0x7a
  github.com/anacrolix/torrent.(*Piece).VerifyData()
      /home/alberto/debug/piece.go:198 +0x179
  github.com/anacrolix/torrent.(*torrent).VerifyData()
      /home/alberto/debug/torrent.go:1805 +0x43
  github.com/anacrolix/torrent.testClientTransfer()
      /home/alberto/debug/client_test.go:340 +0x430
  github.com/anacrolix/torrent.TestClientTransferRateLimitedUpload()
      /home/alberto/debug/client_test.go:198 +0x19c
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199
==================

We cannot accept bug reports with racy reproducers (a program with a data race is not a valid Go program); so I'm closing this issue.

If you can still reproduce the crash after you've fixed all the data-races, please comment and we'll re-open. Thanks.

@ALTree ALTree closed this Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.