Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: timer data corruption #60050

Closed
prattmic opened this issue May 8, 2023 · 6 comments
Closed

runtime: fatal error: timer data corruption #60050

prattmic opened this issue May 8, 2023 · 6 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@prattmic
Copy link
Member

prattmic commented May 8, 2023

#!watchflakes
post <- pkg == "runtime" && log ~ `fatal error: timer data corruption|fatal error: runtimer: bad p`

Existing failure from #55308 (comment): https://build.golang.org/log/83ab44e57523849b0c1ce7a9c63644943c54fbfd

fatal error: timer data corruption

runtime stack:
runtime.throw({0xa4296a?, 0x0?})
	/workdir/go/src/runtime/panic.go:1077 +0x5c fp=0x7f531fffeb78 sp=0x7f531fffeb48 pc=0x47169c
runtime.badTimer()
	/workdir/go/src/runtime/time.go:1143 +0x1f fp=0x7f531fffeb98 sp=0x7f531fffeb78 pc=0x495b3f
runtime.siftdownTimer({0xc0001ca100, 0x10, 0x0?}, 0x0)
	/workdir/go/src/runtime/time.go:1102 +0x65 fp=0x7f531fffebc0 sp=0x7f531fffeb98 pc=0x495945
runtime.dodeltimer0(0xc000038500)
	/workdir/go/src/runtime/time.go:426 +0xe6 fp=0x7f531fffebf0 sp=0x7f531fffebc0 pc=0x4940c6
runtime.runtimer(0xc000038500, 0x1c9cea19f9)
	/workdir/go/src/runtime/time.go:813 +0x177 fp=0x7f531fffec20 sp=0x7f531fffebf0 pc=0x494f57
runtime.checkTimers(0xc000038500, 0x7a?)
	/workdir/go/src/runtime/proc.go:3475 +0x11b fp=0x7f531fffec68 sp=0x7f531fffec20 pc=0x47a9bb
runtime.stealWork(0xc000036000?)
	/workdir/go/src/runtime/proc.go:3087 +0x1b2 fp=0x7f531fffece8 sp=0x7f531fffec68 pc=0x479d32
runtime.findRunnable()
	/workdir/go/src/runtime/proc.go:2796 +0x32a fp=0x7f531fffedf0 sp=0x7f531fffece8 pc=0x47920a
runtime.schedule()
	/workdir/go/src/runtime/proc.go:3376 +0xb1 fp=0x7f531fffee28 sp=0x7f531fffedf0 pc=0x47a6f1
runtime.goexit0(0xc000505a00)
	/workdir/go/src/runtime/proc.go:3703 +0x266 fp=0x7f531fffee68 sp=0x7f531fffee28 pc=0x47b606
runtime.mcall()
	/workdir/go/src/runtime/asm_amd64.s:458 +0x4e fp=0x7f531fffee80 sp=0x7f531fffee68 pc=0x4af98e
@prattmic prattmic added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 8, 2023
@prattmic prattmic added this to the Backlog milestone May 8, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 8, 2023
@prattmic
Copy link
Member Author

prattmic commented May 8, 2023

One more result from greplogs. Failing in approximately the same place:

2023-05-05T19:20:01-f90bb8a/linux-amd64-race
2023-04-18T15:42:18-56c4422/freebsd-amd64-race

@mknyszek
Copy link
Contributor

In triage, we'll wait for more failures.

@mknyszek mknyszek added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label May 10, 2023
@bcmills
Copy link
Member

bcmills commented May 17, 2023

In triage, we'll wait for more failures.

Note that you have to add the issue to the Test Flakes project for watchflakes to triage it. 😅

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "runtime" && log ~ `fatal error: timer data corruption`
2023-04-18 15:42 freebsd-amd64-race go@56c44227 runtime (log)
goos: freebsd
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
BenchmarkCallers/cached-4                 	   69872	      1704 ns/op
BenchmarkCallers/inlined-4                	   54068	      2193 ns/op
BenchmarkCallers/no-cache-4               	   10000	     10040 ns/op
BenchmarkFPCallers/cached-4               	 2720203	        43.58 ns/op
BenchmarkMakeChan/Byte-4                  	 1129864	       109.9 ns/op
BenchmarkMakeChan/Int-4                   	 1000000	       138.4 ns/op
...
runtime.goexit0(0xc0004e9040)
	/tmp/workdir/go/src/runtime/proc.go:3685 +0x266 fp=0x7fffdf9faf38 sp=0x7fffdf9faef8 pc=0x4983e6
runtime.mcall()
	/tmp/workdir/go/src/runtime/asm_amd64.s:458 +0x4e fp=0x7fffdf9faf50 sp=0x7fffdf9faf38 pc=0x4cba4e

goroutine 3527 [running]:
	goroutine running on other thread; stack unavailable
created by testing.(*B).doBench in goroutine 1
	/tmp/workdir/go/src/testing/benchmark.go:284 +0xa5
exit status 2
2023-05-05 19:20 linux-amd64-race go@f90bb8a3 runtime (log)
goos: linux
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
BenchmarkCallers/cached-4                 	   69576	      1721 ns/op
BenchmarkCallers/inlined-4                	   52701	      2280 ns/op
BenchmarkCallers/no-cache-4               	   10000	     10032 ns/op
BenchmarkFPCallers/cached-4               	 2843337	        42.79 ns/op
BenchmarkMakeChan/Byte-4                  	 1186437	        92.75 ns/op
BenchmarkMakeChan/Int-4                   	  938323	       132.2 ns/op
...
runtime.goexit0(0xc000505a00)
	/workdir/go/src/runtime/proc.go:3703 +0x266 fp=0x7f531fffee68 sp=0x7f531fffee28 pc=0x47b606
runtime.mcall()
	/workdir/go/src/runtime/asm_amd64.s:458 +0x4e fp=0x7f531fffee80 sp=0x7f531fffee68 pc=0x4af98e

goroutine 4549 [running]:
	goroutine running on other thread; stack unavailable
created by testing.(*B).doBench in goroutine 1
	/workdir/go/src/testing/benchmark.go:284 +0xa5
exit status 2

watchflakes

@aclements
Copy link
Member

This trybot failure is very likely to be the same root cause: https://storage.googleapis.com/go-build-log/345b5fc2/linux-amd64-race_40fbb72b.log. It's in the same benchmark as the other failures, and it's much more likely the timer.pp got stomped than runtimer moved to another P.

@gopherbot
Copy link

Change https://go.dev/cl/497215 mentions this issue: runtime: fix race in BenchmarkSetType* benchmarks

@mknyszek mknyszek added NeedsFix The path to resolution is known, but the work has not been done. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels May 24, 2023
@mknyszek mknyszek self-assigned this May 24, 2023
Nasfame pushed a commit to golangFame/go that referenced this issue Jun 4, 2023
Currently the BenchmarkSetType* benchmarks are racy: they call
heapBitsSetType on an allocation that might be in a span in-use for
allocation on another P. Because heap bits are bits but are written
byte-wise non-atomically (because a P assumes it has total ownership of
a span's bits), two threads can race writing the same heap bitmap byte
creating incorrect metadata.

Fix this by forcing every value we're writing heap bits for into a large
object. Large object spans will never be written to concurrently unless
they're freed first.

Also, while we're here, refactor the benchmarks a bit. Use generics to
eliminate the reflect nastiness in gc_test.go, and pass b.ResetTimer
down into the test to get slightly more accurate results.

Fixes golang#60050.

Change-Id: Ib7d6249b321963367c8c8ca88385386c8ae9af1c
Reviewed-on: https://go-review.googlesource.com/c/go/+/497215
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: Done
Status: Done
Development

No branches or pull requests

5 participants