Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: timer data corruption #60050

Open
prattmic opened this issue May 8, 2023 · 11 comments
Open

runtime: fatal error: timer data corruption #60050

prattmic opened this issue May 8, 2023 · 11 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@prattmic
Copy link
Member

prattmic commented May 8, 2023

#!watchflakes
post <- pkg == "runtime" && log ~ `fatal error: timer data corruption|fatal error: runtimer: bad p`

Existing failure from #55308 (comment): https://build.golang.org/log/83ab44e57523849b0c1ce7a9c63644943c54fbfd

fatal error: timer data corruption

runtime stack:
runtime.throw({0xa4296a?, 0x0?})
	/workdir/go/src/runtime/panic.go:1077 +0x5c fp=0x7f531fffeb78 sp=0x7f531fffeb48 pc=0x47169c
runtime.badTimer()
	/workdir/go/src/runtime/time.go:1143 +0x1f fp=0x7f531fffeb98 sp=0x7f531fffeb78 pc=0x495b3f
runtime.siftdownTimer({0xc0001ca100, 0x10, 0x0?}, 0x0)
	/workdir/go/src/runtime/time.go:1102 +0x65 fp=0x7f531fffebc0 sp=0x7f531fffeb98 pc=0x495945
runtime.dodeltimer0(0xc000038500)
	/workdir/go/src/runtime/time.go:426 +0xe6 fp=0x7f531fffebf0 sp=0x7f531fffebc0 pc=0x4940c6
runtime.runtimer(0xc000038500, 0x1c9cea19f9)
	/workdir/go/src/runtime/time.go:813 +0x177 fp=0x7f531fffec20 sp=0x7f531fffebf0 pc=0x494f57
runtime.checkTimers(0xc000038500, 0x7a?)
	/workdir/go/src/runtime/proc.go:3475 +0x11b fp=0x7f531fffec68 sp=0x7f531fffec20 pc=0x47a9bb
runtime.stealWork(0xc000036000?)
	/workdir/go/src/runtime/proc.go:3087 +0x1b2 fp=0x7f531fffece8 sp=0x7f531fffec68 pc=0x479d32
runtime.findRunnable()
	/workdir/go/src/runtime/proc.go:2796 +0x32a fp=0x7f531fffedf0 sp=0x7f531fffece8 pc=0x47920a
runtime.schedule()
	/workdir/go/src/runtime/proc.go:3376 +0xb1 fp=0x7f531fffee28 sp=0x7f531fffedf0 pc=0x47a6f1
runtime.goexit0(0xc000505a00)
	/workdir/go/src/runtime/proc.go:3703 +0x266 fp=0x7f531fffee68 sp=0x7f531fffee28 pc=0x47b606
runtime.mcall()
	/workdir/go/src/runtime/asm_amd64.s:458 +0x4e fp=0x7f531fffee80 sp=0x7f531fffee68 pc=0x4af98e
@prattmic prattmic added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 8, 2023
@prattmic prattmic added this to the Backlog milestone May 8, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label May 8, 2023
@prattmic
Copy link
Member Author

prattmic commented May 8, 2023

One more result from greplogs. Failing in approximately the same place:

2023-05-05T19:20:01-f90bb8a/linux-amd64-race
2023-04-18T15:42:18-56c4422/freebsd-amd64-race

@mknyszek
Copy link
Contributor

In triage, we'll wait for more failures.

@mknyszek mknyszek added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label May 10, 2023
@bcmills
Copy link
Contributor

bcmills commented May 17, 2023

In triage, we'll wait for more failures.

Note that you have to add the issue to the Test Flakes project for watchflakes to triage it. 😅

@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "runtime" && log ~ `fatal error: timer data corruption`
2023-04-18 15:42 freebsd-amd64-race go@56c44227 runtime (log)
goos: freebsd
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
BenchmarkCallers/cached-4                 	   69872	      1704 ns/op
BenchmarkCallers/inlined-4                	   54068	      2193 ns/op
BenchmarkCallers/no-cache-4               	   10000	     10040 ns/op
BenchmarkFPCallers/cached-4               	 2720203	        43.58 ns/op
BenchmarkMakeChan/Byte-4                  	 1129864	       109.9 ns/op
BenchmarkMakeChan/Int-4                   	 1000000	       138.4 ns/op
...
runtime.goexit0(0xc0004e9040)
	/tmp/workdir/go/src/runtime/proc.go:3685 +0x266 fp=0x7fffdf9faf38 sp=0x7fffdf9faef8 pc=0x4983e6
runtime.mcall()
	/tmp/workdir/go/src/runtime/asm_amd64.s:458 +0x4e fp=0x7fffdf9faf50 sp=0x7fffdf9faf38 pc=0x4cba4e

goroutine 3527 [running]:
	goroutine running on other thread; stack unavailable
created by testing.(*B).doBench in goroutine 1
	/tmp/workdir/go/src/testing/benchmark.go:284 +0xa5
exit status 2
2023-05-05 19:20 linux-amd64-race go@f90bb8a3 runtime (log)
goos: linux
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
BenchmarkCallers/cached-4                 	   69576	      1721 ns/op
BenchmarkCallers/inlined-4                	   52701	      2280 ns/op
BenchmarkCallers/no-cache-4               	   10000	     10032 ns/op
BenchmarkFPCallers/cached-4               	 2843337	        42.79 ns/op
BenchmarkMakeChan/Byte-4                  	 1186437	        92.75 ns/op
BenchmarkMakeChan/Int-4                   	  938323	       132.2 ns/op
...
runtime.goexit0(0xc000505a00)
	/workdir/go/src/runtime/proc.go:3703 +0x266 fp=0x7f531fffee68 sp=0x7f531fffee28 pc=0x47b606
runtime.mcall()
	/workdir/go/src/runtime/asm_amd64.s:458 +0x4e fp=0x7f531fffee80 sp=0x7f531fffee68 pc=0x4af98e

goroutine 4549 [running]:
	goroutine running on other thread; stack unavailable
created by testing.(*B).doBench in goroutine 1
	/workdir/go/src/testing/benchmark.go:284 +0xa5
exit status 2

watchflakes

@aclements
Copy link
Member

This trybot failure is very likely to be the same root cause: https://storage.googleapis.com/go-build-log/345b5fc2/linux-amd64-race_40fbb72b.log. It's in the same benchmark as the other failures, and it's much more likely the timer.pp got stomped than runtimer moved to another P.

@gopherbot
Copy link

Change https://go.dev/cl/497215 mentions this issue: runtime: fix race in BenchmarkSetType* benchmarks

@mknyszek mknyszek added NeedsFix The path to resolution is known, but the work has not been done. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels May 24, 2023
@mknyszek mknyszek self-assigned this May 24, 2023
Nasfame pushed a commit to golangFame/go that referenced this issue Jun 4, 2023
Currently the BenchmarkSetType* benchmarks are racy: they call
heapBitsSetType on an allocation that might be in a span in-use for
allocation on another P. Because heap bits are bits but are written
byte-wise non-atomically (because a P assumes it has total ownership of
a span's bits), two threads can race writing the same heap bitmap byte
creating incorrect metadata.

Fix this by forcing every value we're writing heap bits for into a large
object. Large object spans will never be written to concurrently unless
they're freed first.

Also, while we're here, refactor the benchmarks a bit. Use generics to
eliminate the reflect nastiness in gc_test.go, and pass b.ResetTimer
down into the test to get slightly more accurate results.

Fixes golang#60050.

Change-Id: Ib7d6249b321963367c8c8ca88385386c8ae9af1c
Reviewed-on: https://go-review.googlesource.com/c/go/+/497215
Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Michael Knyszek <mknyszek@google.com>
Auto-Submit: Michael Knyszek <mknyszek@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot gopherbot reopened this Aug 18, 2023
@gopherbot
Copy link

Found new dashboard test flakes for:

#!watchflakes
post <- pkg == "runtime" && log ~ `fatal error: timer data corruption|fatal error: runtimer: bad p`
2023-08-16 17:16 darwin-amd64-race go@095cfba8 runtime (log)
goos: darwin
goarch: amd64
pkg: runtime
BenchmarkMakeChan/Byte-4                  	  895909	       224.3 ns/op
BenchmarkMakeChan/Int-4                   	 1000000	       324.4 ns/op
BenchmarkMakeChan/Ptr-4                   	  699759	       383.7 ns/op
BenchmarkMakeChan/Struct/0-4              	 1000000	       221.6 ns/op
BenchmarkMakeChan/Struct/32-4             	  237309	       688.2 ns/op
BenchmarkMakeChan/Struct/40-4             	  373726	       927.1 ns/op
BenchmarkChanNonblocking-4                	 7272085	        15.62 ns/op
...
runtime.park_m(0xc000006d00?)
	/tmp/buildlet/go/src/runtime/proc.go:3529 +0x12d fp=0x700005289f50 sp=0x700005289f20 pc=0x109644d
runtime.mcall()
	/tmp/buildlet/go/src/runtime/asm_amd64.s:452 +0x43 fp=0x700005289f60 sp=0x700005289f50 pc=0x10cc2c3

goroutine 4591 [running]:
	goroutine running on other thread; stack unavailable
created by testing.(*B).doBench
	/tmp/buildlet/go/src/testing/benchmark.go:284 +0xa5
exit status 2

watchflakes

@bcmills
Copy link
Contributor

bcmills commented Aug 23, 2023

Note that commit 095cfba is on release-branch.go1.20 (#59483). This was fixed in 1.21?

(Would it make sense to backport the fix?)

@ruyi789
Copy link

ruyi789 commented Jan 3, 2024

`fatal error: timer data corruption

runtime stack:
runtime.throw({0x7ff6b52f0949?, 0x0?})
D:/Go/go/src/runtime/panic.go:1077 +0x65 fp=0x66ac7ff408 sp=0x66ac7ff3d8 pc=0x7ff6b47541e5
runtime.badTimer()
D:/Go/go/src/runtime/time.go:1143 +0x1f fp=0x66ac7ff428 sp=0x66ac7ff408 pc=0x7ff6b4775c7f
runtime.siftupTimer({0xc000046f40, 0x4, 0x66ac7ff498?}, 0x3)
D:/Go/go/src/runtime/time.go:1076 +0x65 fp=0x66ac7ff450 sp=0x66ac7ff428 pc=0x7ff6b4775925
runtime.doaddtimer(0xc00004ca00, 0xc00007c910)
D:/Go/go/src/runtime/time.go:303 +0x125 fp=0x66ac7ff4a8 sp=0x66ac7ff450 pc=0x7ff6b4773d05
runtime.runtimer(0xc00004ca00, 0x167f4db2cb8)
D:/Go/go/src/runtime/time.go:814 +0x1ae fp=0x66ac7ff4e0 sp=0x66ac7ff4a8 pc=0x7ff6b47750ee
runtime.checkTimers(0xc00004ca00, 0x0)
D:/Go/go/src/runtime/proc.go:3693 +0x15f fp=0x66ac7ff538 sp=0x66ac7ff4e0 pc=0x7ff6b475ce5f
runtime.stealWork(0x0)
D:/Go/go/src/runtime/proc.go:3293 +0x1e6 fp=0x66ac7ff5b8 sp=0x66ac7ff538 pc=0x7ff6b475c1c6
runtime.findRunnable()
D:/Go/go/src/runtime/proc.go:2999 +0x2fc fp=0x66ac7ff6d8 sp=0x66ac7ff5b8 pc=0x7ff6b475b6bc
runtime.schedule()
D:/Go/go/src/runtime/proc.go:3582 +0xbd fp=0x66ac7ff710 sp=0x66ac7ff6d8 pc=0x7ff6b475cafd
runtime.park_m(0xc000084540?)
D:/Go/go/src/runtime/proc.go:3745 +0x105 fp=0x66ac7ff740 sp=0x66ac7ff710 pc=0x7ff6b475d065
traceback: unexpected SPWRITE function runtime.mcall
runtime.mcall()
D:/Go/go/src/runtime/asm_amd64.s:458 +0x53 fp=0x66ac7ff758 sp=0x66ac7ff740 pc=0x7ff6b4785473`

@ruyi789
Copy link

ruyi789 commented Jan 3, 2024

1
2

3

@ruyi789
Copy link

ruyi789 commented Jan 3, 2024

go version go1.21.5 windows/amd64:Error with go thread stack extension

4
5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done.
Projects
Status: In Progress
Status: Done
Development

No branches or pull requests

6 participants