Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: SIGSEGV in preemptone (riscv64) #68862

Open
gopherbot opened this issue Aug 13, 2024 · 4 comments
Open

runtime: SIGSEGV in preemptone (riscv64) #68862

gopherbot opened this issue Aug 13, 2024 · 4 comments
Assignees
Labels
arch-riscv Issues solely affecting the riscv64 architecture. compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Tools This label describes issues relating to any tools in the x/tools repository.
Milestone

Comments

@gopherbot
Copy link
Contributor

gopherbot commented Aug 13, 2024

#!watchflakes
default <- goarch == "riscv64" && builder == "linux-riscv64-mengzhuo" && `sigcode=1 addr=0xc0`  

Original flakes

#!watchflakes
default <- pkg == "golang.org/x/tools/internal/imports" && test == "TestModReplace2"

Issue created automatically to collect these failures.

Example (log):

=== RUN   TestModReplace2
SIGSEGV: segmentation violation
PC=0x54520 m=12 sigcode=1 addr=0xc0

goroutine 0 gp=0x3f504421c0 m=12 mp=0x3f50526708 [idle]:
runtime.preemptone(0x3f504421c0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/proc.go:6297 +0x38 fp=0x3f5043ff28 sp=0x3f5043ff10 pc=0x54520
runtime.preemptall()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/proc.go:6275 +0x60 fp=0x3f5043ff50 sp=0x3f5043ff28 pc=0x544c0
runtime.forEachPInternal(0x2fa878)
...
a3  0x223abf02	a4  0x3f98bb8000
a5  0x31a7d99	a6  0x29ab75fd
a7  0x1187f0	s2  0x3f5043fed0
s3  0x3f50526708	s4  0x3f50474000
s5  0x3f50241500	s6  0xffffffff
s7  0x4	s8  0x3f50038688
s9  0x3f5043fdc8	s10 0x2fa878
s11 0x3f504421c0	t3  0x2eb2a46908caf
t4  0xffffffffffffffff	t5  0x1913e15049b3
t6  0x3f50038408	pc  0x54520

watchflakes

@gopherbot gopherbot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 13, 2024
@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "golang.org/x/tools/internal/imports" && test == "TestModReplace2"
2024-08-13 17:00 x_tools-go1.23-linux-riscv64 tools@c1241b9c release-branch.go1.23@6885bad7 x/tools/internal/imports.TestModReplace2 [ABORT] (log)
=== RUN   TestModReplace2
SIGSEGV: segmentation violation
PC=0x54520 m=12 sigcode=1 addr=0xc0

goroutine 0 gp=0x3f504421c0 m=12 mp=0x3f50526708 [idle]:
runtime.preemptone(0x3f504421c0?)
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/proc.go:6297 +0x38 fp=0x3f5043ff28 sp=0x3f5043ff10 pc=0x54520
runtime.preemptall()
	/home/swarming/.swarming/w/ir/x/w/goroot/src/runtime/proc.go:6275 +0x60 fp=0x3f5043ff50 sp=0x3f5043ff28 pc=0x544c0
runtime.forEachPInternal(0x2fa878)
...
a3  0x223abf02	a4  0x3f98bb8000
a5  0x31a7d99	a6  0x29ab75fd
a7  0x1187f0	s2  0x3f5043fed0
s3  0x3f50526708	s4  0x3f50474000
s5  0x3f50241500	s6  0xffffffff
s7  0x4	s8  0x3f50038688
s9  0x3f5043fdc8	s10 0x2fa878
s11 0x3f504421c0	t3  0x2eb2a46908caf
t4  0xffffffffffffffff	t5  0x1913e15049b3
t6  0x3f50038408	pc  0x54520

watchflakes

@gopherbot gopherbot added the Tools This label describes issues relating to any tools in the x/tools repository. label Aug 13, 2024
@gopherbot gopherbot added this to the Unreleased milestone Aug 13, 2024
@adonovan adonovan changed the title x/tools/internal/imports: TestModReplace2 failures runtime: SIGSEGV in preemptone Aug 14, 2024
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 14, 2024
@mknyszek mknyszek removed their assignment Aug 15, 2024
@mknyszek mknyszek added arch-riscv Issues solely affecting the riscv64 architecture. help wanted labels Aug 15, 2024
@mknyszek
Copy link
Contributor

CC @golang/riscv64

@mengzhuo
Copy link
Contributor

mengzhuo commented Aug 23, 2024

Updates:
I've closed flake issues that with query "addr=0xc0 riscv64".
All these failures related to same bad builder: linux-riscv64-mengzhuo--cm2

I found interesting logs in dmesg of this builder:

[12901.072321] INFO: task gc-stress:730941 blocked for more than 614 seconds.
[12901.079356]       Not tainted 6.6.36 #2.0~rc3.2+20240815152052
[12901.085355] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[12901.100078] task:gc-stress       state:D stack:0     pid:730941 ppid:730855 flags:0x00000004
[12901.100120] Call Trace:
[12901.100126] [<ffffffff810c173a>] __schedule+0x28c/0x848
[12901.100153] [<ffffffff810c1d3e>] schedule+0x48/0xd2
[12901.100161] [<ffffffff810c20c0>] schedule_preempt_disabled+0x16/0x28
[12901.100170] [<ffffffff810c4f96>] rwsem_down_write_slowpath+0x220/0x58e
[12901.100183] [<ffffffff810c5374>] down_write+0x70/0x72
[12901.100191] [<ffffffff801a8ae8>] vma_expand+0x46/0x1ca
[12901.100202] [<ffffffff801ac086>] mmap_region+0x3c0/0x6b0
[12901.100212] [<ffffffff801ac596>] do_mmap+0x220/0x39e
[12901.100219] [<ffffffff8018649a>] vm_mmap_pgoff+0x8c/0x118
[12901.100232] [<ffffffff801a968a>] ksys_mmap_pgoff+0x3a/0x158
[12901.100240] [<ffffffff80005562>] __riscv_sys_mmap+0x2a/0x36
[12901.100250] [<ffffffff810bf224>] do_trap_ecall_u+0xba/0x12e
[12901.100259] [<ffffffff810c8722>] ret_from_exception+0x0/0x6e
[12901.100273] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings

stack trace log below shows all these failure related to mmap call.

Bad builder is 4G RAM version of bananapi-f3, so I've update sysctl with

vm.dirty_background_ratio = 5
vm.dirty_ratio = 10

Note: This builder still able to respond to prometheus-node-exporter, so I didn't get any warning :(

@mengzhuo
Copy link
Contributor

mengzhuo commented Aug 28, 2024

Updates 28th, Aug:
I've made contact with SpacemiT staff who confirmed that hardware litmus test works as expected after a two days run.

I've also upgraded two builders with a backported kernel patch related to hang mmap scheduler
https://lore.kernel.org/all/20231213203001.179237-5-alexghiti@rivosinc.com/

Unfortunately, it doesn't work.

Now, I've suspend these two builders and wait for a kernel fix from SpacemiT.

@adonovan adonovan changed the title runtime: SIGSEGV in preemptone runtime: SIGSEGV in preemptone (riscv64) Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-riscv Issues solely affecting the riscv64 architecture. compiler/runtime Issues related to the Go compiler and/or runtime. help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. Tools This label describes issues relating to any tools in the x/tools repository.
Projects
Status: No status
Development

No branches or pull requests

3 participants