Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: fatal error: unaligned sysUnused on Linux/PPC64LE #35445

Closed
cherrymui opened this issue Nov 8, 2019 · 9 comments
Assignees
Milestone

Comments

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Nov 8, 2019

What version of Go are you using (go version)?

tip (33dfd35)

Does this issue reproduce with the latest release?

No

What operating system and processor architecture are you using (go env)?

Linux/PPC64LE

What did you do?

https://build.golang.org/log/fb55a05d088fb47d6fd65e7a10a685dea7aea1fb

fatal error: unaligned sysUnused

runtime stack:
runtime.throw(0x344d85, 0x13)
	/workdir/go/src/runtime/panic.go:1106 +0x5c
runtime.sysUnused(0xc00100a000, 0x66000)
	/workdir/go/src/runtime/mem_linux.go:102 +0x220
runtime.(*pageAlloc).scavengeRangeLocked(0x5a7a68, 0x30004, 0x5, 0x33)
	/workdir/go/src/runtime/mgcscavenge.go:488 +0xac
runtime.(*pageAlloc).scavengeOne(0x5a7a68, 0x66000, 0x30001, 0xf0000)
	/workdir/go/src/runtime/mgcscavenge.go:412 +0x39c
runtime.(*pageAlloc).scavenge(0x5a7a68, 0x3e6000, 0x400001, 0x5a7a68)
	/workdir/go/src/runtime/mgcscavenge.go:330 +0x5c
runtime.(*mheap).grow(0x5a7a60, 0x179, 0x0)
	/workdir/go/src/runtime/mheap.go:1188 +0xf0
runtime.(*mheap).allocSpanLocked(0x5a7a60, 0x179, 0x5c5648, 0x1c)
	/workdir/go/src/runtime/mheap.go:1093 +0x1ac
runtime.(*mheap).alloc_m(0x5a7a60, 0x179, 0x30100, 0x48a60)
	/workdir/go/src/runtime/mheap.go:881 +0x114
runtime.(*mheap).alloc.func1()
	/workdir/go/src/runtime/mheap.go:969 +0x4c
runtime.systemstack(0xc000071ee8)
	/workdir/go/src/runtime/asm_ppc64x.s:295 +0xd4
runtime.(*mheap).alloc(0x5a7a60, 0x179, 0x73b5d3010100, 0x73b5d3ac6650)
	/workdir/go/src/runtime/mheap.go:968 +0x7c
runtime.largeAlloc(0x2f0800, 0x688a500000001, 0x73b5d3ad8fc0)
	/workdir/go/src/runtime/malloc.go:1141 +0x8c
runtime.mallocgc.func1()
	/workdir/go/src/runtime/malloc.go:1036 +0x48
runtime.systemstack(0x45000)
	/workdir/go/src/runtime/asm_ppc64x.s:269 +0x94
runtime.mstart()
	/workdir/go/src/runtime/proc.go:1069

I also ran into this on Linux/PPC64LE gomote while testing async preemption.

cc @mknyszek

@bradfitz

This comment has been minimized.

@bradfitz bradfitz added this to the Go1.14 milestone Nov 8, 2019
@mknyszek

This comment has been minimized.

Copy link
Contributor

@mknyszek mknyszek commented Nov 8, 2019

Looking into it. I expect this to be fairly easy to diagnose and fix.

@mknyszek

This comment has been minimized.

Copy link
Contributor

@mknyszek mknyszek commented Nov 8, 2019

OK so the error is definitely correct, the problem is findScavengeCandidate is returning an unaligned base + length. I verified the physical page size on this builder is 64 KiB, so this indicates a broader problem for systems which have larger page sizes. Despite all the tests something is not quite right.

However, I can probably figure out what's wrong just with some diagnostics on crash, and this seems to be reproducing relatively easily. I'm going to dig deeper now.

@mknyszek

This comment has been minimized.

Copy link
Contributor

@mknyszek mknyszek commented Nov 9, 2019

Unfortunately I've been unable to reproduce so far running all.bash in a gomote...

But, I've found the problem. The max/min logic in findScavengeCandidate is slightly wrong.

Fixed, and added a test. Will upload shortly.

@gopherbot

This comment has been minimized.

Copy link

@gopherbot gopherbot commented Nov 9, 2019

Change https://golang.org/cl/206277 mentions this issue: runtime: fix min/max logic in findScavengeCandidate

@mknyszek mknyszek added NeedsFix and removed NeedsInvestigation labels Nov 9, 2019
@mengzhuo

This comment has been minimized.

Copy link
Contributor

@mengzhuo mengzhuo commented Nov 10, 2019

@ceseo

This comment has been minimized.

Copy link
Contributor

@ceseo ceseo commented Nov 11, 2019

@laboger FYI

@laboger

This comment has been minimized.

Copy link
Contributor

@laboger laboger commented Nov 11, 2019

I'm sure this is because of CLs 201765 and 195700 because we never saw this error before that. I believe CL 206277 fixes it on ppc64le.

@gopherbot gopherbot closed this in f511467 Nov 11, 2019
@mknyszek

This comment has been minimized.

Copy link
Contributor

@mknyszek mknyszek commented Nov 11, 2019

@laboger @mengzhuo The problem occurs on machines with a physical page size > 8 KiB. It should now be fixed on all such platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.