Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: 'unexpected fault address 0x0' in crypto/sha256.blockGeneric on linux/mips64le #42229

Open
XiaodongLoong opened this issue Oct 27, 2020 · 18 comments

Comments

@XiaodongLoong
Copy link

@XiaodongLoong XiaodongLoong commented Oct 27, 2020

What version of Go are you using (go version)?

$ go version
go1.15.3 linux/mips64le

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
From build logs:
linux-mips64le-mengzhuo at c305e49e96deafe54a8e43010ea76fead6da0a98

:: Running /tmp/workdir-host-linux-mipsle-mengzhuo/go/src/make.bash with args ["/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/make.bash"] and env ["LANG=zh_CN.UTF-8" "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin" "HOME=/home/gopher" "LOGNAME=gopher" "USER=gopher" "SHELL=/bin/bash" "INVOCATION_ID=f1b33fc87cb8493ca3d2a22025741093" "JOURNAL_STREAM=8:33728938" "GO_BUILDER_ENV=host-linux-mipsle-mengzhuo" "WORKDIR=/tmp/workdir-host-linux-mipsle-mengzhuo" "HTTPS_PROXY=http://proxy:8123" "HTTP_PROXY=http://proxy:8123" "GOPROXY=https://goproxy.io,direct" "GO_STAGE0_NET_DELAY=800ms" "GO_STAGE0_DL_DELAY=300ms" "GOROOT_BOOTSTRAP=/tmp/workdir-host-linux-mipsle-mengzhuo/go1.4" "GO_BUILDER_NAME=linux-mips64le-mengzhuo" "GO_BUILDER_FLAKY_NET=1" "GOROOT_BOOTSTRAP=/usr/lib/golang" "GOMIPS64=hardfloat" "GOARCH=mips64le" "GOHOSTARCH=mips64le" "GOBIN=" "TMPDIR=/tmp/workdir-host-linux-mipsle-mengzhuo/tmp" "GOCACHE=/tmp/workdir-host-linux-mipsle-mengzhuo/gocache"] in dir /tmp/workdir-host-linux-mipsle-mengzhuo/go/src

What did you do?

build golang and run test on linux/mips64le, report error:
Logs URL: https://build.golang.org/log/e160e033f5373a9699024cf8de6234d420215579

ok  	cmd/go/internal/str	0.011s
unexpected fault address 0x0
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x120259a80]

goroutine 1440 [running]:
runtime.throw(0x12065289b, 0x5)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/panic.go:1112 +0x6c fp=0xc000602d68 sp=0xc000602d40 pc=0x12004189c
runtime.sigpanic()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/signal_unix.go:748 +0x51c fp=0xc000602dc0 sp=0xc000602d68 pc=0x12005de24
crypto/sha256.blockGeneric(0xc0003d2700, 0xc000c68000, 0x8000, 0x8000)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/crypto/sha256/sha256block.go:95 +0x1b0 fp=0xc000602ee0 sp=0xc000602dc8 pc=0x120259a80
crypto/sha256.(*digest).Write(0xc0003d2700, 0xc000c68000, 0x8000, 0x8000, 0x8000, 0x0, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/crypto/sha256/sha256.go:198 +0x158 fp=0xc000602f30 sp=0xc000602ee0 pc=0x120259138
io.copyBuffer(0x7fffc89299f8, 0xc0003d2700, 0x120724468, 0xc0000a0de8, 0xc000c68000, 0x8000, 0x8000, 0x12013115c, 0xc00038b801, 0xc0001edf40)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/io/io.go:425 +0x208 fp=0xc000602fa8 sp=0xc000602f30 pc=0x1200ef108
io.Copy(...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/io/io.go:382
cmd/go/internal/cache.FileHash(0xc0001edf40, 0x41, 0x0, 0x0, 0x0, 0x0, 0x41, 0xc0001edf40)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/cache/hash.go:149 +0x384 fp=0xc0006030a8 sp=0xc000602fa8 pc=0x12038ed6c
cmd/go/internal/work.(*Builder).fileHash(0xc0004695e0, 0xc0001edf40, 0x41, 0xc0001edf40, 0x41)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/buildid.go:403 +0x44 fp=0xc000603110 sp=0xc0006030a8 pc=0x1204cda64
cmd/go/internal/work.(*Builder).buildActionID(0xc0004695e0, 0xc000466000, 0x0, 0x0, 0x0, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:344 +0xa80 fp=0xc0006034d8 sp=0xc000603110 pc=0x1204d1750
cmd/go/internal/work.(*Builder).build(0xc0004695e0, 0x12072ef00, 0xc0000240f0, 0xc000466000, 0x0, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:425 +0x48d0 fp=0xc000603d98 sp=0xc0006034d8 pc=0x1204d8028
cmd/go/internal/work.(*Builder).Do.func2(0x12072ef00, 0xc0000240f0, 0xc000466000)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:136 +0x56c fp=0xc000603f08 sp=0xc000603d98 pc=0x12050881c
cmd/go/internal/work.(*Builder).Do.func3(0xc000273a10, 0xc0002ed010, 0xc0004695e0, 0xc0006edad0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:198 +0xb4 fp=0xc000603fb8 sp=0xc000603f08 pc=0x120508a24
runtime.goexit()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/asm_mips64x.s:631 +0x4 fp=0xc000603fb8 sp=0xc000603fb8 pc=0x12007fc4c
created by cmd/go/internal/work.(*Builder).Do
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:184 +0x418

goroutine 1 [semacquire]:
sync.runtime_Semacquire(0xc0002ed018)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/runtime/sema.go:56 +0x4c
sync.(*WaitGroup).Wait(0xc0002ed010)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/sync/waitgroup.go:130 +0xac
cmd/go/internal/work.(*Builder).Do(0xc0004695e0, 0x12072ef00, 0xc0000240f0, 0xc00050edc0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:207 +0x440
cmd/go/internal/test.runTest(0x12072ef00, 0xc0000240f0, 0x1209a6a00, 0xc0000200b0, 0x7, 0x7)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/test/test.go:808 +0xa7c
main.main()
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/main.go:194 +0x99c

goroutine 1474 [runnable]:
syscall.Syscall(0x1388, 0x10, 0xc000442d30, 0x8, 0x0, 0x2, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/asm_linux_mips64x.s:16 +0x10
syscall.readlen(0x10, 0xc000442d30, 0x8, 0x14, 0xc0005fa180, 0x30)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/zsyscall_linux_mips64le.go:935 +0x5c
syscall.forkExec(0xc00059d950, 0x4a, 0xc00045f7c0, 0x13, 0x14, 0xc000442e60, 0xc3a8d82600020300, 0xc00010b500, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/exec_unix.go:221 +0x418
syscall.StartProcess(...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/exec_unix.go:263
os.startProcess(0xc00059d950, 0x4a, 0xc00045f7c0, 0x13, 0x14, 0xc000442ff0, 0xc0004dc7b0, 0x1, 0xc000410160)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec_posix.go:53 +0x1a0
os.StartProcess(0xc00059d950, 0x4a, 0xc00045f7c0, 0x13, 0x14, 0xc000442ff0, 0x2f, 0x29, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec.go:102 +0x7c
os/exec.(*Cmd).Start(0xc000410160, 0x34d325d5, 0x1ed59463e3b30)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec/exec.go:422 +0x354
os/exec.(*Cmd).Run(0xc000410160, 0x169e08c9, 0x1209ee100)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec/exec.go:338 +0x3c
cmd/go/internal/work.(*Builder).runOut(0xc0004695e0, 0xc00036bb80, 0xc00002a1e0, 0x27, 0x0, 0x0, 0x0, 0xc00045f680, 0x10, 0x14, ...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:2013 +0x52c
cmd/go/internal/work.gcToolchain.gc(0xc0004695e0, 0xc00036bb80, 0xc00059d900, 0x4a, 0xc00013c480, 0x2dc, 0x46c, 0x0, 0x0, 0xc00059d900, ...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/gc.go:172 +0xbcc
cmd/go/internal/work.(*Builder).build(0xc0004695e0, 0x12072ef00, 0xc0000240f0, 0xc00036bb80, 0x0, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:706 +0x122c
cmd/go/internal/work.(*Builder).Do.func2(0x12072ef00, 0xc0000240f0, 0xc00036bb80)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:136 +0x56c
cmd/go/internal/work.(*Builder).Do.func3(0xc000273a10, 0xc0002ed010, 0xc0004695e0, 0xc0006edad0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:198 +0xb4
created by cmd/go/internal/work.(*Builder).Do
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:184 +0x418

goroutine 1475 [select]:
cmd/go/internal/work.(*Builder).Do.func3(0xc000273a10, 0xc0002ed010, 0xc0004695e0, 0xc0006edad0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:188 +0x110
created by cmd/go/internal/work.(*Builder).Do
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:184 +0x418

goroutine 1441 [runnable]:
syscall.Syscall(0x1388, 0xe, 0xc000445640, 0x8, 0x0, 0x2, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/asm_linux_mips64x.s:16 +0x10
syscall.readlen(0xe, 0xc000445640, 0x8, 0x3, 0xc0005fa000, 0x2f)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/zsyscall_linux_mips64le.go:935 +0x5c
syscall.forkExec(0xc00025e320, 0x46, 0xc000094c40, 0x2, 0x2, 0xc000445770, 0xe3bc695b00000300, 0xc00010a700, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/exec_unix.go:221 +0x418
syscall.StartProcess(...)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/syscall/exec_unix.go:263
os.startProcess(0xc00025e320, 0x46, 0xc000094c40, 0x2, 0x2, 0xc000445900, 0xc000207080, 0x0, 0x0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec_posix.go:53 +0x1a0
os.StartProcess(0xc00025e320, 0x46, 0xc000094c40, 0x2, 0x2, 0xc000445900, 0x2e, 0x12001039c, 0x7fffcb43dcc8)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec.go:102 +0x7c
os/exec.(*Cmd).Start(0xc000410000, 0xc0002ee001, 0xc0001c5ce0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec/exec.go:422 +0x354
os/exec.(*Cmd).Run(0xc000410000, 0xc0001c5ce0, 0x2d)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/os/exec/exec.go:338 +0x3c
cmd/go/internal/work.(*Builder).toolID(0xc0004695e0, 0x120651e21, 0x3, 0xc0000b01b1, 0xc)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/buildid.go:192 +0x450
cmd/go/internal/work.(*Builder).vet(0xc0004695e0, 0x12072ef00, 0xc0000240f0, 0xc000423cc0, 0x12072ef00, 0xc0000240f0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:1042 +0x178
cmd/go/internal/work.(*Builder).Do.func2(0x12072ef00, 0xc0000240f0, 0xc000423cc0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:136 +0x56c
cmd/go/internal/work.(*Builder).Do.func3(0xc000273a10, 0xc0002ed010, 0xc0004695e0, 0xc0006edad0)
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:198 +0xb4
created by cmd/go/internal/work.(*Builder).Do
	/tmp/workdir-host-linux-mipsle-mengzhuo/go/src/cmd/go/internal/work/exec.go:184 +0x418
go tool dist: Failed: exit status 2

What did you expect to see?

The bug can be fixed.

What did you see instead?

@bcmills
Copy link
Member

@bcmills bcmills commented Oct 27, 2020

I suspect that this is a bug in the compiler or runtime specific to mips64le or perhaps MIPS in general.

The symptom is similar to #34835.

@bcmills bcmills changed the title cmd/go/internal: signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x120259a80 runtime: 'unexpected fault address 0x0' in crypto/sha256.blockGeneric on linux/mips64le Oct 27, 2020
@bcmills bcmills added this to the Backlog milestone Oct 27, 2020
@bcmills
Copy link
Member

@bcmills bcmills commented Oct 27, 2020

@randall77
Copy link
Contributor

@randall77 randall77 commented Oct 29, 2020

I don't understand this panic. As far as I can tell, the instruction reported can't panic.
Offset of 0x1b0 in sha256.blockGeneric is:

  1f3980:	0270982d 	daddu	s3,s3,s0

Of course that doesn't correspond to the PC reported, so I'm not sure how reliable that deduction is. Is ASLR or other relocation going on for mips64le?

I did

gomote create linux-mips64le-mengzhuo
gomote push user-khr-linux-mips64le-mengzhuo-0 // with GOROOT set to a 1.15.3 tree
gomote run user-khr-linux-mips64le-mengzhuo-0 go/src/make.bash
gomote run user-khr-linux-mips64le-mengzhuo-0 go/bin/go test -c cmd/go/internal/work
gomote run user-khr-linux-mips64le-mengzhuo-0 /usr/bin/objdump -d go/bin/work.test > tmp

Then looked for sha256.blockGeneric in the resulting file.

If a mips64 person can check my work, that would be great.

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Oct 29, 2020

@randall77 The other day I looked at it and came to the same conclusion: that instruction cannot panic.

Is ASLR or other relocation going on for mips64le?

No.

@randall77
Copy link
Contributor

@randall77 randall77 commented Oct 29, 2020

If there is no funky remapping going on, then I think we may not be syncing to what the OP was looking at. I would expect the low 12 bits of any PCs to match with the OP's run and our disassembly, and they don't.
Or this is like #34835 and the machine is just flaky.

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Nov 5, 2020

If there is no funky remapping going on, then I think we may not be syncing to what the OP was looking at. I would expect the low 12 bits of any PCs to match with the OP's run and our disassembly, and they don't.
Or this is like #34835 and the machine is just flaky.

Maybe. it's an off tree kernel from Loongson company.

@XiaodongLoong
Copy link
Author

@XiaodongLoong XiaodongLoong commented Nov 6, 2020

Maybe. it's an off tree kernel from Loongson company.

The version of the kernel you used on linux-mips64le-mengzhuo, you must be very clear. Please clarify this matter.
Thanks!

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Nov 6, 2020

It just pure guess.

This kernel is a forked version of LInux 5.4.38 made by Lemote (a MIPS vendor).
However there is no PC related patch AFAIK.
Lemote custom kernel is far more stable than Loonginx one. One more thing Loonginx kernel will hang at any VDSO syscall in Go program due to a Linux bug (#39046)

@XiaodongLoong
You can try to download and try this kernel yourself since builder can only access by Go team.
http://mirror.lemote.com:8000/fedora/releases/?repo=fedora-$releasever&arch=$basearch

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Dec 11, 2020

New failure: https://build.golang.org/log/65f0559e504202b1f1859c3b4f588724000b231e
Must be something related to sha256 failed.

This one shows sha256block.go:95 and the first failure log shows sha256block.go:101

for i := 16; i < 64; i++ {
v1 := w[i-2]
t1 := (bits.RotateLeft32(v1, -17)) ^ (bits.RotateLeft32(v1, -19)) ^ (v1 >> 10)
v2 := w[i-15]
t2 := (bits.RotateLeft32(v2, -7)) ^ (bits.RotateLeft32(v2, -18)) ^ (v2 >> 3)
w[i] = t1 + w[i-7] + t2 + w[i-16]
}
a, b, c, d, e, f, g, h := h0, h1, h2, h3, h4, h5, h6, h7
for i := 0; i < 64; i++ {
t1 := h + ((bits.RotateLeft32(e, -6)) ^ (bits.RotateLeft32(e, -11)) ^ (bits.RotateLeft32(e, -25))) + ((e & f) ^ (^e & g)) + _K[i] + w[i]
t2 := ((bits.RotateLeft32(a, -2)) ^ (bits.RotateLeft32(a, -13)) ^ (bits.RotateLeft32(a, -22))) + ((a & b) ^ (a & c) ^ (b & c))
h = g

Both of lines are impossible to me, however second failure log shows RotateLeft32 failed access addr=0x0. Maybe an reg addr based related bug in the compiler?

@randall77
Copy link
Contributor

@randall77 randall77 commented Dec 12, 2020

This is the instruction that address faulted:

   120257a10:	0008c03c 	dsll32	t8,a4,0x0

That instruction can't address fault. Contradictory logic cycle ensues...

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Dec 22, 2020

This bug also affect mipsle
https://build.golang.org/log/041b067186699bcb8aa0fbbb03fa0e6f922abf7e

What interest me is all failed from cmd/go/internal/cache

@gopherbot
Copy link

@gopherbot gopherbot commented Dec 22, 2020

Change https://golang.org/cl/279632 mentions this issue: cmd/go/internal/fsys: add package level mutex

@draganmladjenovic
Copy link

@draganmladjenovic draganmladjenovic commented Jan 10, 2021

@mengzhuo Are perhaps huge pages enabled on your machine? I seem to remember that there were series of similar failures on rtrk machines until huge pages support was fixed.

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Jan 10, 2021

@draganmladjenovic
Copy link

@draganmladjenovic draganmladjenovic commented Jan 10, 2021

I believe that it was fixed by this commit [1]. If it is missing from your kernel, you can disable THP ($ echo never > /sys/kernel/mm/transparent_hugepage/enabled) and leave it for some time to see if it fixes the problem.

[1] torvalds/linux@b42aa3f#diff-d20543288b2266f423cf078db1e846e6c1ed83b226368f6bf477a65cb5179783

@bcmills
Copy link
Member

@bcmills bcmills commented Feb 22, 2021

@mengzhuo
Copy link
Contributor

@mengzhuo mengzhuo commented Feb 24, 2021

@draganmladjenovic I've disabled THP when this failed occurred.
@bcmills It always happens while tool build is making buildid hashes and reads failed :(

@draganmladjenovic
Copy link

@draganmladjenovic draganmladjenovic commented Feb 24, 2021

@mengzhuo @bcmills Thats bad. The other hint that may help is that thease SIGSEGV are with si_code = SI_KERNEL (0x80), so they probably come from some force_sig(SIGSEGV) codepath in the kernel. Is there a way to instrument this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants