New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: floating point error panic #15234

Open
sirnewton01 opened this Issue Apr 11, 2016 · 14 comments

Comments

Projects
None yet
5 participants
@sirnewton01
Contributor

sirnewton01 commented Apr 11, 2016

  1. What version of Go are you using (go version)?
    1.5.3
  2. What operating system and processor architecture are you using (go env)?
    plan9/386
  3. What did you do?
    I am building go 1.5.3 on plan9 (latest 9front) 386 bootstrapping it from a build of go 1.4 on the system. I set the GOROOT_BOOTSTRAP variable and run the make.rc script.
  4. What did you expect to see?
    I expected that the bootstrapping process would allow me to build go 1.5.3.
  5. What did you see instead?
    Part way through building go 1.5 there are floating point error panics thrown that stop the build. I see a virtually identical stack as mentioned in this email thread:
    https://www.mail-archive.com/9fans@9fans.net/msg34991.html

Here are the results of my initial investigation so far:

I temporarily worked around that particular problem by changing the code and ran into another floating point error later on in the build process.

I managed to create and build a simple Go program that passes with the 1.4 compiler and fails with the 1.5 compiler (built enough to test it). The failure panic is very similar to what I saw in the link above ("floating point error").
http://play.golang.org/p/rtYF4uasa2

I modified the 1.5 compiler slightly to try to get a better idea of the note sent to the process that resulted in the internal _SIGFLOAT. Here is the full panic with the note text sent from the operating system:

sys: fp: precision loss fppc=0x1098 status=0x20 pc=0x00001098
PC=0x1098

groutine 1 [running]:
main.main()
/usr/glenda/tmp/test.go:7 +0x38 fp=0x102e9f88 sp=0x102e9f44
runtime.main()
/usr/glenda/go1.5/go/src/runtime/proc.go:111 +0x272 fp=0x102e9fb0 sp=0x102e9f88
runtime.goexit()
/usr/glenda/go1.5/go/src/runtime/asm_386.s:1662 +0x1 fp=0x102e9fb4 sp=0x102e9fb0

ax 0x102540e0
bx 0x0
cx 0x102540e0
dx 0x20
di 0x10252be4
si 0x10294068
bp 0x137ac0
sp 0x102e9f44
pc 0x1098
flags 0x10206
cs 0x23
fs 0x0
gs 0x0

I disassembled the main function in both the 1.4 version and the 1.5 version:
1.4:
runtime.text+0xf 0x0000102f JHI runtime.text+0x18(SB)
runtime.text+0x11 0x00001031 CALL runtime.morestack_noctxt(SB)
runtime.text+0x16 0x00001036 JMP runtime.text(SB)
runtime.text+0x18 0x00001038 SUBL $0x58,SP
runtime.text+0x1b 0x0000103b FMOVD $f64.4024000000000000(SB),F0
runtime.text+0x21 0x00001041 FMOVDP F0,0x0(SP)
runtime.text+0x24 0x00001044 CALL math.Log2(SB)
runtime.text+0x29 0x00001049 FMOVD 0x8(SP),F0
runtime.text+0x2d 0x0000104d FMOVDP F0,0x34(SP)
runtime.text+0x31 0x00001051 FMOVD 0x34(SP),F0
runtime.text+0x35 0x00001055 FDIVRD $f64.4080000000000000(SB),F0
runtime.text+0x3b 0x0000105b FMOVDP F0,0x2c(SP)
runtime.text+0x3f 0x0000105f FMOVD 0x2c(SP),F0
runtime.text+0x43 0x00001063 FSTCW 0x22(SP)
runtime.text+0x47 0x00001067 MOVW $0xf7f,0x20(SP)
runtime.text+0x4e 0x0000106e FLDCW 0x20(SP)
runtime.text+0x52 0x00001072 FMOVLP F0,0x24(SP)
runtime.text+0x56 0x00001076 FLDCW 0x22(SP)
runtime.text+0x5a 0x0000107a MOVL 0x24(SP),BX
runtime.text+0x5e 0x0000107e MOVL BX,0x28(SP)
runtime.text+0x62 0x00001082 LEAL 0x44(SP),BX
runtime.text+0x66 0x00001086 MOVL $0x0,0x0(BX)
runtime.text+0x6c 0x0000108c MOVL $0x0,0x4(BX)
runtime.text+0x73 0x00001093 LEAL 0x44(SP),BX
runtime.text+0x77 0x00001097 CMPL BX,$0x0
runtime.text+0x7a 0x0000109a JEQ runtime.text+0x102(SB)
runtime.text+0x80 0x000010a0 MOVL $0x1,0x24(SP)

1.5:
main.main 0x00001060 MOVL _privates(SB),CX
main.main+0x6 0x00001066 MOVL 0x0(CX),CX
main.main+0xc 0x0000106c CMPL 0x8(CX),SP
main.main+0xf 0x0000106f JLS main.main+0xf6(SB)
main.main+0x15 0x00001075 SUBL $0x40,SP
main.main+0x18 0x00001078 MOVSD $f64.4024000000000000(SB),X0
main.main+0x20 0x00001080 MOVSD X0,0x0(SP)
main.main+0x25 0x00001085 CALL math.Log2(SB)
main.main+0x2a 0x0000108a MOVSD 0x8(SP),X0
main.main+0x30 0x00001090 MOVSD $f64.4080000000000000(SB),X1
main.main+0x38 0x00001098 DIVSD X0,X1
main.main+0x3c 0x0000109c CVTTSD2SL X1,BX
main.main+0x40 0x000010a0 MOVL BX,main.autotmp_0001+0x20(SP)
main.main+0x44 0x000010a4 XORL BX,BX
main.main+0x46 0x000010a6 MOVL BX,main.autotmp_0005+0x2c(SP)
main.main+0x4a 0x000010aa MOVL BX,0x30(SP)
main.main+0x4e 0x000010ae LEAL main.autotmp_0005+0x2c(SP),BX
main.main+0x52 0x000010b2 CMPL BX,$0x0
main.main+0x55 0x000010b5 JEQ main.main+0xef(SB)
main.main+0x5b 0x000010bb MOVL $0x1,0x38(SP)
main.main+0x63 0x000010c3 MOVL $0x1,0x3c(SP)
main.main+0x6b 0x000010cb MOVL BX,main.autotmp_0002+0x34(SP)
main.main+0x6f 0x000010cf MOVL $0xa3d40,0x0(SP)
main.main+0x76 0x000010d6 LEAL main.autotmp_0001+0x20(SP),BX
main.main+0x7a 0x000010da MOVL BX,0x4(SP)
main.main+0x7e 0x000010de MOVL $0x0,0x8(SP)
main.main+0x86 0x000010e6 CALL runtime.convT2E(SB)
main.main+0x8b 0x000010eb MOVL 0xc(SP),CX

When I debug the 1.5 binary the floating point loss of precision occurs at the DIVSD X0,X1 instruction at main.main+0x38. It seems that the 1.4 version uses a different instruction to do the division (FDIVRD). I don't know enough about x86 instruction set to know why DIVSD could cause a loss of precision where FDIVRD would not.

@sirnewton01

This comment has been minimized.

Contributor

sirnewton01 commented Apr 11, 2016

I hope that there are the enough of the right kind of details in the report.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Apr 11, 2016

Sorry, the plan9 port of Go is buggy and incomplete. Try using a tip Go build for your bootstrap (e.g. https://storage.googleapis.com/go-builder-data/gobootstrap-plan9-386.tar.gz) and building Go 1.6 or Go tip instead of Go 1.5.3.

/cc @0intro

@bradfitz bradfitz closed this Apr 11, 2016

@bradfitz bradfitz added this to the Unplanned milestone Apr 11, 2016

@sirnewton01

This comment has been minimized.

Contributor

sirnewton01 commented Apr 11, 2016

Thanks @bradfitz I will give that a try.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Apr 11, 2016

Feel free to file more issues if you find problems at tip. I closed this because plan9 support is always very bleeding edge and bug reports against old versions aren't very interesting since they're probably fixed and there are probably new bugs.

@sirnewton01

This comment has been minimized.

Contributor

sirnewton01 commented Apr 11, 2016

I tried the version from the link you provided and even launching "go version" throws a floating point loss of precision error on a MULSD instruction inside the go runtime. The rest of the plan9 install that I have seems stable and fine. I've tried some simple floating point in a C program and that seems to work ok too.

I'm thinking of trying to switch over to plan9/amd64 to see if I have better luck there. Is there a plan9 bootstrap tarball for amd64 so that I can try compiling at tip?

Thanks

@bradfitz

This comment has been minimized.

Member

bradfitz commented Apr 11, 2016

I've never used plan9, so I'll defer to @0intro.

@sirnewton01

This comment has been minimized.

Contributor

sirnewton01 commented Apr 11, 2016

Thanks @bradfitz I'll try making a bootstrap from Mac/Linux.

@minux

This comment has been minimized.

Member

minux commented Apr 11, 2016

@sirnewton01

This comment has been minimized.

Contributor

sirnewton01 commented Apr 12, 2016

Thanks @minux that resolved the problem for me.

It seems that something is wrong with the sse2 FP on my machine.

@0intro

This comment has been minimized.

Member

0intro commented Apr 12, 2016

There is no plan9/amd64 binary package available, since you can easily cross-compile Go for plan9/amd64, as you figured out.

Can you reproduce this issue when building Go tip and bootstraping with Go tip on plan9/386?
If so, could you post a backtrace?

Also, which Plan 9 flavor are you running?

@sirnewton01

This comment has been minimized.

Contributor

sirnewton01 commented Apr 12, 2016

@0intro thank you. I am up and running now with the trick from @minux. I set GO386=387 to force the compiler to use 387 instructions instead of sse2. I'm assuming that will also work when bootstrapping from Linux/Mac but I haven't tried that yet.

I am still unsure why sse2 isn't working in this context for me. It could be the virtualbox is not emulating sse2 the way that the Go compiler expects. Perhaps, there is a legitimate bug in the Go compiler on 386. Although, I doubt that last one since I'm sure many people use linux/386 windows/386 and floating point is used in the Go runtime quite a bit.

I'm running the newest plan9 front (aka 9front) from a week or two ago. My understanding is that it is still compatible with plan9. Would you like me to try bootstrapping Go from tip without the GO386=387 variable? I believe that I am on the latest virtualbox (5.0.16).

@0intro

This comment has been minimized.

Member

0intro commented Apr 12, 2016

Would you like me to try bootstrapping Go from tip without the GO386=387 variable?

Yes. This way we could know if Go is still affected by this issue or not.

@sirnewton01

This comment has been minimized.

Contributor

sirnewton01 commented Apr 12, 2016

@0intro I tested with tip, bootstrapped from Mac and run into the same kind of floating point problem when I try to run "go version" on my plan9 in virtualbox. It's a floating point precision loss on a MULSD instruction in runtime.nextSample (determined by running acid).

Anything else you want me to try? I don't want to waste any more of your time.

@0intro

This comment has been minimized.

Member

0intro commented Apr 12, 2016

I think we can reopen this issue, since the current Go tip seems to be affected as well.

I suspect the issue might be related to the 9front kernel. However, I don't know any difference
between the Plan 9 and 9front kernels regarding SSE. They both save the SSE states
when the fxsr flag is set.

@0intro 0intro reopened this Apr 12, 2016

@ALTree ALTree changed the title from Floating point error panic to runtime: floating point error panic Feb 5, 2017

@ALTree ALTree added the OS-Plan9 label Feb 5, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment