Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: floating-point broken on android-arm-corellium builder #36830

Closed
bcmills opened this issue Jan 28, 2020 · 15 comments
Closed

cmd/compile: floating-point broken on android-arm-corellium builder #36830

bcmills opened this issue Jan 28, 2020 · 15 comments

Comments

@bcmills bcmills changed the title math/rand: TestFloat32 failing frequently on android-arm-corellium builder math/rand: floating-point tests broken on android-arm-corellium builder Jan 28, 2020
@bcmills bcmills changed the title math/rand: floating-point tests broken on android-arm-corellium builder math/rand: floating-point broken on android-arm-corellium builder Jan 28, 2020
@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

@benshi001 Ben, can you take a look?
As far as I can tell floating-point compares are broken. The values printed in the error messages look correct.

@bcmills bcmills changed the title math/rand: floating-point broken on android-arm-corellium builder cmd/compile: floating-point broken on android-arm-corellium builder Jan 28, 2020
@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

Seems to be failing for GOARCH=arm GOARM=6. GOARM=7 is also broken. GOARM=5 is fine.

$ GOARCH=arm GOARM=6 ../bin/go test -c math/rand
$ gomote push user-khr-android-arm-corellium-0
$ gomote run user-khr-android-arm-corellium-0 go/src/rand.test -test.run=Float32 -test.short -test.v
@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

Here's a simple reproducer:

package main

import "fmt"

//go:noinline
func geq(x, y float32) bool {
	return x >= y
}

func main() {
	var x float32 = 3.0
	var y float32 = 4.0
	for i := 0; i < 1e8; i++ {
		if geq(x, y) {
			fmt.Printf("bad at iteration %d\n", i)
		}
	}
}

Compile this with GOARCH=arm GOARM=7, copy to builder, and run it. We get:

bad at iteration 189792
bad at iteration 251153
bad at iteration 3142582
bad at iteration 8808617
bad at iteration 12425345
bad at iteration 16630084
bad at iteration 16636778
bad at iteration 22789256
bad at iteration 29942954
bad at iteration 32523884
bad at iteration 32766014
bad at iteration 50490757
bad at iteration 60613276
bad at iteration 61778799
bad at iteration 71722383
bad at iteration 72913204
bad at iteration 76562474
bad at iteration 76569282
bad at iteration 79995821

Something is seriously wrong with the floating point comparison hardware on this machine.

The code for geq is just:

TEXT main.geq(SB) /usr/local/google/home/khr/gowork/tmp1.go
  tmp1.go:7             0x9d620                 ed9d0a01                MOVF 0x4(R13), F0       
  tmp1.go:7             0x9d624                 ed9d1a02                MOVF 0x8(R13), F1       
  tmp1.go:7             0x9d628                 eeb40ac1                CMPF F1, F0             
  tmp1.go:7             0x9d62c                 eef1fa10                MOVW FPSCR, APSR_NZCV   
  tmp1.go:7             0x9d630                 e3a00000                MOVW $0, R0             
  tmp1.go:7             0x9d634                 a3a00001                MOVW.GE $1, R0          
  tmp1.go:7             0x9d638                 e5cd000c                MOVB R0, 0xc(R13)       
  tmp1.go:7             0x9d63c                 e28ef000                ADD $0, R14, R15        
@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

... or maybe the OS isn't saving/restoring floating point comparison flags? That would explain the erraticness...

@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

Seems to be all the comparisons, occasionally.
The error is always consistent. For ==, it's always a false negative. For all the other comparisons it is always a false positive.

package main

import "fmt"

//go:noinline
func geq(x, y float32) bool {
	return x >= y
}

//go:noinline
func leq(x, y float32) bool {
	return x <= y
}

//go:noinline
func gt(x, y float32) bool {
	return x > y
}

//go:noinline
func lt(x, y float32) bool {
	return x < y
}

//go:noinline
func eq(x, y float32) bool {
	return x == y
}

//go:noinline
func ne(x, y float32) bool {
	return x != y
}

func main() {
	for _, t := range []struct {
		name   string
		f      func(float32, float32) bool
		tx, ty float32 // inputs that should result in true
		fx, fy float32 // inputs that should result in false
	}{
		{">=", geq, 4, 3, 3, 4},
		{"<=", leq, 3, 4, 4, 3},
		{">", gt, 4, 3, 3, 4},
		{"<", lt, 3, 4, 4, 3},
		{"==", eq, 3, 3, 3, 4},
		{"!=", ne, 3, 4, 3, 3},
	} {
		for i := 0; i < 2e7; i++ {
			if !t.f(t.tx, t.ty) {
				fmt.Printf("bad %f %s %f, want true: %d\n", t.tx, t.name, t.ty, i)
			}
			if t.f(t.fx, t.fy) {
				fmt.Printf("bad %f %s %f, want false: %d\n", t.fx, t.name, t.fy, i)
			}
		}
	}
}
bad 3.000000 >= 4.000000, want false: 25635
bad 3.000000 >= 4.000000, want false: 69076
bad 3.000000 >= 4.000000, want false: 4916526
bad 3.000000 >= 4.000000, want false: 9937096
bad 3.000000 >= 4.000000, want false: 12789690
bad 3.000000 >= 4.000000, want false: 19958632
bad 4.000000 <= 3.000000, want false: 462589
bad 4.000000 <= 3.000000, want false: 14222185
bad 4.000000 <= 3.000000, want false: 15495103
bad 3.000000 > 4.000000, want false: 5855504
bad 3.000000 > 4.000000, want false: 7465563
bad 3.000000 > 4.000000, want false: 14899503
bad 3.000000 > 4.000000, want false: 16291141
bad 3.000000 > 4.000000, want false: 16402125
bad 3.000000 > 4.000000, want false: 17907133
bad 4.000000 < 3.000000, want false: 9664194
bad 4.000000 < 3.000000, want false: 13493741
bad 4.000000 < 3.000000, want false: 16757572
bad 3.000000 == 3.000000, want true: 2776863
bad 3.000000 == 3.000000, want true: 6072085
bad 3.000000 == 3.000000, want true: 10121837
bad 3.000000 == 3.000000, want true: 12263554
bad 3.000000 == 3.000000, want true: 15621010
bad 3.000000 == 3.000000, want true: 18710498
bad 3.000000 != 3.000000, want false: 1481130
bad 3.000000 != 3.000000, want false: 1484425
bad 3.000000 != 3.000000, want false: 1487792
bad 3.000000 != 3.000000, want false: 3018792
bad 3.000000 != 3.000000, want false: 19224977
@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

Same behavior for the compare-with-zero instruction, and for float64.

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Jan 28, 2020

Does it work if we disable async preemption? GODEBUG=asyncpreemptoff=1

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Jan 28, 2020

GOARM=5 implies softfloat. I don't think android-arm is a softfloat builder. But we do have some softfloat tests that runs everywhere.

The arm5 builder is a softfloat builder (and the only one).

@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

Does it work if we disable async preemption? GODEBUG=asyncpreemptoff=1

Still buggy with that set. Same with GOGC=off.

arm64 binaries on the same machine work fine.

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Jan 28, 2020

Thanks. So it is probably indeed a hardware/OS bug...

@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 28, 2020

I can reproduce with C.

#include <stdio.h>

int geq(double x, double y) {
  return x >= y;
}

double x = 3;
double y = 4;

int main(int argc, char *argv[]) {
  for (int i = 0; i < 100000000; i++) {
    if (geq(x, y)) {
          printf("failed at iteration %d\n", i);
        }
    }
}
failed at iteration 532580
failed at iteration 19932913
failed at iteration 29020612
failed at iteration 32054460
failed at iteration 67123803
failed at iteration 70001529
failed at iteration 87241841
failed at iteration 89238362
failed at iteration 94347886

Note that it fails when compiled with arm-linux-gnueabihf-gcc (hard float) and not with arm-linux-gnueabi-gcc (soft float).

Anyone know how to contact Corellium and file a bug? Their website is not very helpful.

@cherrymui
Copy link
Contributor

@cherrymui cherrymui commented Jan 28, 2020

@randall77
Copy link
Contributor

@randall77 randall77 commented Jan 29, 2020

I got in contact with the Corellium folks and they've opened an internal issue for it. Likely it will be fixed in their next release, eta unknown.
I'll remove the 1.14 milestone. There's no more action required by us on this, other than to verify the fix when it is released.

@randall77
Copy link
Contributor

@randall77 randall77 commented Feb 24, 2020

Looks like this was fixed by an update on Corellium's part on our about 2/17.

@randall77 randall77 closed this Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.