Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: math.Float32frombits and math.Float32bits round-trip mismatch #27193

Closed
dsnet opened this issue Aug 24, 2018 · 11 comments

Comments

Projects
None yet
5 participants
@dsnet
Copy link
Member

commented Aug 24, 2018

Consider the following:

var got, want bool
for i := uint32(0x7f8ddc7e); i <= uint32(0x7f8ddc7e); i++ {
	got = i == math.Float32bits(math.Float32frombits(i))
}
i := uint32(0x7f8ddc7e)
want = i == math.Float32bits(math.Float32frombits(i))
if got != want {
	print("FAIL")
}

I expect this to work fine.
However, this currently prints "FAIL".

Bisect identifies https://go-review.googlesource.com/62250 as the cause.

\cc @cherrymui @mundaym

@dsnet dsnet changed the title math: Float3frombits and Float32bits round-trip mismatch cmd/compile: math.Float3frombits and math.Float32bits round-trip mismatch Aug 24, 2018

@dsnet dsnet added this to the Go1.12 milestone Aug 24, 2018

@dsnet

This comment has been minimized.

Copy link
Member Author

commented Aug 24, 2018

Setting Go1.12, since this was introduced in Go1.10; no need to block Go1.11.

@mundaym

This comment has been minimized.

Copy link
Member

commented Aug 24, 2018

A runnable link: https://play.golang.org/p/k8rl-o_h97k.

This is a weird one. The constant propagated version of the round trip changes 0x7f8ddc7e into 0x7fcddc7e, effectively an OR with 0x00400000.

It is probably worth backporting the eventual fix to 1.11.1 and 1.10.x since it is a codegen bug but we can figure that out once we have a fix.

@mundaym mundaym changed the title cmd/compile: math.Float3frombits and math.Float32bits round-trip mismatch cmd/compile: math.Float32frombits and math.Float32bits round-trip mismatch Aug 24, 2018

@mundaym

This comment has been minimized.

Copy link
Member

commented Aug 24, 2018

I think the problem is that the float32 bit pattern 0x7f8ddc7e corresponds to an sNaN. During constant propagation we convert that value to a float64 so that it can be stored in an AuxInt. This conversion produces the bit pattern 0x7ff9bb8fc0000000. So during the conversion the processor has set the MSB of the mantissa to 1 to produce a qNaN. I see two possible remedies:

  1. We write our own fully reversible float32 -> float64 and float64 -> float32 implementations using integer operations and use these wherever we need to ensure that the conversion is bit exact.
  2. Store Float32 values in the AuxInt directly without first converting them to float64.

I'm leaning towards option (2) since option (1) requires us to remember to use the special conversion functions in future CLs. On the other hand option (1) is probably less work and we probably don't need full reversibility in very many places (sNaNs can't be expected to survive any arithmetic operations, so this really only applies to freshly created constants). Any thoughts @cherrymui @randall77?

This is probably obscure enough to wait for Go 1.12.

@mundaym mundaym self-assigned this Aug 24, 2018

@cherrymui

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

I'm also leaning towards option (2). So we don't do a conversion unless the input program needs one. I think this is less tricky.

Or maybe we could just store the bit pattern as an uint32 AuxInt? And do Float32frombits when we need to do arithmetics on them.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

I like option 2 as well.
Go 1.12 is fine. Obscure + easy workaround.

Just so I understand, the underlying problem is that the float32->float64 conversion changes a signaling NaN to a quiet NaN?

Joe, was there some real code that triggered this bug?

@dsnet

This comment has been minimized.

Copy link
Member Author

commented Aug 24, 2018

Joe, was there some real code that triggered this bug?

I was writing unit tests for protobuf serialization and wanted to make sure float fields preserved the bits exactly as is round-trip. I don't think this bug is affecting anyone in production.

I should note that a similar bug also affects the float64 versions of these functions, so I'm not sure why a float32 <-> float64 conversion is the cause.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

I tried modifying your example for 64-bit and it passes.
Do you have a failing 64-bit example handy?

@dsnet

This comment has been minimized.

Copy link
Member Author

commented Aug 24, 2018

Hmmm, I just re-ran my tests without the 64-bit workaround and it passes, so it only affects the float32 versions. My mistake.

@mundaym

This comment has been minimized.

Copy link
Member

commented Sep 3, 2018

Just so I understand, the underlying problem is that the float32->float64 conversion changes a signaling NaN to a quiet NaN?

Yeah, that's right. Happens on both the platforms I've investigated (amd64 and s390x). The behavior on Intel is documented in the architecture manual, section 4.8.3.5. On s390x the behavior is specified by the result tables for the relevant instructions in the principles of operation.

@gopherbot

This comment has been minimized.

Copy link

commented Sep 3, 2018

Change https://golang.org/cl/132956 mentions this issue: cmd/compile: fix store-to-load forwarding of 32-bit sNaNs

@mundaym

This comment has been minimized.

Copy link
Member

commented Sep 3, 2018

I sent the above CL for now. It's a small change that should fix the issue and can be backported if needed.

The assemblers expect float64 constants to be encoded as float64 values so option (2) is a much bigger change. We can either do the conversions in the compiler (in which case we'll need something equivalent to the functions added in CL 132956 anyway) or we can modify the assemblers to expect float32 values. Either way I think I'd rather fix this bug first and then make the bigger change a bit later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.