Discuss: FP accuracy loss by performance improvement on ARM64

My last https://go-review.googlesource.com/c/go/+/94901 improved FP computation performance by about 
9% on ARM64, but introduced a little accuracy lost.

The main idea is packing a pair of FMUL/FADD instructions into a single FMADD, and its benefits
1. save a register for the intermediate mul result
2. save CPU ticks

How ever accuracy loss also be introduced. Such as 

	float32(0.6046603 * 0.9405091) + 0.6645601, expected 1.2332485, got 1.2332486
	float32(0.67908466 * 0.21855305) + 0.20318687, expected 0.3516029, got 0.35160288
	...


The test case go/src/cmd/compile/internal/gc/testdata/fp.go failed.

There are two solutions
1. Roll back to the less optimized fmul/fadd
2. Modify the test case, something like pattern matching

	float32(0.6046603 * 0.9405091) + 0.6645601 == 1.2332485
	float32(0.6046603 * 0.9405091) + 0.6645601 == 1.233248*


What is your opinion?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Discuss: FP accuracy loss by performance improvement on ARM64 #24033

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Discuss: FP accuracy loss by performance improvement on ARM64 #24033

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions