Skip to content

Discuss: FP accuracy loss by performance improvement on ARM64 #24033

@benshi001

Description

@benshi001

My last https://go-review.googlesource.com/c/go/+/94901 improved FP computation performance by about
9% on ARM64, but introduced a little accuracy lost.

The main idea is packing a pair of FMUL/FADD instructions into a single FMADD, and its benefits

  1. save a register for the intermediate mul result
  2. save CPU ticks

How ever accuracy loss also be introduced. Such as

float32(0.6046603 * 0.9405091) + 0.6645601, expected 1.2332485, got 1.2332486
float32(0.67908466 * 0.21855305) + 0.20318687, expected 0.3516029, got 0.35160288
...

The test case go/src/cmd/compile/internal/gc/testdata/fp.go failed.

There are two solutions

  1. Roll back to the less optimized fmul/fadd

  2. Modify the test case, something like pattern matching

    float32(0.6046603 * 0.9405091) + 0.6645601 == 1.2332485
    float32(0.6046603 * 0.9405091) + 0.6645601 == 1.233248*

What is your opinion?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions