Skip to content

strconv: optimize ParseFloat with new refloat algorithm #66327

@sugawarayuuta

Description

@sugawarayuuta

Floating point conversions are used in many places such as encoding/json. While the current Eisel-Lemire is efficient, you can actually use smaller lookup table and reduce execution time at the same time.
I implemented this algorithm in Go from start and now it's available through BSD-3-Clause license: https://github.com/sugawarayuuta/refloat

  • Benchmarks
$ go test -bench "\d$" -benchmem -benchtime 1m
goos: linux
goarch: amd64
pkg: github.com/sugawarayuuta/refloat
cpu: Intel(R) Core(TM) i3-10110U CPU @ 2.10GHz
BenchmarkParseFloat64/strconv/bits-4            648877453              112.3 ns/op             0 B/op          0 allocs/op
BenchmarkParseFloat64/strconv/norm-4            729803588               95.37 ns/op            0 B/op          0 allocs/op
BenchmarkParseFloat64/refloat/bits-4            773428314               93.41 ns/op            0 B/op          0 allocs/op
BenchmarkParseFloat64/refloat/norm-4            1000000000              67.75 ns/op            0 B/op          0 allocs/op
BenchmarkParseFloat32/strconv/bits-4            894441672               82.30 ns/op            0 B/op          0 allocs/op
BenchmarkParseFloat32/strconv/norm-4            1000000000              67.97 ns/op            0 B/op          0 allocs/op
BenchmarkParseFloat32/refloat/bits-4            870175646               83.18 ns/op            3 B/op          0 allocs/op
BenchmarkParseFloat32/refloat/norm-4            1000000000              50.24 ns/op            0 B/op          0 allocs/op
PASS
ok      github.com/sugawarayuuta/refloat        612.972s
  • Observation

We can see that it performs mostly better on normally distributed (norm) floats and bit-wise uniform floats (bits).
Benchmarks are done on multiple machines - they did produce results very similarly.
Especially normally distributed float64 has a significant change (~41%) which will fit in many real use-cases.

  • Tests

passes parse-number-fxx-test-data, the standard libraries' tests and I'm doing active fuzz testing in addition comparing outputs to the current strconv. so far the only detected problem seems to be the issue 42436 which is a problem with strconv only.

  • Notes

I actually used an external tool to generate polynomial approximations (Which is not distributed through the same license as the project itself). Do tell me if it can be a problem...

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsFixThe path to resolution is known, but the work has not been done.Performance

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions