Skip to content

Refactor Float.round/2 algorithm for better performance#15329

Merged
josevalim merged 7 commits intoelixir-lang:mainfrom
PJUllrich:float-round-fast-path
May 1, 2026
Merged

Refactor Float.round/2 algorithm for better performance#15329
josevalim merged 7 commits intoelixir-lang:mainfrom
PJUllrich:float-round-fast-path

Conversation

@PJUllrich
Copy link
Copy Markdown
Contributor

@PJUllrich PJUllrich commented Apr 30, 2026

I would like to tentatively put forward this PR for review. I am by no means an expert on floating-point arithmetic and this contribution is significantly outside my expertise. My hope is that this PR can open the discussion about how to replace the inefficient Float.round/2 algorithm with something more performant. Feel free to reject if useless.

Summary

Refactors the bignum-heavy Float.round/2 (and the shared internal round/3 used by Float.floor/2 and Float.ceil/2)

Result: ~6-7× faster across the precision range, with zero behavior change - including the documented tie cases like Float.round(5.5675, 3) == 5.567.

Background

The previous implementation explicitly noted the trade-off:

# This implementation is slow since it relies on big integers.
# Faster implementations are available on more recent papers
# and could be implemented in the future.

It computed m * power_of_5(count) where count could grow to ~104, producing 250-bit bignums for a single division. The cost grew with both float magnitude and target precision.

The current code contains a lot of comments to make the review easier. I can remove those before the final merge.

Benchmark results

11 different float workloads, measured with Benchee on Apple M2. Reported times are the median μs of one iteration over all 11 workloads.

precision current new (this PR) native_double new vs current native_double vs current native_double vs new
0 0.25 μs 0.25 μs 0.25 μs 1.00x 1.00x 1.00
1 2.42 μs 0.33 μs 0.33 μs 7.33x 7.33x 1.00
2 2.67 μs 0.63 μs 0.33 μs 4.24x 8.09x 1.91
3 4.29 μs 0.63 μs 0.33 μs 6.81x 13.00x 1.91
5 5.50 μs 0.67 μs 0.33 μs 8.21x 16.67x 2.03
8 7.21 μs 0.96 μs 0.38 μs 7.51x 18.97x 2.53
12 7.75 μs 0.96 μs 0.38 μs 8.6x 25.7x 3.0
15 8.58 μs 1.58 μs 0.38 μs 5.7x 28.3x 5.0

Median μs per iteration over the 11-float mixed workload (per-call median ≈ value / 11).

  • current is the bignum-based implementation we're replacing
  • new is this PR
  • native_double is included as a "what does this look like in pure native float arithmetic" reference (:erlang.round(f * pow) / pow) - it's 2-4× faster than New but incorrect on tie inputs because 5.5675 * 1000 rounds to 5567.5 in IEEE due to double-rounding (multiplication round + integer round), then bumps to 5568. It's why we can't simply use the obvious approach.

Floats benchmarked

floats = [
  {"small_pi",       3.141592653589793},   # typical small value
  {"common_money",   5.5675},               # tie-boundary case
  {"large_e10",      1.2345e10},            # large magnitude
  {"tiny_e_neg8",    1.2345e-8},            # very small magnitude
  {"near_zero",      0.00001},              # close to zero
  {"negative",       -123.456789},          # negative sign
  {"integer_valued", 12.0},                 # exactly representable integer
  {"tie_half",       12.5},                 # half-tie
  {"tie_5675",       5.5675},               # documented tie example
  {"large_e30",      1.234e30},             # very large
  {"max_finite_ish", 1.7e308}               # near max double
]

Comment thread lib/elixir/lib/float.ex Outdated
Comment thread lib/elixir/lib/float.ex Outdated
@josevalim
Copy link
Copy Markdown
Member

In bignum_to_float/3, mantissa may become exactly 2^53, either because align/3 returns an upper-bound quotient or because rounding carries:

<<result::float>> =
  <<sign::1, exp + 1023::11, mantissa - @power_of_2_to_52::52>>

If mantissa == 2^53, then:

mantissa - @power_of_2_to_52 == 2^52

which does not fit in a 52-bit field. These tests should catch it:

assert Float.round(:math.pow(2, 50), 1) === :math.pow(2, 50)
assert Float.ceil(:math.pow(2, 50), 1) === :math.pow(2, 50)
assert Float.floor(:math.pow(2, 50), 1) === :math.pow(2, 50)

I believe you need to handle the carry/upper-bound normalization before packing:

{mantissa, exp} =
  if mantissa == @power_of_2_to_52 <<< 1 do
    {@power_of_2_to_52, exp + 1}
  else
    {mantissa, exp}
  end

But please double check with the Go implementation too.

@josevalim
Copy link
Copy Markdown
Member

Or in general test these:

for k <- 0..60 do
  f = :math.pow(2, k)
  assert Float.round(f, 1) === f
  assert Float.floor(f, 1) === f
  assert Float.ceil(f, 1) === f
end

Make sure to remove other assertions that already check powers of two, as this will effectively verify all of them.

Comment thread lib/elixir/lib/float.ex Outdated
@josevalim
Copy link
Copy Markdown
Member

I have dropped some comments. If you have used AI for implementing it, please disclose so! And, in such cases, it may be useful to ask different models to review the code, both with and without the reference Go implementation. :)

PJUllrich added 2 commits May 1, 2026 12:27
… of Cox, but only a refactor of the existing exact

rational scaling approach.
Fix bug when mantissa is exactly `2^53`.
Fix Float.round/2 slow-path mantissa overflow when shift_adjust < 0 and the integer quotient lands in [2^53, 2^54)
Comment thread lib/elixir/lib/float.ex Outdated
@PJUllrich
Copy link
Copy Markdown
Contributor Author

@josevalim thank you for your comments and review ❤️ And my apologies: I read the relevant parts of the paper and worked with Opus to get a high-level understanding of how it works, but then prompted Opus 4.7 to do the translation from paper to code with the Go implementation as reference because that work exceeded my abilities. I focused on verifying the implementation through the existing tests and expanded them to catch new (and previously uncovered) edge-cases, identified by both me and Opus. I also focused on benchmarking different variants of this implementation and alternatives like e.g. the native_double variant. But in the end, I had to admit to myself that I did not understand the implementation and needed someone smarter to look over it, which is why I opened this PR (probably should have made it a Draft first, also for that I'd like to apologize).

I'd like to close this PR for now. I do not feel good proposing a new algorithm if I myself don't understand it. I already had very mixed feelings about this before I opened the PR. For documentation purposes, I updated the PR with your suggestions and fixed one bug that GPT-5.5 identified. I also updated the function comment extensively to reflect that this is in fact not a direct implementation of Cox 2026, but merely a refactor of the existing exact rational scaling approach based on David M. Gay. I hope that this PR can serve as inspiration for someone smarter than me who actually understands this stuff to improve the rounding algorithm in the future 🙏

@PJUllrich PJUllrich closed this May 1, 2026
@PJUllrich PJUllrich changed the title Replace Float.round/2 algorithm with Cox 2026 Refactor Float.round/2 algorithm for better performance May 1, 2026
@josevalim
Copy link
Copy Markdown
Member

Hi @PJUllrich, I have reviewed the pull request and it overall looks good to me, so if you don't mind, you can reopen it and I will be responsible for merging it.

@PJUllrich PJUllrich reopened this May 1, 2026
@PJUllrich
Copy link
Copy Markdown
Contributor Author

I don't mind as long as you don't blame me at the next ElixirConf for breaking Elixir 1.20 :D

@josevalim josevalim merged commit 11f7d8f into elixir-lang:main May 1, 2026
15 checks passed
@josevalim
Copy link
Copy Markdown
Member

💚 💙 💜 💛 ❤️

@PJUllrich PJUllrich deleted the float-round-fast-path branch May 1, 2026 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants