Skip to content

std.math.big.int: Improvements to toString and setString speed #24220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

samy-00007
Copy link
Contributor

@samy-00007 samy-00007 commented Jun 18, 2025

This PR is ready to be reviewed but depends on functions defined in #23848, so I am marking it as a draft until the other one is merged (hopefully).

This PR massively improve the speed of toString and setString.
For the former, a subquadratic algorithm has been added, and the quadratic one has also been improved.
For the latter, the previous implementation was, uhh, aiming for accuracy rather than speed, to put it nicely, and was therefore very easy to massively speedup.
I have also used stackFallback at 2 locations (in astgen and zongen) to reduce the number of allocations when parsing small numbers.

Graphs for toString
toString_500_final
toString (long)
In the last graph (the long one), the current std implementation of toString takes 1m10 for n = 90000 on my machine.

For setString, this is the time the implementation in master takes divided by the time the new implementation takes (for a BigInt of n limbs). In short, this graph shows how many times faster is the new implementation.
setString

Ast-check of compiler_rt/udivmodti4_test.zig comparison (master vs new):

Benchmark 1 (17 runs): /home/samy/Downloads/zig/zig-x86_64-linux-0.15.0-dev.833+5f7780c53/zig ast-check ../lib/compiler_rt/udivmodti4_test.zig
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           295ms ± 2.77ms     292ms …  302ms          1 ( 6%)        0%
  peak_rss            105MB ± 1.50MB     103MB …  108MB          0 ( 0%)        0%
  cpu_cycles          986M  ± 1.76M      984M  …  990M           0 ( 0%)        0%
  instructions       3.01G  ±  190      3.01G  … 3.01G           0 ( 0%)        0%
  cache_references   4.35M  ± 22.2K     4.30M  … 4.39M           1 ( 6%)        0%
  cache_misses       2.08M  ±  104K     1.99M  … 2.33M           2 (12%)        0%
  branch_misses      2.29M  ± 8.90K     2.27M  … 2.31M           1 ( 6%)        0%
Benchmark 2 (27 runs): stage3/bin/zig ast-check ../lib/compiler_rt/udivmodti4_test.zig
  measurement          mean ± σ            min … max           outliers         delta
  wall_time           186ms ± 1.73ms     183ms …  192ms          2 ( 7%)        ⚡- 37.0% ±  0.5%
  peak_rss            101MB ± 2.54MB    97.2MB …  104MB          0 ( 0%)        ⚡-  4.2% ±  1.3%
  cpu_cycles          592M  ±  693K      591M  …  594M           1 ( 4%)        ⚡- 40.0% ±  0.1%
  instructions       1.77G  ±  266      1.77G  … 1.77G           0 ( 0%)        ⚡- 41.2% ±  0.0%
  cache_references   3.65M  ± 50.0K     3.56M  … 3.75M           0 ( 0%)        ⚡- 16.2% ±  0.6%
  cache_misses       1.82M  ± 92.3K     1.73M  … 2.16M           1 ( 4%)        ⚡- 12.5% ±  2.9%
  branch_misses      2.31M  ± 3.72K     2.30M  … 2.32M           1 ( 4%)          +  0.9% ±  0.2% 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant