Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uint: div_mod_word use intrinsic u128 division(#406) #478

Merged
merged 1 commit into from
Feb 4, 2022

Conversation

frostRed
Copy link
Contributor

The fuzz test is ok, but I don't see any div_mod_word bench test because it's a private function.

Copy link
Member

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please report the results of the division benchmarks in uint:

cd uint
cargo bench -- div

Comment on lines 25 to 27
let x = (u128::from(hi) << 64) + u128::from(lo);
let y = u128::from(y);
((x / y) as u64, (x % y) as u64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please revert this change. we're comparing two implementations here, they should produce identical results, it doesn't make sense to compare it with itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem.

@frostRed frostRed closed this Dec 11, 2020
@frostRed frostRed deleted the div-mod-word branch December 11, 2020 11:13
@ordian
Copy link
Member

ordian commented Dec 11, 2020

@frostRed to be clear, I didn't mean the PR should be closed, just the change in fuzz tests should be reverted.

@frostRed frostRed restored the div-mod-word branch December 11, 2020 11:25
@frostRed
Copy link
Contributor Author

Sorry for misunderstand, I will just revert fuzz tests.

@frostRed frostRed reopened this Dec 11, 2020
@frostRed
Copy link
Contributor Author

Hardware:CPU 8 cores, Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz, 16G Memory
OS:Linux version 5.4.80-2-MANJARO
Rust:stable-x86_64-unknown-linux-gnu (default)

cargo bench -- div result:

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running /home/paysonl/Work/TurboGeth/parity-common/target/release/deps/bigint-955fd06249dc3a95
Gnuplot not found, using plotters backend
u256_div                time:   [79.113 ns 80.665 ns 82.545 ns]                     
                        change: [-6.8722% -4.0399% -1.3118%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  9 (9.00%) high severe

u512_div_mod//(1340780792994259709957402499820584612747936582059239337772356144372176403007354697680...                                                                            
                        time:   [94.804 ns 97.305 ns 100.24 ns]
                        change: [-8.3888% -5.2581% -2.1371%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) high mild
  4 (4.00%) high severe
u512_div_mod//(18446744073709551615, 4294967295)                                                                            
                        time:   [80.631 ns 85.097 ns 89.689 ns]
                        change: [+21.595% +27.837% +34.164%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild
u512_div_mod//(18446744073709551615, 18446744073709551614)                                                                            
                        time:   [93.279 ns 95.371 ns 98.026 ns]
                        change: [-1.3642% +0.6383% +3.0009%] (p = 0.57 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
u512_div_mod//(18446744073709551615, 18446744073709551614) #2                                                                            
                        time:   [98.748 ns 102.06 ns 105.45 ns]
                        change: [+2.2879% +5.0513% +8.0426%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  6 (6.00%) high mild
  7 (7.00%) high severe
u512_div_mod//(3759751734479964094783137206182536765532905409829204647089173492, 2167484464668298946...                                                                            
                        time:   [133.25 ns 137.63 ns 142.49 ns]
                        change: [+18.953% +22.845% +26.439%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
u512_div_mod//(1942668892225729070919461906823518906642406839052139521251812409738904285205208498175...                                                                            
                        time:   [105.52 ns 107.42 ns 109.72 ns]
                        change: [+5.6022% +8.5035% +11.329%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
u512_div_mod//(2045869129935088668758243560517249470135401278776915493427057105060083622743875339830...                                                                            
                        time:   [375.91 ns 384.56 ns 394.69 ns]
                        change: [+0.0208% +2.8799% +5.8761%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

u512_div                time:   [109.21 ns 110.54 ns 112.29 ns]                     
                        change: [-13.816% -11.509% -9.1624%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high severe

u128_div//(0, 18446744073709551615, 100)                                                                             
                        time:   [6.4493 ns 6.6085 ns 6.8338 ns]
                        change: [-0.2883% +2.3483% +5.4248%] (p = 0.12 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe
u128_div//(18446744073709551615, 18446744073709551615, 99)                                                                            
                        time:   [306.36 ns 309.53 ns 313.88 ns]
                        change: [-52.692% -49.737% -46.654%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe
u128_div//(42, 42, 100500)                                                                            
                        time:   [245.77 ns 249.00 ns 253.12 ns]
                        change: [-22.851% -20.064% -16.626%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 15 outliers among 100 measurements (15.00%)
  1 (1.00%) high mild
  14 (14.00%) high severe

@ordian
Copy link
Member

ordian commented Dec 25, 2020

Unfortunately, the benchmarks are inconclusive. There are some severe regressions, would need to investigate further. This will likely not go into the next release, but we can publish it later as a patch release.

@ordian ordian linked an issue Jul 11, 2021 that may be closed by this pull request
@ordian ordian mentioned this pull request Dec 12, 2021
@ordian
Copy link
Member

ordian commented Feb 4, 2022

Getting good results on my laptop with rust 1.58 and CARGO_PROFILE_RELEASE_LTO=true cargo bench -- div:

u256_div                time:   [64.181 ns 64.274 ns 64.369 ns]                     
                        change: [-2.8874% -1.0136% +0.4653%] (p = 0.30 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

u512_div_mod//(1340780792994259709957402499820584612747936582059239337772356144372176403007354697680...                                                                            
                        time:   [76.674 ns 76.833 ns 77.016 ns]
                        change: [-59.160% -58.841% -58.557%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  6 (6.00%) high severe
u512_div_mod//(18446744073709551615, 4294967295)                                                                            
                        time:   [73.206 ns 73.451 ns 73.761 ns]
                        change: [-48.940% -48.548% -47.952%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
u512_div_mod//(18446744073709551615, 18446744073709551614)                                                                            
                        time:   [79.133 ns 79.251 ns 79.376 ns]
                        change: [-43.538% -43.420% -43.307%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) low mild
  4 (4.00%) high mild
u512_div_mod//(18446744073709551615, 18446744073709551614) #2                                                                            
                        time:   [79.176 ns 79.251 ns 79.327 ns]
                        change: [-43.839% -43.687% -43.540%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe
u512_div_mod//(3759751734479964094783137206182536765532905409829204647089173492, 2167484464668298946...                                                                            
                        time:   [89.617 ns 89.767 ns 89.941 ns]
                        change: [+1.5136% +1.8538% +2.2394%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  6 (6.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe
u512_div_mod//(1942668892225729070919461906823518906642406839052139521251812409738904285205208498175...                                                                            
                        time:   [75.859 ns 75.938 ns 76.024 ns]
                        change: [-11.705% -11.482% -11.263%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) low mild
  5 (5.00%) high mild
  1 (1.00%) high severe
u512_div_mod//(2045869129935088668758243560517249470135401278776915493427057105060083622743875339830...                                                                            
                        time:   [190.73 ns 191.19 ns 191.71 ns]
                        change: [-3.4927% -3.2306% -2.9578%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high severe

u512_div                time:   [86.340 ns 87.108 ns 88.241 ns]                     
                        change: [+3.1944% +4.0582% +5.2751%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  4 (4.00%) high severe

@ordian ordian added the changelog Needs to be added to the changelog label Feb 4, 2022
@ordian ordian merged commit 2d7f7e2 into paritytech:master Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog Needs to be added to the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[uint] remove a workaround for 128-bit division
2 participants