refactor: Optimize polynomial operations with parallel scan #330

huitseeker · 2024-02-17T17:22:01Z

Optimized performance by introducing parallel scans methods across multiple components, focusing on the rlc<T, F> method and the powers function.
Switched from DoubleEndedIteratorExt to IndexedParallelIteratorExt across several files for iterator processing in parallel, and modified corresponding imports accordingly.
Left DoubleEndedIteratorExt as an option, since it's more efficient on small polynomials,
Added rayon-scan to general dependencies

Benches

https://gist.github.com/huitseeker/9094bf6b1c68d83f4c0290e68c6c5beb

Excerpt:

Benchmarking PCS-Proving 10/arecibo::provider::hyperkzg::EvaluationEngine<halo2cur
                        time:   [5.3010 ms 5.3071 ms 5.3111 ms]
                        change: [-1.9609% -1.7245% -1.5197%] (p = 0.00 < 0.05)
                        Performance has improved.
Benchmarking PCS-Proving 10/arecibo::provider::shplonk::EvaluationEngine<halo2curv
                        time:   [5.0915 ms 5.1006 ms 5.1113 ms]
                        change: [-1.4249% -0.9489% -0.4561%] (p = 0.00 < 0.05)
                        Change within noise threshold.

Benchmarking PCS-Verifying 10/arecibo::provider::hyperkzg::EvaluationEngine<halo2c
                        time:   [1.7398 ms 1.7941 ms 1.8332 ms]
                        change: [+1.4410% +5.8646% +10.456%] (p = 0.03 < 0.05)
                        Performance has regressed.
Benchmarking PCS-Verifying 10/arecibo::provider::shplonk::EvaluationEngine<halo2cu
                        time:   [1.6421 ms 1.6721 ms 1.7167 ms]
                        change: [-25.528% -23.891% -22.609%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking PCS-Proving 12/arecibo::provider::hyperkzg::EvaluationEngine<halo2cur
                        time:   [13.860 ms 13.897 ms 13.922 ms]
                        change: [-2.9355% -2.0591% -1.1507%] (p = 0.00 < 0.05)
                        Performance has improved.
Benchmarking PCS-Proving 12/arecibo::provider::shplonk::EvaluationEngine<halo2curv
                        time:   [12.516 ms 12.894 ms 13.414 ms]
                        change: [-1.9522% -0.1449% +2.1939%] (p = 0.92 > 0.05)
                        No change in performance detected.

Benchmarking PCS-Verifying 12/arecibo::provider::hyperkzg::EvaluationEngine<halo2c
                        time:   [1.7331 ms 1.7738 ms 1.8319 ms]
                        change: [+4.0118% +8.5040% +13.413%] (p = 0.00 < 0.05)
                        Performance has regressed.
Benchmarking PCS-Verifying 12/arecibo::provider::shplonk::EvaluationEngine<halo2cu
                        time:   [1.6813 ms 1.7043 ms 1.7406 ms]
                        change: [-27.955% -25.826% -23.707%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking PCS-Proving 14/arecibo::provider::hyperkzg::EvaluationEngine<halo2cur
                        time:   [45.807 ms 46.013 ms 46.171 ms]
                        change: [-1.8652% -1.1684% -0.4207%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Benchmarking PCS-Proving 14/arecibo::provider::shplonk::EvaluationEngine<halo2curv
                        time:   [40.215 ms 40.674 ms 41.234 ms]
                        change: [-1.5516% -0.1620% +1.3679%] (p = 0.84 > 0.05)
                        No change in performance detected.

Benchmarking PCS-Verifying 14/arecibo::provider::hyperkzg::EvaluationEngine<halo2c
                        time:   [1.6864 ms 1.7380 ms 1.7911 ms]
                        change: [+3.3281% +6.5028% +9.7707%] (p = 0.00 < 0.05)
                        Performance has regressed.
Benchmarking PCS-Verifying 14/arecibo::provider::shplonk::EvaluationEngine<halo2cu
                        time:   [1.6934 ms 1.7149 ms 1.7339 ms]
                        change: [-38.309% -36.955% -35.256%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking PCS-Proving 16/arecibo::provider::hyperkzg::EvaluationEngine<halo2cur
                        time:   [56.960 ms 57.401 ms 57.911 ms]
                        change: [-1.9650% -0.4809% +1.1637%] (p = 0.60 > 0.05)
                        No change in performance detected.
Benchmarking PCS-Proving 16/arecibo::provider::shplonk::EvaluationEngine<halo2curv
                        time:   [144.40 ms 146.11 ms 147.10 ms]
                        change: [-3.0841% -1.3092% +0.1681%] (p = 0.16 > 0.05)
                        No change in performance detected.

Benchmarking PCS-Verifying 16/arecibo::provider::hyperkzg::EvaluationEngine<halo2c
                        time:   [1.7697 ms 1.8230 ms 1.8705 ms]
                        change: [+8.8669% +11.972% +14.877%] (p = 0.00 < 0.05)
                        Performance has regressed.
Benchmarking PCS-Verifying 16/arecibo::provider::shplonk::EvaluationEngine<halo2cu
                        time:   [1.7504 ms 1.7893 ms 1.8288 ms]
                        change: [-39.552% -38.348% -37.334%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking PCS-Proving 18/arecibo::provider::hyperkzg::EvaluationEngine<halo2cur
                        time:   [92.503 ms 94.145 ms 95.466 ms]
                        change: [-3.4856% -1.2100% +1.0332%] (p = 0.34 > 0.05)
                        No change in performance detected.
Benchmarking PCS-Proving 18/arecibo::provider::shplonk::EvaluationEngine<halo2curv
                        time:   [126.80 ms 127.90 ms 129.16 ms]
                        change: [-0.2031% +0.6876% +1.6003%] (p = 0.17 > 0.05)
                        No change in performance detected.

Benchmarking PCS-Verifying 18/arecibo::provider::hyperkzg::EvaluationEngine<halo2c
                        time:   [1.8290 ms 1.8495 ms 1.8756 ms]
                        change: [+5.9647% +9.8855% +14.680%] (p = 0.00 < 0.05)
                        Performance has regressed.
Benchmarking PCS-Verifying 18/arecibo::provider::shplonk::EvaluationEngine<halo2cu
                        time:   [1.7422 ms 1.7515 ms 1.7584 ms]
                        change: [-45.244% -44.721% -44.209%] (p = 0.00 < 0.05)
                        Performance has improved.

Benchmarking PCS-Proving 20/arecibo::provider::hyperkzg::EvaluationEngine<halo2cur
                        time:   [197.93 ms 198.95 ms 199.83 ms]
                        change: [+1.0665% +1.9132% +2.8369%] (p = 0.00 < 0.05)
                        Performance has regressed.
Benchmarking PCS-Proving 20/arecibo::provider::shplonk::EvaluationEngine<halo2curv
                        time:   [333.84 ms 335.52 ms 337.42 ms]
                        change: [-0.7533% +0.0132% +0.8489%] (p = 0.98 > 0.05)
                        No change in performance detected.

Benchmarking PCS-Verifying 20/arecibo::provider::hyperkzg::EvaluationEngine<halo2c
                        time:   [1.8954 ms 1.9183 ms 1.9365 ms]
                        change: [+0.5823% +5.3112% +9.8971%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Benchmarking PCS-Verifying 20/arecibo::provider::shplonk::EvaluationEngine<halo2cu
                        time:   [1.6963 ms 1.7098 ms 1.7195 ms]
                        change: [-47.135% -46.221% -45.047%] (p = 0.00 < 0.05)
                        Performance has improved.

Follows #260.
Fixes #223

- Optimized performance by introducing parallel scans methods across multiple components, focusing on the `rlc<T, F>` method and the `powers` function. - Switched from `DoubleEndedIteratorExt` to `IndexedParallelIteratorExt` across several files for iterator processing in parallel, and modified corresponding imports accordingly. - Left `DoubleEndedIteratorExt` as an option, since it's more efficient on small polynomials, - Added `rayon-scan` to general dependencies

storojs72

LGTM! Will need to rebase #326

Happy to see that verification in shplonk is becoming a little bit faster.

Also it is odd to see, that shplonk's prover "sags" on big numbers of benchmarks - 16, 18, 20. Hopefully it will be eliminated with solving #302

huitseeker requested review from adr1anh and storojs72 February 17, 2024 17:22

huitseeker force-pushed the rayon_parscan branch from 48c12dc to 80f6366 Compare February 17, 2024 17:23

storojs72 reviewed Feb 19, 2024

View reviewed changes

storojs72 self-requested a review February 19, 2024 17:20

storojs72 approved these changes Feb 19, 2024

View reviewed changes

huitseeker added this pull request to the merge queue Feb 19, 2024

Merged via the queue into lurk-lab:dev with commit 8d2bb89 Feb 19, 2024
9 checks passed

huitseeker deleted the rayon_parscan branch February 19, 2024 18:03

storojs72 mentioned this pull request Feb 20, 2024

Improving Shplonk implementation #326

Merged

huitseeker mentioned this pull request Feb 20, 2024

Replace unary SNARK / PP-SNARK with their batched variants. #331

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Optimize polynomial operations with parallel scan #330

refactor: Optimize polynomial operations with parallel scan #330

huitseeker commented Feb 17, 2024 •

edited

Loading

storojs72 left a comment

refactor: Optimize polynomial operations with parallel scan #330

refactor: Optimize polynomial operations with parallel scan #330

Conversation

huitseeker commented Feb 17, 2024 • edited Loading

Benches

storojs72 left a comment

Choose a reason for hiding this comment

huitseeker commented Feb 17, 2024 •

edited

Loading