-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: parallelize workload after threshold (thrpt ~60% on large input) #6
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Test Results12 tests ±0 12 ✔️ ±0 0s ⏱️ ±0s Results for commit a78a8d4. ± Comparison against base commit 344e89f. This pull request removes 1 and adds 1 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
Profiling Reportencode/3 time: [50.253 ns 50.373 ns 50.503 ns]
thrpt: [56.650 MiB/s 56.796 MiB/s 56.932 MiB/s]
change:
+ time: [-4.3126% -3.9796% -3.6412%] (p = 0.00 < 0.05)
+ thrpt: [+3.7788% +4.1446% +4.5070%]
+ Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
encode/50 time: [117.68 ns 118.00 ns 118.32 ns]
thrpt: [403.02 MiB/s 404.10 MiB/s 405.18 MiB/s]
change:
+ time: [-8.0142% -6.4541% -5.1326%] (p = 0.00 < 0.05)
+ thrpt: [+5.4103% +6.8994% +8.7124%]
+ Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe
encode/100 time: [168.59 ns 169.26 ns 169.91 ns]
thrpt: [561.27 MiB/s 563.45 MiB/s 565.68 MiB/s]
change:
+ time: [-11.021% -10.676% -10.299%] (p = 0.00 < 0.05)
+ thrpt: [+11.481% +11.953% +12.386%]
+ Performance has improved.
Found 17 outliers among 100 measurements (17.00%)
6 (6.00%) low mild
7 (7.00%) high mild
4 (4.00%) high severe
encode/500 time: [467.47 ns 468.79 ns 470.06 ns]
thrpt: [1014.4 MiB/s 1017.2 MiB/s 1020.0 MiB/s]
change:
+ time: [-6.3477% -5.8551% -5.3883%] (p = 0.00 < 0.05)
+ thrpt: [+5.6952% +6.2192% +6.7779%]
+ Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
encode/3072 time: [2.4967 µs 2.5042 µs 2.5120 µs]
thrpt: [1.1390 GiB/s 1.1425 GiB/s 1.1459 GiB/s]
change:
+ time: [-2.0523% -1.5864% -1.1121%] (p = 0.00 < 0.05)
+ thrpt: [+1.1246% +1.6120% +2.0953%]
+ Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) low mild
1 (1.00%) high mild
encode/51200 time: [41.380 µs 41.704 µs 42.038 µs]
thrpt: [1.1343 GiB/s 1.1434 GiB/s 1.1523 GiB/s]
change:
- time: [+1.8121% +2.6548% +3.5304%] (p = 0.00 < 0.05)
- thrpt: [-3.4100% -2.5862% -1.7799%]
- Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
encode/102400 time: [80.357 µs 80.587 µs 80.818 µs]
thrpt: [1.1800 GiB/s 1.1834 GiB/s 1.1868 GiB/s]
change:
time: [-1.2965% -0.8447% -0.4286%] (p = 0.00 < 0.05)
thrpt: [+0.4304% +0.8519% +1.3135%]
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
encode/512000 time: [898.93 µs 901.68 µs 904.62 µs]
thrpt: [539.77 MiB/s 541.53 MiB/s 543.18 MiB/s]
change:
+ time: [-6.3577% -4.5873% -2.2842%] (p = 0.00 < 0.05)
+ thrpt: [+2.3376% +4.8079% +6.7893%]
+ Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high severe
encode/1048576 time: [1.8108 ms 1.8168 ms 1.8233 ms]
thrpt: [548.47 MiB/s 550.43 MiB/s 552.25 MiB/s]
change:
time: [-2.1512% -1.3803% -0.6367%] (p = 0.00 < 0.05)
thrpt: [+0.6408% +1.3996% +2.1984%]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
encode/5242880 time: [8.9506 ms 8.9840 ms 9.0211 ms]
thrpt: [554.26 MiB/s 556.55 MiB/s 558.62 MiB/s]
change:
time: [-2.2373% -1.3248% -0.4598%] (p = 0.00 < 0.05)
thrpt: [+0.4619% +1.3426% +2.2885%]
Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
encode/10485760 time: [19.277 ms 19.376 ms 19.475 ms]
thrpt: [513.47 MiB/s 516.11 MiB/s 518.75 MiB/s]
change:
time: [-0.6666% +0.0632% +0.8543%] (p = 0.87 > 0.05)
thrpt: [-0.8471% -0.0631% +0.6711%]
No change in performance detected.
encode/20971520 time: [41.736 ms 42.028 ms 42.330 ms]
thrpt: [472.48 MiB/s 475.87 MiB/s 479.21 MiB/s]
change:
- time: [+2.2732% +3.1148% +4.0849%] (p = 0.00 < 0.05)
- thrpt: [-3.9245% -3.0207% -2.2227%]
- Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
decode/3 time: [72.410 ns 72.675 ns 72.929 ns]
thrpt: [39.230 MiB/s 39.367 MiB/s 39.511 MiB/s]
change:
time: [-1.3030% -0.7631% -0.1942%] (p = 0.01 < 0.05)
thrpt: [+0.1945% +0.7689% +1.3202%]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
decode/50 time: [106.16 ns 107.17 ns 108.27 ns]
thrpt: [440.41 MiB/s 444.95 MiB/s 449.15 MiB/s]
change:
- time: [+1.5482% +2.3274% +3.1441%] (p = 0.00 < 0.05)
- thrpt: [-3.0483% -2.2745% -1.5246%]
- Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
4 (4.00%) high mild
3 (3.00%) high severe
decode/100 time: [153.14 ns 153.65 ns 154.15 ns]
thrpt: [618.67 MiB/s 620.69 MiB/s 622.73 MiB/s]
change:
time: [-0.2101% +0.2724% +0.8374%] (p = 0.31 > 0.05)
thrpt: [-0.8304% -0.2717% +0.2106%]
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) low mild
2 (2.00%) high mild
2 (2.00%) high severe
decode/500 time: [567.25 ns 569.20 ns 571.10 ns]
thrpt: [834.95 MiB/s 837.74 MiB/s 840.61 MiB/s]
change:
- time: [+3.6298% +4.2192% +4.8432%] (p = 0.00 < 0.05)
- thrpt: [-4.6194% -4.0484% -3.5026%]
- Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) low mild
3 (3.00%) high mild
decode/3072 time: [3.0063 µs 3.0175 µs 3.0288 µs]
thrpt: [967.29 MiB/s 970.90 MiB/s 974.51 MiB/s]
change:
time: [-0.2432% +0.2310% +0.7220%] (p = 0.33 > 0.05)
thrpt: [-0.7168% -0.2304% +0.2438%]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) low mild
decode/51200 time: [47.498 µs 47.611 µs 47.726 µs]
thrpt: [1023.1 MiB/s 1.0015 GiB/s 1.0039 GiB/s]
change:
time: [-0.2951% +0.1118% +0.5115%] (p = 0.58 > 0.05)
thrpt: [-0.5089% -0.1117% +0.2960%]
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
6 (6.00%) low mild
4 (4.00%) high mild
decode/102400 time: [94.769 µs 95.053 µs 95.344 µs]
thrpt: [1.0002 GiB/s 1.0033 GiB/s 1.0063 GiB/s]
change:
time: [-0.1792% +0.2886% +0.7451%] (p = 0.23 > 0.05)
thrpt: [-0.7396% -0.2877% +0.1795%]
No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low severe
2 (2.00%) low mild
5 (5.00%) high mild
decode/512000 time: [684.70 µs 686.86 µs 689.50 µs]
thrpt: [708.16 MiB/s 710.89 MiB/s 713.13 MiB/s]
change:
+ time: [-4.9462% -3.5109% -2.1901%] (p = 0.00 < 0.05)
+ thrpt: [+2.2392% +3.6386% +5.2036%]
+ Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
2 (2.00%) low severe
2 (2.00%) high mild
3 (3.00%) high severe
decode/1048576 time: [1.3988 ms 1.4096 ms 1.4226 ms]
thrpt: [702.96 MiB/s 709.40 MiB/s 714.89 MiB/s]
change:
time: [-0.8079% +0.4409% +1.6216%] (p = 0.49 > 0.05)
thrpt: [-1.5957% -0.4390% +0.8145%]
No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
decode/5242880 time: [6.9969 ms 7.0504 ms 7.1088 ms]
thrpt: [703.36 MiB/s 709.18 MiB/s 714.60 MiB/s]
change:
- time: [+1.2388% +2.4243% +3.5346%] (p = 0.00 < 0.05)
- thrpt: [-3.4139% -2.3669% -1.2236%]
- Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
decode/10485760 time: [14.085 ms 14.194 ms 14.324 ms]
thrpt: [698.13 MiB/s 704.53 MiB/s 710.00 MiB/s]
change:
- time: [+1.0710% +1.9180% +2.9093%] (p = 0.00 < 0.05)
- thrpt: [-2.8270% -1.8819% -1.0597%]
- Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
decode/20971520 time: [28.734 ms 28.842 ms 28.952 ms]
thrpt: [690.79 MiB/s 693.44 MiB/s 696.04 MiB/s]
change:
time: [-1.1755% -0.2375% +0.5374%] (p = 0.61 > 0.05)
thrpt: [-0.5345% +0.2380% +1.1895%]
No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
|
uhmarcel
force-pushed
the
feature/parallelize
branch
from
November 20, 2022 16:00
a78a8d4
to
ee81d51
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.