Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: parallelize workload after threshold (thrpt ~60% on large input) #6

Merged
merged 4 commits into from
Nov 20, 2022

Conversation

uhmarcel
Copy link
Owner

No description provided.

@github-actions
Copy link

github-actions bot commented Nov 20, 2022

Test Results

12 tests  ±0   12 ✔️ ±0   0s ⏱️ ±0s
  4 suites ±0     0 💤 ±0 
  1 files   ±0     0 ±0 

Results for commit a78a8d4. ± Comparison against base commit 344e89f.

This pull request removes 1 and adds 1 tests. Note that renamed tests count towards both.
tests ‑ should_construct_matching_encode_decode_tables
common::tests ‑ should_construct_matching_encode_decode_tables

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Nov 20, 2022

Profiling Report

encode/3                time:   [50.253 ns 50.373 ns 50.503 ns]
                        thrpt:  [56.650 MiB/s 56.796 MiB/s 56.932 MiB/s]
                 change:
+                        time:   [-4.3126% -3.9796% -3.6412%] (p = 0.00 < 0.05)
+                        thrpt:  [+3.7788% +4.1446% +4.5070%]
+                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
encode/50               time:   [117.68 ns 118.00 ns 118.32 ns]
                        thrpt:  [403.02 MiB/s 404.10 MiB/s 405.18 MiB/s]
                 change:
+                        time:   [-8.0142% -6.4541% -5.1326%] (p = 0.00 < 0.05)
+                        thrpt:  [+5.4103% +6.8994% +8.7124%]
+                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe
encode/100              time:   [168.59 ns 169.26 ns 169.91 ns]
                        thrpt:  [561.27 MiB/s 563.45 MiB/s 565.68 MiB/s]
                 change:
+                        time:   [-11.021% -10.676% -10.299%] (p = 0.00 < 0.05)
+                        thrpt:  [+11.481% +11.953% +12.386%]
+                        Performance has improved.
Found 17 outliers among 100 measurements (17.00%)
  6 (6.00%) low mild
  7 (7.00%) high mild
  4 (4.00%) high severe
encode/500              time:   [467.47 ns 468.79 ns 470.06 ns]
                        thrpt:  [1014.4 MiB/s 1017.2 MiB/s 1020.0 MiB/s]
                 change:
+                        time:   [-6.3477% -5.8551% -5.3883%] (p = 0.00 < 0.05)
+                        thrpt:  [+5.6952% +6.2192% +6.7779%]
+                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
encode/3072             time:   [2.4967 µs 2.5042 µs 2.5120 µs]
                        thrpt:  [1.1390 GiB/s 1.1425 GiB/s 1.1459 GiB/s]
                 change:
+                        time:   [-2.0523% -1.5864% -1.1121%] (p = 0.00 < 0.05)
+                        thrpt:  [+1.1246% +1.6120% +2.0953%]
+                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) low mild
  1 (1.00%) high mild
encode/51200            time:   [41.380 µs 41.704 µs 42.038 µs]
                        thrpt:  [1.1343 GiB/s 1.1434 GiB/s 1.1523 GiB/s]
                 change:
-                        time:   [+1.8121% +2.6548% +3.5304%] (p = 0.00 < 0.05)
-                        thrpt:  [-3.4100% -2.5862% -1.7799%]
-                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
encode/102400           time:   [80.357 µs 80.587 µs 80.818 µs]
                        thrpt:  [1.1800 GiB/s 1.1834 GiB/s 1.1868 GiB/s]
                 change:
                        time:   [-1.2965% -0.8447% -0.4286%] (p = 0.00 < 0.05)
                        thrpt:  [+0.4304% +0.8519% +1.3135%]
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
encode/512000           time:   [898.93 µs 901.68 µs 904.62 µs]
                        thrpt:  [539.77 MiB/s 541.53 MiB/s 543.18 MiB/s]
                 change:
+                        time:   [-6.3577% -4.5873% -2.2842%] (p = 0.00 < 0.05)
+                        thrpt:  [+2.3376% +4.8079% +6.7893%]
+                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high severe
encode/1048576          time:   [1.8108 ms 1.8168 ms 1.8233 ms]
                        thrpt:  [548.47 MiB/s 550.43 MiB/s 552.25 MiB/s]
                 change:
                        time:   [-2.1512% -1.3803% -0.6367%] (p = 0.00 < 0.05)
                        thrpt:  [+0.6408% +1.3996% +2.1984%]
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
encode/5242880          time:   [8.9506 ms 8.9840 ms 9.0211 ms]
                        thrpt:  [554.26 MiB/s 556.55 MiB/s 558.62 MiB/s]
                 change:
                        time:   [-2.2373% -1.3248% -0.4598%] (p = 0.00 < 0.05)
                        thrpt:  [+0.4619% +1.3426% +2.2885%]
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
encode/10485760         time:   [19.277 ms 19.376 ms 19.475 ms]
                        thrpt:  [513.47 MiB/s 516.11 MiB/s 518.75 MiB/s]
                 change:
                        time:   [-0.6666% +0.0632% +0.8543%] (p = 0.87 > 0.05)
                        thrpt:  [-0.8471% -0.0631% +0.6711%]
                        No change in performance detected.
encode/20971520         time:   [41.736 ms 42.028 ms 42.330 ms]
                        thrpt:  [472.48 MiB/s 475.87 MiB/s 479.21 MiB/s]
                 change:
-                        time:   [+2.2732% +3.1148% +4.0849%] (p = 0.00 < 0.05)
-                        thrpt:  [-3.9245% -3.0207% -2.2227%]
-                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

decode/3                time:   [72.410 ns 72.675 ns 72.929 ns]
                        thrpt:  [39.230 MiB/s 39.367 MiB/s 39.511 MiB/s]
                 change:
                        time:   [-1.3030% -0.7631% -0.1942%] (p = 0.01 < 0.05)
                        thrpt:  [+0.1945% +0.7689% +1.3202%]
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
decode/50               time:   [106.16 ns 107.17 ns 108.27 ns]
                        thrpt:  [440.41 MiB/s 444.95 MiB/s 449.15 MiB/s]
                 change:
-                        time:   [+1.5482% +2.3274% +3.1441%] (p = 0.00 < 0.05)
-                        thrpt:  [-3.0483% -2.2745% -1.5246%]
-                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe
decode/100              time:   [153.14 ns 153.65 ns 154.15 ns]
                        thrpt:  [618.67 MiB/s 620.69 MiB/s 622.73 MiB/s]
                 change:
                        time:   [-0.2101% +0.2724% +0.8374%] (p = 0.31 > 0.05)
                        thrpt:  [-0.8304% -0.2717% +0.2106%]
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe
decode/500              time:   [567.25 ns 569.20 ns 571.10 ns]
                        thrpt:  [834.95 MiB/s 837.74 MiB/s 840.61 MiB/s]
                 change:
-                        time:   [+3.6298% +4.2192% +4.8432%] (p = 0.00 < 0.05)
-                        thrpt:  [-4.6194% -4.0484% -3.5026%]
-                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
decode/3072             time:   [3.0063 µs 3.0175 µs 3.0288 µs]
                        thrpt:  [967.29 MiB/s 970.90 MiB/s 974.51 MiB/s]
                 change:
                        time:   [-0.2432% +0.2310% +0.7220%] (p = 0.33 > 0.05)
                        thrpt:  [-0.7168% -0.2304% +0.2438%]
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild
decode/51200            time:   [47.498 µs 47.611 µs 47.726 µs]
                        thrpt:  [1023.1 MiB/s 1.0015 GiB/s 1.0039 GiB/s]
                 change:
                        time:   [-0.2951% +0.1118% +0.5115%] (p = 0.58 > 0.05)
                        thrpt:  [-0.5089% -0.1117% +0.2960%]
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) low mild
  4 (4.00%) high mild
decode/102400           time:   [94.769 µs 95.053 µs 95.344 µs]
                        thrpt:  [1.0002 GiB/s 1.0033 GiB/s 1.0063 GiB/s]
                 change:
                        time:   [-0.1792% +0.2886% +0.7451%] (p = 0.23 > 0.05)
                        thrpt:  [-0.7396% -0.2877% +0.1795%]
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  5 (5.00%) high mild
decode/512000           time:   [684.70 µs 686.86 µs 689.50 µs]
                        thrpt:  [708.16 MiB/s 710.89 MiB/s 713.13 MiB/s]
                 change:
+                        time:   [-4.9462% -3.5109% -2.1901%] (p = 0.00 < 0.05)
+                        thrpt:  [+2.2392% +3.6386% +5.2036%]
+                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) low severe
  2 (2.00%) high mild
  3 (3.00%) high severe
decode/1048576          time:   [1.3988 ms 1.4096 ms 1.4226 ms]
                        thrpt:  [702.96 MiB/s 709.40 MiB/s 714.89 MiB/s]
                 change:
                        time:   [-0.8079% +0.4409% +1.6216%] (p = 0.49 > 0.05)
                        thrpt:  [-1.5957% -0.4390% +0.8145%]
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe
decode/5242880          time:   [6.9969 ms 7.0504 ms 7.1088 ms]
                        thrpt:  [703.36 MiB/s 709.18 MiB/s 714.60 MiB/s]
                 change:
-                        time:   [+1.2388% +2.4243% +3.5346%] (p = 0.00 < 0.05)
-                        thrpt:  [-3.4139% -2.3669% -1.2236%]
-                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
decode/10485760         time:   [14.085 ms 14.194 ms 14.324 ms]
                        thrpt:  [698.13 MiB/s 704.53 MiB/s 710.00 MiB/s]
                 change:
-                        time:   [+1.0710% +1.9180% +2.9093%] (p = 0.00 < 0.05)
-                        thrpt:  [-2.8270% -1.8819% -1.0597%]
-                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
decode/20971520         time:   [28.734 ms 28.842 ms 28.952 ms]
                        thrpt:  [690.79 MiB/s 693.44 MiB/s 696.04 MiB/s]
                 change:
                        time:   [-1.1755% -0.2375% +0.5374%] (p = 0.61 > 0.05)
                        thrpt:  [-0.5345% +0.2380% +1.1895%]
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant