Skip to content

Conversation

@britto
Copy link
Collaborator

@britto britto commented Jul 11, 2020

This PR supersedes #121, adding more performance improvements and also refactoring some parts:

  • Make property-based tests agnostic to implementation details by preventing direct access to private concerns.
  • Extract conversion between Proto and Wire format into its own Protobuf.Wire module.
  • Extract Varint encoding/decoding logic into its own Protobuf.Wire.Varint module.
  • Streamline and better isolate group-skipping logic.

Benchmark results

##### With input google_message3_5 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)             398.30 K        2.51 μs  ±5473.04%           1 μs           4 μs
decode (optimize-packed-b11d326-decode)      343.00 K        2.92 μs  ±5263.15%           1 μs           3 μs
decode (master-3428d3a-decode)               287.16 K        3.48 μs  ±4507.40%           2 μs           4 μs

Comparison: 
decode (refactor-1cfbf81-decode)             398.30 K
decode (optimize-packed-b11d326-decode)      343.00 K - 1.16x slower +0.40 μs
decode (master-3428d3a-decode)               287.16 K - 1.39x slower +0.97 μs

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                  632 B
decode (optimize-packed-b11d326-decode)           920 B - 1.46x memory usage +288 B
decode (master-3428d3a-decode)                   1280 B - 2.03x memory usage +648 B

##### With input google_message3_4 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)             382.05 K        2.62 μs  ±5187.68%           1 μs           4 μs
decode (optimize-packed-b11d326-decode)      336.45 K        2.97 μs  ±4952.23%           1 μs           4 μs
decode (master-3428d3a-decode)               286.98 K        3.48 μs  ±4476.92%           2 μs           4 μs

Comparison: 
decode (refactor-1cfbf81-decode)             382.05 K
decode (optimize-packed-b11d326-decode)      336.45 K - 1.18x slower +0.46 μs
decode (master-3428d3a-decode)               286.98 K - 1.39x slower +0.97 μs

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                  632 B
decode (optimize-packed-b11d326-decode)           920 B - 1.46x memory usage +288 B
decode (master-3428d3a-decode)                   1280 B - 2.03x memory usage +648 B

##### With input google_message3_3 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)             203.75 K        4.91 μs  ±2697.16%           3 μs           6 μs
decode (optimize-packed-b11d326-decode)      180.60 K        5.54 μs  ±2681.47%           3 μs           5 μs
decode (master-3428d3a-decode)               151.92 K        6.58 μs  ±2447.36%           3 μs           6 μs

Comparison: 
decode (refactor-1cfbf81-decode)             203.75 K
decode (optimize-packed-b11d326-decode)      180.60 K - 2.21x slower +3.03 μs
decode (master-3428d3a-decode)               151.92 K - 2.62x slower +4.07 μs

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                1.32 KB
decode (optimize-packed-b11d326-decode)         1.88 KB - 3.05x memory usage +1.27 KB
decode (master-3428d3a-decode)                  2.52 KB - 4.09x memory usage +1.91 KB

##### With input google_message4 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)             159.32 K        6.28 μs  ±2074.80%           4 μs           8 μs
decode (optimize-packed-b11d326-decode)      143.57 K        6.97 μs  ±2091.18%           4 μs           6 μs
decode (master-3428d3a-decode)               121.78 K        8.21 μs  ±1876.36%           4 μs           9 μs

Comparison: 
decode (refactor-1cfbf81-decode)             159.32 K
decode (optimize-packed-b11d326-decode)      143.57 K - 2.77x slower +4.45 μs
decode (master-3428d3a-decode)               121.78 K - 3.27x slower +5.70 μs

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                1.66 KB
decode (optimize-packed-b11d326-decode)         2.31 KB - 3.75x memory usage +1.70 KB
decode (master-3428d3a-decode)                  3.02 KB - 4.90x memory usage +2.41 KB

##### With input google_message1_proto2 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)              66.30 K       15.08 μs   ±924.67%           9 μs          22 μs
decode (optimize-packed-b11d326-decode)       60.27 K       16.59 μs   ±980.27%           8 μs          32 μs
decode (master-3428d3a-decode)                51.89 K       19.27 μs   ±883.01%           9 μs          37 μs

Comparison: 
decode (refactor-1cfbf81-decode)              66.30 K
decode (optimize-packed-b11d326-decode)       60.27 K - 6.61x slower +14.08 μs
decode (master-3428d3a-decode)                51.89 K - 7.68x slower +16.76 μs

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                5.23 KB
decode (optimize-packed-b11d326-decode)         7.03 KB - 11.39x memory usage +6.41 KB
decode (master-3428d3a-decode)                  8.48 KB - 13.75x memory usage +7.87 KB

##### With input google_message1_proto3 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)              65.87 K       15.18 μs   ±921.78%           9 μs          23 μs
decode (optimize-packed-b11d326-decode)       64.86 K       15.42 μs  ±1047.99%           8 μs          16 μs
decode (master-3428d3a-decode)                54.42 K       18.38 μs   ±939.82%           9 μs          20 μs

Comparison: 
decode (refactor-1cfbf81-decode)              65.87 K
decode (optimize-packed-b11d326-decode)       64.86 K - 6.14x slower +12.91 μs
decode (master-3428d3a-decode)                54.42 K - 7.32x slower +15.87 μs

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                5.23 KB
decode (optimize-packed-b11d326-decode)         7.03 KB - 11.39x memory usage +6.41 KB
decode (master-3428d3a-decode)                  8.48 KB - 13.75x memory usage +7.87 KB

##### With input google_message2 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)              1383.84        0.72 ms     ±9.27%        0.70 ms        0.98 ms
decode (optimize-packed-b11d326-decode)        853.24        1.17 ms     ±3.54%        1.17 ms        1.30 ms
decode (master-3428d3a-decode)                 687.49        1.45 ms     ±6.05%        1.45 ms        1.68 ms

Comparison: 
decode (refactor-1cfbf81-decode)              1383.84
decode (optimize-packed-b11d326-decode)        853.24 - 466.82x slower +1.17 ms
decode (master-3428d3a-decode)                 687.49 - 579.36x slower +1.45 ms

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)               22.90 KB
decode (optimize-packed-b11d326-decode)        23.74 KB - 38.47x memory usage +23.13 KB
decode (master-3428d3a-decode)                731.05 KB - 1184.48x memory usage +730.43 KB

##### With input google_message3_2 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)               106.70        9.37 ms     ±4.21%        9.32 ms       10.99 ms
decode (optimize-packed-b11d326-decode)         89.18       11.21 ms     ±4.47%       11.12 ms       12.69 ms
decode (master-3428d3a-decode)                  75.38       13.27 ms     ±2.65%       13.23 ms       14.03 ms

Comparison: 
decode (refactor-1cfbf81-decode)               106.70
decode (optimize-packed-b11d326-decode)         89.18 - 4466.43x slower +11.21 ms
decode (master-3428d3a-decode)                  75.38 - 5283.91x slower +13.26 ms

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                2.34 MB
decode (optimize-packed-b11d326-decode)         2.88 MB - 4779.73x memory usage +2.88 MB
decode (master-3428d3a-decode)                  7.26 MB - 12046.59x memory usage +7.26 MB

##### With input google_message3_1 #####
Name                                              ips        average  deviation         median         99th %
decode (refactor-1cfbf81-decode)                0.141         7.10 s     ±3.53%         6.97 s         7.39 s
decode (optimize-packed-b11d326-decode)         0.139         7.17 s     ±1.68%         7.19 s         7.28 s
decode (master-3428d3a-decode)                  0.118         8.50 s     ±0.44%         8.52 s         8.53 s

Comparison: 
decode (refactor-1cfbf81-decode)                0.141
decode (optimize-packed-b11d326-decode)         0.139 - 2856588.76x slower +7.17 s
decode (master-3428d3a-decode)                  0.118 - 3386587.08x slower +8.50 s

Memory usage statistics:

Name                                       Memory usage
decode (refactor-1cfbf81-decode)                1.33 GB
decode (optimize-packed-b11d326-decode)         1.64 GB - 2792543.27x memory usage +1.64 GB
decode (master-3428d3a-decode)                  2.05 GB - 3479007.04x memory usage +2.05 GB

##### With input google_message1_proto3 #####
Name                                              ips        average  deviation         median         99th %
encode (optimize-packed-b11d326-encode)       19.06 K       52.48 μs  ±3762.13%          11 μs          20 μs
encode (master-3428d3a-encode)                18.78 K       53.24 μs  ±3800.24%          11 μs       22.69 μs
encode (refactor-1cfbf81-encode)              18.61 K       53.72 μs  ±3717.12%       11.90 μs       21.90 μs

Comparison: 
encode (optimize-packed-b11d326-encode)       19.06 K
encode (master-3428d3a-encode)                18.78 K - 1.01x slower +0.77 μs
encode (refactor-1cfbf81-encode)              18.61 K - 1.02x slower +1.25 μs

Memory usage statistics:

Name                                       Memory usage
encode (optimize-packed-b11d326-encode)         2.78 KB
encode (master-3428d3a-encode)                  2.78 KB - 1.00x memory usage +0 KB
encode (refactor-1cfbf81-encode)                2.78 KB - 1.00x memory usage +0 KB

##### With input google_message3_3 #####
Name                                              ips        average  deviation         median         99th %
encode (optimize-packed-b11d326-encode)       15.74 K       63.53 μs  ±3680.21%           4 μs          19 μs
encode (refactor-1cfbf81-encode)              13.16 K       75.98 μs  ±3388.69%        4.90 μs       12.90 μs
encode (master-3428d3a-encode)                12.60 K       79.36 μs  ±3500.22%           4 μs          30 μs

Comparison: 
encode (optimize-packed-b11d326-encode)       15.74 K
encode (refactor-1cfbf81-encode)              13.16 K - 1.45x slower +23.51 μs
encode (master-3428d3a-encode)                12.60 K - 1.51x slower +26.89 μs

Memory usage statistics:

Name                                       Memory usage
encode (optimize-packed-b11d326-encode)         1.82 KB
encode (refactor-1cfbf81-encode)                1.82 KB - 0.65x memory usage -0.96094 KB
encode (master-3428d3a-encode)                  1.82 KB - 0.65x memory usage -0.96094 KB

##### With input google_message1_proto2 #####
Name                                              ips        average  deviation         median         99th %
encode (master-3428d3a-encode)                11.49 K       87.02 μs  ±2973.81%          18 μs          26 μs
encode (optimize-packed-b11d326-encode)       11.13 K       89.84 μs  ±2914.48%          18 μs          24 μs
encode (refactor-1cfbf81-encode)              10.94 K       91.40 μs  ±2854.25%       18.90 μs       59.90 μs

Comparison: 
encode (master-3428d3a-encode)                11.49 K
encode (optimize-packed-b11d326-encode)       11.13 K - 1.71x slower +37.36 μs
encode (refactor-1cfbf81-encode)              10.94 K - 1.74x slower +38.92 μs

Memory usage statistics:

Name                                       Memory usage
encode (master-3428d3a-encode)                  7.48 KB
encode (optimize-packed-b11d326-encode)         7.48 KB - 2.69x memory usage +4.70 KB
encode (refactor-1cfbf81-encode)                7.48 KB - 2.69x memory usage +4.70 KB

##### With input google_message3_5 #####
Name                                              ips        average  deviation         median         99th %
encode (optimize-packed-b11d326-encode)        8.00 K      125.06 μs  ±2658.48%           2 μs           7 μs
encode (master-3428d3a-encode)                 7.71 K      129.73 μs  ±2857.39%           3 μs          10 μs
encode (refactor-1cfbf81-encode)               6.68 K      149.79 μs  ±2480.70%        3.90 μs       20.90 μs

Comparison: 
encode (optimize-packed-b11d326-encode)        8.00 K
encode (master-3428d3a-encode)                 7.71 K - 2.47x slower +77.25 μs
encode (refactor-1cfbf81-encode)               6.68 K - 2.85x slower +97.32 μs

Memory usage statistics:

Name                                       Memory usage
encode (optimize-packed-b11d326-encode)          1016 B
encode (master-3428d3a-encode)                   1016 B - 0.36x memory usage -1832 B
encode (refactor-1cfbf81-encode)                 1016 B - 0.36x memory usage -1832 B

##### With input google_message2 #####
Name                                              ips        average  deviation         median         99th %
encode (refactor-1cfbf81-encode)               5.29 K      188.96 μs  ±2165.60%        8.90 μs       31.90 μs
encode (optimize-packed-b11d326-encode)        5.17 K      193.27 μs  ±2114.16%           8 μs          49 μs
encode (master-3428d3a-encode)                 4.75 K      210.35 μs  ±2202.98%           8 μs          50 μs

Comparison: 
encode (refactor-1cfbf81-encode)               5.29 K
encode (optimize-packed-b11d326-encode)        5.17 K - 3.68x slower +140.79 μs
encode (master-3428d3a-encode)                 4.75 K - 4.01x slower +157.88 μs

Memory usage statistics:

Name                                       Memory usage
encode (refactor-1cfbf81-encode)                3.80 KB
encode (optimize-packed-b11d326-encode)         3.80 KB - 1.37x memory usage +1.02 KB
encode (master-3428d3a-encode)                  3.80 KB - 1.37x memory usage +1.02 KB

##### With input google_message3_4 #####
Name                                              ips        average  deviation         median         99th %
encode (optimize-packed-b11d326-encode)         92.84       10.77 ms   ±275.15%      0.0250 ms       91.76 ms
encode (refactor-1cfbf81-encode)                91.43       10.94 ms   ±280.27%      0.0474 ms       96.86 ms
encode (master-3428d3a-encode)                  90.93       11.00 ms   ±288.18%      0.0240 ms      109.00 ms

Comparison: 
encode (optimize-packed-b11d326-encode)         92.84
encode (refactor-1cfbf81-encode)                91.43 - 208.42x slower +10.88 ms
encode (master-3428d3a-encode)                  90.93 - 209.57x slower +10.94 ms

Memory usage statistics:

Name                                       Memory usage
encode (optimize-packed-b11d326-encode)         1.10 KB
encode (refactor-1cfbf81-encode)                1.10 KB - 0.40x memory usage -1.67969 KB
encode (master-3428d3a-encode)                  1.10 KB - 0.40x memory usage -1.67969 KB

##### With input google_message3_2 #####
Name                                              ips        average  deviation         median         99th %
encode (optimize-packed-b11d326-encode)         45.31       22.07 ms   ±171.20%        3.32 ms       98.83 ms
encode (refactor-1cfbf81-encode)                41.30       24.22 ms   ±164.22%        3.39 ms      100.80 ms
encode (master-3428d3a-encode)                  37.32       26.80 ms   ±166.62%        3.68 ms      151.63 ms

Comparison: 
encode (optimize-packed-b11d326-encode)         45.31
encode (refactor-1cfbf81-encode)                41.30 - 461.46x slower +24.16 ms
encode (master-3428d3a-encode)                  37.32 - 510.67x slower +26.75 ms

Memory usage statistics:

Name                                       Memory usage
encode (optimize-packed-b11d326-encode)         1.10 MB
encode (refactor-1cfbf81-encode)                1.10 MB - 404.75x memory usage +1.10 MB
encode (master-3428d3a-encode)                  1.10 MB - 404.76x memory usage +1.10 MB

##### With input google_message4 #####
Name                                              ips        average  deviation         median         99th %
encode (optimize-packed-b11d326-encode)          5.96      167.72 ms    ±77.42%      268.05 ms      322.36 ms
encode (refactor-1cfbf81-encode)                 5.84      171.25 ms    ±79.08%      283.25 ms      302.25 ms
encode (master-3428d3a-encode)                   5.62      177.79 ms    ±81.30%      277.33 ms      329.11 ms

Comparison: 
encode (optimize-packed-b11d326-encode)          5.96
encode (refactor-1cfbf81-encode)                 5.84 - 3263.49x slower +171.20 ms
encode (master-3428d3a-encode)                   5.62 - 3388.10x slower +177.74 ms

Memory usage statistics:

Name                                       Memory usage
encode (optimize-packed-b11d326-encode)         2.80 KB
encode (refactor-1cfbf81-encode)                2.80 KB - 1.01x memory usage +0.0234 KB
encode (master-3428d3a-encode)                  2.80 KB - 1.01x memory usage +0.0234 KB

##### With input google_message3_1 #####
Name                                              ips        average  deviation         median         99th %
encode (refactor-1cfbf81-encode)                 0.26         3.91 s    ±22.97%         3.46 s         5.42 s
encode (optimize-packed-b11d326-encode)          0.25         3.99 s    ±29.14%         3.27 s         5.82 s
encode (master-3428d3a-encode)                   0.24         4.17 s    ±20.84%         3.87 s         5.21 s

Comparison: 
encode (refactor-1cfbf81-encode)                 0.26
encode (optimize-packed-b11d326-encode)          0.25 - 76024.14x slower +3.99 s
encode (master-3428d3a-encode)                   0.24 - 79410.94x slower +4.17 s

Memory usage statistics:

Name                                       Memory usage
encode (refactor-1cfbf81-encode)              439.80 MB
encode (optimize-packed-b11d326-encode)       439.80 MB - 161924.52x memory usage +439.79 MB
encode (master-3428d3a-encode)                439.80 MB - 161924.85x memory usage +439.80 MB


body =
quote do
var!(value) = unquote(expression)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will need some help here, metaprogramming is definitely not my forte 😅

I feel there is a better way of doing this. Right now my skip_varint function will trigger a compilation warning, because it does not use the value variable in its body. I wonder what is the correct way of fixing that.

britto added 6 commits May 31, 2021 16:10
By making them agnostic to implementation details, preventing direct access to private concerns.
This version improves group-skipping by avoiding unnecessary allocation of
intermediate values and better separating that logic. It also reduces memory
consumption by preventing the creation of an intermediate list with all read
values. Now the message is updated as soon as values are read from the wire.
@britto britto marked this pull request as ready for review June 7, 2021 13:10
Compilation warning is still there though :(
@whatyouhide
Copy link
Collaborator

We're merging this and we'll fix the compilation warning later on before releasing a new version.

@whatyouhide whatyouhide merged commit 21ec7c5 into elixir-protobuf:master Jun 7, 2021
@britto britto deleted the refactor branch October 14, 2021 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants