Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array Encoding/Decoding performance improvements (continued) #185

Merged
merged 36 commits into from
Jul 14, 2021

Conversation

danielmarbach
Copy link
Contributor

@danielmarbach danielmarbach commented May 11, 2021

This is a work in progress to show how array encoding/decoding of
primitive types can be improved.

Before

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.985 (20H2/October2020Update)
AMD Ryzen 9 3950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=6.0.100-preview.4.21255.9
  [Host]   : .NET 5.0.5 (5.0.521.16609), X64 RyuJIT
  ShortRun : .NET 5.0.5 (5.0.521.16609), X64 RyuJIT

Job=ShortRun  Runtime=.NET 5.0  IterationCount=3  
LaunchCount=1  WarmupCount=3  
Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated Code Size
ArrayAmqpSymbolDecode_1M_MAA 535.44 μs 149.503 μs 8.195 μs 76.1719 26.3672 - 647,048 B 276 B
ArrayAmqpSymbolDecode_1K_MAA 52.14 μs 8.871 μs 0.486 μs 7.6904 0.9155 - 64,480 B 276 B
ArrayAmqpSymbolEncode_100K_MAA 1,496.76 μs 192.350 μs 10.543 μs 97.6563 - - 819,048 B 226 B
ArrayAmqpSymbolEncode_1K_MAA 146.51 μs 40.250 μs 2.206 μs 9.7656 - - 81,928 B 226 B
Bytes_Encode_MAA 23.43 μs 0.445 μs 0.024 μs - - - - 326 B
Bytes_Decode_MAA 320.12 μs 605.266 μs 33.177 μs 2.9297 2.9297 2.9297 1,048,600 B 555 B
ArrayInt32Encode_MAA_1M 101,466.66 μs 28,031.501 μs 1,536.501 μs 6000.0000 - - 50,331,760 B 320 B
ArrayInt32Encode_MAA_1K 98.50 μs 16.658 μs 0.913 μs 5.8594 - - 49,264 B 320 B
ArrayInt32Decode_1M_MAA 51,164.37 μs 4,089.352 μs 224.151 μs 3000.0000 - - 29,361,027 B 276 B
ArrayInt32Decode_1K_MAA 54.88 μs 3.563 μs 0.195 μs 3.4180 - - 28,696 B 276 B
ArrayUInt32Encode_MAA_1M 105,372.92 μs 32,508.110 μs 1,781.879 μs 6000.0000 - - 50,331,760 B 320 B
ArrayUInt32Encode_MAA_1K 97.29 μs 19.274 μs 1.056 μs 5.8594 - - 49,264 B 320 B
ArrayUInt32Decode_1M_MAA 57,301.39 μs 3,303.465 μs 181.074 μs 3000.0000 - - 29,361,148 B 276 B
ArrayUInt32Decode_1K_MAA 58.39 μs 6.783 μs 0.372 μs 3.4180 - - 28,696 B 276 B
ArrayInt64Encode_MAA_1M 105,115.89 μs 27,232.659 μs 1,492.714 μs 6000.0000 - - 50,331,760 B 320 B
ArrayInt64Encode_MAA_1K 98.41 μs 43.838 μs 2.403 μs 5.8594 - - 49,264 B 320 B
ArrayInt64Decode_1M_MAA 51,023.07 μs 17,808.911 μs 976.166 μs 3000.0000 - - 33,554,456 B 276 B
ArrayInt64Decode_1K_MAA 47.36 μs 3.820 μs 0.209 μs 3.9063 0.0610 - 32,792 B 276 B
ArrayUInt64Encode_MAA_1M 102,957.17 μs 18,212.982 μs 998.315 μs 6000.0000 - - 50,331,760 B 320 B
ArrayUInt64Encode_MAA_1K 96.94 μs 59.427 μs 3.257 μs 5.8594 - - 49,264 B 320 B
ArrayUInt64Decode_1M_MAA 54,329.93 μs 30,664.377 μs 1,680.818 μs 3000.0000 - - 33,554,456 B 276 B
ArrayUInt64Decode_1K_MAA 50.64 μs 24.464 μs 1.341 μs 3.9063 0.0610 - 32,792 B 276 B
ArrayBoolEncode_MAA_1M 104,928.10 μs 29,551.077 μs 1,619.794 μs 6000.0000 - - 50,331,760 B 320 B
ArrayBoolEncode_MAA_1K 96.20 μs 16.166 μs 0.886 μs 5.8594 - - 49,264 B 320 B
ArrayBoolDecode_1M_MAA 44,793.29 μs 2,699.893 μs 147.990 μs 3000.0000 - - 26,214,424 B 276 B
ArrayBoolDecode_1K_MAA 44.57 μs 4.078 μs 0.224 μs 3.0518 - - 25,624 B 276 B
ArrayDecimalEncode_MAA_1M 132,444.94 μs 73,831.727 μs 4,046.966 μs 18000.0000 - - 150,995,072 B 320 B
ArrayDecimalEncode_MAA_1K 129.31 μs 70.490 μs 3.864 μs 17.5781 - - 147,584 B 320 B
ArrayDecimalDecode_1M_MAA 112,627.93 μs 13,433.443 μs 736.332 μs 9000.0000 - - 92,274,712 B 276 B
ArrayDecimalDecode_1K_MAA 108.87 μs 10.047 μs 0.551 μs 10.7422 0.6104 - 90,136 B 276 B
ArrayDoubleEncode_MAA_1M 101,110.83 μs 22,499.287 μs 1,233.262 μs 6000.0000 - - 50,331,760 B 320 B
ArrayDoubleEncode_MAA_1K 98.77 μs 36.658 μs 2.009 μs 5.8594 - - 49,264 B 320 B
ArrayDoubleDecode_1M_MAA 50,673.58 μs 18,513.259 μs 1,014.774 μs 3000.0000 - - 33,554,456 B 276 B
ArrayDoubleDecode_1K_MAA 45.88 μs 2.758 μs 0.151 μs 3.9063 0.0610 - 32,792 B 276 B
ArrayFloatEncode_MAA_1M 100,870.10 μs 14,169.486 μs 776.677 μs 6000.0000 - - 50,331,760 B 320 B
ArrayFloatEncode_MAA_1K 100.00 μs 20.726 μs 1.136 μs 5.8594 - - 49,264 B 320 B
ArrayFloatDecode_1M_MAA 50,928.70 μs 10,357.976 μs 567.756 μs 3000.0000 - - 29,360,227 B 276 B
ArrayFloatDecode_1K_MAA 49.35 μs 3.641 μs 0.200 μs 3.4180 - - 28,696 B 276 B

After

BenchmarkDotNet=v0.13.0, OS=Windows 10.0.19042.985 (20H2/October2020Update)
AMD Ryzen 9 3950X, 1 CPU, 32 logical and 16 physical cores
.NET SDK=6.0.100-preview.4.21255.9
  [Host]   : .NET 5.0.5 (5.0.521.16609), X64 RyuJIT
  ShortRun : .NET 5.0.5 (5.0.521.16609), X64 RyuJIT

Job=ShortRun  Runtime=.NET 5.0  IterationCount=3  
LaunchCount=1  WarmupCount=3  
Method Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated Code Size
ArrayAmqpSymbolDecode_1M_MAA 337,431.5 ns 70,315.51 ns 3,854.23 ns 47.3633 15.6250 - 401,288 B 276 B
ArrayAmqpSymbolDecode_1K_MAA 31,422.2 ns 2,298.39 ns 125.98 ns 4.7607 0.4883 - 39,904 B 276 B
ArrayAmqpSymbolEncode_100K_MAA 600,549.1 ns 22,404.81 ns 1,228.08 ns - - - - 233 B
ArrayAmqpSymbolEncode_1K_MAA 58,396.3 ns 2,661.36 ns 145.88 ns - - - - 233 B
Bytes_Encode_MAA 22,882.8 ns 2,215.89 ns 121.46 ns - - - - 411 B
Bytes_Decode_MAA 342,778.9 ns 890,252.17 ns 48,797.73 ns 2.9297 2.9297 2.9297 1,048,601 B 555 B
ArrayInt32Encode_MAA_1M 851,541.3 ns 39,990.25 ns 2,192.00 ns - - - - 327 B
ArrayInt32Encode_MAA_1K 950.1 ns 51.39 ns 2.82 ns - - - - 327 B
ArrayInt32Decode_1M_MAA 2,019,578.2 ns 745,896.04 ns 40,885.08 ns 11.7188 11.7188 11.7188 4,194,332 B 276 B
ArrayInt32Decode_1K_MAA 972.9 ns 106.33 ns 5.83 ns 0.4921 - - 4,120 B 276 B
ArrayUInt32Encode_MAA_1M 848,060.4 ns 12,011.16 ns 658.37 ns - - - - 327 B
ArrayUInt32Encode_MAA_1K 940.0 ns 115.65 ns 6.34 ns - - - - 327 B
ArrayUInt32Decode_1M_MAA 2,040,378.3 ns 564,903.81 ns 30,964.28 ns 11.7188 11.7188 11.7188 4,194,332 B 276 B
ArrayUInt32Decode_1K_MAA 1,010.7 ns 94.37 ns 5.17 ns 0.4921 - - 4,120 B 276 B
ArrayInt64Encode_MAA_1M 907,390.7 ns 89,501.36 ns 4,905.87 ns - - - - 327 B
ArrayInt64Encode_MAA_1K 952.1 ns 115.76 ns 6.35 ns - - - - 327 B
ArrayInt64Decode_1M_MAA 3,383,721.9 ns 3,818,522.36 ns 209,306.10 ns 23.4375 23.4375 23.4375 8,388,640 B 276 B
ArrayInt64Decode_1K_MAA 1,115.4 ns 77.21 ns 4.23 ns 0.9804 - - 8,216 B 276 B
ArrayUInt64Encode_MAA_1M 929,740.7 ns 614,016.69 ns 33,656.33 ns - - - - 327 B
ArrayUInt64Encode_MAA_1K 946.0 ns 35.88 ns 1.97 ns - - - - 327 B
ArrayUInt64Decode_1M_MAA 3,450,412.7 ns 1,869,324.49 ns 102,463.98 ns 23.4375 23.4375 23.4375 8,388,640 B 276 B
ArrayUInt64Decode_1K_MAA 1,106.5 ns 90.17 ns 4.94 ns 0.9804 - - 8,216 B 276 B
ArrayBoolEncode_MAA_1M 1,012,373.1 ns 53,120.62 ns 2,911.72 ns - - - 1 B 327 B
ArrayBoolEncode_MAA_1K 1,063.6 ns 44.50 ns 2.44 ns - - - - 327 B
ArrayBoolDecode_1M_MAA 1,198,136.6 ns 191,227.15 ns 10,481.81 ns - - - 1,048,601 B 276 B
ArrayBoolDecode_1K_MAA 1,073.1 ns 145.35 ns 7.97 ns 0.1240 - - 1,048 B 276 B
ArrayDecimalEncode_MAA_1M 22,354,117.7 ns 7,959,300.14 ns 436,276.10 ns 5000.0000 - - 41,943,049 B 327 B
ArrayDecimalEncode_MAA_1K 20,883.4 ns 1,987.14 ns 108.92 ns 4.8828 - - 40,960 B 327 B
ArrayDecimalDecode_1M_MAA 57,522,114.8 ns 12,834,405.35 ns 703,497.06 ns - - - 16,777,391 B 276 B
ArrayDecimalDecode_1K_MAA 52,401.2 ns 1,246.96 ns 68.35 ns 1.9531 - - 16,408 B 276 B
ArrayDoubleEncode_MAA_1M 1,053,240.8 ns 823,822.43 ns 45,156.49 ns - - - 1 B 327 B
ArrayDoubleEncode_MAA_1K 1,064.5 ns 39.81 ns 2.18 ns - - - - 327 B
ArrayDoubleDecode_1M_MAA 3,339,528.0 ns 2,736,238.70 ns 149,982.48 ns 23.4375 23.4375 23.4375 8,388,640 B 276 B
ArrayDoubleDecode_1K_MAA 1,092.7 ns 250.16 ns 13.71 ns 0.9804 - - 8,216 B 276 B
ArrayFloatEncode_MAA_1M 1,256,692.6 ns 45,829.78 ns 2,512.08 ns - - - 1 B 327 B
ArrayFloatEncode_MAA_1K 1,315.5 ns 44.92 ns 2.46 ns - - - - 327 B
ArrayFloatDecode_1M_MAA 2,094,115.0 ns 3,181,908.62 ns 174,411.15 ns 7.8125 7.8125 7.8125 4,194,331 B 276 B
ArrayFloatDecode_1K_MAA 1,096.8 ns 282.48 ns 15.48 ns 0.4921 - - 4,120 B 276 B

@danielmarbach
Copy link
Contributor Author

@xinchen10 I think I'm missing one primitive here and running the latest results in the benchmark (which is still quite a bit of effort)

Can you have a look and tell me what you think or if there is anything missing from your side?

@xinchen10
Copy link
Member

@danielmarbach I took a quick look at the changes and I think they look good. I will look more closely at the changes later.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants