Removed Hopac dependency #75

Merged
merged 5 commits into from Aug 31, 2016

Projects

None yet

1 participant

@Horusiath
Contributor
Horusiath commented Aug 31, 2016 edited

This PR introduces removing of the Hopac ( #41 ) dependencies inside the project. It's going back to F# Async<> with special sugar in form of AsyncVal<> - a dedicated structure, that's going to simplify most common scenario, where data is retrieved synchronously.

Some benchmarks:

Current dev

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-6300HQ CPU 2.30GHz, ProcessorCount=4
Frequency=2250003 ticks, Resolution=444.4439 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1080.0

Type=SimpleExecutionBenchmark  Mode=Throughput  
Method Median StdDev Mean Min Max Op/s Gen 0 Gen 1 Gen 2 Bytes Allocated/Op
BenchmarkSimpleQueryUnparsed 157.3269 us 5.6500 us 158.2112 us 148.5248 us 172.2940 us 6320.66 24.02 - - 5 605,29
BenchmarkSimpleQueryParsed 146.4599 us 8.1424 us 148.6951 us 134.4616 us 168.3049 us 6725.17 12.21 - - 3 693,64
BenchmarkSimpleQueryPlanned 181.3533 us 7.7570 us 182.6536 us 168.9308 us 201.0036 us 5474.84 9.94 - - 2 994,85
BenchmarkFlatQueryUnparsed 1,026.3858 us 179.5066 us 981.0037 us 468.7164 us 1,309.8351 us 1019.36 61.85 - - 16 243,82
BenchmarkFlatQueryParsed 595.1850 us 120.4812 us 582.7553 us 259.1585 us 897.2696 us 1715.99 51.53 - - 12 600,62
BenchmarkFlatQueryPlanned 322.9032 us 88.0148 us 342.8616 us 216.6768 us 745.2664 us 2916.63 41.38 - - 10 396,88
BenchmarkNestedQueryUnparsed 847.3400 us 26.5663 us 846.2497 us 800.5788 us 903.7332 us 1181.68 215.00 - - 55 156,48
BenchmarkNestedQueryParsed 259.7336 us 4.2310 us 259.9037 us 251.7067 us 271.2696 us 3847.58 165.29 - - 37 102,93
BenchmarkNestedQueryPlanned 250.5577 us 2.2592 us 250.7361 us 247.6127 us 255.7093 us 3988.26 149.10 - - 33 618,53

AsyncVals

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-6300HQ CPU 2.30GHz, ProcessorCount=4
Frequency=2250003 ticks, Resolution=444.4439 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1080.0

Type=SimpleExecutionBenchmark  Mode=Throughput  
Method Median StdDev Mean Min Max Op/s Gen 0 Gen 1 Gen 2 Bytes Allocated/Op
BenchmarkSimpleQueryUnparsed 56.0447 us 1.2391 us 55.9036 us 53.4428 us 59.2214 us 17887.92 16.82 - - 2 892,18
BenchmarkSimpleQueryParsed 37.5909 us 0.8322 us 37.7219 us 36.4099 us 39.9745 us 26509.8 11.19 - - 1 903,26
BenchmarkSimpleQueryPlanned 36.7179 us 0.6371 us 36.8037 us 35.8933 us 38.0362 us 27171.21 8.62 - - 1 519,35
BenchmarkFlatQueryUnparsed 118.6983 us 4.4283 us 120.1088 us 116.6450 us 134.9538 us 8325.78 29.80 14.31 - 7 330,10
BenchmarkFlatQueryParsed 94.5388 us 4.1212 us 94.7174 us 88.7297 us 100.8304 us 10557.73 4.69 4.22 - 3 419,43
BenchmarkFlatQueryPlanned 87.3928 us 2.0692 us 87.5802 us 82.0037 us 92.1323 us 11418.11 4.73 2.17 - 2 339,76
BenchmarkNestedQueryUnparsed 683.2747 us 23.5373 us 682.2727 us 597.9384 us 716.7725 us 1465.69 136.62 - - 23 553,46
BenchmarkNestedQueryParsed 541.7645 us 13.5400 us 544.4594 us 523.8335 us 577.7102 us 1836.68 107.00 - - 19 198,92
BenchmarkNestedQueryPlanned 510.8088 us 2.9108 us 510.2895 us 502.5219 us 513.8596 us 1959.67 106.02 - - 18 659,60

Description

As we may see, good point is that heap allocations have gone down by 35%-50% on average in favor of the new approach.

When it comes to operations per sec, values differ significantly:

  • For simple queries, new approach is almost 3 times faster - which is fairly close to using branch where Async<> where used before Hopac.
  • For case, where query has a lot of fields on the same level of depth, new approach is around 7 times faster.
  • For case, where we have a query with deep level of nesting, new approach is a little faster than the old one (around 20%).

Those comparisons take into account unparsed versions of the tests - so the case when we get raw query string. For cases when we have prefetched query, Hopac can be faster than new approach, however the reason for that is unknown and need further investigantion.


Comparison of Async<> vs AsyncVal<> in common scenarios

Host Process Environment Information:
BenchmarkDotNet.Core=v0.9.9.0
OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i5-6300HQ CPU 2.30GHz, ProcessorCount=4
Frequency=2250003 ticks, Resolution=444.4439 ns, Timer=TSC
CLR=MS.NET 4.0.30319.42000, Arch=32-bit RELEASE
GC=Concurrent Workstation
JitModules=clrjit-v4.6.1080.0

Type=AsyncValBenchmark  Mode=Throughput  
Method Median StdDev Mean Min Max Op/s Gen 0 Gen 1 Gen 2 Bytes Allocated/Op
AsyncValImmediate 2.0300 ns 0.2742 ns 2.1011 ns 1.7335 ns 3.1936 ns 475947646.76 - - - 0,00
AsyncValAwaiting 13.2080 ns 1.1620 ns 13.7225 ns 12.8344 ns 17.7117 ns 72873061.83 0.49 - - 16,40
AsyncReturnImmediatelly 6,024.7790 ns 166.8665 ns 6,064.8655 ns 5,762.1858 ns 6,406.6484 ns 164884.12 5.81 - - 219,33
AsyncValCollectionAllSync 495.9702 ns 16.3894 ns 494.1212 ns 473.6449 ns 524.1607 ns 2023794.80 5.93 - - 204,86
AsyncValCollectionAllAsync 102,270.2629 ns 3,614.2131 ns 100,790.5879 ns 92,410.3586 ns 105,359.9941 ns 9921.56 609.00 18.00 - 23 723,62
AsyncCollection 52,520.7091 ns 239.5974 ns 52,563.3468 ns 52,274.1273 ns 53,388.1667 ns 19024.66 467.80 9.11 - 18 529,63
AsyncValCollectionMixed90x10 19,946.5473 ns 135.0883 ns 19,983.9279 ns 19,837.2945 ns 20,282.5522 ns 50040.21 86.77 2.22 - 3 109,31

Description

For the most common case - returning a value immediatelly - using AsyncVal.wrap instead of async.Return results in ~3000 times faster execution with no heap allocations!

When working on flattening collections of Async/AsyncVal, numbers can be different:

  • If all collection values can be received synchronously, AsyncVal is over 100 times faster than Async.
  • If all collection values as asynchronous, AsyncVal can be twice slower - I belive we can improve this inside implementation.
  • If we have mix of sync/async values (ratio: 90% sync, 10% async), AsyncVal is over two times faster. However I think, the exact numbers may vary - AsyncVal will be faster if async values will be put at the end of an array. To make it even faster, we should actually partition input array by sync/async and optimize path for each case - this however cannot be done in presented approach as indexes of inputs ⇒ outputs must match.
Horusiath added some commits Aug 30, 2016
@Horusiath Horusiath created AsyncVal<> and removed Hopac 4a4d880
@Horusiath Horusiath added benchmark tests for AsyncVal 79a224c
@Horusiath Horusiath asyncVal.bind return from async dc8fe1f
@Horusiath Horusiath fixed all failing tests 3f566e8
@Horusiath Horusiath fixed F# core version for benchmarks
57d3a3d
@Horusiath Horusiath merged commit b11f062 into fsprojects:dev Aug 31, 2016

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment