New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate from F# to C# when possible #68
Comments
Have cleaned up the existing code and it all should stay, will use internals + InternalsVisibleTo if need implementations details. |
Keep it here for the record, a post from my nuked blog that originated from this issue: Functional programming is a tool for thought, imperative programming is a tool for hacking"Why I am taking a pause with F# development (until tooling is fixed or I buy 32 core/64 Gb machine)" TL;DR; F# is great for algorithms and analytics. It could work for libraries, interop and high-performance I have been functional programming aficionado for a while. At some point, I took its advantages over But lately, I started to dig even deeper and closer to the metal, to use unsafe code, to do many small F# is probably the best choice for end product development, but not the one for libraries that require ToolingRecently I have bought the very top model of MacBook Air with 8 Gb of RAM for development. Specifically, Refactoring and simple edits take a very noticeable pause for type inference to finish and for IDE to become responsive, and this pause grows non-linearly with a project size. I understand Project recompilation happens on every build, even if there are no changes in the code. And it is slow. Each problem is minor in isolation, but when this happens very often, then most of the development time is spent on tooling,
Generated code performanceGenerated code performance is also an issue: to get it right in F# one has to write really ugly Native interop and unsafe could be done in F#, but this is ugly and clumsy. The Recent addition of Task Parallel Library is the best thing since sliced bread in .NET universe. Async/await came from F#. File and type ordering is an F# limitation that magically turns into blessing most of the times. But when it doesn't, Implicit allocations and IL generation are almost fixed in latest versions and have proper attention from the core team, One could say I should fix F# itself since it is open source, but for the same reasons above Despite the issues above, there is nothing wrong with F# itself and there is still a way to write efficient code. .NET Core interopThis is both tooling and language issue. I often feel that Microsoft almost said: "F^&k you, come back in a year". When .NET Core was in alpha/beta, it was OK. Now it is 1.0 and there should be no excuses. If you think this is just F# critics and rant - don't get me wrong. This is an open question and a call for action. F# is very great and functional programming has many good parts. F# is still faster than Scala,
This post started this morning from an issue on GitHub
|
F# is 30+% slower, maxstack is 5 vs 3 - `unit` is places twice, generates `nop`s even in Release, empty `try{}` has `ldnull` and `stloc.2` for unit `()` (cannot have completely empty `try..finally`, "everything is an expressoin" has this small cost), uses `ldc.i8 1` instead of `ldc.i4.1 + conv.i8` (http://stackoverflow.com/a/40726190/801189). Probably most importantly it uses `callvirt instance`whereas C# uses `call instance` (we do not need a null check if we as a child class call an instance method of a base class, the caller is not null or it couldn't make an instance call). BeforWrite/AfterWrite methods from a C# assembly are properly inlined by JIT, but these other small things accumulate. That is sad and adds to #68. Since a major rewrite is underway, rewriting the two main collections (SM & SCM) using Span is the right thing. With Span we could use mmaped memory for series later. Initially Span will be slower (but probably not by 30%), but later it should be on par with arrays. F# code ====== ``` [<MethodImpl(MethodImplOptions.AggressiveInlining)>] member this.Increment() : unit = let mutable v2 = 0L; try try () finally v2 <- this.BeforWrite() this.counter <- this.counter + 1L finally this.AfterWrite(v2, true); ``` F# IL ======= ``` .method /*06000008*/ public hidebysig instance void Increment() cil managed { .maxstack 5 .locals /*11000011*/ init ( [0] int64 v2, [1] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_1, [2] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_2 ) // [17 7 - 17 26] IL_0000: nop IL_0001: ldc.i8 0 // 0x0000000000000000 IL_000a: stloc.0 // v2 .try { // [18 7 - 18 10] IL_000b: nop .try { // [19 9 - 19 12] IL_000c: nop // [20 11 - 20 13] IL_000d: ldnull IL_000e: stloc.2 // V_2 IL_000f: leave.s IL_001c } // end of .try finally { // [21 9 - 21 16] IL_0011: nop // [22 11 - 22 34] IL_0012: ldarg.0 // this IL_0013: callvirt instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite() IL_0018: stloc.0 // v2 IL_0019: ldnull IL_001a: pop IL_001b: endfinally } // end of finally IL_001c: ldloc.2 // V_2 IL_001d: pop // [23 9 - 23 42] IL_001e: ldarg.0 // this IL_001f: ldarg.0 // this IL_0020: ldfld int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/ IL_0025: ldc.i8 1 // 0x0000000000000001 IL_002e: add IL_002f: stfld int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/ IL_0034: ldnull IL_0035: stloc.1 // V_1 IL_0036: leave.s IL_0044 } // end of .try finally { // [24 7 - 24 14] IL_0038: nop // [25 9 - 25 34] IL_0039: ldarg.0 // this IL_003a: ldloc.0 // v2 IL_003b: ldc.i4.1 IL_003c: callvirt instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool) IL_0041: ldnull IL_0042: pop IL_0043: endfinally } // end of finally IL_0044: ldloc.1 // V_1 IL_0045: pop IL_0046: ret } // end of method LockTestSeries::Increment ``` C# code ========= ``` [MethodImpl(MethodImplOptions.AggressiveInlining)] public void Increment() { long v2 = 0L; try { try { } finally { v2 = this.BeforWrite(); } _counter++; } finally { AfterWrite(v2, true); } } ``` C# IL =========== ``` .method /*060000C3*/ public hidebysig instance void Increment() cil managed { .maxstack 3 .locals /*11000012*/ init ( [0] int64 v2 ) // [36 17 - 36 29] IL_0000: ldc.i4.0 IL_0001: conv.i8 IL_0002: stloc.0 // v2 .try { .try { // [39 27 - 39 28] IL_0003: leave.s IL_000d } // end of .try finally { // [42 25 - 42 48] IL_0005: ldarg.0 // this IL_0006: call instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite() IL_000b: stloc.0 // v2 // [43 21 - 43 22] IL_000c: endfinally } // end of finally // [44 21 - 44 32] IL_000d: ldarg.0 // this IL_000e: ldarg.0 // this IL_000f: ldfld int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/ IL_0014: ldc.i4.1 IL_0015: conv.i8 IL_0016: add IL_0017: stfld int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/ // [45 17 - 45 18] IL_001c: leave.s IL_0027 } // end of .try finally { // [48 21 - 48 42] IL_001e: ldarg.0 // this IL_001f: ldloc.0 // v2 IL_0020: ldc.i4.1 IL_0021: call instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool) // [49 17 - 49 18] IL_0026: endfinally } // end of finally // [50 13 - 50 14] IL_0027: ret } // end of method LockTestSeries::Increment ```
F# is 30+% slower, maxstack is 5 vs 3 - `unit`vars are placed twice, generates `nop`s even in Release, empty `try{}` has `ldnull` and `stloc.2` for unit `()` (cannot have completely empty `try..finally`, "everything is an expressoin" has this small cost), uses `ldc.i8 1` instead of `ldc.i4.1 + conv.i8` (http://stackoverflow.com/a/40726190/801189). Most importantly it uses `callvirt instance`whereas C# uses `call instance` (we do not need a null check if we as a child class call an instance method of a base class, the caller is not null or it couldn't make an instance call). BeforWrite/AfterWrite methods from a C# assembly are properly inlined by JIT, but these other small things accumulate. That is sad and adds to #68. Since a major rewrite is underway, rewriting the two main collections (SM & SCM) using Span is the right thing. With Span we could use mmaped memory for series later. Initially Span will be slower (but probably not by 30%), but later it should be on par with arrays. F# code ====== ``` [<MethodImpl(MethodImplOptions.AggressiveInlining)>] member this.Increment() : unit = let mutable v2 = 0L; try try () finally v2 <- this.BeforWrite() this.counter <- this.counter + 1L finally this.AfterWrite(v2, true); ``` F# IL ======= ``` .method /*06000008*/ public hidebysig instance void Increment() cil managed { .maxstack 5 .locals /*11000011*/ init ( [0] int64 v2, [1] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_1, [2] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_2 ) // [17 7 - 17 26] IL_0000: nop IL_0001: ldc.i8 0 // 0x0000000000000000 IL_000a: stloc.0 // v2 .try { // [18 7 - 18 10] IL_000b: nop .try { // [19 9 - 19 12] IL_000c: nop // [20 11 - 20 13] IL_000d: ldnull IL_000e: stloc.2 // V_2 IL_000f: leave.s IL_001c } // end of .try finally { // [21 9 - 21 16] IL_0011: nop // [22 11 - 22 34] IL_0012: ldarg.0 // this IL_0013: callvirt instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite() IL_0018: stloc.0 // v2 IL_0019: ldnull IL_001a: pop IL_001b: endfinally } // end of finally IL_001c: ldloc.2 // V_2 IL_001d: pop // [23 9 - 23 42] IL_001e: ldarg.0 // this IL_001f: ldarg.0 // this IL_0020: ldfld int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/ IL_0025: ldc.i8 1 // 0x0000000000000001 IL_002e: add IL_002f: stfld int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/ IL_0034: ldnull IL_0035: stloc.1 // V_1 IL_0036: leave.s IL_0044 } // end of .try finally { // [24 7 - 24 14] IL_0038: nop // [25 9 - 25 34] IL_0039: ldarg.0 // this IL_003a: ldloc.0 // v2 IL_003b: ldc.i4.1 IL_003c: callvirt instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool) IL_0041: ldnull IL_0042: pop IL_0043: endfinally } // end of finally IL_0044: ldloc.1 // V_1 IL_0045: pop IL_0046: ret } // end of method LockTestSeries::Increment ``` C# code ========= ``` [MethodImpl(MethodImplOptions.AggressiveInlining)] public void Increment() { long v2 = 0L; try { try { } finally { v2 = this.BeforWrite(); } _counter++; } finally { AfterWrite(v2, true); } } ``` C# IL =========== ``` .method /*060000C3*/ public hidebysig instance void Increment() cil managed { .maxstack 3 .locals /*11000012*/ init ( [0] int64 v2 ) // [36 17 - 36 29] IL_0000: ldc.i4.0 IL_0001: conv.i8 IL_0002: stloc.0 // v2 .try { .try { // [39 27 - 39 28] IL_0003: leave.s IL_000d } // end of .try finally { // [42 25 - 42 48] IL_0005: ldarg.0 // this IL_0006: call instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite() IL_000b: stloc.0 // v2 // [43 21 - 43 22] IL_000c: endfinally } // end of finally // [44 21 - 44 32] IL_000d: ldarg.0 // this IL_000e: ldarg.0 // this IL_000f: ldfld int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/ IL_0014: ldc.i4.1 IL_0015: conv.i8 IL_0016: add IL_0017: stfld int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/ // [45 17 - 45 18] IL_001c: leave.s IL_0027 } // end of .try finally { // [48 21 - 48 42] IL_001e: ldarg.0 // this IL_001f: ldloc.0 // v2 IL_0020: ldc.i4.1 IL_0021: call instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool) // [49 17 - 49 18] IL_0026: endfinally } // end of finally // [50 13 - 50 14] IL_0027: ret } // end of method LockTestSeries::Increment ```
Existing F# code will mostly remain in Spreads, at least collections. Immutable collections are just perfect
fit for F# - they were the initial reason why it was used at all. But I use these collections in real world
almost never, because they are slower than even the SQLite-based persistent implementation of series in some
cases and they take 60+ bytes per a key-value pair. SortedMap and SortedChunkedMap are polished and well tested
to risk rewriting them and spend time (and also are good examples of high-performance imperative F#).
The text was updated successfully, but these errors were encountered: