Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate from F# to C# when possible #68

Closed
2 of 4 tasks
buybackoff opened this issue Oct 19, 2016 · 2 comments
Closed
2 of 4 tasks

Migrate from F# to C# when possible #68

buybackoff opened this issue Oct 19, 2016 · 2 comments

Comments

@buybackoff
Copy link
Member

buybackoff commented Oct 19, 2016

  • Freeze any F# new development until tooling and interop is fixed (in VS15?)
  • Move common functionality from Collections to Core project
  • Move base Series and Cursor classes to the Core project and eliminate recursive types
  • Rename Spreads.Extensions project to Spreads, remove existing empty Spreads project. Logically Extensions should "extend" Spreads package and contain non-core functionality, e.g. time-zone conversion, advanced cursors implementation.

Existing F# code will mostly remain in Spreads, at least collections. Immutable collections are just perfect
fit for F# - they were the initial reason why it was used at all. But I use these collections in real world
almost never, because they are slower than even the SQLite-based persistent implementation of series in some
cases and they take 60+ bytes per a key-value pair. SortedMap and SortedChunkedMap are polished and well tested
to risk rewriting them and spend time (and also are good examples of high-performance imperative F#).

@buybackoff
Copy link
Member Author

Have cleaned up the existing code and it all should stay, will use internals + InternalsVisibleTo if need implementations details.

@buybackoff
Copy link
Member Author

buybackoff commented Dec 19, 2016

Keep it here for the record, a post from my nuked blog that originated from this issue:

Functional programming is a tool for thought, imperative programming is a tool for hacking

"Why I am taking a pause with F# development (until tooling is fixed or I buy 32 core/64 Gb machine)"

TL;DR; F# is great for algorithms and analytics. It could work for libraries, interop and high-performance
code as well, but the experience is far from great and I do not want to fight with it anymore. At least I tried...

I have been functional programming aficionado for a while. At some point, I took its advantages over
mutable imperative code almost religiously. I fell in love with F#. Then I fell in love with
mechanical sympathy, and F# - by being a multi-paradigm and .NET language - still allowed me to get
almost all performance I needed, even if it required abusing it somewhat.

But lately, I started to dig even deeper and closer to the metal, to use unsafe code, to do many small
experiments and microbenchmarks, to use .NET Core, and started to question my choice of F# as the
primary language for .NET library development. Tooling, generated code performance, and .NET Core
integration could be better. As a matter of fact, I almost haven't touched my F# code over last several
months while programming exclusively on my laptop.

F# is probably the best choice for end product development, but not the one for libraries that require
performance, native interop, and (at this moment) .NET Core interop.

Tooling

Recently I have bought the very top model of MacBook Air with 8 Gb of RAM for development. Specifically,
for F# development, because IDE experience was always slower than C#. But even on this machine,
the IDE is still not responsive and compilation is slow. While C# projects fly on this machine even with R#,
F# ones crawl and creep even for such a small project as Spreads.Collections.

Refactoring and simple edits take a very noticeable pause for type inference to finish and for IDE to become responsive, and this pause grows non-linearly with a project size. I understand
that type inference is expensive and it is almost a recompilation step, but when refactoring is done
frequently it takes too much time.

Project recompilation happens on every build, even if there are no changes in the code. And it is slow.
I searched for a solution in the Internet, but it didn't
help. The recompilation happened on all my 4 machines over the past couple of years. And the funny thing is
that F# projects do not see changes in referenced C# projects when compiled separately, not as a part of a solution.

Each problem is minor in isolation, but when this happens very often, then most of the development time is spent on tooling,
not coding. In a workflow when I make many small refactorings, add tests and run microbenchmarks to see
the results of the changes - the total time costs become too significant. It is similar to a program with garbage collection
that takes most of the execution time. In the era when live editing and reload is a norm, slow recompilation significantly
impedes productivity. On the same machine, strongly-typed Angular2 (TypeScript) with WebPack
reloads in just several seconds, while VS Code also flies. I wonder why F# compilation couldn't be slower than C#'s just by several percents, not times!?

They say that Russian programmers used to be very good because they lacked frequent access to machines in the 80s and
they had to think a lot and to perfect their code before a chance to actually run it. That contrasts to modern
days when people start with opening an editor/IDE and think over a blank screen like a writer thinks over a blank sheet of paper.
On my laptop, F# nearly forces me to do such Hammock-driven development.
This is not bad per se, and for an end product, I think this is good. For some complex algorithm development, when most of the time
is spent on thinking about domain subject and not about generated IL, F# type inference in the background
is invaluable for correctness. And by my own experience, I could confirm that when F# code compiles it is very often already correct.

Generated code performance

Generated code performance is also an issue: to get it right in F# one has to write really ugly
verbose mutable code and inspect IL and run a profiler to check for implicit allocations, which are
sometimes hidden somewhere in the compiled code. This really defeats the F# elegance and terseness.
(Also there is an interesting discussion here).
Additions such as struct records/DU and fixed are kind of overdue, especially when I
have already invested in unsafe C# as heavily as IL rewriting hack taken from corefxlab.

Native interop and unsafe could be done in F#, but this is ugly and clumsy. The Recent addition of
fixed keyword is nice, but there is still no support for it in VS2015, and the NativePtr API
is painful to work with compared to C-like pointers in unsafe C#.

Task Parallel Library is the best thing since sliced bread in .NET universe. Async/await came from F#.
Yet, there is still a big overhead when interacting with TPL from F#, there is no native and
idiomatic ways to do so. I tried to use Task computation expressions from FSharpx project,
but they explode like Galaxy S7 in for/while loops and other non-trivial constructs. Simple
Bind and Return work well with recursion, but in the end manual usage of GetAwaiter,
OnComplete callbacks and TaskCompletionSource is more reliable. I believe TPL integration
at the core language level should be a priority to perceive of F# as a sibling on .NET platform,
and not as an adoptee.

File and type ordering is an F# limitation that magically turns into blessing most of the times. But when it doesn't,
workarounds are awkward.
I spent some time to find a different architecture, but often recursive types across files are just the right thing.
With the recent introduction of recursive modules and namespaces, I could put all my code in a single file,
and sometimes I really thinking about this. #sarcasm

Implicit allocations and IL generation are almost fixed in latest versions and have proper attention from the core team,
but sometimes a profiler shows unexpected results in the places where one would not expect. E.g. Array.average
was allocating objects just recently. After a lot of
work on eliminating allocations in my Spreads library, it was a surprise to find this. Generated IL code is not easy
to reason about compared to C#, which is almost "what you write is what you get" and is to IL like C to assembly.
F# sometimes surprises.

One could say I should fix F# itself since it is open source, but for the same reasons above
just setting up and compiling VisualFSharp repo from a clean clone is a half-day exercise on a really
powerful workstation machine; and editing is slow even if it eventually compiles with tests.
By the way, I really tried to fix the array issue above and almost succeeded, however, the
tooling was a great impediment in the process and I could hardly setup and compile the project
after several attempts. I also reported the IL issue but had no idea how to fix it.
I believe such contributions are optional and good will, not an obligation when I have my own work to do
(even though I would love to contribute more if I could both from time and technical ability perspective).
As they say in Russian, "Вам с шашечками или ехать?"/"Do you need limo or lift?", and C# gives a good ride.

Despite the issues above, there is nothing wrong with F# itself and there is still a way to write efficient code.
But to do so, I have to do many small experiments and changes very often, run tests and benchmarks - and repeat...
In such workflow tooling again becomes the main obstacle to overcome the issues.

.NET Core interop

This is both tooling and language issue. I often feel that Microsoft almost said: "F^&k you, come back in a year".
It is just not there yet. Doesn't work with C#/F# project mix. #dontnetcore

When .NET Core was in alpha/beta, it was OK. Now it is 1.0 and there should be no excuses.
It feels again that F# is a side project for MSFT.


If you think this is just F# critics and rant - don't get me wrong. This is an open question and a call for action.
I am watching the development of CoreFX and CoreFXLab projects,
and really like the recent trend of making C# even more performant and even less safe
than existing unsafe keyword. It feels like all the "fairy tales" about C# as a system programming language
are slowly materializing. At the same time, C# aggressively takes the good parts from F# and is becoming
more and more functional (in both senses of the word). With Roslyn, it added interactive execution. If only they add if as expression
and a compiler option to disable implicit type conversions, similar to checked arithmetics...
It looks like F# relevance is diminishing unless Microsoft invests in its tooling.

F# is very great and functional programming has many good parts. F# is still faster than Scala,
which takes forever to compile "Hello, World". F# is great for an end product, like trading algorithms
or analytical code. Its absence of implicit conversions and presence of units of measure should
not be taken lightly. (I had a real bug due to implicit ints conversions that took a long time and many
sanity checks to find, "select was not broken"). In addition to trading and analytics, there is a kind of libraries where F# shines like a supernova

  • metaprogramming. I have a project that implements a very basic query language with FParsec, and it blows my mind
    how easy it is to implement a new language. (Small Basic blog posts are a great start).

This post started this morning from an issue on GitHub
after once again I was fighting with .NET Core C#/F# interop and many issues from above. But instead of bullets
on what to change in the library I could'd help but just started writing
about the issues and experiences I have had recently. I invested into F# quite a lot and tried to make
it work where it doesn't fit. But now I am tired and emotional as if I drunk a lot of it and am experiencing a hangover.
I am still addicted to it, though. It tastes great and has an elegant flavor, makes programming fun, protects from many classes of errors,
and will remain my #1 choice for end products where correctness is paramount. Oh, and F# interactive is awesome for
such scenarios, but I still could not find a way to productively use FSI for a solution with multiple C#/F# projects
and my workflow. What I am doing wrong?

"Functional programming is a tool for thought, imperative programming is a tool for hacking." (c) Erik Meijer

buybackoff added a commit that referenced this issue Apr 12, 2017
F# is 30+% slower, maxstack is 5 vs 3 - `unit` is places twice, generates `nop`s even in Release, empty `try{}` has `ldnull` and `stloc.2` for unit `()` (cannot have completely empty `try..finally`, "everything is an expressoin" has this small cost), uses `ldc.i8 1` instead of `ldc.i4.1 + conv.i8` (http://stackoverflow.com/a/40726190/801189). Probably most importantly it uses `callvirt instance`whereas C# uses `call instance` (we do not need a null check if we as a child class call an instance method of a base class, the caller is not null or it couldn't make an instance call).

BeforWrite/AfterWrite methods from a C# assembly are properly inlined by JIT, but these other small things accumulate. That is sad and adds to #68. Since a major rewrite is underway, rewriting the two main collections (SM & SCM) using Span is the right thing. With Span we could use mmaped memory for series later. Initially Span will be slower (but probably not by 30%), but later it should be on par with arrays.

F# code
======

```
  [<MethodImpl(MethodImplOptions.AggressiveInlining)>]
  member this.Increment() : unit =
      let mutable v2 = 0L;
      try
        try
          ()
        finally
          v2 <- this.BeforWrite()
        this.counter <- this.counter + 1L
      finally
        this.AfterWrite(v2, true);
```

F# IL
=======

```
.method /*06000008*/ public hidebysig instance void
    Increment() cil managed
  {
    .maxstack 5
    .locals /*11000011*/ init (
      [0] int64 v2,
      [1] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_1,
      [2] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_2
    )

    // [17 7 - 17 26]
    IL_0000: nop
    IL_0001: ldc.i8       0 // 0x0000000000000000
    IL_000a: stloc.0      // v2
    .try
    {

      // [18 7 - 18 10]
      IL_000b: nop
      .try
      {

        // [19 9 - 19 12]
        IL_000c: nop

        // [20 11 - 20 13]
        IL_000d: ldnull
        IL_000e: stloc.2      // V_2
        IL_000f: leave.s      IL_001c
      } // end of .try
      finally
      {

        // [21 9 - 21 16]
        IL_0011: nop

        // [22 11 - 22 34]
        IL_0012: ldarg.0      // this
        IL_0013: callvirt     instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite()
        IL_0018: stloc.0      // v2
        IL_0019: ldnull
        IL_001a: pop
        IL_001b: endfinally
      } // end of finally
      IL_001c: ldloc.2      // V_2
      IL_001d: pop

      // [23 9 - 23 42]
      IL_001e: ldarg.0      // this
      IL_001f: ldarg.0      // this
      IL_0020: ldfld        int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/
      IL_0025: ldc.i8       1 // 0x0000000000000001
      IL_002e: add
      IL_002f: stfld        int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/
      IL_0034: ldnull
      IL_0035: stloc.1      // V_1
      IL_0036: leave.s      IL_0044
    } // end of .try
    finally
    {

      // [24 7 - 24 14]
      IL_0038: nop

      // [25 9 - 25 34]
      IL_0039: ldarg.0      // this
      IL_003a: ldloc.0      // v2
      IL_003b: ldc.i4.1
      IL_003c: callvirt     instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool)
      IL_0041: ldnull
      IL_0042: pop
      IL_0043: endfinally
    } // end of finally
    IL_0044: ldloc.1      // V_1
    IL_0045: pop
    IL_0046: ret

  } // end of method LockTestSeries::Increment

```

C# code
=========

```
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void Increment()
{
    long v2 = 0L;
    try
    {
        try { }
        finally
        {
            v2 = this.BeforWrite();
        }
        _counter++;
    }
    finally
    {
        AfterWrite(v2, true);
    }
}
```

C# IL
===========

```
.method /*060000C3*/ public hidebysig instance void
      Increment() cil managed
    {
      .maxstack 3
      .locals /*11000012*/ init (
        [0] int64 v2
      )

      // [36 17 - 36 29]
      IL_0000: ldc.i4.0
      IL_0001: conv.i8
      IL_0002: stloc.0      // v2
      .try
      {
        .try
        {

          // [39 27 - 39 28]
          IL_0003: leave.s      IL_000d
        } // end of .try
        finally
        {

          // [42 25 - 42 48]
          IL_0005: ldarg.0      // this
          IL_0006: call         instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite()
          IL_000b: stloc.0      // v2

          // [43 21 - 43 22]
          IL_000c: endfinally
        } // end of finally

        // [44 21 - 44 32]
        IL_000d: ldarg.0      // this
        IL_000e: ldarg.0      // this
        IL_000f: ldfld        int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/
        IL_0014: ldc.i4.1
        IL_0015: conv.i8
        IL_0016: add
        IL_0017: stfld        int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/

        // [45 17 - 45 18]
        IL_001c: leave.s      IL_0027
      } // end of .try
      finally
      {

        // [48 21 - 48 42]
        IL_001e: ldarg.0      // this
        IL_001f: ldloc.0      // v2
        IL_0020: ldc.i4.1
        IL_0021: call         instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool)

        // [49 17 - 49 18]
        IL_0026: endfinally
      } // end of finally

      // [50 13 - 50 14]
      IL_0027: ret

    } // end of method LockTestSeries::Increment
```
buybackoff added a commit that referenced this issue Apr 12, 2017
F# is 30+% slower, maxstack is 5 vs 3 - `unit`vars are placed twice, generates `nop`s even in Release, empty `try{}` has `ldnull` and `stloc.2` for unit `()` (cannot have completely empty `try..finally`, "everything is an expressoin" has this small cost), uses `ldc.i8 1` instead of `ldc.i4.1 + conv.i8` (http://stackoverflow.com/a/40726190/801189). Most importantly it uses `callvirt instance`whereas C# uses `call instance` (we do not need a null check if we as a child class call an instance method of a base class, the caller is not null or it couldn't make an instance call).

BeforWrite/AfterWrite methods from a C# assembly are properly inlined by JIT, but these other small things accumulate. That is sad and adds to #68. Since a major rewrite is underway, rewriting the two main collections (SM & SCM) using Span is the right thing. With Span we could use mmaped memory for series later. Initially Span will be slower (but probably not by 30%), but later it should be on par with arrays.

F# code
======

```
  [<MethodImpl(MethodImplOptions.AggressiveInlining)>]
  member this.Increment() : unit =
      let mutable v2 = 0L;
      try
        try
          ()
        finally
          v2 <- this.BeforWrite()
        this.counter <- this.counter + 1L
      finally
        this.AfterWrite(v2, true);
```

F# IL
=======

```
.method /*06000008*/ public hidebysig instance void
    Increment() cil managed
  {
    .maxstack 5
    .locals /*11000011*/ init (
      [0] int64 v2,
      [1] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_1,
      [2] class [FSharp.Core/*23000003*/]Microsoft.FSharp.Core.Unit V_2
    )

    // [17 7 - 17 26]
    IL_0000: nop
    IL_0001: ldc.i8       0 // 0x0000000000000000
    IL_000a: stloc.0      // v2
    .try
    {

      // [18 7 - 18 10]
      IL_000b: nop
      .try
      {

        // [19 9 - 19 12]
        IL_000c: nop

        // [20 11 - 20 13]
        IL_000d: ldnull
        IL_000e: stloc.2      // V_2
        IL_000f: leave.s      IL_001c
      } // end of .try
      finally
      {

        // [21 9 - 21 16]
        IL_0011: nop

        // [22 11 - 22 34]
        IL_0012: ldarg.0      // this
        IL_0013: callvirt     instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite()
        IL_0018: stloc.0      // v2
        IL_0019: ldnull
        IL_001a: pop
        IL_001b: endfinally
      } // end of finally
      IL_001c: ldloc.2      // V_2
      IL_001d: pop

      // [23 9 - 23 42]
      IL_001e: ldarg.0      // this
      IL_001f: ldarg.0      // this
      IL_0020: ldfld        int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/
      IL_0025: ldc.i8       1 // 0x0000000000000001
      IL_002e: add
      IL_002f: stfld        int64 Spreads.Tests.Profile.LockTestSeries/*02000003*/::counter/*04000006*/
      IL_0034: ldnull
      IL_0035: stloc.1      // V_1
      IL_0036: leave.s      IL_0044
    } // end of .try
    finally
    {

      // [24 7 - 24 14]
      IL_0038: nop

      // [25 9 - 25 34]
      IL_0039: ldarg.0      // this
      IL_003a: ldloc.0      // v2
      IL_003b: ldc.i4.1
      IL_003c: callvirt     instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool)
      IL_0041: ldnull
      IL_0042: pop
      IL_0043: endfinally
    } // end of finally
    IL_0044: ldloc.1      // V_1
    IL_0045: pop
    IL_0046: ret

  } // end of method LockTestSeries::Increment

```

C# code
=========

```
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void Increment()
{
    long v2 = 0L;
    try
    {
        try { }
        finally
        {
            v2 = this.BeforWrite();
        }
        _counter++;
    }
    finally
    {
        AfterWrite(v2, true);
    }
}
```

C# IL
===========

```
.method /*060000C3*/ public hidebysig instance void
      Increment() cil managed
    {
      .maxstack 3
      .locals /*11000012*/ init (
        [0] int64 v2
      )

      // [36 17 - 36 29]
      IL_0000: ldc.i4.0
      IL_0001: conv.i8
      IL_0002: stloc.0      // v2
      .try
      {
        .try
        {

          // [39 27 - 39 28]
          IL_0003: leave.s      IL_000d
        } // end of .try
        finally
        {

          // [42 25 - 42 48]
          IL_0005: ldarg.0      // this
          IL_0006: call         instance int64 class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::BeforWrite()
          IL_000b: stloc.0      // v2

          // [43 21 - 43 22]
          IL_000c: endfinally
        } // end of finally

        // [44 21 - 44 32]
        IL_000d: ldarg.0      // this
        IL_000e: ldarg.0      // this
        IL_000f: ldfld        int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/
        IL_0014: ldc.i4.1
        IL_0015: conv.i8
        IL_0016: add
        IL_0017: stfld        int64 Spreads.Core.Tests.LockFreeTests/*02000007*//LockTestSeries/*02000028*/::_counter/*04000013*/

        // [45 17 - 45 18]
        IL_001c: leave.s      IL_0027
      } // end of .try
      finally
      {

        // [48 21 - 48 42]
        IL_001e: ldarg.0      // this
        IL_001f: ldloc.0      // v2
        IL_0020: ldc.i4.1
        IL_0021: call         instance void class [Spreads.Core/*23000002*/]Spreads.BaseSeries`2<int32, int32>::AfterWrite(int64, bool)

        // [49 17 - 49 18]
        IL_0026: endfinally
      } // end of finally

      // [50 13 - 50 14]
      IL_0027: ret

    } // end of method LockTestSeries::Increment
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant