Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tail call optimization for partially applied functions and functions coming from a pattern match #3301

Closed
pema99 opened this issue Dec 15, 2022 · 15 comments · Fixed by #3302
Closed

Comments

@pema99
Copy link

pema99 commented Dec 15, 2022

Description

Tail calls are not optimized in some cases where I hoped they would be, which can lead to stack overflows given large enough input.

Repro code

Consider this simple function. It get's optimized to a while-loop:

let rec foo x a =
    if x <= 0 then a
    else
        foo (x-1) (a+1)

However, this does not, as the function is partially applied:

let rec foo x a =
    if x <= 0 then a
    else
        let bb = foo (x-1)
        bb (a+1)

And this also doesn't, because we had to pattern match to get the function:

let rec foo x a =
    if x <= 0 then a
    else
        let (bb, _) = (foo, 1)
        bb (x-1) (a+1)

In general, it seems any kind of delayed tail call isn't optimized. I assume this is just something nobody has implemented yet, but I am curious if it is tracked, and if there are any plans to do so? FWIW, Microsoft's main F# compiler currently optimizes all 3 of these.

Expected and actual results

All 3 functions are tail recursive, and should be optimized into while-loops to not overflow the stack. One might also consider using a trampoline in these cases, if this is deemed too difficult. That tends to be easier to implement, but slower.

@pema99 pema99 changed the title Tail call optimization for functions partially applied functions and functions coming from a pattern match Tail call optimization for partially applied functions and functions coming from a pattern match Dec 15, 2022
@ncave
Copy link
Collaborator

ncave commented Dec 16, 2022

@pema99 It's possible that these cases are tail-call optimized by the .NET JIT, not by the F# compiler.
Unfortunately there are no such optimizations in most JavaScript engines at the moment. There are JavaScript workarounds with varying performance implications (trampolines, etc.), but none are implemented out of the box in Fable (besides what the F# compiler already does). See also #2392.

It is, of course, also possible that the tail-call elimination analysis that Fable does is just not taking into account delayed functions at tail position. If you have a proposal or an idea for improving it, by all means share it, contributions are always welcome.

@ncave
Copy link
Collaborator

ncave commented Dec 16, 2022

@pema99 Looks like you're correct, here is what the first one compiles to in .NET:

let rec foo x a =
    if x <= 0 then a
    else
        let bb = foo (x-1)
        bb (a+1)

Compiles to (C#):

public static int foo(int x, int a)
{
    while (x > 0)
    {
        int num = x - 1;
        a++;
        x = num;
    }
    return a;
}

So we have to figure out how the F# compiler codegen does that for delayed functions.

@pema99
Copy link
Author

pema99 commented Dec 16, 2022

I think one approach is to do a kind of abstract interpretation of the syntax tree for whatever intermediate representation is most relevant at this point (not familiar with Fable code base).

Just as type inference can be viewed as abstract interpretation where values are types, I think you can treat this problem as abstract interpretation where values are objects carrying information about whether a recursive call was found, and how many arguments have been applied to it so far. You'd keep adding arguments to the object as you find applications, and once these reach target arity, you eliminate the call if in tail position.

Another option is to let the user manually annotate functions they want trampolined, or something like that. Trampolines are very easy to implement in JS with downside being bad performance for small inputs.

I admit that first description is pretty vague, but it's probably how I would start trying to solve this.

@pema99
Copy link
Author

pema99 commented Dec 16, 2022

It just seems a little spoopy to me that you can take perfectly fine dotnet F# code and compile with fable, and then start seeing stack overflows from doing so :9

If I wanted to look at how fables current tail call analysis is done, do you know where I should start looking?

@pema99
Copy link
Author

pema99 commented Dec 16, 2022

Ah, just saw the issue you linked. Didn't realize Fable also doesn't optimize mutual tail recursion or CPS. I don't have much experience using Fable yet, so perhaps my expectations were just set a bit too high. Those seem more pressing, I suppose :9. Was primarily curious if the lack of handling for delayed tail recursion was something you were aware of at all.

@ncave
Copy link
Collaborator

ncave commented Dec 16, 2022

@pema99 Yes, I believe spoopy is the correct technical term. Looks like the problem actually happens before the tail call position check, in that the beta reduction of bb is not happening, so the call is not in a tail position. I'll take a look.

ncave added a commit to ncave/Fable that referenced this issue Dec 16, 2022
@ncave ncave mentioned this issue Dec 16, 2022
@pema99
Copy link
Author

pema99 commented Dec 16, 2022

@ncave Oh, interesting. Does this fix both the failing cases I mentioned? If so, fix seems much simpler than what I was proposing - but then again, I really don't know what infrastructure is already present in the compiler :P

@ncave
Copy link
Collaborator

ncave commented Dec 16, 2022

@pema99 Right now it only fixes the first, I'm looking at the second (tuple binding).

@ncave
Copy link
Collaborator

ncave commented Dec 16, 2022

@pema99 I think I fixed both, looks like both were regressions that used to work before. Thanks for contributing this issue!

@pema99
Copy link
Author

pema99 commented Dec 16, 2022

Great success! :D

@MangelMaxime
Copy link
Member

@ncave I opened an issue where the fixes for this issue is only applied when Fable is run in release mode.

See #3522

Is it possible to support it when in watch mode too? Otherwise, it means we have disparity in terms of feature depending which mode Fable is running in.

@ncave
Copy link
Collaborator

ncave commented Sep 11, 2023

@MangelMaxime The default Fable CLI -c or --configuration option says that "default is 'Debug' in watch mode, or 'Release' otherwise", which means that yes, it will be different (as Debug builds are quite different than Release builds, even on .NET).

So perhaps the -c or --configuration option can be used together with --watch to force Release mode.

@MangelMaxime
Copy link
Member

So perhaps the -c or --configuration option can be used together with --watch to force Release mode.

Using --watch --configuration Release does fix the problem with the tests.

But this means that some features are supported in Release and not in Debug mode. When in an ideal world it should work in both case.

Because, if I am a normal user, and have my code falling I will think of a bug and open an issue.

@ncave
Copy link
Collaborator

ncave commented Sep 11, 2023

@MangelMaxime

some features are supported in Release and not in Debug mode

This is a F# compiler behavior, it produces different code in Debug vs Release mode.
IMO it's reasonable to expect different code being generated when optimizations are turned off (i.e. in DEBUG mode). That's just how it works for most compilers and most languages.

I don't know why the default mode for --watch was set to DEBUG, there was probably a reason, but perhaps it can be set to Release.

@MangelMaxime
Copy link
Member

IMO it's reasonable to expect different code being generated when optimizations are turned off (i.e. in DEBUG mode). That's just how it works for most compilers and most languages.

I understand that part :) But in most compilers, I think both compile valid code.

I don't know why the default mode for --watch was set to DEBUG, there was probably a reason, but perhaps it can be set to Release.

Historically, it has been set to DEBUG so library like Elmish.HMR which needs a different code generated between dev and release mode could rely on the DEBUG flag. Today, I would preferred to have a FABLE_WATCH flag to avoid situation where we rely on an external flag but I think the breaking changes is too risky as it could break a lot of libraries / programs that rely on the current behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants