[Proposal] Support tail recursion #2544

gafter · 2015-03-12T22:02:38Z

gafter
Mar 12, 2015

For performance reasons, it would be very helpful if some method invocations were to generate tail calls, in which the caller's stack is discarded and replaced by the callee's stack. When the method is calling itself, the compiler can generate a jump to the start of the method.

There are a few ways one might designate that one desires tail calls in the generated code:

The compiler could always generate tail calls when applicable.
The compiler could generate tail calls when applicable in production builds but not debug builds. Equivalently, the compiler could recognize a command-line option (or project-wide option) that causes the generation of tail calls wherever possible.
The language could provide a syntax for tail invocations.
The compiler could recognize an attribute on the enclosing method and generate tail calls in the method's body wherever possible.

The advantages of a tail invocation is that it reduces the amount of stack used in a recursive call chain. That may improve performance due to improved memory locality. For some kinds of recursive code, it would allow the code to run with unbounded recursion; that makes it practical (without loss of performance) to replace loops with equivalent recursive code where the recursive code is more clear.

The disadvantages are that it makes debugging a bit harder, as stack frames with tail calls disappear from the call stack. Similarly they disappear from an exception stack trace. If the compiler does it automatically, this would be a behavioral change from previous releases, and would "break" any client that depends on the shape of the stack trace.

gafter · 2015-03-12T22:03:14Z

gafter
Mar 12, 2015
Author

/cc @pharring @MattGertz

0 replies

svick · 2015-03-12T22:15:55Z

svick
Mar 12, 2015
Collaborator

AFAIK the CLR already supports tail recursion in some cases. What exactly are the cases where it doesn't and wouldn't it make sense to improve the CLR for those cases, instead of the compiler?

0 replies

MattGertz · 2015-03-12T22:25:19Z

MattGertz
Mar 12, 2015
Collaborator

Yes, the CLR supports .tail. The Roslyn compiler currently has no facility for inserting .tail at all, regardless of what the CLR currently does, however. This is unlike, for example, F#, where tail recursion is leveraged.

0 replies

VSadov · 2015-03-12T23:41:24Z

VSadov
Mar 12, 2015
Collaborator

Generally JIT emits tail calls when it finds that profitable. In some sense it is a hardware-specific optimization. Same call may benefit from tailcalling on x64 (where stack frames are larger) and not so much on x86 (where register set is limited). Unsurprisingly the JIT is much more aggressive with tail calls on x64.
I think for perf-oriented tailcalls it may still be better to leave it to the JIT as to a compiler that is closer to the metal.

There is another set of situations where tailcalls are used for correctness/safety. This is the case where a long chain of methods, one deferring to another (recursively or not) is expected. This is common in F# and is also a pattern used by DLR callsite binders. (http://dlr.codeplex.com/SourceControl/latest#DLR_Main/Runtime/Microsoft.Scripting.Core/Compiler/LambdaCompiler.Expressions.cs look at EmitMethodCall method)

These are the cases where OpCodes.Tailcall IL prefix is used to force taicalling. The prefix basically says "please tailcall this even if that might be not profitable since we expect a long chain of calls".

This is a part that is not expressible in C#. Unlike inlining, which can be forced via attributes, tailcalling cannot be currently forced. If one needs to write the code like emitted by EmitMethodCall, he cannot use C#.

In this light it might be interesting to consider options *3 or *4 because, while rare, there are indeed known cases when user has reasons to force tailcalling.

0 replies

gafter · 2015-03-13T00:17:13Z

gafter
Mar 13, 2015
Author

The runtime generally does not do this automatically as that would make it impossible for the runtime to produce stack traces.

0 replies

tmat · 2015-03-13T00:33:13Z

tmat
Mar 13, 2015
Collaborator

Seems like the cleanest way of doing this would be to add MethodImplOptions.AggressiveTailCall (similar to AggressiveInlining). Then either the compiler would emit tailcall prefix or the jit would just understand the flag.

0 replies

gafter · 2015-03-13T00:51:10Z

gafter
Mar 13, 2015
Author

@tmat Yes, that was option (4).

0 replies

tmat · 2015-03-13T01:16:05Z

tmat
Mar 13, 2015
Collaborator

@gafter Kind of. I was trying to point out 1) that we can use an existing attribute and 2) the JIT could do this, not the compiler.

0 replies

gafter · 2015-03-13T01:24:39Z

gafter
Mar 13, 2015
Author

@tmat Oh, I didn't realize a suitable attribute already existed. When you said "add" I thought you meant we'd add the attribute.

Does the jit already do this? I suspect not since it affects the stack trace, but perhaps the attribute is license to do so.

0 replies

leppie · 2015-03-13T06:55:30Z

leppie
Mar 13, 2015

@tmat Just attributing a method is not sufficient. You could have 2 paths where one is tail callable and the other not. This is probably best left to the JIT to decide. Tail calls in most cases are not cheap.

OTOH, if you could explicitly indicate a tail call, the C# compiler could perform tail call elimination (iow turn it into a while loop, like F#/IronScheme does) which is a very good optimization.

0 replies

tmat · 2015-03-13T06:58:46Z

tmat
Mar 13, 2015
Collaborator

@ngafter I meant to add a new value to an existing enum that is used by an existing attribute.

0 replies

paulomorgado · 2015-03-13T15:46:11Z

paulomorgado
Mar 13, 2015

A way of explicitly requesting for tail calls could help on debugging. Maybe some MethodImplOptions.

That way it would be a developer requested action. One would be valid to expect that, if the developer requested, he will know it when debugging. And it will also be explicit for anyone else working on that code.

0 replies

VSadov · 2015-03-13T16:10:10Z

VSadov
Mar 13, 2015
Collaborator

@gafter JIT definitely does tailcals when running optimized code and not debugging.

The following program works and terminates normally on x64

    class Program
    {
        static void Main(string[] args)
        {
            System.Console.WriteLine(Ping(int.MaxValue, 0));
        }

        public static long Ping(int cnt, long val)
        {
            if (cnt-- == 0)
                return val;

            return Pong(cnt, val + cnt);
        }
        public static long Pong(int cnt, long val)
        {
            if (cnt-- == 0)
                return val;

            return Ping(cnt, val + cnt);
        }
    }

Make sure the code is not forced to run on 32bit via "32bit preferred" (where tailcalling is considered generally not profitable) and is not running under debugger (which suppresses JIT optimizations).

===== Output:
2305843005992468481
Press any key to continue . . .

===== the native codegen for the Ping:

            if (cnt-- == 0)
00000000  mov         eax,ecx 
00000002  lea         ecx,[rax-1] 
00000005  test        eax,eax 
00000007  jne         000000000000000D 
                return val;
00000009  mov         rax,rdx 
0000000c  ret 
0000000d  movsxd      rax,ecx 
00000010  add         rdx,rax 
00000013  mov         rax,7FFDDD600088h // address of Pong
0000001d  jmp         rax               // <-- tailcalling Pong here

This does interfere with stack traces/debugging. But that is basically release/debug tradeoff. Besides , there are existing attributes to selectively suppress JIT optimizations if needed.

Note that as long as performance is concerned, JIT can do a better job than C# here by choosing the fastest calling convention conditional on the actual execution environment.
It is only if we have extra whole-program knowledge that Ping/Pong are mutually recursive and want to ensure that this code works regardless whether under debugger or on x86, then we want a way to force tailcalls

0 replies

Sebazzz · 2015-04-04T10:04:34Z

Sebazzz
Apr 4, 2015

Isn't the point here that the generation (or not) of the tail call should be deterministic, and not based on environment or JIT. The only way to guarantee tail call is if the compiler generates them, not if the JIT does (which cannot be predicated).

0 replies

sharwell · 2015-04-04T15:53:06Z

sharwell
Apr 4, 2015
Collaborator

@gafter I think the first thing you need to decide is whether you want C# developers to be able to use tail call elimination as a key factor in algorithm implementation. For this to happen, developers must be able to rely on tail call elimination, which means two things:

The compiler must insert the tail. prefix on these calls
The source code must be able to express a semantic requirement that tail. be inserted on a specific call

If you don't want to provide the above, then really what you have is a behind-the-scenes optimization like inlining that C# developers may benefit from but shouldn't assume will or will not happen during any particular execution of an application.

Also keep in mind stack traces during most production executions were not deterministic, so changing them isn't really a breaking change. They already changed in the past.

1 reply

madelson Jan 21, 2023

The source code must be able to express a semantic requirement that tail. be inserted on a specific call

This is really important. Because of stack overflow, the presence of this optimization being applied or not can make the different between a program working vs. crashing. If I had some code that relied on tail calls to avoid stack overflow, I'd want a way to have the compiler assert that it is actually generating tail instructions so that if someone comes along later and breaks the tail call they'll know about it.

MkazemAkhgary · 2017-12-16T13:15:53Z

MkazemAkhgary
Dec 16, 2017

@psibernetic i think it doesn't require keyword or special syntax. because this is only an optimization on certain type of recursive methods that can be replaced by loops or jumps.

@gafter we are nearly in 2018 and this is not yet supported? by the way to me option 2 and 4 seems most desirable.

option 4 has no critical drawbacks, option 5 is to introduce compiler option.

0 replies

jaykrell · 2018-07-21T04:44:23Z

jaykrell
Jul 21, 2018
Collaborator

This is a surprisingly complex topic and many people (including me?) only understand some of it.
People often ignore one part and exaggerate another.

Let me give what I know.

Sometimes tailcall is a performance win-win.
It can save CPU.
jmp is cheaper than call/ret
It can save stack.
Touching less stack makes for better locality.

Sometimes tailcall is a performance loss, stack win.
The CLR has a complex mechanism in which to pass more parameters
to the callee than the caller recieved. I mean specifically
more stack space for parameters. This is slow. But it conserves stack.
It will only do this with the tail. prefix.

If the caller parameters are stack-larger than callee parameters,
it usually a pretty easy win-win transform.
There might be factors like parameter-position changing from managed
to integer/float, and generating precise StackMaps and such.

Now, there is another angle, that of algorithms that demand tailcall
elimination, for purposes of being able to process arbitrarily large
data with fixed/small stack. This is not about performance, but about
ability to run at all.

People spoke of loops.
One must consider that tailcalls can be self-tailcalls or tailcall-to-other.
Typically, "when it matters", tailcall-to-other is a mutual recursion of 2
or more functions, forming a loop.

For portability, across languages (C#, Java, C++), runtimes (Mono, CoreCLR),
compiler and runtime optimization flags, it is difficult to rely on opportunistic
tailcall elimination. I have not heard of anything but .NET even having
the tail. prefix and its "strong encouragement". But perhaps Java has it?
I understand Scheme mandates, but what are the Scheme implementations?
For example, anything translating to C++ will be at the mercy of that which
they do not control, unless they transform to loops.

Some C++ programmers as well, are trained to avoid recursion, in order to use
bounded machine stack. Perhaps perhaps that is the best answer for its portability
and performance.

But it'd still be nice to see C# attempt to address this area more formally,
to allow the idiom of self-recursion and mutual-recursion to be executable
in fixed amount of stack.

While F# might be a great language, one should not have to leave C# to gain this feature.
I kinda wish there was some delineation in the language of "fast" and "expensive" tailcalls.
But it isn't just the stacked parameter size.
But maybe this is too low level for a programmer to think about.
The runtimes also have "extra parameters" for generics and, at least Mono, interface calls,
and "generic shared with value type", which provide additional tailcall challenges on some
architectures, e.g. arm32.

0 replies

jaykrell · 2018-12-13T21:40:28Z

jaykrell
Dec 13, 2018
Collaborator

There are 2 kinds of tailcalls in .NET. There are fast ones, strictly faster & conserve stack. And there are slow ones that are slow and conserve stack.

You will only get the slow ones with explicit tail. prefix.

C++ surely only will do the fast ones.

The slow ones are very complicated, involving helpers and stack unwind.
They handle arbitrary caller/callee signature mismatch. I didn’t think this was possible but it is.

The fast ones are very simple, for when caller & callee signature match or roughly match. This is what most people probably think of.

0 replies

orthoxerox · 2018-12-14T07:29:51Z

orthoxerox
Dec 14, 2018

@jaykrell Are the fast ones you're taking about the jmp <method> ones?

0 replies

YairHalberstadt · 2019-01-07T22:16:11Z

YairHalberstadt
Jan 7, 2019
Collaborator

There are two different things being discussed here. One is emitting .tail calls in the compiler, and the other is tail call elimination by the compiler. Which one was this issue intended to discuss, or were they both on the table?

Is it worth splitting them out into 2 separate issues?

0 replies

orthoxerox · 2019-01-08T10:28:10Z

orthoxerox
Jan 8, 2019

@YairHalberstadt .tail calls are probably the wrong way forward. They are ignored by the runtime in debug configuration, compress the call stack and make it impossible to "break on the closing brace".

I'd rather have compiler support for explicitly self-recursive methods that are rewritten into loops, a la F#.

0 replies

YairHalberstadt · 2019-01-08T10:58:39Z

YairHalberstadt
Jan 8, 2019
Collaborator

@orthoxerox
I agree. But this discussion doesn't make that clear. Is it worth creating a separate issue to track tail call elimination in the compiler via an attribute?

0 replies

YairHalberstadt · 2019-03-03T18:14:02Z

YairHalberstadt
Mar 3, 2019
Collaborator

Handling local tail calls is an uninteresting corner case that can be done by the front-end compiler and has no effect on Roslyn as a consequence so I don't think it is worth discussing here.

Roslyn includes two front end compilers. One for C# and one for VB

0 replies

samuel-vidal · 2019-05-17T23:05:59Z

samuel-vidal
May 17, 2019

following comment moved from: #2304 (comment)_

I think it is an excellent idea

With C# supporting a very confortable style of functional programming it becomes increasingly valuable to have a construct to help guaranteeing stack safety.

If we agree this is the purpose of that proposal, I think we should precise its semantic as follows:

The proposed 'tail' syntax should mean exactly that : the compiler is asked to check that 1) the call is tail and 2) guarantee that the call will be optimized (either by the Jit through emitting the necessary IL or otherwise).

In my opinion it would be preferable to make the IL .tail work than to make it a compiler code rewrite (this is way too narrow, and way more complex).

A tail call is one that is performed just before exiting a method. Either returning (nothing) directly if that call returns void or return the result of that call in case it returns something. No operation on the return value should be performed (not even implicit cast, although upcast should be fine).

(In particular, not all recursive call are tail-recursive, and not all tail-calls need to be part of recursion.)

An other key point is that for a call to be tail, both methods have to share the same return type (can be relaxed a bit with covariance).

Also, assuming the criteria is implemented in the compiler, the .tail IL should be emitted even in the absence of the tail constraint.

proposed syntax :
tail qualifiedmethodname(parameters); // (the syntax is not tied to the return statement).

examples

void method1()
{
	tail method1();		// tail call
	return;
	...
	method2();		// tail call
	return;
}

void method2()
{
	method1();		// tail call
	return;
	...
	method3();		// Not tail
	return;
}

int method3()
{
	method1();		// Not tail
	return 1;
	...
	return method3();		// tail call
	...
	return method4() + 1;		// Not tail
}

int method4()
{
	return tail method3();		// tail call
	...
	tail method3();		// compilation fails with an error (or a warning)
	return 1;
	...
	retrun x == j ? tail method3() : method4() + 1;	// only the first call is tail.
}

A few points:

The real difficulty here, I think, will be the interaction with ref/in/out arguments (let alone stack alloc etc) rather than the tail call being to the same method.
A point to have in mind is that a virtual method calling itself is actually performing another virtual dispatch (could be confusing for the user).

Scope : The compilation should fail or produce a warning if the condition for guaranteeing tail call optimization by the Jit are not met (or too hard to figure out e.g. in the case of in/out/ref args etc).

0 replies

gafter · 2019-05-19T22:23:25Z

gafter
May 19, 2019
Author

/cc @agocke

0 replies

Korporal · 2019-08-03T18:59:11Z

Korporal
Aug 3, 2019

Isn't the primary motive for tail recursion to avoid stack overflow? Recursion is the norm in functional languages but far less common in most imperative languages. So just how many people or scenarios would benefit from this?

This sound like a feature that would benefit just a tiny fraction of use cases where C# is used. The effort should be devoted to more pressing and immediate language enhancements that yield a bigger payoff.

0 replies

spydacarnage · 2019-08-03T19:30:44Z

spydacarnage
Aug 3, 2019

I would use recursion a lot more if tail recursion was a thing we could have

0 replies

gafter · 2019-08-03T21:51:08Z

gafter
Aug 3, 2019
Author

@Korporal Many Roslyn compiler failures can be traced to the fact that the compilers are recursive on the shape of the program. Eliminating unnecessary frames would relax or eliminate many program size and shape constraints that cause the compiler to blow up. So it would be at least very useful for the Roslyn compiler. And every user of the Roslyn compiler, even if they don't use the feature directly.

0 replies

AartBluestoke · 2019-08-06T23:24:29Z

AartBluestoke
Aug 6, 2019

The problem is that the following pattern is fundamentally dangerous unless you can GUARANTEE either:
a) quick termination of recursion (limited by an unknown stack size limitation)
b) --an-eventual-termination-- very deep recursion , AND tail recursion
edit: responding to the follow-up comment by awshelley-beckman - the guarantee was concerning the picking of A or B, not the termination of the recursion, we do a) and explode with stack overflow at the moment, and b) tail recursion would allow potential infinite recursive functional programming patterns.

Because the first eliminates a large number of use cases (would you annotate a function "probably crashes with stack overflow if passed a list of more than 32k elements, or half that if you use this in ASP.Net")

This feature would be great. A DEMAND for tail recursion or compilation failure/warning? would allow us to actually use "potentially" tail recursive scenarios when you know it could loop 1 million times.

At the moment even if roslyn reliably supported tail recursion, Using tail recursion would still be a "gosh, i hope this compile actually produces the specific byte code pattern that i REQUIRE for sane program behaviour".

When the creator of a program requires specific compile pattern for correctness, it would be helpful if the language/ compiler supports that request.

This results in recommendations to not use tail recursion unless your algorithm is O(log n) (eg https://softwareengineering.stackexchange.com/a/189537/111157 )

Code pattern many people would like to guarantee to be safe:

thing f(something){
..
return f(modifiedSomething);
}

3 replies

awshelley-beckman Mar 19, 2021

b) an eventual termination, AND tail recursion

I would argue that this isn't necessary. The language has no problem with while (true) loops that never terminate. Why should tail call recursion be any different?

At the moment even if roslyn reliably supported tail recursion, Using tail recursion would still be a "gosh, i hope this compile actually produces the specific byte code pattern that i REQUIRE for sane program behaviour".

Perhaps a special "tail" keyword could be added, which tells the compiler that you want this call to be a tail call.

timcassell Feb 12, 2022

Perhaps a special "tail" keyword could be added, which tells the compiler that you want this call to be a tail call.

I like that idea.

agocke Feb 12, 2022
Collaborator

#2304

This comment has been hidden.

Sign in to view

[Proposal] Support tail recursion #2544

Replies: 44 comments · 4 replies

gafter Mar 12, 2015 Author

svick Mar 12, 2015 Collaborator

MattGertz Mar 12, 2015 Collaborator

VSadov Mar 12, 2015 Collaborator

gafter Mar 13, 2015 Author

tmat Mar 13, 2015 Collaborator

gafter Mar 13, 2015 Author

tmat Mar 13, 2015 Collaborator

gafter Mar 13, 2015 Author

tmat Mar 13, 2015 Collaborator

VSadov Mar 13, 2015 Collaborator

sharwell Apr 4, 2015 Collaborator

jaykrell Jul 21, 2018 Collaborator

jaykrell Dec 13, 2018 Collaborator

YairHalberstadt Jan 7, 2019 Collaborator

YairHalberstadt Jan 8, 2019 Collaborator

YairHalberstadt Mar 3, 2019 Collaborator

gafter May 19, 2019 Author

gafter Aug 3, 2019 Author

agocke Feb 12, 2022 Collaborator

This comment has been hidden.

Replies: 44 comments 4 replies

gafter
Mar 12, 2015
Author

svick
Mar 12, 2015
Collaborator

MattGertz
Mar 12, 2015
Collaborator

VSadov
Mar 12, 2015
Collaborator

gafter
Mar 13, 2015
Author

tmat
Mar 13, 2015
Collaborator

gafter
Mar 13, 2015
Author

tmat
Mar 13, 2015
Collaborator

gafter
Mar 13, 2015
Author

tmat
Mar 13, 2015
Collaborator

VSadov
Mar 13, 2015
Collaborator

sharwell
Apr 4, 2015
Collaborator

jaykrell
Jul 21, 2018
Collaborator

jaykrell
Dec 13, 2018
Collaborator

YairHalberstadt
Jan 7, 2019
Collaborator

YairHalberstadt
Jan 8, 2019
Collaborator

YairHalberstadt
Mar 3, 2019
Collaborator

gafter
May 19, 2019
Author

gafter
Aug 3, 2019
Author

agocke Feb 12, 2022
Collaborator