Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upProper tail calls #1888
Conversation
DemiMarie
added some commits
Feb 7, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
camlorn
Feb 7, 2017
It should be possible to implement tail calls as some sort of transformation in rustc itself, predicated on the backend allowing manipulation of the stack and supporting some form of goto. I assume that WebAssembly at least would allow for us to write our own method calls, but haven't looked at it.
C is a problem, save in the case that the become keyword is used on the function we are in (there is a name for this that I'm forgetting). In that case, you may be able to just reassign to the arguments and reuse variables. More advanced constructions might be possible by wrapping the tail calls in some sort of outer running loop and returning codes as to which to call next, but this doesn't adequately address how to go about passing arguments around. I wouldn't rule out being able to support this in ANSI C, it's just incredibly tricky.
camlorn
commented
Feb 7, 2017
|
It should be possible to implement tail calls as some sort of transformation in rustc itself, predicated on the backend allowing manipulation of the stack and supporting some form of goto. I assume that WebAssembly at least would allow for us to write our own method calls, but haven't looked at it. C is a problem, save in the case that the become keyword is used on the function we are in (there is a name for this that I'm forgetting). In that case, you may be able to just reassign to the arguments and reuse variables. More advanced constructions might be possible by wrapping the tail calls in some sort of outer running loop and returning codes as to which to call next, but this doesn't adequately address how to go about passing arguments around. I wouldn't rule out being able to support this in ANSI C, it's just incredibly tricky. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
camlorn
Feb 7, 2017
Okay, an outline for a specific scheme in C:
-
Post-monomorphization, find all functions that use become. Build a list of the possible tail calls that may be reached from each of these functions.
-
Declare a C variable for all variables in all the functions making up the set. Ad a variable,
code, that says which function we're in.codeof 0 means we're done. Add another variable,ret, to hold the return value. -
Give each function a nonzero code and copy the body into a while loop that dispatches based on the
codevariable. When the while loop exits, returnret. Any instances of return are translated into an assignment toretand settingcodeto 0. -
When entering a tailcall function, redirect the call to the special version instead.
I believe this scheme works in all cases. We can deal with the issue of things that have Drop impls by dropping them: the variable can stay around without a problem, as long as the impl gets called. The key point is that we're declaring the slots up front so that the stack doesn't keep growing. The biggest disadvantage is that we have to declare all the slots for all the functions, and consequently the combined stack frame is potentially (much?) larger than if we had done it in the way the RFC currently proposes. If making sure there is parody in terms of performance is a concern, this could be the used scheme in all backends. Nonetheless, it works for any backend which C can be compiled to.
Unless I'm missing something obvious, anyway.
camlorn
commented
Feb 7, 2017
|
Okay, an outline for a specific scheme in C:
I believe this scheme works in all cases. We can deal with the issue of things that have Unless I'm missing something obvious, anyway. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
glaebhoerl
Feb 7, 2017
Contributor
@camlorn Does that work for function pointers? I don't immediately see any reason it wouldn't, just the "Build a list of the possible tail calls that may be reached from each of these functions." snippet which sticks out otherwise, because in the interesting cases it's presumably "all of them"?
(Also, this feels very like defunctionalization? Is it?)
|
@camlorn Does that work for function pointers? I don't immediately see any reason it wouldn't, just the "Build a list of the possible tail calls that may be reached from each of these functions." snippet which sticks out otherwise, because in the interesting cases it's presumably "all of them"? (Also, this feels very like defunctionalization? Is it?) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
camlorn
Feb 7, 2017
@glaebhoerl
Can you become a function pointer? This was not how I read the RFC, though it would make sense if this were the case. Nonetheless, you are correct: if function pointers are allowed, this probably does indeed break my scheme. It might be possible to get around it, somehow.
I don't know what defunctionalization is. Is this defunctionalization? I'll get back to you once I learn a new word.
camlorn
commented
Feb 7, 2017
|
@glaebhoerl I don't know what defunctionalization is. Is this defunctionalization? I'll get back to you once I learn a new word. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ranma42
Feb 7, 2017
Contributor
Is there any benchmarking data regarding the callee-pops calling convention?
AFAICT Windows uses such a calling convention stdcall for most APIs.
I have repeatedly looked for benchmarks comparing stdcall to cdecl, but I have only found minor differences (in either direction, possibly related to the interaction with optimisations) and I was unable to find something providing a conclusive answer on which one results in better performance.
|
Is there any benchmarking data regarding the callee-pops calling convention? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
camlorn
Feb 7, 2017
@ranma42
I'm not sure why there would be a difference: either you do your jmp for return and then pop or you pop and then do your jmp for return, but in either case someone is popping the same amount of stuff?
Also, why does it matter here?
camlorn
commented
Feb 7, 2017
|
@ranma42 Also, why does it matter here? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
camlorn
Feb 7, 2017
@glaebhoerl
Apparently today is idea day:
Instead of making the outer loop be inside a function that declares all the needed variables, make the outer loop something that expects a struct morally equivalent to the tuple (int code, void* ptr, void* args), then have it cast ptr to the appropriate function pointer type by switching on code, cast args to a function-pointer-specific argument structure, then call the function pointer. It should be possible to get the args struct to be inline as opposed to an additional level of indirection somehow, but I'm not sure how to do it without violating strict aliasing. This has the advantage of making the stack frame roughly the same size as what it would be in the LLVM backend, but the disadvantage of being slower (but maybe we can sometimes use the faster while-loop with switch statement approach).
I don't think this is defunctionalization, based off a quick google of that term.
camlorn
commented
Feb 7, 2017
|
@glaebhoerl Instead of making the outer loop be inside a function that declares all the needed variables, make the outer loop something that expects a struct morally equivalent to the tuple I don't think this is defunctionalization, based off a quick google of that term. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ranma42
Feb 7, 2017
Contributor
@camlorn That is my opinion, too, but it is mentioned as "one major drawback of proper tail calls" in the current RFC
|
@camlorn That is my opinion, too, but it is mentioned as "one major drawback of proper tail calls" in the current RFC |
0000-template.md
| Later phases in the compiler assert that these requirements are met. | ||
| New nodes are added in HIR and HAIR to correspond to `become`. In MIR, however, | ||
| a new flag is added to the `TerminatorKind::Call` varient. This flag is only |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
DemiMarie
Feb 7, 2017
@camlorn @ranma42 The drawback of a callee-pops calling convention is that for caller-pops calling conventions, much of the stack pointer motion can be eliminated by the optimizer, since it is all in one function. However, with a callee-pops calling convention, you might be able to do the same thing in the callee – but I don't think you gain anything except on Windows, due to the red zone which Windows doesn't have.
I really don't know what I am talking about on the performance front though. Easy way to find out would be to patch the LLVM bindings that Rust uses to always enable tail calls at the LLVM level, then build the compiler, and finally see if the modified compiler is faster or slower than the original.
DemiMarie
commented
Feb 7, 2017
•
|
@camlorn @ranma42 The drawback of a callee-pops calling convention is that for caller-pops calling conventions, much of the stack pointer motion can be eliminated by the optimizer, since it is all in one function. However, with a callee-pops calling convention, you might be able to do the same thing in the callee – but I don't think you gain anything except on Windows, due to the red zone which Windows doesn't have. I really don't know what I am talking about on the performance front though. Easy way to find out would be to patch the LLVM bindings that Rust uses to always enable tail calls at the LLVM level, then build the compiler, and finally see if the modified compiler is faster or slower than the original. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
DemiMarie
Feb 7, 2017
@camlorn My intent was that one can become any function or method that uses the Rust ABI or the rust-call ABI (both of which lower to LLVM fastcc), provided that the return types match. Haven't thought about function pointers, but I believe that tail calls on trait object methods are an equivalent problem.
DemiMarie
commented
Feb 7, 2017
|
@camlorn My intent was that one can |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
camlorn
Feb 7, 2017
@DemiMarie
Good point. They are.
I think my latest idea works out, but I'm not quite sure where you put the supporting structs without heap allocation. I do agree that not being able to do it in all backends might sadly be a deal breaker.
Is there a reason that Rustc doesn't already always enable tail calls in release mode?
camlorn
commented
Feb 7, 2017
|
@DemiMarie I think my latest idea works out, but I'm not quite sure where you put the supporting structs without heap allocation. I do agree that not being able to do it in all backends might sadly be a deal breaker. Is there a reason that Rustc doesn't already always enable tail calls in release mode? |
0000-template.md
| [implementation]: #implementation | ||
| A current, mostly-functioning implementation can be found at | ||
| [DemiMarie/rust/tree/explicit-tailcalls](/DemiMarie/rust/tree/explicit-tailcalls). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
cramertj
Feb 8, 2017
Member
Is there any particular reason this RFC specifies that become should be implemented at an LLVM level rather than through some sort of MIR transformation? I don't know how they work, but it seems like maybe StorageLive and StorageDead could be used to mark the callee's stack as expired prior to the function call.
|
Is there any particular reason this RFC specifies that |
archshift
suggested changes
Feb 8, 2017
Just a note:
You shouldn't be changing the template file, but rather copying the template to a new file (0000-proper-tail-calls.md) and changing that!
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
archshift
Feb 8, 2017
Contributor
I wonder if one can simulate the behavior of computed goto dispatch using these tail calls. That would be pretty neat indeed!
|
I wonder if one can simulate the behavior of computed goto dispatch using these tail calls. That would be pretty neat indeed! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
DemiMarie
Feb 8, 2017
@archshift There is a better way to do that (get rustc to emit the appropriate LLVM IR for a loop wrapped around a match when told to do so, perhaps by an attribute).
DemiMarie
commented
Feb 8, 2017
|
@archshift There is a better way to do that (get rustc to emit the appropriate LLVM IR for a loop wrapped around a match when told to do so, perhaps by an attribute). |
DemiMarie
added some commits
Feb 8, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
DemiMarie
commented
Feb 8, 2017
|
@archshift done. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Stebalien
Feb 8, 2017
Contributor
As a non-FP/non-PL person, it would be really nice to see some concrete examples of where become is nicer than a simple while loop. Personally, I only ever use recursion when I want a stack.
|
As a non-FP/non-PL person, it would be really nice to see some concrete examples of where |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ranma42
Feb 8, 2017
Contributor
@Stebalien a case where they are typically nicer than a loop is when they are used to encode (the states of a) state machine. That is because instead of explicitly looping and changing the state, it is sufficient to call the appropriate function (i.e. the state is implicitly encoded by the function being run at that time). Note that this often makes it easier for the compiler to detect optimisation opportunities, as in some cases a state can trivially be inlined.
|
@Stebalien a case where they are typically nicer than a loop is when they are used to encode (the states of a) state machine. That is because instead of explicitly looping and changing the state, it is sufficient to call the appropriate function (i.e. the state is implicitly encoded by the function being run at that time). Note that this often makes it easier for the compiler to detect optimisation opportunities, as in some cases a state can trivially be inlined. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Stebalien
Feb 8, 2017
Contributor
@ranma42 I see. Usually, I'd just put the state in an enum and use a while + match loop but I can see how become with a bunch of individual functions could be cleaner. Thanks!
|
@ranma42 I see. Usually, I'd just put the state in an enum and use a |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sgrif
Feb 8, 2017
Contributor
Should this RFC include at least one example of what this syntax looks like in use? (e.g. an entire function body)
|
Should this RFC include at least one example of what this syntax looks like in use? (e.g. an entire function body) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
arthurprs
Feb 8, 2017
A good example snippet would go a long way.
arthurprs
commented
Feb 8, 2017
|
A good example snippet would go a long way. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
DemiMarie
Feb 8, 2017
Pinging @thepowersgang because they are the only person working on an alternative Rust compiler to the best of my knowledge, and because since their compiler (mrustc) compiles via C they would need to implement one of the above solutions.
DemiMarie
commented
Feb 8, 2017
|
Pinging @thepowersgang because they are the only person working on an alternative Rust compiler to the best of my knowledge, and because since their compiler (mrustc) compiles via C they would need to implement one of the above solutions. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mrhota
Feb 9, 2017
Isn't it clearer and more correct to call this explicit tail call optimization/elimination? A "proper" or "explicit" tail call is nothing more than a tail call, perhaps with explicit annotation.
But what the RFC discusses is optimizing explicitly annotated tail calls, right?
My confusion compounds: in the Portability section, we learn that LLVM does not support "proper tail calls" for MIPS and WebAssembly. Does that mean LLVM will not accept a call as the final instruction before a ret on those platforms? Or does that mean that it will not optimize the call in the way described above?
mrhota
commented
Feb 9, 2017
|
Isn't it clearer and more correct to call this explicit tail call optimization/elimination? A "proper" or "explicit" tail call is nothing more than a tail call, perhaps with explicit annotation. But what the RFC discusses is optimizing explicitly annotated tail calls, right? My confusion compounds: in the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mrhota
Feb 9, 2017
Do we want to support optimizing explicit mutual tail recursion? If so, can we see an example in the RFC using become?
mrhota
commented
Feb 9, 2017
|
Do we want to support optimizing explicit mutual tail recursion? If so, can we see an example in the RFC using |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
DemiMarie
Feb 9, 2017
@mrhota LLVM will fail to turn a call into a jump for MIPS and WebAssembly, even if the call has the tail prefix and is followed immediately by a ret.
DemiMarie
commented
Feb 9, 2017
|
@mrhota LLVM will fail to turn a call into a jump for MIPS and WebAssembly, even if the |
aturon
added
the
T-lang
label
Feb 9, 2017
aturon
assigned
solson and
eddyb
Feb 9, 2017
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mrhota
Feb 10, 2017
@DemiMarie I see. My nit was with the choice of terminology. call ...; ret (or tail call ...; ret) is the very definition of a ("proper") tail call, no matter what LLVM or some other compiler does (or doesn't do) with it. Optimizing it to a jump is called tail call optimization/elimination.
mrhota
commented
Feb 10, 2017
|
@DemiMarie I see. My nit was with the choice of terminology. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
andrestesti
Feb 10, 2017
A #[tailrec] attribute decorating the function doesn't require a reserved word, and is easier to read and declare than a intrusive statement like become. It is also declarative, you don't need to modify your executable code to check an optimization. If the compiler couldn't optimize a #[tailrec] function into a loop, it would raise an error (and maybe a suggestion). You will never get a non-optimizable function, since it won't compile. The annotation tailrec works fine in Scala, I think Rust should follow the same approach.
Another quirk with the become keyword, is that it is not symmetric with return keyword omission. It hits against expression/functional code styling, while you are trying to use a very functional construction block as recursion is.
fn foo(x: i32, accu: i32) -> i32 {
if x < 0 {
// return omission, functional like style
foo(-x,1) + accu
} else {
// asimmetry, imperative like style
become foo(x-1, x*accu);
}
}
andrestesti
commented
Feb 10, 2017
•
|
A #[tailrec] attribute decorating the function doesn't require a reserved word, and is easier to read and declare than a intrusive statement like become. It is also declarative, you don't need to modify your executable code to check an optimization. If the compiler couldn't optimize a #[tailrec] function into a loop, it would raise an error (and maybe a suggestion). You will never get a non-optimizable function, since it won't compile. The annotation tailrec works fine in Scala, I think Rust should follow the same approach. fn foo(x: i32, accu: i32) -> i32 {
if x < 0 {
// return omission, functional like style
foo(-x,1) + accu
} else {
// asimmetry, imperative like style
become foo(x-1, x*accu);
}
} |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
thepowersgang
Aug 31, 2017
Contributor
musttail appears to have a very restricted set of valid uses - mainly that the caller and callee must have (almost) the same signature. I assume that outside of that set it can't be defined for all platforms (and will probably error in IR validation?)
- The caller and callee prototypes must match. Pointer types of parameters or return types may differ in pointee type, but not in address space.
|
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jhjourdan
Sep 1, 2017
Right, sorry, I should have read the whole paragraph following musttail in the docs. It seems like the purpose of this attribute is to force tail call optimization even if not using the fastcc calling convention.
But still, the tail attribute does not have this restriction and has the guarantee of succeeding under some not-so-terrible restrictions.
jhjourdan
commented
Sep 1, 2017
|
Right, sorry, I should have read the whole paragraph following But still, the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
DemiMarie
Sep 2, 2017
DemiMarie
commented
Sep 2, 2017
|
And that is the problem.
Personally, I am not sure that this RFC should be accepted unless we can
get LLVM to support tail calls on all platforms, or are willing to use
hand-written assembler trampolines to solve the problem.
On Aug 31, 2017 8:38 AM, "Jiří Zárevúcky" <notifications@github.com> wrote:
@jhjourdan <https://github.com/jhjourdan>
Is there really no way to force LLVM to compile a taill call as a tail call?
Short answer: There isn't.
https://llvm.org/docs/CodeGenerator.html#tail-call-section
Long answer: It's only supported on some platforms, which means
guaranteeing this behavior in Rust would inconveniently restrict the number
of platforms we can support (unless we patch LLVM for it).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1888 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGGWB_jcThqBw387t-sUvQKTqAT3ZDS4ks5sdqlMgaJpZM4L5miN>
.
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
thepowersgang
Sep 2, 2017
Contributor
There is the option of accepting this RFC in a weaker form - where become is an optimisation that moves all destructor calls to before the returning function call (opening up the way for LLVM/other to make it a tail call).
|
There is the option of accepting this RFC in a weaker form - where |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jhjourdan
Sep 2, 2017
I don't think that this is a better solution compared to supporting become only on platforms that do support TCO. Indeed, what you are proposing essentially corresponds to delaying a failure from compile time (i.e., become is rejected) to runtime (i.e., stack overflow), which is usually not Rust's philosophy.
jhjourdan
commented
Sep 2, 2017
|
I don't think that this is a better solution compared to supporting |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rkruppe
Sep 2, 2017
Contributor
Accepting only the become keyword and its largest semantic impact (early drops) right now does not mean we can't later guarantee tail calls, either everywhere or in certain cases (platforms, restrictions on calling conventions, only for known callees, etc.). In the latter case, there could be an opt-in warning/error for code that doesn't fall in those cases, so that programmers can make sure that code they write gets TCO.
And in the mean time, or even afterwards in cases where we don't guarantee TCO, become could still be useful to sometimes allow a slight optimization.
I do think, though, that an accepted RFC along those lines should come with a commitment that some sort of guaranteed TCO is coming (and ideally, an idea of when it will be possible). Reserving a keyword and messing with drop order doesn't seem worthwhile if it would just occasionally, maybe, shave a few instructions off a call.
To be quite honest, after the implementation difficulties described in this thread, I am personally unsure if there realistically can be a satisfactory implementation of guaranteed TCO. At the least, I would consider anything that places strict requirements on function signatures, or introduces costs such as trampolines, to be deeply unsatisfactory. So I'm not really arguing for anything here, I'm just saying provisional acceptance without a settled implementation could potentially be useful.
|
Accepting only the I do think, though, that an accepted RFC along those lines should come with a commitment that some sort of guaranteed TCO is coming (and ideally, an idea of when it will be possible). Reserving a keyword and messing with drop order doesn't seem worthwhile if it would just occasionally, maybe, shave a few instructions off a call. To be quite honest, after the implementation difficulties described in this thread, I am personally unsure if there realistically can be a satisfactory implementation of guaranteed TCO. At the least, I would consider anything that places strict requirements on function signatures, or introduces costs such as trampolines, to be deeply unsatisfactory. So I'm not really arguing for anything here, I'm just saying provisional acceptance without a settled implementation could potentially be useful. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
le-jzr
Sep 3, 2017
I don't see a reason a add a keyword for something you can already do with an extra pair of curly braces. Making sure that rustc generates tail-optimizable code wherever possible is a separate issue, that can be solved without adding keywords (maybe an attribute would work better, like #[tail], used the same as existing #[inline].
le-jzr
commented
Sep 3, 2017
|
I don't see a reason a add a keyword for something you can already do with an extra pair of curly braces. Making sure that |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RalfJung
Sep 3, 2017
Member
I don't see a reason a add a keyword for something you can already do with an extra pair of curly braces.
Can it, though? You also need to manually drop all your arguments that are not forwarded.
Can it, though? You also need to manually drop all your arguments that are not forwarded. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
le-jzr
commented
Sep 3, 2017
|
Good point. I forgot about those. Still, a keyword seems a bit much. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
RalfJung
Sep 3, 2017
Member
However, an attribute doesn't solve the problem where some returns are tail calls and some are not.
|
However, an attribute doesn't solve the problem where some returns are tail calls and some are not. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
cramertj
Sep 3, 2017
Member
I'm not knowledgeable enough about trans to know the answer: would it be possible to compile a tail-recursive function to "musttail" when possible, and a trampoline otherwise? It sounds complicated, for sure, but is it possible? It'd be nice to offer trampolines as a less performant but stack-destroying fallback from native tail calls.
|
I'm not knowledgeable enough about trans to know the answer: would it be possible to compile a tail-recursive function to "musttail" when possible, and a trampoline otherwise? It sounds complicated, for sure, but is it possible? It'd be nice to offer trampolines as a less performant but stack-destroying fallback from native tail calls. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rkruppe
Sep 3, 2017
Contributor
Detecting whether musttail will be applicable seems feasible. However, caller and callee both need to be clued in on the trampoline (you can't call a thunk-returning function expecting it to return its return value, or vice versa), so this strategy would still have some unfortunate limitations. For example, it couldn't in general support tail calls to function pointers or trait object methods (unless we eat the huge cost of generating a trampoline-enabled variant of every function that has its address taken).
Besides these technical problems, I am also philosphically unhappy with aying the cost of trampolines at all, especially if it happens silently and as commonly as it would with the severe restrictions on musttail. While some people may only care about not overflowing the stack, in many cases (e.g., for state machines) the tail call must be cheaper than some alternative implementation strategy (in the state machine case, loop + match) to be really useful.
|
Detecting whether Besides these technical problems, I am also philosphically unhappy with aying the cost of trampolines at all, especially if it happens silently and as commonly as it would with the severe restrictions on |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jhjourdan
Sep 4, 2017
@rkruppe The point is that tail has fewer restrictions than musttail, but still, when the llvm compiler is given the right options on the right architectures (and when using the right calling convention), the tail calls are guaranteed.
jhjourdan
commented
Sep 4, 2017
|
@rkruppe The point is that |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
le-jzr
Sep 4, 2017
(unless we eat the huge cost of generating a trampoline-enabled variant of every function that has its address taken)
A combination of an attribute and a keyword/intrinsic could work. So you'd use become or analogous to make the tail call, but the caller/callee/both would additionally have to be annotated with #[tail], which would generate the necessary sauce on platforms where LLVM doesn't support tail calls natively. Naturally, the compiler would select the most efficient strategy on a given platform.
I don't think the overhead of trampolines is a dealbreaker here. It's better than just failing to compile on some platforms, and vastly better than exhausting stack at runtime.
le-jzr
commented
Sep 4, 2017
•
A combination of an attribute and a keyword/intrinsic could work. So you'd use I don't think the overhead of trampolines is a dealbreaker here. It's better than just failing to compile on some platforms, and vastly better than exhausting stack at runtime. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rkruppe
Sep 4, 2017
Contributor
@rkruppe The point is that tail has fewer restrictions than musttail, but still, when the llvm compiler is given the right options on the right architectures (and when using the right calling convention), the tail calls are guaranteed.
I'm not sure which point you're referring to, are you talking about @cramertj's proposal?
A combination of an attribute and a keyword/intrinsic could work. So you'd use become or analogous to make the tail call, but the caller/callee/both would additionally have to be annotated with #[tail], which would generate the necessary sauce on platforms where LLVM doesn't support tail calls natively. Naturally, the compiler would select the most efficient strategy on a given platform.
This still wouldn't work with unknown callees (as you don't know if the callee has been annotated with that attribute), which is what I was talking about in the part you quote. If the callee is known, you can already generate the trampoline variant lazily (i.e., when there's actually a become call to that callee).
I don't think the overhead of trampolines is a dealbreaker here. It's better than just failing to compile on some platforms, and vastly better than exhausting stack at runtime.
It may not be for your use cases, but as I said, for other use cases -- such as efficient state machines -- trampolines are unsuitable. That is not to say exhausting the stack or failing to compile would be better, but there are other alternatives that are probably faster than use constant stack space, work on all platforms (with consistent performance, unlike sometimes-automagically generated trampolines), and are likely faster than trampolines.
Furthermore, while trampolines are annoying to write by hand, they don't require nearly as much integration with the compiler (edit: ... as proper tail calls) to generate automatically. If you're okay with trampolines, and would be okay with modifying functions that would return thunks (e.g., adding an attribute), you can already generate working trampolines with some macros and slightly uglier syntax. So I am not convinced that we need the become keyword and its semantics implications if it would only satisfy the people who are okay with trampolines.
I'm not sure which point you're referring to, are you talking about @cramertj's proposal?
This still wouldn't work with unknown callees (as you don't know if the callee has been annotated with that attribute), which is what I was talking about in the part you quote. If the callee is known, you can already generate the trampoline variant lazily (i.e., when there's actually a
It may not be for your use cases, but as I said, for other use cases -- such as efficient state machines -- trampolines are unsuitable. That is not to say exhausting the stack or failing to compile would be better, but there are other alternatives that are probably faster than use constant stack space, work on all platforms (with consistent performance, unlike sometimes-automagically generated trampolines), and are likely faster than trampolines. Furthermore, while trampolines are annoying to write by hand, they don't require nearly as much integration with the compiler (edit: ... as proper tail calls) to generate automatically. If you're okay with trampolines, and would be okay with modifying functions that would return thunks (e.g., adding an attribute), you can already generate working trampolines with some macros and slightly uglier syntax. So I am not convinced that we need the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jhjourdan
Sep 4, 2017
I'm not sure which point you're referring to, are you talking about @cramertj's proposal?
No. I am just saying that musttail is not the only way to get guaranteed tail calls in LLVM. More precisely, I am referring to the following paragraph in LLVM docs:
Tail call optimization for calls marked tail is guaranteed to occur if the following conditions are met:
Caller and callee both have the calling convention fastcc. The call is in tail position (ret immediately follows call and ret uses value of call or is void). Option -tailcallopt is enabled, or llvm::GuaranteedTailCallOpt is true. Platform-specific constraints are met.
jhjourdan
commented
Sep 4, 2017
No. I am just saying that
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rkruppe
Sep 4, 2017
Contributor
Okay. I am aware of that, but have (lazily) talked about musttail because @cramertj did. I don't believe the additional cases where TCO is guaranteed by tail significantly shift the balance.
|
Okay. I am aware of that, but have (lazily) talked about |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
scottmcm
Sep 4, 2017
Member
What about an initial version of this that only allows tail recursion, not tail calls? That trivially meets the musttail requirements, and is quite useful when dealing with exclusive borrows, particularly &mut [T].
|
What about an initial version of this that only allows tail recursion, not tail calls? That trivially meets the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
likeabbas
Jan 24, 2018
I apologize if this is not the place for me to post this, but I was wondering if there has been any updates on this RFC? A basic version that only allows for tail recursion as described by @scottmcm would be a huge improvement to the language imo
likeabbas
commented
Jan 24, 2018
|
I apologize if this is not the place for me to post this, but I was wondering if there has been any updates on this RFC? A basic version that only allows for tail recursion as described by @scottmcm would be a huge improvement to the language imo |
isiahmeadows
referenced this pull request
in fantasyland/fantasy-land
Jan 25, 2018
Open
Fantasy Land proposal process for ECMAScript #204
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
nikomatsakis
Jan 26, 2018
Contributor
I think that if we were to do tail-calls, this RFC is roughly how I would want to do them. For example, I prefer the idea of running destructors and dropping state early, and I like using the become keyword.
However, I do not think it's really an option to change to a "callee pops" style. I think it's a crucial selling point of Rust that it compiles down to code that is basically the same as C -- it's ok for us to diverge slightly in our calling convention, but we have to be very careful there, particularly if it can lead to a performance hit.
That said, I'd like a point of clarification. Could we in some way "contain" the callee-pops convention? For example, imagine that we had to declare functions as tail recursive, and that allowed the function to contain a become or to be the target of a become, and we disallow 'indirect' become for now. Then perhaps the "callee pops" effect could be quarantined to the tail recursive bit of your program?
I am still kind of wary in general, just because this seems like a semi-niche feature that will add complexity and maintenance burden across the board. The portability hazards are significant as well. Then again, JS supposedly has tail recursion now, so maybe people are becoming familiar with the concept (and I know some things are much nicer when you can tail recurse). (One final point is that I am not sure how problematic the borrowing restrictions and so forth would prove to be, though obviously I see why we need some such restrictions.)
|
I think that if we were to do tail-calls, this RFC is roughly how I would want to do them. For example, I prefer the idea of running destructors and dropping state early, and I like using the However, I do not think it's really an option to change to a "callee pops" style. I think it's a crucial selling point of Rust that it compiles down to code that is basically the same as C -- it's ok for us to diverge slightly in our calling convention, but we have to be very careful there, particularly if it can lead to a performance hit. That said, I'd like a point of clarification. Could we in some way "contain" the callee-pops convention? For example, imagine that we had to declare functions as tail recursive, and that allowed the function to contain a I am still kind of wary in general, just because this seems like a semi-niche feature that will add complexity and maintenance burden across the board. The portability hazards are significant as well. Then again, JS supposedly has tail recursion now, so maybe people are becoming familiar with the concept (and I know some things are much nicer when you can tail recurse). (One final point is that I am not sure how problematic the borrowing restrictions and so forth would prove to be, though obviously I see why we need some such restrictions.) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
aturon
Feb 7, 2018
Member
Given our goals for 2018, I don't think work in this area is on the docket near-term. Thus, I move to postpone:
@rfcbot fcp postpone
|
Given our goals for 2018, I don't think work in this area is on the docket near-term. Thus, I move to postpone: @rfcbot fcp postpone |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rfcbot
Feb 7, 2018
Team member @aturon has proposed to postpone this. The next step is review by the rest of the tagged teams:
No concerns currently listed.
Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
rfcbot
commented
Feb 7, 2018
•
|
Team member @aturon has proposed to postpone this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
rfcbot
added
the
proposed-final-comment-period
label
Feb 7, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rfcbot
commented
Feb 14, 2018
|
|
rfcbot
added
final-comment-period
and removed
proposed-final-comment-period
labels
Feb 14, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
likeabbas
Feb 23, 2018
There are a few defining characteristics of a language that stick out when I think of a language. How to define a variable, how to write an if statement, and how to iterate through an array. In fact, Rust is introduced by iterating through of a list of greetings. That introduction has a lasting impact in what people believe is the correct way to write Rust code.
With regards to the 2018 goal
Ship an epoch release: Rust 2018
I believe adding Tail Calls would be a one of the most defining characteristics of Rust. It is a characteristic that even more separates Rust from C and C++, and gives it a cleaner feeling I associate with functional programming. If one of the major goals of 2018 is giving Rust it's defining characteristics, then I believe this is an RFC that cannot be ignored.
likeabbas
commented
Feb 23, 2018
|
There are a few defining characteristics of a language that stick out when I think of a language. How to define a variable, how to write an if statement, and how to iterate through an array. In fact, Rust is introduced by iterating through of a list of greetings. That introduction has a lasting impact in what people believe is the correct way to write Rust code. With regards to the 2018 goal
I believe adding Tail Calls would be a one of the most defining characteristics of Rust. It is a characteristic that even more separates Rust from C and C++, and gives it a cleaner feeling I associate with functional programming. If one of the major goals of 2018 is giving Rust it's defining characteristics, then I believe this is an RFC that cannot be ignored. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Pauan
Feb 24, 2018
Member
@nikomatsakis Then again, JS supposedly has tail recursion now, so maybe people are becoming familiar with the concept.
Technically the ES6 spec mandates tail-calls, but the situation in reality is more complicated than that.
The only browser that actually supports tail calls is Safari (and Webkit). And the Edge team has said that it's unlikely that they will implement tail calls (for similar reasons as Rust: they currently use the Windows ABI calling convention, which doesn't work well with tail calls).
Therefore, tail calls in JS is a very controversial thing, even to this day:
Microsoft/ChakraCore#796
kangax/compat-table#819
https://github.com/tc39/proposal-ptc-syntax
tc39/proposal-ptc-syntax#22
https://v8project.blogspot.com/2016/04/es6-es7-and-beyond.html
https://www.chromestatus.com/features/5516876633341952
https://bugs.chromium.org/p/v8/issues/detail?id=4698#c75
https://github.com/rwaldron/tc39-notes/blob/master/es7/2016-05/may-24.md#syntactic-tail-calls-bt
So for now you cannot rely upon tail calls in JS, and given the controversy you might never be able to rely upon them.
Personally I love tail calls, but I can accept the technical reasons for not implementing them.
P.S. Just to be clear, the Edge team is against implicit tail-calls for all functions, but they're in favor of tail-calls-with-an-explicit-keyword (similar to this RFC).
Technically the ES6 spec mandates tail-calls, but the situation in reality is more complicated than that. The only browser that actually supports tail calls is Safari (and Webkit). And the Edge team has said that it's unlikely that they will implement tail calls (for similar reasons as Rust: they currently use the Windows ABI calling convention, which doesn't work well with tail calls). Therefore, tail calls in JS is a very controversial thing, even to this day: Microsoft/ChakraCore#796 So for now you cannot rely upon tail calls in JS, and given the controversy you might never be able to rely upon them. Personally I love tail calls, but I can accept the technical reasons for not implementing them. P.S. Just to be clear, the Edge team is against implicit tail-calls for all functions, but they're in favor of tail-calls-with-an-explicit-keyword (similar to this RFC). |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rfcbot
commented
Feb 24, 2018
|
The final comment period is now complete. |
Centril
added
the
postponed
label
Feb 24, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Closing since FCP with a motion to postpone is now complete. |
Centril
closed this
Feb 24, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
"Postponed" issue - #271. |
petrochenkov
removed
the
postponed
label
Feb 24, 2018
Centril
added
the
postponed
label
May 15, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bbarker
Jun 23, 2018
Possibly of interest: https://twitter.com/edwinbrady/status/1009929654289956869?s=19
bbarker
commented
Jun 23, 2018
|
Possibly of interest: https://twitter.com/edwinbrady/status/1009929654289956869?s=19 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ehaliewicz
Jul 30, 2018
@bbarker yep, that's a similar solution to what Webkit does. It's a classic trick.
And also used by Chicken Scheme.
ehaliewicz
commented
Jul 30, 2018
|
@bbarker yep, that's a similar solution to what Webkit does. It's a classic trick. |
DemiMarie commentedFeb 7, 2017
•
edited
Edited 1 time
-
DemiMarie
edited Feb 8, 2017 (most recent)
Rendered