Clang should provide annotations to force or verify tail call optimization #46765

maximecb · 2020-09-04T17:31:29Z


Bugzilla Link	47421
Resolution	FIXED
Resolved on	Apr 17, 2021 10:35
Version	unspecified
OS	All
CC	@davidbolvansky,@dwblaikie,@DougGregor,@efriedma-quic,@zygoloid
Fixed by commit(s)	rG8344675908424ee532d4ae30e5043c5a5834e02c

Extended Description

It's possible to use tail calls to implement bytecode interpreters in C/C++ in a way that is much cleaner than with computed GOTOs, and potentially more efficient as well. This is because, with computed GOTOs, you end up with a large function body with very many labeled code blocks, such that every code block can potentially jump to every other. This makes it hard for any compiler to do register allocation effectively.

With tail calls, you can define a tail-recursive function for every opcode in your interpreter such that every instruction simply calls the next one. Tail calls also potentially give us a clean ABI (function call boundaries) where an interpreter could efficiently dispatch to code generated by a JIT compiler. However, this is only possible if one can ensure that tail recursion is in fact optimized by the C/C++ compiler.

Clang currently does a fairly good job at optimizing tail calls, however, it's not something developers can rely on. It happens behind the scenes, and to know whether tail call optimization was performed or not, one has to carefully study the disassembly. In our case, we have tried to convert the Ruby interpreter to a tail recursive implementation, and found that tail call optimization was performed for some, but not all Ruby opcodes. The Ruby codebase is very large, which makes it difficult to tell which opcodes did not get optimized and why.

The usefulness of tail call optimization is clearly recognized in other programming languages, but the fact that we currently can't rely on them severely limits their usefulness. It would be very desirable to have some kind of annotation to force the compiler to make specific function calls into tail calls (eg: must_tail), or to force every control flow path leaving a given function to become tail calls, or to at least produce a compilation error if this cannot be done, so that the outcome is visible to the programmer.

I believe it's in the spirit of C to allow "unsafe" operations to be performed if the programmer specifies this is what must happen, and so forcing tail calls seems acceptable to me, so long as function call signatures are compatible, but if that isn't possible, at least forcing a compiler error message (or even just a warning) to be produced would already be a useful step forward.

davidbolvansky · 2020-09-04T17:51:42Z

Maybe there are some optimization remarks in -Rpass=.* / -Rpass-missed=.* related to tail call optimization (so you can see why tco failed for some functions)

maximecb · 2020-09-04T20:18:40Z

Maybe there are some optimization remarks in -Rpass=.* / -Rpass-missed=.*
related to tail call optimization (so you can see why tco failed for some
functions)

I've tried compiling with:
-foptimize-sibling-calls -Rpass='tailcallelim'

Unfortunately, it doesn't tell you every function that got optimized with tail calls, for some reason, only functions that are self-recursive?

I think this approach isn't great. It would be much more robust if there could be annotations in the code. At the very least to verify that tail call optimization happens for a specific function, but ideally to force tail calls in that function.

efriedma-quic · 2020-09-04T23:14:48Z

The IR has support for "musttail", which unconditionally forces a tail call. We currently don't expose that directly to C code, though. Maybe we could expose it somehow, I guess?

Note that this only works if the types of the caller and callee functions precisely match. Would that be sufficient for your use-case?

maximecb · 2020-09-08T17:01:09Z

The IR has support for "musttail", which unconditionally forces a tail call.
We currently don't expose that directly to C code, though. Maybe we could
expose it somehow, I guess?

Note that this only works if the types of the caller and callee functions
precisely match. Would that be sufficient for your use-case?

Yes, that sounds perfect for the use case of implementing an interpreter using tail calls. Our interpreter, and every interpreter looking to do this will likely implement instructions that all have the same function signature.

Exposing "musttail" to force tail calls to happen would do the trick. Is there a way to annotate individual return call() statements in LLVM, or can we only allocate whole functions? Either way, for our use case (and most interpreters), signalling that every return from a given function must be a tail call would be sufficient.

Can you implement something like this? This could definitely have an impact on the way interpreters are typically implemented in C/C++.

efriedma-quic · 2020-09-09T17:43:55Z

It's per-call on the LLVM side.

Strawman syntax: "void foo(int x) { __attribute((musttail)) return bar(x); }"

I don't think I'll have time to work on this in the near future, but I can answer any questions.

maximecb · 2020-09-10T14:31:06Z

It's per-call on the LLVM side.

Strawman syntax: "void foo(int x) { __attribute((musttail)) return bar(x); }"

I don't think I'll have time to work on this in the near future, but I can
answer any questions.

That syntax would work very well for us.

Do you think the clang maintainers would be likely to accept a pull request for this feature? How difficult do you think this is to implement?

efriedma-quic · 2020-09-11T00:44:59Z

Not hard to implement, relatively speaking. It should be possible to leverage the existing attribute infrastructure without making any big changes to existing code.

In terms of how likely it is to be accepted, see https://clang.llvm.org/get_involved.html . I see two sticking points:

It's sort of niche; unless you're writing a few specific sorts of compilers/interpreters, it's not useful. But usefulness isn't just raw number of users; if we expect that major projects like the Ruby interpreter would take advantage of it, that would be enough to justify it.
There isn't any C/C++ standard proposal. But obviously that could change.

Probably would want to send an email to cfe-dev before getting too deep into it.

davidbolvansky · 2021-04-16T00:28:45Z

https://reviews.llvm.org/D99517

So seems like resolved?

maximecb · 2021-04-16T14:36:15Z

https://reviews.llvm.org/D99517

So seems like resolved?

That looks exactly like what we wanted, awesome. This is merged then? It's official?

davidbolvansky · 2021-04-16T15:48:39Z

Yes, it should be in clang 13.

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clang should provide annotations to force or verify tail call optimization #46765

Clang should provide annotations to force or verify tail call optimization #46765

maximecb mannequin commented Sep 4, 2020

davidbolvansky commented Sep 4, 2020

maximecb mannequin commented Sep 4, 2020

efriedma-quic commented Sep 4, 2020

maximecb mannequin commented Sep 8, 2020

efriedma-quic commented Sep 9, 2020

maximecb mannequin commented Sep 10, 2020

efriedma-quic commented Sep 11, 2020

davidbolvansky commented Apr 16, 2021

maximecb mannequin commented Apr 16, 2021

davidbolvansky commented Apr 16, 2021

Clang should provide annotations to force or verify tail call optimization #46765

Clang should provide annotations to force or verify tail call optimization #46765

Comments

maximecb mannequin commented Sep 4, 2020

Extended Description

davidbolvansky commented Sep 4, 2020

maximecb mannequin commented Sep 4, 2020

efriedma-quic commented Sep 4, 2020

maximecb mannequin commented Sep 8, 2020

efriedma-quic commented Sep 9, 2020

maximecb mannequin commented Sep 10, 2020

efriedma-quic commented Sep 11, 2020

davidbolvansky commented Apr 16, 2021

maximecb mannequin commented Apr 16, 2021

davidbolvansky commented Apr 16, 2021