MethodImplOptions.Flatten? #89789

risc26z · 2023-08-01T16:15:35Z

I wonder if it would be possible to add 'Flatten' to the MethodImplOptions enum? A similar attribute is found in most C/C++ compilers. When method A is marked as 'flattened', it hints to the compiler that it should inline methods called by A, whether or not those methods are themselves marked as inline.

This is often very efficient, as it means that the performance-critical paths can be inlined very aggressively while less critical parts of the program needn't have their calls inlined at all. At present, methods marked as AggressiveInlining are theoretically inlined everywhere they're used, whether or not it matters.

Ideally, there would also be an AggressiveFlatten, which would inline recursively, with much greedier inlining than AggressiveInlining would ordinarily produce. In effect, this would say to the JIT, "this is going to be a much bigger method than you usually want to generate, but trust me on this." A use case I have in mind is any kind of interpreter of the 'fetch & big switch' type, for which it's very difficult to generate efficient code at present.

ghost · 2023-08-01T16:27:04Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

I wonder if it would be possible to add 'Flatten' to the MethodImplOptions enum? A similar attribute is found in most C/C++ compilers. When method A is marked as 'flattened', it hints to the compiler that it should inline methods called by A, whether or not those methods are themselves marked as inline.

This is often very efficient, as it means that the performance-critical paths can be inlined very aggressively while less critical parts of the program needn't have their calls inlined at all. At present, methods marked as AggressiveInlining are theoretically inlined everywhere they're used, whether or not it matters.

Ideally, there would also be an AggressiveFlatten, which would inline recursively, with much greedier inlining than AggressiveInlining would ordinarily produce. In effect, this would say to the JIT, "this is going to be a much bigger method than you usually want to generate, but trust me on this." A use case I have in mind is any kind of interpreter of the 'fetch & big switch' type, for which it's very difficult to generate efficient code at present.

Author:	risc26z
Assignees:	-
Labels:	`area-CodeGen-coreclr`, `untriaged`
Milestone:	-

danmoseley · 2023-08-01T16:28:33Z

Is it a scenario to want to inline only specific callees into a method? Or in that case you refactor.

SingleAccretion · 2023-08-01T16:31:49Z

A use case I have in mind is any kind of interpreter of the 'fetch & big switch' type, for which it's very difficult to generate efficient code at present.

The Jit has internal limits on optimization, so any strategies which encourage very large method formation are likely to actually be deoptimizations. One can read the above as the Jit being generally bad at compiling "interpreter loops", and this is true (even in the absense of limits, linear register allocation can be not great in such methods).

API-wise, we are unlikely to use a MethodImpl bit for such a toggle as there are not many free ones left.

jkotas · 2023-08-01T17:35:17Z

We have MethodImplOptions.AggressiveOptimization that was meant for very similar purpose.

AndyAyersMS · 2023-08-01T20:38:37Z

Also note PGO can accomplish more or less the same thing.

The inliner is quite a bit more aggressive with PGO and has a pretty good idea which callees are important.

huoyaoyuan · 2023-08-02T03:01:42Z

Also worth point out that some large and rarely used callees are intentionally not inlined.

risc26z · 2023-08-02T10:44:16Z

SingleAccretion wrote:

The Jit has internal limits on optimization, so any strategies which encourage very large method formation are likely to actually be deoptimizations.

Most of the time, it looks as though those limits/heuristics work really well, as is evidenced by the performance of the system. But there are a few situations when it'd be really helpful to be able to say, please boost those limits here. And there are situations such as AOT where there's plenty of time to do really expensive in-lining.

One can read the above as the Jit being generally bad at compiling "interpreter loops", and this is true (even in the absense of limits, linear register allocation can be not great in such methods).

I've done some limited tests of inlining by copy and paste. It's a maintenance headache, obviously, but performance is generally acceptable. It's not as good as a C++ compiler would produce, but it wouldn't be reasonable to expect that from a JIT.

.Net is a wonderful platform for implementing virtual machines (in the loose sense: anywhere there's a "analyse and plan, then execute multiple times" pattern). It has almost everything you might want: good performance, a rich library, reflection for diagnostics, and the ability, where appropriate, to generate code at runtime that's fully integrated with the system. But the performance of interpreter-style loops is the weak point.

To offer some random ideas that may or may not be practical: if it were possible to AOT compile part of the program (say, one assembly) in a highly optimised way, while still benefiting from the JIT elsewhere, that might work. (And the 'flatten' tag could be then utilised by the AOT compiler but ignored by the JIT.) Or, could there be a kind of DynamicMethod.CreateDelegateViaLLVM that could perform expensive optimisations? One could then generate at runtime a delegate that invokes the interpreter with lots of optimisations enabled.

risc26z · 2023-08-02T10:49:12Z

We have MethodImplOptions.AggressiveOptimization that was meant for very similar purpose.

Would it be possible to improve the documentation? At present, it's rather vague about what sort of optimisations it actually does. It'd be useful to know that it affects the potential inlining of called methods.

ghost added the untriaged New issue has not been triaged by the area owner label Aug 1, 2023

dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 1, 2023

danmoseley added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 1, 2023

SingleAccretion added this to the Future milestone Aug 1, 2023

ghost removed the untriaged New issue has not been triaged by the area owner label Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MethodImplOptions.Flatten? #89789

MethodImplOptions.Flatten? #89789

risc26z commented Aug 1, 2023

ghost commented Aug 1, 2023

danmoseley commented Aug 1, 2023

SingleAccretion commented Aug 1, 2023

jkotas commented Aug 1, 2023 •

edited

Loading

AndyAyersMS commented Aug 1, 2023

huoyaoyuan commented Aug 2, 2023

risc26z commented Aug 2, 2023

risc26z commented Aug 2, 2023

MethodImplOptions.Flatten? #89789

MethodImplOptions.Flatten? #89789

Comments

risc26z commented Aug 1, 2023

ghost commented Aug 1, 2023

danmoseley commented Aug 1, 2023

SingleAccretion commented Aug 1, 2023

jkotas commented Aug 1, 2023 • edited Loading

AndyAyersMS commented Aug 1, 2023

huoyaoyuan commented Aug 2, 2023

risc26z commented Aug 2, 2023

risc26z commented Aug 2, 2023

jkotas commented Aug 1, 2023 •

edited

Loading