Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MethodImplOptions.Flatten? #89789

Open
risc26z opened this issue Aug 1, 2023 · 8 comments
Open

MethodImplOptions.Flatten? #89789

risc26z opened this issue Aug 1, 2023 · 8 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Milestone

Comments

@risc26z
Copy link

risc26z commented Aug 1, 2023

I wonder if it would be possible to add 'Flatten' to the MethodImplOptions enum? A similar attribute is found in most C/C++ compilers. When method A is marked as 'flattened', it hints to the compiler that it should inline methods called by A, whether or not those methods are themselves marked as inline.

This is often very efficient, as it means that the performance-critical paths can be inlined very aggressively while less critical parts of the program needn't have their calls inlined at all. At present, methods marked as AggressiveInlining are theoretically inlined everywhere they're used, whether or not it matters.

Ideally, there would also be an AggressiveFlatten, which would inline recursively, with much greedier inlining than AggressiveInlining would ordinarily produce. In effect, this would say to the JIT, "this is going to be a much bigger method than you usually want to generate, but trust me on this." A use case I have in mind is any kind of interpreter of the 'fetch & big switch' type, for which it's very difficult to generate efficient code at present.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Aug 1, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Aug 1, 2023
@danmoseley danmoseley added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Aug 1, 2023
@ghost
Copy link

ghost commented Aug 1, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

I wonder if it would be possible to add 'Flatten' to the MethodImplOptions enum? A similar attribute is found in most C/C++ compilers. When method A is marked as 'flattened', it hints to the compiler that it should inline methods called by A, whether or not those methods are themselves marked as inline.

This is often very efficient, as it means that the performance-critical paths can be inlined very aggressively while less critical parts of the program needn't have their calls inlined at all. At present, methods marked as AggressiveInlining are theoretically inlined everywhere they're used, whether or not it matters.

Ideally, there would also be an AggressiveFlatten, which would inline recursively, with much greedier inlining than AggressiveInlining would ordinarily produce. In effect, this would say to the JIT, "this is going to be a much bigger method than you usually want to generate, but trust me on this." A use case I have in mind is any kind of interpreter of the 'fetch & big switch' type, for which it's very difficult to generate efficient code at present.

Author: risc26z
Assignees: -
Labels:

area-CodeGen-coreclr, untriaged

Milestone: -

@danmoseley
Copy link
Member

Is it a scenario to want to inline only specific callees into a method? Or in that case you refactor.

@SingleAccretion
Copy link
Contributor

A use case I have in mind is any kind of interpreter of the 'fetch & big switch' type, for which it's very difficult to generate efficient code at present.

The Jit has internal limits on optimization, so any strategies which encourage very large method formation are likely to actually be deoptimizations. One can read the above as the Jit being generally bad at compiling "interpreter loops", and this is true (even in the absense of limits, linear register allocation can be not great in such methods).

API-wise, we are unlikely to use a MethodImpl bit for such a toggle as there are not many free ones left.

@SingleAccretion SingleAccretion added this to the Future milestone Aug 1, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Aug 1, 2023
@jkotas
Copy link
Member

jkotas commented Aug 1, 2023

We have MethodImplOptions.AggressiveOptimization that was meant for very similar purpose.

@AndyAyersMS
Copy link
Member

Also note PGO can accomplish more or less the same thing.

The inliner is quite a bit more aggressive with PGO and has a pretty good idea which callees are important.

@huoyaoyuan
Copy link
Member

Also worth point out that some large and rarely used callees are intentionally not inlined.

@risc26z
Copy link
Author

risc26z commented Aug 2, 2023

SingleAccretion wrote:

The Jit has internal limits on optimization, so any strategies which encourage very large method formation are likely to actually be deoptimizations.

Most of the time, it looks as though those limits/heuristics work really well, as is evidenced by the performance of the system. But there are a few situations when it'd be really helpful to be able to say, please boost those limits here. And there are situations such as AOT where there's plenty of time to do really expensive in-lining.

One can read the above as the Jit being generally bad at compiling "interpreter loops", and this is true (even in the absense of limits, linear register allocation can be not great in such methods).

I've done some limited tests of inlining by copy and paste. It's a maintenance headache, obviously, but performance is generally acceptable. It's not as good as a C++ compiler would produce, but it wouldn't be reasonable to expect that from a JIT.

.Net is a wonderful platform for implementing virtual machines (in the loose sense: anywhere there's a "analyse and plan, then execute multiple times" pattern). It has almost everything you might want: good performance, a rich library, reflection for diagnostics, and the ability, where appropriate, to generate code at runtime that's fully integrated with the system. But the performance of interpreter-style loops is the weak point.

To offer some random ideas that may or may not be practical: if it were possible to AOT compile part of the program (say, one assembly) in a highly optimised way, while still benefiting from the JIT elsewhere, that might work. (And the 'flatten' tag could be then utilised by the AOT compiler but ignored by the JIT.) Or, could there be a kind of DynamicMethod.CreateDelegateViaLLVM that could perform expensive optimisations? One could then generate at runtime a delegate that invokes the interpreter with lots of optimisations enabled.

@risc26z
Copy link
Author

risc26z commented Aug 2, 2023

We have MethodImplOptions.AggressiveOptimization that was meant for very similar purpose.

Would it be possible to improve the documentation? At present, it's rather vague about what sort of optimisations it actually does. It'd be useful to know that it affects the potential inlining of called methods.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

No branches or pull requests

6 participants