Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: allow some aggressive inlines to go over budget #38163

Merged
merged 2 commits into from
Jun 23, 2020

Conversation

AndyAyersMS
Copy link
Member

If we have a very small root method that calls a large method that is marked with
AggressiveInlining, we may fail to inline because of a budget check.

Allow such inlines to bypass the check.

Closes #38106.

If we have a very small root method that calls a large method that is marked with
AggressiveInlining, we may fail to inline because of a budget check.

Allow such inlines to bypass the check.

Closes dotnet#38106.
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 19, 2020
@AndyAyersMS
Copy link
Member Author

@erozenfeld PTAL cc @dotnet/jit-contrib

Jit diffs shows 3 diffs, but they're spurious Vector<Vector<>> cases:

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for x64 default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: 525 (0.001% of base)
    diff is a regression.
Top file regressions (bytes):
         525 : System.Private.CoreLib.dasm (0.012% of base)
1 total files with Code Size differences (0 improved, 1 regressed), 266 unchanged.
Top method regressions (bytes):
         191 (129.932% of base) : System.Private.CoreLib.dasm - Vector:EqualsAll(Vector`1,Vector`1):bool (6 methods)
         182 (110.976% of base) : System.Private.CoreLib.dasm - Vector`1:op_Inequality(Vector`1,Vector`1):bool (6 methods)
         152 (81.720% of base) : System.Private.CoreLib.dasm - Vector`1:Equals(Vector`1):bool:this (6 methods)
Top method regressions (percentages):
         191 (129.932% of base) : System.Private.CoreLib.dasm - Vector:EqualsAll(Vector`1,Vector`1):bool (6 methods)
         182 (110.976% of base) : System.Private.CoreLib.dasm - Vector`1:op_Inequality(Vector`1,Vector`1):bool (6 methods)
         152 (81.720% of base) : System.Private.CoreLib.dasm - Vector`1:Equals(Vector`1):bool:this (6 methods)
3 total methods with Code Size differences (0 improved, 3 regressed), 243010 unchanged.

@tannergooding
Copy link
Member

Is there any cheap analysis we can do to know that the method is likely cheaper because the generic constraint is where T : struct and it has a sequence of if (typeof(T) == typeof(x)) checks (and therefore most of the IL is dead code)?

@AndyAyersMS
Copy link
Member Author

We might be able to heuristically recognize type-specializing opcode sequences and branching patterns during our initial IL scan/profitability check. At this level we would not know what methods get called and whether or not they are intrinsic. It would be pretty ad-hoc as we can't even do simple dataflow at that level today.

Assuming we could guess right much of the time, we could "discount" the cost of the candidate to be the estimated cost of one of the branch cases plus the cost of any "unconditional" IL, and provisionally accept the inline. I have written heuristics like this in the past and they're tricky to get right without fairly strong models for control flow; we don't have that here either.

Similar perhaps for size-specializing patterns.

When we actually go to import the inlinee, if it then turned out we were wrong and things didn't simplify, we could reject then to keep code size small. But we'd have burned a fair bit of jit time and memory in the process.

So the key would be to get good recognition accuracy, so that jit throughput is not unduly impacted by things that look promising but don't pan out.

An alternative is to try and do this recognition either in the language compiler or in some other precompilation tool and just leave breadcrumbs for the jit.

@tannergooding
Copy link
Member

I'm assuming the JIT time is still a concern for Tier 1 optimizations? Perhaps its something we could leave a breadcrumb on from the initial Tier 0 pass?

(Mostly just brainstorming here since its a fairly common pattern for things in numerics land and will impact new things like Vector2<T>).

In either case, I think this will cover the major scenarios around the issue file, so 👍 from me.

However, I was wondering for a case like the following, Method2 would likely not be inlined as it has a call depth of 1, is that correct?

void Method()
{
     Method1<byte>();
}

void Method1<T>()
{
    if (typeof(T) == typeof(byte))
    {
        Method2<byte>();
    }
    else if (typeof(T) == ...)
    {
    }
}

void Method2<T>()
{
    if (typeof(T) == typeof(byte))
    {
        //  code
    }
    else if (typeof(T) == ...)
    {
    }
}

@AndyAyersMS
Copy link
Member Author

Perhaps its something we could leave a breadcrumb on from the initial Tier 0 pass?

Yeah, not a bad idea. Assuming we start doing intrinsic recognition and opts in Tier0 (which I think is a reasonable idea) then we'd know for sure which methods simplified and roughly by how much. We don't have a general way for Tier0 to feed info to Tier1 currently, but we can work on that in a manner similar to how OSR does things.

for a case like the following, Method2 would likely not be inlined as it has a call depth of 1, is that correct?

I'll look into a case like that -- once we've inlined Method1 the budget expands (see InlineStrategy::NoteOutcome) so there may now be enough room for Method2.

Another option here is to follow the pattern suggested in NoteOutcome and allow either top-level inline, or an inline where all parent contexts are also AggressiveInlining. But without some kind of budgetary constraint it is easy to abuse this all this and create very large methods.

@EgorBo
Copy link
Member

EgorBo commented Jun 20, 2020

Yeah

bool M<T1,T2>() => typeof(T1) == typeof(T2);

^ is 3 calls (Type::GetTypeFromHandle, Type::GetTypeFromHandle and Type::op_Equality) we have to resolve in the inliner to be able to recognize and change the benefit multiplier/decrease the estimated codegen size

Copy link
Member

@erozenfeld erozenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: bradmarder <bradmarder@users.noreply.github.com>
@AndyAyersMS AndyAyersMS merged commit 0256379 into dotnet:master Jun 23, 2020
@AndyAyersMS AndyAyersMS deleted the InlineBudgetRevision branch June 23, 2020 18:55
@sandreenko
Copy link
Contributor

sandreenko commented Jun 24, 2020

I suspect this PR has caused new failures like (jitstress=1):

Assert failure(PID 2440 [0x00000988], Thread: 9332 [0x2474]): Assertion failed 'm_CallsiteDepth > 0' in '<>c:<GetType>b__1_0(System.String):System.Type:this' during 'Morph - Inlining' (IL size 7)

    File: F:\git\runtime\src\coreclr\src\jit\inlinepolicy.cpp Line: 388
    Image: F:\git\runtime\artifacts\bin\testhost\net5.0-Windows_NT-Release-x64\dotnet.exe

see https://dev.azure.com/dnceng/public/_build/results?buildId=701636&view=ms.vss-test-web.build-test-results-tab&runId=21759702&paneView=debug&resultId=105581 for more examples.

@AndyAyersMS
Copy link
Member Author

Suspect the random policy needs a tweak; I'll take a look.

AndyAyersMS added a commit to AndyAyersMS/runtime that referenced this pull request Jun 25, 2020
In dotnet#38163 I added a depth field to one of the inline policy base classes.
A derived class already had a similar field. Unify in favor of the base class
field.

The inline policies implemented by these derived policy classes are sometimes
used during jit stress (eg for random inlining).

Fixes dotnet#38374.
AndyAyersMS added a commit that referenced this pull request Jun 25, 2020
In #38163 I added a depth field to one of the inline policy base classes.
A derived class already had a similar field. Unify in favor of the base class
field.

The inline policies implemented by these derived policy classes are sometimes
used during jit stress (eg for random inlining).

Fixes #38374.
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT known generic casts not elided
7 participants