Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: improve types for single def locals and temps #10471

Merged
merged 3 commits into from Mar 29, 2017

Conversation

Projects
None yet
6 participants
@AndyAyersMS
Copy link
Member

commented Mar 24, 2017

Track whether a local has a single definition, and if so, if it has
a reference type, try and update its type from the declared type to
a better type taken from the value being assigned to the local.

Obtain types for some of the 'short-lived' ref type temps that should
have a single definition. Use both the tree and the eval stack as sources
of type information (the latter can be phased out if/when all tree nodes
can return rich type information).

Refactor the code that sets or updates lvClassHnd into utilities
to provide better auditing of type flow and make the set/update process
a bit more rigorous.

Cleanup the code that passes argument values a bit by commoning redundant
argument lookup expressions.

JIT: improve types for single def locals and temps
Track whether a local has a single definition, and if so, if it has
a reference type, try and update its type from the declared type to
a better type taken from the value being assigned to the local.

Obtain types for some of the 'short-lived' ref type temps that should
have a single definition. Use both the tree and the eval stack as sources
of type information (the latter can be phased out if/when all tree nodes
can return rich type information).

Refactor the code that sets or updates lvClassHnd into utilities
to provide better auditing of type flow and make the set/update process
a bit more rigorous.

Cleanup the code that passes argument values a bit by commoning redundant
argument lookup expressions.
@AndyAyersMS

This comment has been minimized.

Copy link
Member Author

commented Mar 24, 2017

@JosephTremoulet PTAL
cc @dotnet/jit-contrib

Impacts 114 methods in the jit-diff framework set. Overall devirt stats for System.Private.CoreLib:

CallKind Success Fail Total Succ %
Virtual 798 7173 7971 10.01 %
Interface 98 4207 4305 2.28 %
Total 896 11380 12276 7.30 %

Baseline rates were 9.42%, 2.23%, and 6.89%.

This also lets us devirtualize the obvious things one would hope, eg the call to F() below.

class B 
{
    public virtual int F() { return 3; }
}

class D : B
{
    public override int F() { return 5; }
}

class X
{
    public static int Main(string[] args)
    {
        B b = new D();
        return b.F() + 95;
    }
}
@briansull

This comment has been minimized.

Copy link
Contributor

commented Mar 25, 2017

Can you leverage the existing:
unsigned char lvSingleDef : 1; // variable has a single def

@AndyAyersMS

This comment has been minimized.

Copy link
Member Author

commented Mar 25, 2017

It looks like lvSingleDef means something slightly different -- it asserts that there is just one defining value. For the type propagation's "single definition" we're interested in knowing if there is one point at which the variable value is defined (ignoring any implicit null definition or reaching undefined value); and if so we set the type as the type common to the possible values at the definition point. We probably don't get a type off of a general question op today but we could do so in principle, and we do this for assignments where the value comes from the inline castclass/isinst, which would not be considered lvSingleDef material.

@JosephTremoulet
Copy link
Contributor

left a comment

Looks good with a few comments

unsigned char lvArgWrite : 1; // variable is a parameter and STARG was used on it
unsigned char lvIsTemp : 1; // Short-lifetime compiler temp
unsigned char lvArgWrite : 1; // there is at least one STLOC or STARG on this local
unsigned char lvMultipleArgWrite : 1; // there is more than one STLOC on this local

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

IMO these names are confusing now, especially the fact that lvMultipleArgWrite doesn't apply to args/starg. Maybe something more like HasStore/HasMultipleStores?

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

There's an asymmetry between args and locals that I wrestled with -- args are implicitly written once, so a single starg or ldarga creates a multiple-def arg, whereas for locals we need to see two of stloc or ldloca to have a multiple def local. I could resolve this by setting lvArgWrite when the arg temp is allocated and update the logic for args and then the fields would have same meaning. Does that seem preferable?

store seems a bit too specific for the name since these also apply to address-of operators. I'd use definition but then the overlap with lvISingleDef would be even more confusing. How about HasSingleWrite and HasMultipleWrites?

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

I'd use definition but then the overlap with lvISingleDef would be even more confusing. How about HasSingleWrite and HasMultipleWrites?

I thought about definition and the same thing occurred to me. I picked "store" because it's the verb used in the IL opcodes, and to my mind they key point is that these properties track which IL opcodes get used and how often, so that you can take advantage of them when importing that single IL operation... I don't think that "write" conveys that (though I'll admit that "store" doesn't do a great job of conveying it). Maybe something along the lines of lvSingleILDef or lvSingleILStoreOp?

I could resolve this by setting lvArgWrite when the arg temp is allocated and update the logic for args and then the fields would have same meaning. Does that seem preferable?

maybe? On the one hand, I feel ok with what you have (modulo naming). On the other hand, I had to read over the change a few times to figure out that the asymmetry was intentional and because of the implicit arg def at the callsite... I guess whatever fits with the names is preferable to me, so if you keep them with names matching the opcodes then like you have it seems preferable, but if you switch to something more generic like Def/Write then yeah, reworking it to count the implicit arg defs at callsites as defs/writes would make more sense.

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Bringing in IL seems like a good improvement, so lvSingleILStoreOp and lvMultipleILStoreOp? And then similarly for the local bools that get introduced....

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

so lvSingleILStoreOp and lvMultipleILStoreOp?

SGTM

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Hmm, lvSingleILStoreOp is sounding more precise than it is, it does not mean exactly one. Maybe lvHasILStoreOp and lvHasMultipleILStoreOp ?

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Also am going to split the Set/Update method into two methods instead of passing in a bool.

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

Maybe lvHasILStoreOp and lvHasMultipleILStoreOp ?

Sure.

if (inlArgInfo[argNum].argHasTmp)
const InlArgInfo& argInfo = inlArgInfo[argNum];
const bool argIsSingleDef = !argInfo.argHasLdargaOp && !argInfo.argHasStargOp;
GenTree* const argNode = inlArgInfo[argNum].argNode;

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

Shorten to argInfo.argNode?

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Thanks, not sure how I missed these. Will clean them up.

{
assert(inlArgInfo[argNum].argNode->OperIsConst() || inlArgInfo[argNum].argNode->gtOper == GT_ADDR);
assert(argInfo.argNode->OperIsConst() || argInfo.argNode->gtOper == GT_ADDR);

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

use your new local argNode here?

(inlArgInfo[argNum].argNode->gtOper != GT_LCL_VAR ||
(inlArgInfo[argNum].argNode->gtFlags & GTF_GLOB_REF)));
noway_assert((argInfo.argIsLclVar == 0) ==
(argInfo.argNode->gtOper != GT_LCL_VAR || (argInfo.argNode->gtFlags & GTF_GLOB_REF)));

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

argNode?


if (inlArgInfo[argNum].argNode->gtOper == GT_OBJ ||
inlArgInfo[argNum].argNode->gtOper == GT_MKREFANY)
if (argInfo.argNode->gtOper == GT_OBJ || argInfo.argNode->gtOper == GT_MKREFANY)

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

argNode? (and again once in each branch of the if/else in the next 10 lines)

// We should have seen a stloc in our IL prescan.
assert(lvaTable[lclNum].lvArgWrite);

const bool isSingleDefLocal =

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

Since this has the differences you mentioned vs lvSingleDef, maybe this should be something more like isSingleStlocLocal?

}

tiRetVal = verMakeTypeInfo(resolvedToken.hClass);

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

Why did you move this line?

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

We can use the stack type handle to set temp types; this makes the more precise type available more generally.

@@ -6491,6 +6601,14 @@ void Compiler::lvaDumpEntry(unsigned lclNum, FrameLayoutState curState, size_t r
{
printf(" stack-byref");
}
if (varDsc->lvClassHnd != nullptr)
{
printf(" class-hnd");

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

Did you want to actually dump the class handle (or look up a name for it and dump that)?

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Class names get pretty long and there should now be notes left earlier in the dump stream when this field is set that correlate the variable, the name of the type, and the corresponding handle address. So my instinct is to leave this as is.

I could add this as a dump option, though we seem to have perhaps too many dump options already.

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

I could add this as a dump option, though we seem to have perhaps too many dump options already

My inclination would be not to add a dump option, and leave how you have it -- thanks for explaining.

}

// Are we updating the type?
if (varDsc->lvClassHnd != clsHnd)

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

From the descriptions (in the comment on this method and commit message), I was expecting more of a meet operation here, not "the new one must be better" -- i.e. that we could have multiple sources of information and call this with each... would it make sense to do something like add a debug check that this only gets called once per local?

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

The comment is misleading, sorry about that. This is not a meet (where abstractly we'd be moving from subtypes towards supertypes), but is simply modelling assignment. I had been thinking of handling the types for the block boundary spill temps when I wrote this but subsequently realized it can't be done yet. I'll reword this.

Not sure if you are using "local" here in the specific or general sense -- this might be called twice for callee arg temps (which are also caller locals): once to set the type based on the signature, and then again to "improve" the type based on the expression for the caller supplied value. I can add more state to track whether a temp is an callee arg temp and then assert that the update cases only happen where expected.

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Let me refine that last paragraph -- updates can happen for single-def locals too, as we move from the signature type to the type of the initializing value. So the only cases of temps-corresponding-to-IL variables we won't update are the args in the root method; root method locals, callee args and locals all might be updated.

Compiler temps should never get updated; their types should only be set when the temp is allocated.

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

I was using "local" in the more general sense and didn't realize you have the two cases that can apply to the same callee arg temp. Yes if there's a reasonable way to add checks along these lines (maybe an "is callee local" bit like you suggest, or maybe just have a way to assert this isn't getting called twice but also a way to exempt that callee arg callsite), I'd think that would be useful. Ok without if that gets too unwieldy.

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

updates can happen for single-def locals too, as we move from the signature type to the type of the initializing value

So generally you'd expect zero or one calls to the overload that doesn't take a tree, followed by zero or one calls to the overload that does take a tree?

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Yes.

Maybe it would be simpler to do this: track for each var if a type update has happened, and assert it can only happen once. Pass an "update expected" flag down from the calling context since there are just two specific locations where we do updates: when we assign to the arg temp (where we won't have a stack handle), and when we see a STLOC for a single IL store local (where we will).

So we could check that updates only happened when expected and only happened once.

This comment has been minimized.

Copy link
@JosephTremoulet

JosephTremoulet Mar 27, 2017

Contributor

Sure, that sounds good (I'm assuming that by "update" you mean "get a non-null class handle passed in and already had a non-null class handle stored on the type?)

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 27, 2017

Author Member

Yes, exactly that -- even if it turns out we don't alter the handle or exactness bit.

@AndyAyersMS

This comment has been minimized.

Copy link
Member Author

commented Mar 27, 2017

@JosephTremoulet hopefully this covers most of your feedback. It reads better now so I appreciate your review.

{
CORINFO_CLASS_HANDLE stkHnd = verCurrentState.esStack[level].seTypeInfo.GetClassHandle();
lvaSetClass(tnum, tree, stkHnd);
}

This comment has been minimized.

Copy link
@briansull

briansull Mar 28, 2017

Contributor

I did see a comment in the code base that lvIsTemp variables are single def, but there is nothing that enforces that and I spotted at least in one place where that isn't true.
genReturnLocal = lvaGrabTemp(true DEBUGARG("Single return block return value"));

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 28, 2017

Author Member

Right, I don't rely on lvIsTemp == "single def" to be true in general (otherwise I would do a more comprehensive type capture over in impAssignTempGen). But in this specific context the temps are single def when lvIsTemp is true.

Seems like we ought to fix up those places where lvIsTemp does not imply a single def with an eye towards ultimately making this reliable, at least through the importer / inliner.

This comment has been minimized.

Copy link
@briansull

briansull Mar 28, 2017

Contributor

I would prefer that we use the lvIsSingleDef property when we need that behavior to be true.
We may want to deprecate the lvIsTemp or define is mean more carefully. It origin was as a
hint to the old register allocator that we had a short-lifetime temp and that we should try a bit harder to both track it and place it in a register. It didn't mean that there was a single definition.

This comment has been minimized.

Copy link
@briansull

briansull Mar 28, 2017

Contributor

That case that I pointed out may be of interest to your devirtualization work.
If we have many return blocks (5 or more) we create a new LclVar to hold the return value
and each return block gets converted into an assignment of the return value into this new LclVar and a branch to the "single" return block. Thus we end up with a new LclVar that have 5 or more defs and a single use. It has a "short-lifetime" as the JIT can typically find a register usually RAX to hold the value assigned and immediately returned. If you were to treat this as having a single def you might infer that the return type is known when in fact there could be different return types.
This would probably only impact the inlining case and we probably don't try to inline a method with so many return blocks, but I'm not sure about that.

This comment has been minimized.

Copy link
@AndyAyersMS

AndyAyersMS Mar 29, 2017

Author Member

For devirt we can't use lvIsSingleDef as currently constituted, for a couple of reasons:

  • It is computed from a global tree walk, and so is not available when importing code when we are trying to devirtualize
  • It currently does not consider some constructs as "single def" that can be treated that way during type propagation -- for example, question ops

If you're suggesting that we eliminate/deprecate/redefine lvIsTemp and start setting lvIsSingleDef during the importer (and update subsequent code to verify that if it was set during importation it was set correctly) then perhaps we can use it, if we can reconcile the question ops issue. But that is future work.

I'll update the code here to capture the type for newly introduced temps instead of looking at lvIsTemp; this better captures the intent anyways.

As for return temps -- I have not gone and comprehensively tried to capture types for all temps, in part because some (like this one) may have multiple definitions, and we don't have a general way of computing bounds on types when we have multiple definitions (specifically for cases where we have mixtures of exact and shared types).

We use the return temp more often these days, since it's also used to avoid interference from the post-inline unpin/gc ref nulling, and so quite often it may be a single def. So we could get that case and we can always use the type from the method signature as the initial approximation. I'll leave those as future enhancements.

@JosephTremoulet

This comment has been minimized.

Copy link
Contributor

commented Mar 28, 2017

Updates look good.

@briansull

This comment has been minimized.

Copy link
Contributor

commented Mar 28, 2017

LGTM with comments

@AndyAyersMS

This comment has been minimized.

Copy link
Member Author

commented Mar 29, 2017

OSX now hitting a java remoting error (log)

FATAL: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@7f84e947[name=dci-mac-build-050]
10:53:38 hudson.remoting.RequestAbortedException: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@7f84e947[name=dci-mac-build-050]

Will retry.

@dotnet-bot retest OSX10.12 x64 Checked Build and Test

@AndyAyersMS

This comment has been minimized.

Copy link
Member Author

commented Mar 29, 2017

Looks like OSX is just generally in a bad way lately, the last 6 or so attempts by any PR have all failed.

@mmitche can you take a look?

@mmitche

This comment has been minimized.

Copy link
Member

commented Mar 29, 2017

@AndyAyersMS Took a look. A particular machine was acting up, though I haven't seen the particular error in a while. I rebooted it.

@mmitche

This comment has been minimized.

Copy link
Member

commented Mar 29, 2017

@dotnet-bot test OSX10.12 x64 Checked Build and Test

@briansull

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2017

lgtm

@AndyAyersMS AndyAyersMS merged commit e99037f into dotnet:master Mar 29, 2017

16 checks passed

CentOS7.1 x64 Debug Build and Test Build finished.
Details
FreeBSD x64 Checked Build Build finished.
Details
OSX10.12 x64 Checked Build and Test Build finished.
Details
Tizen armel Cross Debug Build Build finished.
Details
Tizen armel Cross Release Build Build finished.
Details
Ubuntu arm Cross Release Build Build finished.
Details
Ubuntu x64 Checked Build and Test Build finished.
Details
Ubuntu x64 Formatting Build finished.
Details
Ubuntu16.04 arm Cross Debug Build Build finished.
Details
Windows_NT arm Cross Debug Build Build finished.
Details
Windows_NT arm Cross Release Build Build finished.
Details
Windows_NT arm64 Cross Debug Build Build finished.
Details
Windows_NT x64 Debug Build and Test Build finished.
Details
Windows_NT x64 Formatting Build finished.
Details
Windows_NT x64 Release Priority 1 Build and Test Build finished.
Details
Windows_NT x86 Checked Build and Test Build finished.
Details

@AndyAyersMS AndyAyersMS deleted the AndyAyersMS:TrackSingleDefLocals branch Mar 29, 2017

@AndyAyersMS AndyAyersMS referenced this pull request Jun 21, 2017

Open

JIT: devirtualization next steps #9908

5 of 23 tasks complete

@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.