Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

JIT: improve types for single def locals and temps #10471

Merged
merged 3 commits into from
Mar 29, 2017

Conversation

AndyAyersMS
Copy link
Member

Track whether a local has a single definition, and if so, if it has
a reference type, try and update its type from the declared type to
a better type taken from the value being assigned to the local.

Obtain types for some of the 'short-lived' ref type temps that should
have a single definition. Use both the tree and the eval stack as sources
of type information (the latter can be phased out if/when all tree nodes
can return rich type information).

Refactor the code that sets or updates lvClassHnd into utilities
to provide better auditing of type flow and make the set/update process
a bit more rigorous.

Cleanup the code that passes argument values a bit by commoning redundant
argument lookup expressions.

Track whether a local has a single definition, and if so, if it has
a reference type, try and update its type from the declared type to
a better type taken from the value being assigned to the local.

Obtain types for some of the 'short-lived' ref type temps that should
have a single definition. Use both the tree and the eval stack as sources
of type information (the latter can be phased out if/when all tree nodes
can return rich type information).

Refactor the code that sets or updates lvClassHnd into utilities
to provide better auditing of type flow and make the set/update process
a bit more rigorous.

Cleanup the code that passes argument values a bit by commoning redundant
argument lookup expressions.
@AndyAyersMS
Copy link
Member Author

@JosephTremoulet PTAL
cc @dotnet/jit-contrib

Impacts 114 methods in the jit-diff framework set. Overall devirt stats for System.Private.CoreLib:

CallKind Success Fail Total Succ %
Virtual 798 7173 7971 10.01 %
Interface 98 4207 4305 2.28 %
Total 896 11380 12276 7.30 %

Baseline rates were 9.42%, 2.23%, and 6.89%.

This also lets us devirtualize the obvious things one would hope, eg the call to F() below.

class B 
{
    public virtual int F() { return 3; }
}

class D : B
{
    public override int F() { return 5; }
}

class X
{
    public static int Main(string[] args)
    {
        B b = new D();
        return b.F() + 95;
    }
}

@briansull
Copy link

Can you leverage the existing:
unsigned char lvSingleDef : 1; // variable has a single def

@AndyAyersMS
Copy link
Member Author

It looks like lvSingleDef means something slightly different -- it asserts that there is just one defining value. For the type propagation's "single definition" we're interested in knowing if there is one point at which the variable value is defined (ignoring any implicit null definition or reaching undefined value); and if so we set the type as the type common to the possible values at the definition point. We probably don't get a type off of a general question op today but we could do so in principle, and we do this for assignments where the value comes from the inline castclass/isinst, which would not be considered lvSingleDef material.

Copy link

@JosephTremoulet JosephTremoulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with a few comments

unsigned char lvArgWrite : 1; // variable is a parameter and STARG was used on it
unsigned char lvIsTemp : 1; // Short-lifetime compiler temp
unsigned char lvArgWrite : 1; // there is at least one STLOC or STARG on this local
unsigned char lvMultipleArgWrite : 1; // there is more than one STLOC on this local

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO these names are confusing now, especially the fact that lvMultipleArgWrite doesn't apply to args/starg. Maybe something more like HasStore/HasMultipleStores?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an asymmetry between args and locals that I wrestled with -- args are implicitly written once, so a single starg or ldarga creates a multiple-def arg, whereas for locals we need to see two of stloc or ldloca to have a multiple def local. I could resolve this by setting lvArgWrite when the arg temp is allocated and update the logic for args and then the fields would have same meaning. Does that seem preferable?

store seems a bit too specific for the name since these also apply to address-of operators. I'd use definition but then the overlap with lvISingleDef would be even more confusing. How about HasSingleWrite and HasMultipleWrites?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use definition but then the overlap with lvISingleDef would be even more confusing. How about HasSingleWrite and HasMultipleWrites?

I thought about definition and the same thing occurred to me. I picked "store" because it's the verb used in the IL opcodes, and to my mind they key point is that these properties track which IL opcodes get used and how often, so that you can take advantage of them when importing that single IL operation... I don't think that "write" conveys that (though I'll admit that "store" doesn't do a great job of conveying it). Maybe something along the lines of lvSingleILDef or lvSingleILStoreOp?

I could resolve this by setting lvArgWrite when the arg temp is allocated and update the logic for args and then the fields would have same meaning. Does that seem preferable?

maybe? On the one hand, I feel ok with what you have (modulo naming). On the other hand, I had to read over the change a few times to figure out that the asymmetry was intentional and because of the implicit arg def at the callsite... I guess whatever fits with the names is preferable to me, so if you keep them with names matching the opcodes then like you have it seems preferable, but if you switch to something more generic like Def/Write then yeah, reworking it to count the implicit arg defs at callsites as defs/writes would make more sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bringing in IL seems like a good improvement, so lvSingleILStoreOp and lvMultipleILStoreOp? And then similarly for the local bools that get introduced....

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so lvSingleILStoreOp and lvMultipleILStoreOp?

SGTM

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, lvSingleILStoreOp is sounding more precise than it is, it does not mean exactly one. Maybe lvHasILStoreOp and lvHasMultipleILStoreOp ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also am going to split the Set/Update method into two methods instead of passing in a bool.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe lvHasILStoreOp and lvHasMultipleILStoreOp ?

Sure.

if (inlArgInfo[argNum].argHasTmp)
const InlArgInfo& argInfo = inlArgInfo[argNum];
const bool argIsSingleDef = !argInfo.argHasLdargaOp && !argInfo.argHasStargOp;
GenTree* const argNode = inlArgInfo[argNum].argNode;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shorten to argInfo.argNode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, not sure how I missed these. Will clean them up.

{
assert(inlArgInfo[argNum].argNode->OperIsConst() || inlArgInfo[argNum].argNode->gtOper == GT_ADDR);
assert(argInfo.argNode->OperIsConst() || argInfo.argNode->gtOper == GT_ADDR);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use your new local argNode here?

(inlArgInfo[argNum].argNode->gtOper != GT_LCL_VAR ||
(inlArgInfo[argNum].argNode->gtFlags & GTF_GLOB_REF)));
noway_assert((argInfo.argIsLclVar == 0) ==
(argInfo.argNode->gtOper != GT_LCL_VAR || (argInfo.argNode->gtFlags & GTF_GLOB_REF)));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argNode?


if (inlArgInfo[argNum].argNode->gtOper == GT_OBJ ||
inlArgInfo[argNum].argNode->gtOper == GT_MKREFANY)
if (argInfo.argNode->gtOper == GT_OBJ || argInfo.argNode->gtOper == GT_MKREFANY)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argNode? (and again once in each branch of the if/else in the next 10 lines)

// We should have seen a stloc in our IL prescan.
assert(lvaTable[lclNum].lvArgWrite);

const bool isSingleDefLocal =

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this has the differences you mentioned vs lvSingleDef, maybe this should be something more like isSingleStlocLocal?

}

tiRetVal = verMakeTypeInfo(resolvedToken.hClass);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you move this line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use the stack type handle to set temp types; this makes the more precise type available more generally.

@@ -6491,6 +6601,14 @@ void Compiler::lvaDumpEntry(unsigned lclNum, FrameLayoutState curState, size_t r
{
printf(" stack-byref");
}
if (varDsc->lvClassHnd != nullptr)
{
printf(" class-hnd");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to actually dump the class handle (or look up a name for it and dump that)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class names get pretty long and there should now be notes left earlier in the dump stream when this field is set that correlate the variable, the name of the type, and the corresponding handle address. So my instinct is to leave this as is.

I could add this as a dump option, though we seem to have perhaps too many dump options already.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add this as a dump option, though we seem to have perhaps too many dump options already

My inclination would be not to add a dump option, and leave how you have it -- thanks for explaining.

}

// Are we updating the type?
if (varDsc->lvClassHnd != clsHnd)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the descriptions (in the comment on this method and commit message), I was expecting more of a meet operation here, not "the new one must be better" -- i.e. that we could have multiple sources of information and call this with each... would it make sense to do something like add a debug check that this only gets called once per local?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is misleading, sorry about that. This is not a meet (where abstractly we'd be moving from subtypes towards supertypes), but is simply modelling assignment. I had been thinking of handling the types for the block boundary spill temps when I wrote this but subsequently realized it can't be done yet. I'll reword this.

Not sure if you are using "local" here in the specific or general sense -- this might be called twice for callee arg temps (which are also caller locals): once to set the type based on the signature, and then again to "improve" the type based on the expression for the caller supplied value. I can add more state to track whether a temp is an callee arg temp and then assert that the update cases only happen where expected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me refine that last paragraph -- updates can happen for single-def locals too, as we move from the signature type to the type of the initializing value. So the only cases of temps-corresponding-to-IL variables we won't update are the args in the root method; root method locals, callee args and locals all might be updated.

Compiler temps should never get updated; their types should only be set when the temp is allocated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using "local" in the more general sense and didn't realize you have the two cases that can apply to the same callee arg temp. Yes if there's a reasonable way to add checks along these lines (maybe an "is callee local" bit like you suggest, or maybe just have a way to assert this isn't getting called twice but also a way to exempt that callee arg callsite), I'd think that would be useful. Ok without if that gets too unwieldy.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updates can happen for single-def locals too, as we move from the signature type to the type of the initializing value

So generally you'd expect zero or one calls to the overload that doesn't take a tree, followed by zero or one calls to the overload that does take a tree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Maybe it would be simpler to do this: track for each var if a type update has happened, and assert it can only happen once. Pass an "update expected" flag down from the calling context since there are just two specific locations where we do updates: when we assign to the arg temp (where we won't have a stack handle), and when we see a STLOC for a single IL store local (where we will).

So we could check that updates only happened when expected and only happened once.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that sounds good (I'm assuming that by "update" you mean "get a non-null class handle passed in and already had a non-null class handle stored on the type?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly that -- even if it turns out we don't alter the handle or exactness bit.

@AndyAyersMS
Copy link
Member Author

@JosephTremoulet hopefully this covers most of your feedback. It reads better now so I appreciate your review.

{
CORINFO_CLASS_HANDLE stkHnd = verCurrentState.esStack[level].seTypeInfo.GetClassHandle();
lvaSetClass(tnum, tree, stkHnd);
}
Copy link

@briansull briansull Mar 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did see a comment in the code base that lvIsTemp variables are single def, but there is nothing that enforces that and I spotted at least in one place where that isn't true.
genReturnLocal = lvaGrabTemp(true DEBUGARG("Single return block return value"));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I don't rely on lvIsTemp == "single def" to be true in general (otherwise I would do a more comprehensive type capture over in impAssignTempGen). But in this specific context the temps are single def when lvIsTemp is true.

Seems like we ought to fix up those places where lvIsTemp does not imply a single def with an eye towards ultimately making this reliable, at least through the importer / inliner.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that we use the lvIsSingleDef property when we need that behavior to be true.
We may want to deprecate the lvIsTemp or define is mean more carefully. It origin was as a
hint to the old register allocator that we had a short-lifetime temp and that we should try a bit harder to both track it and place it in a register. It didn't mean that there was a single definition.

Copy link

@briansull briansull Mar 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That case that I pointed out may be of interest to your devirtualization work.
If we have many return blocks (5 or more) we create a new LclVar to hold the return value
and each return block gets converted into an assignment of the return value into this new LclVar and a branch to the "single" return block. Thus we end up with a new LclVar that have 5 or more defs and a single use. It has a "short-lifetime" as the JIT can typically find a register usually RAX to hold the value assigned and immediately returned. If you were to treat this as having a single def you might infer that the return type is known when in fact there could be different return types.
This would probably only impact the inlining case and we probably don't try to inline a method with so many return blocks, but I'm not sure about that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For devirt we can't use lvIsSingleDef as currently constituted, for a couple of reasons:

  • It is computed from a global tree walk, and so is not available when importing code when we are trying to devirtualize
  • It currently does not consider some constructs as "single def" that can be treated that way during type propagation -- for example, question ops

If you're suggesting that we eliminate/deprecate/redefine lvIsTemp and start setting lvIsSingleDef during the importer (and update subsequent code to verify that if it was set during importation it was set correctly) then perhaps we can use it, if we can reconcile the question ops issue. But that is future work.

I'll update the code here to capture the type for newly introduced temps instead of looking at lvIsTemp; this better captures the intent anyways.

As for return temps -- I have not gone and comprehensively tried to capture types for all temps, in part because some (like this one) may have multiple definitions, and we don't have a general way of computing bounds on types when we have multiple definitions (specifically for cases where we have mixtures of exact and shared types).

We use the return temp more often these days, since it's also used to avoid interference from the post-inline unpin/gc ref nulling, and so quite often it may be a single def. So we could get that case and we can always use the type from the method signature as the initial approximation. I'll leave those as future enhancements.

@JosephTremoulet
Copy link

Updates look good.

@briansull
Copy link

LGTM with comments

@AndyAyersMS
Copy link
Member Author

OSX now hitting a java remoting error (log)

FATAL: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@7f84e947[name=dci-mac-build-050]
10:53:38 hudson.remoting.RequestAbortedException: java.io.IOException: Connection aborted: org.jenkinsci.remoting.nio.NioChannelHub$MonoNioTransport@7f84e947[name=dci-mac-build-050]

Will retry.

@dotnet-bot retest OSX10.12 x64 Checked Build and Test

@AndyAyersMS
Copy link
Member Author

Looks like OSX is just generally in a bad way lately, the last 6 or so attempts by any PR have all failed.

@mmitche can you take a look?

@mmitche
Copy link
Member

mmitche commented Mar 29, 2017

@AndyAyersMS Took a look. A particular machine was acting up, though I haven't seen the particular error in a while. I rebooted it.

@mmitche
Copy link
Member

mmitche commented Mar 29, 2017

@dotnet-bot test OSX10.12 x64 Checked Build and Test

@briansull
Copy link

lgtm

@AndyAyersMS AndyAyersMS merged commit e99037f into dotnet:master Mar 29, 2017
@AndyAyersMS AndyAyersMS deleted the TrackSingleDefLocals branch March 29, 2017 21:17
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
6 participants