Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Fix GT_LOCKADD register specification. #8697

Merged
merged 1 commit into from
Dec 21, 2016
Merged

Conversation

sivarv
Copy link
Member

@sivarv sivarv commented Dec 21, 2016

The following PR has uncovered an assert failure in SuperPMI playback
#8532

The repro method has the following IR pattern in one of the basic blocks:

// V1 is in esi at the beginning of the basic block
// Lower reg specification marks op1 as IsDelayFree=true and op2 is marked as contained immed.
// Further LOCKADD consumes its operands and produces no result.
GT_LOCKADD(op1=long static fied addr, op2 = 1)

// thisptr = rcx
GT_PUTARG_REG(v1)
ControlExpr = indir(lea(indir(lea(v1), 32)))
GT_CALL (thisptr, contrl expr)

--------------------------------+----+----+----+----+----+----+----+----+----+----+----+----+
Loc RP#  Name Type  Action Reg  |rax |rcx |rdx |rbx |rsp |rbp |rsi |rdi |r8  |r9  |r10 |r11 |
--------------------------------+----+----+----+----+----+----+----+----+----+----+----+----+
 91.#116 BB3 PredBB1            |    |    |    |----|----|----|V1 a|----|    |    |    |    |
 94.#117 C36  Def    Alloc rcx  |    |C36a|    |----|----|----|V1 a|----|    |    |    |    |
 97.#118 C36  Use *D Keep  rcx  |    |C36a|    |----|----|----|V1 a|----|    |    |    |    |
103.#119 rcx  Fixd   Keep  rcx  |    |C36a|    |----|----|----|V1 a|----|    |    |    |    |
103.#120 V1   Use    Copy  rcx  |    |V1 a|    |----|----|----|V1 a|----|    |    |    |    |
104.#121 rcx  Fixd   Keep  rcx  |    |V1 i|    |----|----|----|V1 i|----|    |    |    |    |
104.#122 I37  Def    Alloc rcx  |    |I37a|    |----|----|----|V1 i|----|    |    |    |    |

RefPos 117 and 118 correspond to Def and Use of op1 of GT_LOCKADD. After allocating 118, allocateRegisters() puts rcx in delayRegsToFree since it is the last Use. While processing FixedReg RefPos 119, allocateRegisters() assigns delayRegsToFree to regsToFree and sets the former to RBM_NONE.
While allocating RefPos 120, allocateRegister() finds that the location hasn't changed and hence will not free rcx (given by regsToFree local). Since V1 is rsi but needs to be in rcx, RefPos 120 is allocated rcx and marked as copyReg=true. Note that at the time of allocating rcx to RefPos 120, rcx is considered a busy reg. Before processing RefPos 121, rcx gets freed by allocateRegisters() since the location has changed. As a result, V1 gets incorrectly marked as inactive.

In summary GT_LOCKADD is a node with one of its op1 marked as IsDelayFree=true but has no Def position. As a result, op1's reg is getting freed an incorrect location.

Fix:
After importatation IR has GT_XADD node which gets morphed into GT_LOCKADD of TYP_VOID (see gtExtractSideEffList()). Further Lower register specification sets its dstCount to zero. As a result, no Def position gets created.

Now lower register specification will mark GT_LOCKADD/XADD/XCHG nodes as IsLocalDefUse=true if its type is set to TYP_VOID by front-end. This would result in a Def position created but not considered consumed by its parent node.

Here is the reg allocation for the repro method with the fix

--------------------------------+----+----+----+----+----+----+----+----+----+----+----+----+
Loc RP#  Name Type  Action Reg  |rax |rcx |rdx |rbx |rsp |rbp |rsi |rdi |r8  |r9  |r10 |r11 |
--------------------------------+----+----+----+----+----+----+----+----+----+----+----+----+
 91.#117 BB3 PredBB1            |    |    |    |----|----|----|V1 a|----|    |    |    |    |
 94.#118 C37  Def    Alloc rcx  |    |C37a|    |----|----|----|V1 a|----|    |    |    |    |
 97.#119 C37  Use *D Keep  rcx  |    |C37a|    |----|----|----|V1 a|----|    |    |    |    |
 98.#120 I38  Def *  Alloc rax  |I38a|C37a|    |----|----|----|V1 a|----|    |    |    |    |
103.#121 rcx  Fixd   Keep  rcx  |    |    |    |----|----|----|V1 a|----|    |    |    |    |
103.#122 V1   Use    Copy  rcx  |    |V1 a|    |----|----|----|V1 a|----|    |    |    |    |
104.#123 rcx  Fixd   Keep  rcx  |    |V1 a|    |----|----|----|V1 a|----|    |    |    |    |
104.#124 I39  Def    Alloc rcx  |    |I39a|    |----|----|----|V1 a|----|    |    |    |    |
109.#125 V1   Use *  Keep  rsi  |    |I39a|    |----|----|----|V1 a|----|    |    |    |    |
110.#126 I40  Def    Alloc rax  |I40a|I39a|    |----|----|----|    |----|    |    |    |    |
113.#127 I40  Use *  Keep  rax  |I40a|I39a|    |----|----|----|    |----|    |    |    |    |
114.#128 I41  Def    Alloc rax  |I41a|I39a|    |----|----|----|    |----|    |    |    |    |

SuperPMI asm Diffs:
CqPerf has no asm diffs.
Mscorlib has 4 methods impacted due to this change. Since GT_LOCKADD Def position is created, its op1's reg gets freed early enough and at the right location. This resulted in reg allocation difference.

00.46:         <filename>                                            baseline        diff improvement  %improvement    #funcs
00.46:         mscorlib_Clean_Thin_Unique_all.dasm                       2767        2771          -4         -0.14         4
00.46:            2 functions regressed (4 total bytes regression)
00.46:               Regression # 1 goes from   128 to   131, diff of     3 ( 2.29%) :: System.Runtime.MemoryFailPoint:Dispose(bool):this
00.46:               Regression # 2 goes from  2162 to  2163, diff of     1 ( 0.05%) :: System.Runtime.MemoryFailPoint:.ctor(int):this
00.46:         TOTAL                                                     2767        2771          -4         -0.14         4

@sivarv
Copy link
Member Author

sivarv commented Dec 21, 2016

@CarolEidt - Please review this
CC @dotnet/jit-contrib

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

assert(tree->gtLsraInfo.dstCount == 0);

// Give it an artificial type and mark it isLocalDefUse = true.
// This would result in a Def position created but not considered

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: "This would" => "This will"

@sivarv sivarv merged commit 77561c8 into dotnet:master Dec 21, 2016
@sivarv sivarv deleted the lockAddFix branch December 21, 2016 18:30
@karelz karelz modified the milestones: 2.1.0, 2.0.0 Aug 28, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Fix GT_LOCKADD register specification.

Commit migrated from dotnet/coreclr@77561c8
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants