This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The following PR has uncovered an assert failure in SuperPMI playback
#8532
The repro method has the following IR pattern in one of the basic blocks:
RefPos 117 and 118 correspond to Def and Use of op1 of GT_LOCKADD. After allocating 118, allocateRegisters() puts rcx in delayRegsToFree since it is the last Use. While processing FixedReg RefPos 119, allocateRegisters() assigns delayRegsToFree to regsToFree and sets the former to RBM_NONE.
While allocating RefPos 120, allocateRegister() finds that the location hasn't changed and hence will not free rcx (given by regsToFree local). Since V1 is rsi but needs to be in rcx, RefPos 120 is allocated rcx and marked as copyReg=true. Note that at the time of allocating rcx to RefPos 120, rcx is considered a busy reg. Before processing RefPos 121, rcx gets freed by allocateRegisters() since the location has changed. As a result, V1 gets incorrectly marked as inactive.
In summary GT_LOCKADD is a node with one of its op1 marked as IsDelayFree=true but has no Def position. As a result, op1's reg is getting freed an incorrect location.
Fix:
After importatation IR has GT_XADD node which gets morphed into GT_LOCKADD of TYP_VOID (see gtExtractSideEffList()). Further Lower register specification sets its dstCount to zero. As a result, no Def position gets created.
Now lower register specification will mark GT_LOCKADD/XADD/XCHG nodes as IsLocalDefUse=true if its type is set to TYP_VOID by front-end. This would result in a Def position created but not considered consumed by its parent node.
Here is the reg allocation for the repro method with the fix
SuperPMI asm Diffs:
CqPerf has no asm diffs.
Mscorlib has 4 methods impacted due to this change. Since GT_LOCKADD Def position is created, its op1's reg gets freed early enough and at the right location. This resulted in reg allocation difference.