Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Use faster write barriers when we know the target address is on the heap #97953

Merged
merged 9 commits into from
Feb 8, 2024

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Feb 5, 2024

Closes #97534

TL;DR: JIT has an optimization that tries to simplify/remove write barriers, but that optimization happens too late to rely on VN/SSA/Assertions, so CSE can easily ruin it by moving addresses or values to locals and forcing that opt to give up and use the checked write barrier. This PR assists that optimization from assertion prop (#97901 did it for nongc objects as values for such indirs), Example:

public struct Slot
{
    public object Item;
    public int SequenceNumber;
}

static void Test(Slot[] arr, object o)
{
    arr[0].Item = o;
    arr[0].SequenceNumber = 1;
}

Codegen for Test:

; Method LdtokenRepro:Test(Slot[],System.Object) (FullOpts)
G_M35662_IG01:  ;; offset=0x0000
       push     rbx
       sub      rsp, 32
G_M35662_IG02:  ;; offset=0x0005
       cmp      dword ptr [rcx+0x08], 0
       jbe      SHORT G_M35662_IG04
       lea      rbx, bword ptr [rcx+0x10]
       mov      rcx, rbx
-      call     CORINFO_HELP_CHECKED_ASSIGN_REF
+      call     CORINFO_HELP_ASSIGN_REF
       mov      dword ptr [rbx+0x08], 1
G_M35662_IG03:  ;; offset=0x001E
       add      rsp, 32
       pop      rbx
       ret      
G_M35662_IG04:  ;; offset=0x0024
       call     CORINFO_HELP_RNGCHKFAIL
       int3     
; Total bytes of code: 42

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 5, 2024
@ghost ghost assigned EgorBo Feb 5, 2024
@ghost
Copy link

ghost commented Feb 5, 2024

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Closes #97534

TL;DR: JIT has an optimization that tries to simplify/remove write barriers, but that optimization happens to late to rely on VN/SSA/Assertions, so CSE can easily ruin it by moving addresses or values to locals and forcing that opt to give up and use the checked write barrier. This PR assists that optimization from assertion prop, Example:

public struct Slot
{
    public object Item;
    public int SequenceNumber;
}

static void Test(Slot[] arr, object o)
{
    arr[0].Item = o;
    arr[0].SequenceNumber = 1;
}

Codegen for Test:

; Method LdtokenRepro:Test(Slot[],System.Object) (FullOpts)
G_M35662_IG01:  ;; offset=0x0000
       push     rbx
       sub      rsp, 32
G_M35662_IG02:  ;; offset=0x0005
       cmp      dword ptr [rcx+0x08], 0
       jbe      SHORT G_M35662_IG04
       lea      rbx, bword ptr [rcx+0x10]
       mov      rcx, rbx
-      call     CORINFO_HELP_CHECKED_ASSIGN_REF
+      call     CORINFO_HELP_ASSIGN_REF
       mov      dword ptr [rbx+0x08], 1
G_M35662_IG03:  ;; offset=0x001E
       add      rsp, 32
       pop      rbx
       ret      
G_M35662_IG04:  ;; offset=0x0024
       call     CORINFO_HELP_RNGCHKFAIL
       int3     
; Total bytes of code: 42
Author: EgorBo
Assignees: EgorBo
Labels:

area-CodeGen-coreclr

Milestone: -

@EgorBo EgorBo marked this pull request as ready for review February 5, 2024 12:51
@EgorBo
Copy link
Member Author

EgorBo commented Feb 5, 2024

@AndyAyersMS @jakobbotsch cc @SingleAccretion @dotnet/jit-contrib PTAL. Diffs. Diffs aren't too big because CORINFO_HELP_CHECKED_ASSIGN_REF -> CORINFO_HELP_ASSIGN_REF change has 0 size diff, but should improve perf since the latter doesn't need to check whether pointer targets heap or not (two comparisons). Most size diffs come from PtrToLoc -> NoWritebarrier. A small TP regression due to GetVNFunc in a loop (the regression was much bigger before I minimized it), but, hopefully, faster write barriers are worth it.

Closes #97534

@EgorBo
Copy link
Member Author

EgorBo commented Feb 5, 2024

@MihuBot

@EgorBo
Copy link
Member Author

EgorBo commented Feb 7, 2024

ping @AndyAyersMS - I presume it should help you with CSE work since CORINFO_HELP_ASSIGN_REF and CORINFO_HELP_CHECKED_ASSIGN_REF have the same asm and PerfScore, so CSE changes migth silently regress them.

I analyzed jit-diff on top of BCL and this PR removed ~2400 checked write barriers.

if (arg1Type != GCInfo::WriteBarrierForm::WBF_BarrierUnknown)
{
return arg1Type;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: this has an assumption that "addressOfLocal + addressOfHeap" is UB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it a bit more conservative (it didn't affect diffs) by requiring the 2nd argument to be a non-handle constant. So, if we see address being GT_ADD(op1, op2) and one of the arguments is either address-of-local or address-within-heap, the other argument must be a non-handle constant.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

src/coreclr/jit/assertionprop.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/assertionprop.cpp Show resolved Hide resolved
Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for fixing this.

We could / should have perf score reflect the cost of helper calls (likewise with gtCostEx).

@EgorBo
Copy link
Member Author

EgorBo commented Feb 8, 2024

We could / should have perf score reflect the cost of helper calls (likewise with gtCostEx).

I agree, I assume currently all unknown calls have the same cost (assuming same arguments)? Do we want to make all helpers cheaper than unknown calls or more expensive? I presume this could result in some diffs

@EgorBo
Copy link
Member Author

EgorBo commented Feb 8, 2024

Failures are known + PR passed all checks before adding debug JITDUMPs

@EgorBo EgorBo merged commit cd8057f into dotnet:main Feb 8, 2024
121 of 129 checks passed
@EgorBo EgorBo deleted the assertprop-wb-fix-3 branch February 8, 2024 14:42
@github-actions github-actions bot locked and limited conversation to collaborators Mar 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JIT: CSE of an address computation can result in a write barrier change
3 participants