Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neighboring bounds checks aren't collapsed #109983

Open
colejohnson66 opened this issue Nov 19, 2024 · 3 comments
Open

Neighboring bounds checks aren't collapsed #109983

colejohnson66 opened this issue Nov 19, 2024 · 3 comments
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@colejohnson66
Copy link

colejohnson66 commented Nov 19, 2024

Description

Given the following code:

public static uint Test(ReadOnlySpan<uint> span) =>
    span[0] + span[1] + span[2] + span[3];

The only way for this method to fail a bounds/range check is if span.Length < 4. As such, I'd expect the runtime to optimize to something like this:

public static uint Test(ReadOnlySpan<uint> span)
{
    if (span.Length <= 3)
        CORINFO_HELP_RNGCHKFAIL();
    return span[0] + span[1] + span[2] + span[3];
}

Instead, the runtime generates this:

// coreclr trunk-20241119+e33be4d53ab951272c9161c50b549e5e7db60262

Klass:Test(System.ReadOnlySpan`1[uint]):uint (FullOpts):
       push     rax
       test     esi, esi
       je       SHORT G_M30269_IG04
       mov      eax, dword ptr [rdi]
       cmp      esi, 1
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x04]
       cmp      esi, 2
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x08]
       cmp      esi, 3
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x0C]
       add      rsp, 8
       ret      
G_M30269_IG04:  ;; offset=0x0024
       call     CORINFO_HELP_RNGCHKFAIL
       int3     

Note the four range checks against esi. Failures all call into CORIFNO_HELP_RNGCHKFAIL with no indication of which failed. As such, I'd expect the four compare+branch blocks to collapse into a singular one like this:

       cmp esi, 3
       jbe SHORT_G_M30269_IG04
       mov eax, dword ptr [rdi]
       add eax, dword ptr [rdi+0x04]
       add eax, dword ptr [rdi+0x08]
       add eax, dword ptr [rdi+0x0C]
       ret      
G_M30269_IG04:  ;; offset=0x00__
       call     CORINFO_HELP_RNGCHKFAIL
       int3     

Configuration

Godbolt Compiler Explorer against "trunk" (e33be4d)

Regression?

No. Present in net7.0.19 and net8.0.5.

Notes

Interestingly, a naive loop generates nearly the same assembly, but sets up and tears down a frame:

public static uint Test(ReadOnlySpan<uint> span)
{
    uint sum = 0;
    for (int i = 0; i < 4; i++)
        sum += span[i];
    return sum;
}
// coreclr trunk-20241119+e33be4d53ab951272c9161c50b549e5e7db60262

Klass:Test(System.ReadOnlySpan`1[uint]):uint (FullOpts):
       push     rbp
       mov      rbp, rsp
       test     esi, esi
       je       SHORT G_M30269_IG04
       mov      eax, dword ptr [rdi]
       cmp      esi, 1
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x04]
       cmp      esi, 2
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x08]
       cmp      esi, 3
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x0C]
       pop      rbp
       ret      
G_M30269_IG04:  ;; offset=0x0024
       call     CORINFO_HELP_RNGCHKFAIL
       int3     
@colejohnson66 colejohnson66 added the tenet-performance Performance related issue label Nov 19, 2024
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 19, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@EgorBo EgorBo added this to the Future milestone Nov 19, 2024
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Nov 19, 2024
@AndyAyersMS
Copy link
Member

As far as I know, we don't yet do any analysis to prove that an execution that might do some things and then throw is equivalent to doing an up-front check and just throwing. Typically, there other are observable side effects from the "things" so these are not equivalent.

It would be interesting to try and figure out how often we see this. Something like: if we have two bounds checks A and B, where A dominates B, and B implies A (eg n > 3 implies n > 2), and there are no other observable side effects between A and B, then we can change the check at A to be B, and remove B. Repeated application of this (or starting with the most dominated check and pushing it up as far as it can go) would get us the "optimized" version above.

@colejohnson66
Copy link
Author

Neither LLVM (with Rust) nor GCC (with C++ and std::vector::at) perform this optimization, so it’s not surprising CoreCLR doesn’t either. I figured I’d bring it up as an interesting optimization because there’s quite a bit of code that contains logic similar to this (future code implying that prior code must succeed and there’s no side effects between them).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

3 participants