Neighboring bounds checks aren't collapsed #109983

colejohnson66 · 2024-11-19T21:03:26Z

Description

Given the following code:

public static uint Test(ReadOnlySpan<uint> span) =>
    span[0] + span[1] + span[2] + span[3];

The only way for this method to fail a bounds/range check is if span.Length < 4. As such, I'd expect the runtime to optimize to something like this:

public static uint Test(ReadOnlySpan<uint> span)
{
    if (span.Length <= 3)
        CORINFO_HELP_RNGCHKFAIL();
    return span[0] + span[1] + span[2] + span[3];
}

Instead, the runtime generates this:

// coreclr trunk-20241119+e33be4d53ab951272c9161c50b549e5e7db60262

Klass:Test(System.ReadOnlySpan`1[uint]):uint (FullOpts):
       push     rax
       test     esi, esi
       je       SHORT G_M30269_IG04
       mov      eax, dword ptr [rdi]
       cmp      esi, 1
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x04]
       cmp      esi, 2
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x08]
       cmp      esi, 3
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x0C]
       add      rsp, 8
       ret      
G_M30269_IG04:  ;; offset=0x0024
       call     CORINFO_HELP_RNGCHKFAIL
       int3

Note the four range checks against esi. Failures all call into CORIFNO_HELP_RNGCHKFAIL with no indication of which failed. As such, I'd expect the four compare+branch blocks to collapse into a singular one like this:

       cmp esi, 3
       jbe SHORT_G_M30269_IG04
       mov eax, dword ptr [rdi]
       add eax, dword ptr [rdi+0x04]
       add eax, dword ptr [rdi+0x08]
       add eax, dword ptr [rdi+0x0C]
       ret      
G_M30269_IG04:  ;; offset=0x00__
       call     CORINFO_HELP_RNGCHKFAIL
       int3

Configuration

Godbolt Compiler Explorer against "trunk" (e33be4d)

Regression?

No. Present in net7.0.19 and net8.0.5.

Notes

Interestingly, a naive loop generates nearly the same assembly, but sets up and tears down a frame:

public static uint Test(ReadOnlySpan<uint> span)
{
    uint sum = 0;
    for (int i = 0; i < 4; i++)
        sum += span[i];
    return sum;
}

// coreclr trunk-20241119+e33be4d53ab951272c9161c50b549e5e7db60262

Klass:Test(System.ReadOnlySpan`1[uint]):uint (FullOpts):
       push     rbp
       mov      rbp, rsp
       test     esi, esi
       je       SHORT G_M30269_IG04
       mov      eax, dword ptr [rdi]
       cmp      esi, 1
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x04]
       cmp      esi, 2
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x08]
       cmp      esi, 3
       jbe      SHORT G_M30269_IG04
       add      eax, dword ptr [rdi+0x0C]
       pop      rbp
       ret      
G_M30269_IG04:  ;; offset=0x0024
       call     CORINFO_HELP_RNGCHKFAIL
       int3

The text was updated successfully, but these errors were encountered:

dotnet-policy-service · 2024-11-19T21:04:01Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

AndyAyersMS · 2024-11-20T16:46:51Z

As far as I know, we don't yet do any analysis to prove that an execution that might do some things and then throw is equivalent to doing an up-front check and just throwing. Typically, there other are observable side effects from the "things" so these are not equivalent.

It would be interesting to try and figure out how often we see this. Something like: if we have two bounds checks A and B, where A dominates B, and B implies A (eg n > 3 implies n > 2), and there are no other observable side effects between A and B, then we can change the check at A to be B, and remove B. Repeated application of this (or starting with the most dominated check and pushing it up as far as it can go) would get us the "optimized" version above.

colejohnson66 · 2024-11-22T13:49:31Z

Neither LLVM (with Rust) nor GCC (with C++ and std::vector::at) perform this optimization, so it’s not surprising CoreCLR doesn’t either. I figured I’d bring it up as an interesting optimization because there’s quite a bit of code that contains logic similar to this (future code implying that prior code must succeed and there’s no side effects between them).

colejohnson66 added the tenet-performance Performance related issue label Nov 19, 2024

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 19, 2024

dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 19, 2024

EgorBo added this to the Future milestone Nov 19, 2024

EgorBo removed the untriaged New issue has not been triaged by the area owner label Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Neighboring bounds checks aren't collapsed #109983

Neighboring bounds checks aren't collapsed #109983

colejohnson66 commented Nov 19, 2024 •

edited

Loading

dotnet-policy-service bot commented Nov 19, 2024

AndyAyersMS commented Nov 20, 2024

colejohnson66 commented Nov 22, 2024

Neighboring bounds checks aren't collapsed #109983

Neighboring bounds checks aren't collapsed #109983

Comments

colejohnson66 commented Nov 19, 2024 • edited Loading

Description

Configuration

Regression?

Notes

dotnet-policy-service bot commented Nov 19, 2024

AndyAyersMS commented Nov 20, 2024

colejohnson66 commented Nov 22, 2024

colejohnson66 commented Nov 19, 2024 •

edited

Loading