Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate bounds checking in slice::Windows #77617

Merged
merged 2 commits into from
Oct 7, 2020

Conversation

AnthonyMikh
Copy link
Contributor

@AnthonyMikh AnthonyMikh commented Oct 6, 2020

This is how <core::slice::Windows as Iterator>::next looks right now:

fn next(&mut self) -> Option<&'a [T]> {
    if self.size > self.v.len() {
        None
    } else {
        let ret = Some(&self.v[..self.size]);
        self.v = &self.v[1..];
        ret
    }
}

The line with self.v = &self.v[1..]; relies on assumption that self.v is definitely not empty at this point. Else branch is taken when self.size <= self.v.len(), so self.v can be empty if self.size is zero. In practice, since Windows is never created directly but rather trough [T]::windows which panics when size is zero, self.size is never zero. However, the compiler doesn't know about this check, so it keeps the code which checks bounds and panics.

Using NonZeroUsize lets the compiler know about this invariant and reliably eliminate bounds checking without unsafe on -O2. Here is assembly of Windows<'a, u32>::next before and after this change (goldbolt):

Before
example::next:
        push    rax
        mov     rcx, qword ptr [rdi + 8]
        mov     rdx, qword ptr [rdi + 16]
        cmp     rdx, rcx
        jbe     .LBB0_2
        xor     eax, eax
        pop     rcx
        ret
.LBB0_2:
        test    rcx, rcx
        je      .LBB0_5
        mov     rax, qword ptr [rdi]
        mov     rsi, rax
        add     rsi, 4
        add     rcx, -1
        mov     qword ptr [rdi], rsi
        mov     qword ptr [rdi + 8], rcx
        pop     rcx
        ret
.LBB0_5:
        lea     rdx, [rip + .L__unnamed_1]
        mov     edi, 1
        xor     esi, esi
        call    qword ptr [rip + core::slice::slice_index_order_fail@GOTPCREL]
        ud2

.L__unnamed_2:
        .ascii  "./example.rs"

.L__unnamed_1:
        .quad   .L__unnamed_2
        .asciz  "\f\000\000\000\000\000\000\000\016\000\000\000\027\000\000"
After
example::next:
        mov     rcx, qword ptr [rdi + 8]
        mov     rdx, qword ptr [rdi + 16]
        cmp     rdx, rcx
        jbe     .LBB0_2
        xor     eax, eax
        ret
.LBB0_2:
        mov     rax, qword ptr [rdi]
        lea     rsi, [rax + 4]
        add     rcx, -1
        mov     qword ptr [rdi], rsi
        mov     qword ptr [rdi + 8], rcx
        ret

Note the lack of call to core::slice::slice_index_order_fail in second snippet.

Possible reasons not to merge this PR:

  • this changes the error message on panic in [T]::windows. However, AFAIK this messages are not covered by backwards compatibility policy.

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 6, 2020
@lcnr
Copy link
Contributor

lcnr commented Oct 6, 2020

Would you mind adding a codegen test which checks that windows().next() is unable to panic?

@lcnr
Copy link
Contributor

lcnr commented Oct 6, 2020

I expect that you need something like this: https://github.com/bugadani/rust/blob/1d157ce797dddcee16a577796199b1144b4f7f34/src/test/codegen/enum-bounds-check-issue-13926.rs (potentially without min-llvm-version, depending on how stable this optimization is)

@jyn514 jyn514 added I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 6, 2020
@AnthonyMikh
Copy link
Contributor Author

Would you mind adding a codegen test which checks that windows().next() is unable to panic?

@lcnr I can add it. However, I see no way to run codegen check. How can I do it?

@lcnr
Copy link
Contributor

lcnr commented Oct 6, 2020

./x.py test src/test/codegen/name_of_your_test.rs --stage 0 should work afaik

@AnthonyMikh
Copy link
Contributor Author

@lcnr I've added test but I am not sure if it is correct. Or should I test Windows::next directly?

@lcnr
Copy link
Contributor

lcnr commented Oct 6, 2020

seems good to me. Does this test fail without your changes?

@AnthonyMikh
Copy link
Contributor Author

seems good to me. Does this test fail without your changes?

It... Passes without my changes.

Apparently I don't know how to write codegen tests

@lcnr
Copy link
Contributor

lcnr commented Oct 6, 2020

Looks like you need slice_start_index_len_fail https://rust.godbolt.org/z/eb3jfo

@AnthonyMikh AnthonyMikh force-pushed the slice_windows_no_bounds_checking branch from 0f52d1b to b5096c1 Compare October 6, 2020 18:34
@AnthonyMikh
Copy link
Contributor Author

@lcnr without my changes this codegen test fails on --stage 1 but passes with my changes. Thanks for directing.

@AnthonyMikh
Copy link
Contributor Author

BTW I only now realised that my rather mechanical changes in other methods seem also to eliminate bounds check in next_back, last and nth_back. Should I make codegen tests for them as well?

@AnthonyMikh AnthonyMikh force-pushed the slice_windows_no_bounds_checking branch from b5096c1 to a8f098b Compare October 6, 2020 18:47
@lcnr
Copy link
Contributor

lcnr commented Oct 6, 2020

hmm, it probably won't hurt to use explicitly extend the test to also add functions for these methods 🤷 feel free to do so

r? @lcnr
otherwise r=me

@rust-highfive rust-highfive assigned lcnr and unassigned Mark-Simulacrum Oct 6, 2020
@AnthonyMikh AnthonyMikh force-pushed the slice_windows_no_bounds_checking branch from a8f098b to 61fca88 Compare October 7, 2020 09:58
@AnthonyMikh
Copy link
Contributor Author

@lcnr I've added codegen tests for Windows methods which benefit from my changes. PR is ready to merge.

@lcnr
Copy link
Contributor

lcnr commented Oct 7, 2020

Looking at the emitted code on https://godbolt.org/z/T716cx

It looks to me like next calls core::slice::index::slice_start_index_len_fail on the currently nightly and next_back calls core::slice::index::slice_end_index_len_fail rn.
Does next and next_back alone fail with the current nightly?

In general this looks to me like these tests might not be as useful as I thought, considering that method renames make them somewhat useless. Does it work to instead test with the following?

// CHECK-NOT: panic
// CHECK-NOT: fail

@AnthonyMikh AnthonyMikh force-pushed the slice_windows_no_bounds_checking branch from 61fca88 to e699e83 Compare October 7, 2020 13:17
@AnthonyMikh
Copy link
Contributor Author

Indeed, it correctly checks with panic and fail as well.

@lcnr
Copy link
Contributor

lcnr commented Oct 7, 2020

@bors r+

@bors
Copy link
Contributor

bors commented Oct 7, 2020

📌 Commit e699e83 has been approved by lcnr

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 7, 2020
@@ -751,7 +752,7 @@ impl<T> [T] {
#[stable(feature = "rust1", since = "1.0.0")]
#[inline]
pub fn windows(&self, size: usize) -> Windows<'_, T> {
assert_ne!(size, 0);
let size = NonZeroUsize::new(size).expect("size is zero");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have guideline about what the phrase used in expect call ?

Suggested change
let size = NonZeroUsize::new(size).expect("size is zero");
let size = NonZeroUsize::new(size).expect("`size` cannot be zero");

Copy link
Contributor Author

@AnthonyMikh AnthonyMikh Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, but before my change panic message was assertion failed: size != 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is bike-shredding so don't worry much about this.

@AnthonyMikh
Copy link
Contributor Author

AnthonyMikh commented Oct 7, 2020

BTW while inspecting the assembly on order to determine which changed methods benefit from my changes I checked nth_back, and I was surprised to see the panicking functions still remain in resulting binary even with -C opt-level=3. After my changes nth_back looks like so:

// impl DoubleEndedIteator ...
pub fn nth_back(&mut self, n: usize) -> Option<&'a [T]> {
    let (end, overflow) = self.v.len().overflowing_sub(n);
    if end < self.size.get() || overflow {
        self.v = &[];
        None
    } else {
        let ret = &self.v[end - self.size.get()..end];
        self.v = &self.v[..end - 1];
        Some(ret)
    }
}

Here else branch is taken when both conditions in if are false. !overflow means that n <= self.v.len() and thus end <= self.v.len(). !(end < self.size.get()) means end >= self.size.get() (so end - self.size.get() is greater or equal to zero and doesn't underflow) and, since self.size is NonZeroUsize, by transitivity we get end > 0. These two invariants guarantees that both slicings of slice.v are performed in-bounds.

The compiler understands that end > 0 -- in fact, putting assert!(end > 0) in the beginning of else branch doesn't change assembly code at all. However, it fails to understand the first invariant -- putting assert!(end <= self.v.len()) generates assembly with panic-related code. Calling unsafe {core::intrinsics::assume(end <= self.v.len()); } doesn't elide panic-related code (though it merges two code paths to them). I believe that it can be called a compiler bug since both invariants can be established via data-flow analysis. Should I file an issue for this?

@scottmcm
Copy link
Member

scottmcm commented Oct 7, 2020

TBH I was surprised that NonZeroUSize was working for this, since it works reliably for layout optimizations but often doesn't work for code optimization, see #49572

@bors
Copy link
Contributor

bors commented Oct 7, 2020

⌛ Testing commit e699e83 with merge 28928c7...

@bors
Copy link
Contributor

bors commented Oct 7, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: lcnr
Pushing 28928c7 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 7, 2020
@bors bors merged commit 28928c7 into rust-lang:master Oct 7, 2020
@rustbot rustbot added this to the 1.49.0 milestone Oct 7, 2020
@AnthonyMikh AnthonyMikh deleted the slice_windows_no_bounds_checking branch October 8, 2020 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants