Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust fails to deduce enumerate() range #92174

Open
AngelicosPhosphoros opened this issue Dec 21, 2021 · 16 comments
Open

Rust fails to deduce enumerate() range #92174

AngelicosPhosphoros opened this issue Dec 21, 2021 · 16 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@AngelicosPhosphoros
Copy link
Contributor

I tried this code:

pub fn iter_too(v: &[u32]){
    for (i, _) in v.iter().enumerate(){
        assert!(i <= v.len());
    }
}

I expected to see this happen: Assertions is removed because it is always false.

Instead, this happened: Generated code contains loop, check and panic.

example::iter_too:
        lea     rax, [4*rsi + 4]
        mov     rcx, -1
.LBB0_1:
        add     rax, -4
        je      .LBB0_4
        add     rcx, 1
        cmp     rcx, rsi
        jbe     .LBB0_1
        push    rax
        lea     rdi, [rip + .L__unnamed_1]
        lea     rdx, [rip + .L__unnamed_2]
        mov     esi, 30
        call    qword ptr [rip + core::panicking::panic@GOTPCREL]
        ud2
.LBB0_4:
        ret

Surprisingly, compiler is able to understand that i < v.len()

This code

pub fn iter_too(v: &[u32]){
    for (i, _) in v.iter().enumerate(){
        assert!(i < v.len());
    }
}

compiles into

example::iter_too:
        ret

godbolt link

Meta

rustc --version --verbose:

rustc 1.59.0-nightly (91a0600a5 2021-12-18)
binary: rustc
commit-hash: 91a0600a5c22b9d159e3c57526af83e71d1120f8
commit-date: 2021-12-18
host: x86_64-unknown-linux-gnu
release: 1.59.0-nightly
LLVM version: 13.0.0
Compiler returned: 0

stable 1.57 affected too.

@AngelicosPhosphoros AngelicosPhosphoros added the C-bug Category: This is a bug. label Dec 21, 2021
@AngelicosPhosphoros
Copy link
Contributor Author

I got this error when tried to work with such code:

for (i, vi) in arr.iter().enumerate(){
   for (j, vj) in arr[..i].iter().enumerate(){
        ...
   }
}

I wrote minimized version in the issue.

@the8472
Copy link
Member

the8472 commented Dec 21, 2021

silly workaround for the moment:

pub fn iter_too(v: &[u32]){
    for (_, i) in v.iter().zip(0..v.len()) {
        assert!(i <= v.len());
    }
}

@AngelicosPhosphoros
Copy link
Contributor Author

Another workaround:

pub fn iter_too(v: &[u32]){
    for (i, _) in v.iter().enumerate(){
        assert!(i < v.len());
        assert!(i <= v.len());
    }
}

@BGR360
Copy link
Contributor

BGR360 commented Dec 22, 2021

Does this really fall under the category of bug? Seems like it could be considered an enhancement to optimization.

Either way, here's some findings.

The resulting optimized MIR is identical apart from Lt vs Le:

With <

    bb6: {
        StorageLive(_11);                
        _11 = (((_7 as Some).0: (usize, &u32)).0: usize); 
        StorageLive(_12);                
        StorageLive(_13);                
        StorageLive(_14);                
        _14 = _11;                       
        StorageLive(_15);               
        StorageLive(_16);                
        _16 = _1;                        
        _15 = Len((*_16));
        StorageDead(_16);                
        _13 = Lt(move _14, move _15);    // <----
        StorageDead(_15);                
        StorageDead(_14);               
        _12 = Not(move _13);
        StorageDead(_13);                
        switchInt(move _12) -> [false: bb10, otherwise: bb9];
    }

With <=

    bb6: {
        StorageLive(_11);
        _11 = (((_7 as Some).0: (usize, &u32)).0: usize);
        StorageLive(_12);
        StorageLive(_13);
        StorageLive(_14);
        _14 = _11;
        StorageLive(_15);
        StorageLive(_16);
        _16 = _1;
        _15 = Len((*_16));
        StorageDead(_16);
        _13 = Le(move _14, move _15);    // <----
        StorageDead(_15);
        StorageDead(_14);
        _12 = Not(move _13);
        StorageDead(_13);
        switchInt(move _12) -> [false: bb10, otherwise: bb9];
    }

But the resulting LLVM IR is vastly different:

With <

; playground::iter_too
; Function Attrs: nonlazybind uwtable
define void @_ZN10playground8iter_too17he8dda29a6929fd5aE([0 x i32]* noalias nonnull readonly align 4 %v.0, i64 %v.1) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
bb6:
  ret void
}

With <=

; playground::iter_too
; Function Attrs: nonlazybind uwtable
define void @_ZN10playground8iter_too17he8dda29a6929fd5aE([0 x i32]* noalias nonnull readonly align 4 %v.0, i64 %v.1) unnamed_addr #0 personality i32 (i32, i32, i64, %"unwind::libunwind::_Unwind_Exception"*, %"unwind::libunwind::_Unwind_Context"*)* @rust_eh_personality {
start:
  %0 = getelementptr [0 x i32], [0 x i32]* %v.0, i64 0, i64 0
  %1 = getelementptr inbounds [0 x i32], [0 x i32]* %v.0, i64 0, i64 %v.1
  br label %bb4

bb4:                                              ; preds = %bb8, %start
  %iter.sroa.0.0 = phi i32* [ %0, %start ], [ %2, %bb8 ]
  %iter.sroa.7.0 = phi i64 [ 0, %start ], [ %_11.0.i, %bb8 ]
  %_12.i.i = icmp eq i32* %iter.sroa.0.0, %1
  br i1 %_12.i.i, label %bb6, label %bb8

bb6:                                              ; preds = %bb4
  ret void

bb8:                                              ; preds = %bb4
  %_11.0.i = add nuw nsw i64 %iter.sroa.7.0, 1
  %2 = getelementptr inbounds i32, i32* %iter.sroa.0.0, i64 1
  %_16.not = icmp ugt i64 %iter.sroa.7.0, %v.1
  br i1 %_16.not, label %bb9, label %bb4

bb9:                                              ; preds = %bb8
; call core::panicking::panic
  tail call void @_ZN4core9panicking5panic17h0ba7146865b2f9d6E([0 x i8]* noalias nonnull readonly align 1 bitcast (<{ [30 x i8] }>* @alloc19 to [0 x i8]*), i64 30, %"core::panic::location::Location"* noalias readonly align 8 dereferenceable(24) bitcast (<{ i8*, [16 x i8] }>* @alloc21 to %"core::panic::location::Location"*)) #2
  unreachable
}

So I think maybe this points to a problem with constant evaluation.

@rustbot label +A-const-eval +T-compiler

@rustbot rustbot added A-const-eval Area: constant evaluation (mir interpretation) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 22, 2021
@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. and removed C-bug Category: This is a bug. A-const-eval Area: constant evaluation (mir interpretation) labels Dec 22, 2021
@nikic
Copy link
Contributor

nikic commented Dec 22, 2021

The basic problem here is that the loop has two induction variables, a counter and a pointer. The loop condition is on the pointer, and SCEV apparently is unable to determine the correlation between them in this case. Full IR for reference: https://llvm.godbolt.org/z/P8a8nr714

@the8472
Copy link
Member

the8472 commented Dec 22, 2021

The basic problem here is that the loop has two induction variables, a counter and a pointer.

I think that could be improved on the libs side. At least when iter.for_each() is used instead of for _ in

@AngelicosPhosphoros
Copy link
Contributor Author

clang fails to optimize it too.

#include <cstddef>
#include <vector>

void do_checks(const std::vector<int> v){
    size_t idx = 0;
    for(auto& item : v){
        v.at(idx);
        ++idx;
    }
}

compiles to

do_checks(std::vector<int, std::allocator<int> >):          # @do_checks(std::vector<int, std::allocator<int> >)
        push    rax
        mov     rax, qword ptr [rdi]
        mov     rcx, qword ptr [rdi + 8]
        mov     rdx, rcx
        sub     rdx, rax
        je      .LBB0_4
        sar     rdx, 2
        mov     rsi, -1
.LBB0_2:                                # =>This Inner Loop Header: Depth=1
        add     rsi, 1
        cmp     rdx, rsi
        je      .LBB0_5
        add     rax, 4
        cmp     rax, rcx
        jne     .LBB0_2
.LBB0_4:
        pop     rax
        ret
.LBB0_5:
        mov     edi, offset .L.str
        mov     rsi, rdx
        xor     eax, eax
        call    std::__throw_out_of_range_fmt(char const*, ...)
.L.str:
        .asciz  "vector::_M_range_check: __n (which is %zu) >= this->size() (which is %zu)"

@AngelicosPhosphoros
Copy link
Contributor Author

@the8472

I think that could be improved on the libs side.

I think, it is not optimal approach. We would need specializations of Enumerate<I> for all possible cases and still fail in some cases. LLVM-side optimization would be more robust.

@the8472
Copy link
Member

the8472 commented Dec 22, 2021

It's actually not that difficult, we already have a specialization trait that should cover most cases.

@AngelicosPhosphoros
Copy link
Contributor Author

This still would increase complexity and harm maintainability. It is fine to fix the issue that way but I think we should try to fix LLVM first and increase complexity of library only if we fail.

@BGR360
Copy link
Contributor

BGR360 commented Dec 23, 2021

@nikic or @AngelicosPhosphoros can either of you file the appropriate LLVM issue?

@AngelicosPhosphoros
Copy link
Contributor Author

Done.
llvm/llvm-project#52851

@BGR360
Copy link
Contributor

BGR360 commented Dec 24, 2021

Interesting @AngelicosPhosphoros in the example you posted on the LLVM issue, you used < and it still didn't remove the bounds check. In the Rust example, using < resulted in well-optimized code, but <= didn't. What could be causing this difference?

@the8472
Copy link
Member

the8472 commented Apr 27, 2022

The linked LLVM issue was closed and the commit that closed it can be found in the current rust-lang/llvm-project tree, so it should be available on nightly.

But the reproducer still shows the panic on godbolt

@AngelicosPhosphoros
Copy link
Contributor Author

Filed another issue to mainstream: llvm/llvm-project#56612

@nikic
Copy link
Contributor

nikic commented Apr 3, 2023

Still a problem with LLVM 16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants