Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redundant bounds check is not elided #58388

Open
FaultyRAM opened this issue Feb 12, 2019 · 2 comments
Open

Redundant bounds check is not elided #58388

FaultyRAM opened this issue Feb 12, 2019 · 2 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@FaultyRAM
Copy link

Nightly Playground link

Splitting a slice in two such that left.len() <= right.len(), then looping left.len() times over the elements of left and right cannot overflow, but a bounds check is emitted for right[i] regardless:

pub fn foo(bytes: &[u8]) -> u8 {
    let (left, right) = bytes.split_at(bytes.len() >> 1);
    let mut total: u8 = 0;
    for i in 0..left.len() {
        let a = left[i];
        let b = right[i];
        let n = a.wrapping_add(b);
        total = total.wrapping_add(n);
    }
    total
}
playground::foo: # @playground::foo
# %bb.0:
	pushq	%rax
	movq	%rsi, %rcx
	shrq	%rcx
	je	.LBB0_1
# %bb.3:
	movq	%rsi, %r8
	subq	%rcx, %r8
	leaq	(%rdi,%rcx), %r9
	xorl	%eax, %eax
	xorl	%esi, %esi

.LBB0_4:                                # =>This Inner Loop Header: Depth=1
	cmpq	%r8, %rsi
	jae	.LBB0_6
# %bb.5:                                #   in Loop: Header=BB0_4 Depth=1
	addb	(%rdi,%rsi), %al
	addb	(%r9,%rsi), %al
	leaq	1(%rsi), %rdx
	movq	%rdx, %rsi
	cmpq	%rcx, %rdx
	jb	.LBB0_4
# %bb.2:
                                        # kill: def $al killed $al killed $rax
	popq	%rcx
	retq

.LBB0_1:
	xorl	%eax, %eax
                                        # kill: def $al killed $al killed $rax
	popq	%rcx
	retq

.LBB0_6:
	leaq	.Lanon.f6dd5f9df35562231dce3b61698e08b8.0(%rip), %rdi
	movq	%r8, %rdx
	callq	*core::panicking::panic_bounds_check@GOTPCREL(%rip)
	ud2
                                        # -- End function

str.0:
	.ascii	"src/lib.rs"

.Lanon.f6dd5f9df35562231dce3b61698e08b8.0:
	.quad	str.0
	.quad	10                      # 0xa
	.long	6                       # 0x6
	.long	17                      # 0x11
@jonas-schievink jonas-schievink added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 12, 2019
@scottmcm
Copy link
Member

FWIW, the usual hack mostly fixes this:

pub fn foo(bytes: &[u8]) -> u8 {
    let (left, right) = bytes.split_at(bytes.len() >> 1);
    let right = &right[..left.len()]; // <--
    let mut total: u8 = 0;
    for i in 0..left.len() {
        let a = left[i];
        let b = right[i];
        let n = a.wrapping_add(b);
        total = total.wrapping_add(n);
    }
    total
}

A simpler repro:

pub fn foo(x: &[i32]) -> &[i32] {
    let (a, b) = x.split_at(x.len() / 2);
    &b[..a.len()]
}

Probably an LLVM issue, not something practical to fix in Rust.

@nikic
Copy link
Contributor

nikic commented Mar 25, 2019

Minimal IR test case:

define i1 @test(i32 %x) {
  %y = lshr i32 %x, 1
  %z = sub i32 %x, %y
  %r = icmp ult i32 %z, %y
  ret i1 %r
}

Here %r should fold to i1 false.

While it would be easy to add an instcombine pattern for this specific case, I don't see an obvious way to express this as a more generic optimization...

@nikic nikic added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Apr 13, 2019
@jonas-schievink jonas-schievink added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Mar 17, 2020
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants