Missing optimization with signed pointer offset #56057

TheIronBorn · 2018-11-19T02:13:30Z

I am trying to elide the pointer offset of a slice indexing operation.

I tried this code:

pub fn index(table: &[u128; 4], idx: i32) -> u128 {
    table[(idx as usize & 0b11_0000) >> 4]
}

with RUST_BACKTRACE=full RUSTFLAGS='--emit=asm' cargo build --release.

I expected to see this happen:

example::index:
  and esi, 48
  mov rax, qword ptr [rsi + rdi]
  mov rdx, qword ptr [rsi + rdi + 8]
  ret

(selects two bits, already in the pointer offset position)

Instead, this happened:

example::index:
  shr esi, 4
  and esi, 3
  shl rsi, 4
  mov rax, qword ptr [rdi + rsi]
  mov rdx, qword ptr [rdi + rsi + 8]
  ret

A godbolt link for comparison with an unsafe version which does apply the optimization: https://godbolt.org/z/0QsA3z

Meta

rustc --version --verbose
rustc 1.32.0-nightly (6b9b97bd9 2018-11-15)
binary: rustc
commit-hash: 6b9b97bd9b704f85f0184f7a213cc4d62bd9654c
commit-date: 2018-11-15
host: x86_64-apple-darwin
release: 1.32.0-nightly
LLVM version: 8.0

Backtrace:
none

The text was updated successfully, but these errors were encountered:

TheIronBorn · 2018-11-19T02:15:35Z

Note that

pub fn index(table: &[u128; 4], idx: usize) -> u128 {
    table[(idx & 0b11_0000) >> 4]
}

does apply the optimization. The compiler seems to be worried about the sign-bit, despite the bit-and.

nikic · 2018-12-01T12:07:54Z

It seems that the problematic factor is here not the signedness, but the integer size. If the index uses isize it optimizes as expected. With i32 there is an extra zext that seems to inhibit this optimization.

nikic · 2018-12-01T20:03:33Z

Just looked into this... With usize the relevant part of SelectionDAG looks like

      t2: i64,ch = CopyFromReg t0, Register:i64 %0
            t4: i64,ch = CopyFromReg t0, Register:i64 %1
          t7: i64 = srl t4, Constant:i8<4>
        t9: i64 = and t7, Constant:i64<3>
      t10: i64 = shl t9, Constant:i64<4>
    t11: i64 = add t2, t10

and is DAGCombined into

      t2: i64,ch = CopyFromReg t0, Register:i64 %0
        t4: i64,ch = CopyFromReg t0, Register:i64 %1
      t26: i64 = and t4, Constant:i64<48>
    t11: i64 = add t2, t26

With isize instead we have

    t2: i64,ch = CopyFromReg t0, Register:i64 %0
            t4: i32,ch = CopyFromReg t0, Register:i32 %1
          t7: i32 = srl t4, Constant:i8<4>
        t9: i32 = and t7, Constant:i32<3>
      t10: i64 = zero_extend t9
    t12: i64 = shl t10, Constant:i64<4>
  t13: i64 = add t2, t12

The additional zero_extend between shl and and inhibits the optimization.

The relevant combine for this is https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L6177.

nikic · 2018-12-01T20:59:57Z

Reported as https://bugs.llvm.org/show_bug.cgi?id=39855.

steveklabnik · 2019-12-07T18:13:28Z

triage: playground still reports

playground::index:
	shrl	$4, %esi
	andl	$3, %esi
	shlq	$4, %rsi
	movq	(%rdi,%rsi), %rax
	movq	8(%rdi,%rsi), %rdx
	retq

nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Dec 1, 2018

Nilstrieb added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Apr 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing optimization with signed pointer offset #56057

Missing optimization with signed pointer offset #56057

TheIronBorn commented Nov 19, 2018

TheIronBorn commented Nov 19, 2018

nikic commented Dec 1, 2018

nikic commented Dec 1, 2018

nikic commented Dec 1, 2018

steveklabnik commented Dec 7, 2019

Missing optimization with signed pointer offset #56057

Missing optimization with signed pointer offset #56057

Comments

TheIronBorn commented Nov 19, 2018

Meta

TheIronBorn commented Nov 19, 2018

nikic commented Dec 1, 2018

nikic commented Dec 1, 2018

nikic commented Dec 1, 2018

steveklabnik commented Dec 7, 2019