Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get LLVM to stop generating dead assembly in next_power_of_two #42556

Merged
merged 2 commits into from Jun 10, 2017

Conversation

Projects
None yet
6 participants
@scottmcm
Copy link
Member

scottmcm commented Jun 9, 2017

It turns out that LLVM can turn @llvm.ctlz.i64(_, true) into @llvm.ctlz.i64(_, false) (ctlz) where valuable, but never does the opposite. That leads to some silly assembly getting generated in certain cases.

A contrived-but-clear example https://is.gd/VAIKuC:

fn foo(x:u64) -> u32 {
    if x == 0 { return !0; }
    x.leading_zeros()
}

Generates

	testq	%rdi, %rdi
	je	.LBB0_1
	je	.LBB0_3    ; <-- wha?
	bsrq	%rdi, %rax
	xorq	$63, %rax
	retq
.LBB0_1:
	movl	$-1, %eax
	retq
.LBB0_3:
	movl	$64, %eax  ; <-- dead
	retq

I noticed this in next_power_of_two, which without this PR generates the following:

	cmpq	$2, %rcx
	jae	.LBB1_2
	movl	$1, %eax
	retq
.LBB1_2:
	decq	%rcx
	je	.LBB1_3
	bsrq	%rcx, %rcx
	xorq	$63, %rcx
	jmp	.LBB1_5
.LBB1_3:
	movl	$64, %ecx  ; <-- dead
.LBB1_5:
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq

And with this PR becomes

	cmpq	$2, %rcx
	jae	.LBB0_2
	movl	$1, %eax
	retq
.LBB0_2:
	decq	%rcx
	bsrq	%rcx, %rcx
	xorl	$63, %ecx
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Jun 9, 2017

r? @BurntSushi

(rust_highfive has picked a reviewer for you, use r? to override)

@@ -280,6 +280,11 @@ pub fn trans_intrinsic_call<'a, 'tcx>(bcx: &Builder<'a, 'tcx>,
let llfn = ccx.get_intrinsic(&format!("llvm.{}.i{}", name, width));
bcx.call(llfn, &[llargs[0], y], None)
}
"ctlz_nonzero" | "cttz_nonzero" => {
let y = C_bool(bcx.ccx, true);
let llfn = ccx.get_intrinsic(&format!("llvm.{}.i{}", &name[..4], width));

This comment has been minimized.

@kennytm

kennytm Jun 9, 2017

Member
[00:03:22] tidy error: /checkout/src/librustc_trans/intrinsic.rs:285: line longer than 100 chars

This comment has been minimized.

@scottmcm

scottmcm Jun 9, 2017

Author Member

Yup; currently waiting on local double-check for the fix. Thanks for the targeted comment!

scottmcm added some commits May 28, 2017

Add ctlz_nonzero & cttz_nonzero intrinsics
LLVM currently doesn't remove the "bypass if argument is zero" assembly inside branches where the value is known to be non-zero, pessimizing code that uses uN::leading_zeros

@scottmcm scottmcm force-pushed the scottmcm:ctz-nz branch from c13a31a to 6d86f0c Jun 9, 2017

@BurntSushi

This comment has been minimized.

Copy link
Member

BurntSushi commented Jun 9, 2017

LGTM. Thanks! @bors r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 9, 2017

📌 Commit 6d86f0c has been approved by BurntSushi

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 10, 2017

⌛️ Testing commit 6d86f0c with merge 995f741...

bors added a commit that referenced this pull request Jun 10, 2017

Auto merge of #42556 - scottmcm:ctz-nz, r=BurntSushi
Get LLVM to stop generating dead assembly in next_power_of_two

It turns out that LLVM can turn `@llvm.ctlz.i64(_, true)` into `@llvm.ctlz.i64(_, false)` ([`ctlz`](http://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic)) where valuable, but never does the opposite.  That leads to some silly assembly getting generated in certain cases.

A contrived-but-clear example https://is.gd/VAIKuC:
```rust
fn foo(x:u64) -> u32 {
    if x == 0 { return !0; }
    x.leading_zeros()
}
```
Generates
```asm
	testq	%rdi, %rdi
	je	.LBB0_1
	je	.LBB0_3    ; <-- wha?
	bsrq	%rdi, %rax
	xorq	$63, %rax
	retq
.LBB0_1:
	movl	$-1, %eax
	retq
.LBB0_3:
	movl	$64, %eax  ; <-- dead
	retq
```

I noticed this in `next_power_of_two`, which without this PR generates the following:
```asm
	cmpq	$2, %rcx
	jae	.LBB1_2
	movl	$1, %eax
	retq
.LBB1_2:
	decq	%rcx
	je	.LBB1_3
	bsrq	%rcx, %rcx
	xorq	$63, %rcx
	jmp	.LBB1_5
.LBB1_3:
	movl	$64, %ecx  ; <-- dead
.LBB1_5:
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq
```

And with this PR becomes
```asm
	cmpq	$2, %rcx
	jae	.LBB0_2
	movl	$1, %eax
	retq
.LBB0_2:
	decq	%rcx
	bsrq	%rcx, %rcx
	xorl	$63, %ecx
	movq	$-1, %rax
	shrq	%cl, %rax
	incq	%rax
	retq
```
@bors

This comment has been minimized.

Copy link
Contributor

bors commented Jun 10, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: BurntSushi
Pushing 995f741 to master...

@bors bors merged commit 6d86f0c into rust-lang:master Jun 10, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details

@scottmcm scottmcm deleted the scottmcm:ctz-nz branch Jun 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.