Unfortunate optimization with Option<bool> vs Option<uint> #20149

alexcrichton · 2014-12-23T00:11:15Z

Currently, given this program:

#[inline(never)]
fn get_bool(slot: &mut Option<bool>) -> &mut bool {
    if slot.is_none() {
        *slot = Some(true);
    }
    slot.as_mut().unwrap()
}

#[inline(never)]
fn get_uint(slot: &mut Option<uint>) -> &mut uint {
    if slot.is_none() {
        *slot = Some(1);
    }
    slot.as_mut().unwrap()
}

fn main() {
    println!("{}", get_bool(&mut None));
    println!("{}", get_uint(&mut None));
}

It generates this IR.

The "unfortunate" part here is that a call to failure is generated for the get_bool function where no failure is generated for the get_uint function.

There's a lot going on here so I'm not sure if it's our codegen or LLVM's codegen, but I would personally expect both cases to optimize down to never unwinding because it's known that neither Option can ever be None.

The text was updated successfully, but these errors were encountered:

alexcrichton · 2014-12-23T00:11:36Z

cc @luqmana, @dotdash, @Aatch

dotdash · 2014-12-23T00:22:27Z

Seems to happen for all integer types smaller than int/uint

dotdash · 2014-12-23T00:28:48Z

Noticable difference is that the uint version does two independent stores, one for the discriminant and one for the value itself, but the versions with the smaller data types have a store for the whole Option value at once:

  store %"enum.core::option::Option<[bool]>[#3]" { i8 1, [0 x i8] undef, [1 x i8] c"\01" }, %"enum.core::option::Option<[bool]>[#3]"* %0, align 1

And I guess the pass responsible for optimizing the icmp away can't handle that.

Since we originally emit independent stores, I guess this should be handled in LLVM upstream.

dotdash · 2014-12-23T00:39:36Z

OTOH I can't reproduce the problem with clang. So maybe we could do something about the way unions are represented in LLVM types.

dotdash · 2014-12-23T00:51:09Z

OK, it's something completely different. The assignment happens via store for Option<bool> etc. but via memcpy for Option<uint>. Only the memcpy gets optimized.

Aatch · 2014-12-23T02:31:58Z

Because Option<bool> is only two bytes, it's an immediate. Hence why we don't use memcpy to copy it. I don't know why LLVM can optimize one and not the other though. All I can think is that the ordering of the passes is such that it combines the stores for the bool case before LLVM realises that the panic case is impossible.

Aatch · 2014-12-23T02:51:23Z

There is a comment in expr.rs that seems suspicious. Commenting out the line causes get_bool to optimised as expected. Since it references morestack, I'm curious if the comment is still accurate today, as it suggests that it may be related to segmented stacks.

Maybe whoever wrote the comment could chime in? The library split has made it difficult to find out who though.

Aatch · 2014-12-23T02:59:07Z

This commit added the comment in question: ee4ba449

@nikomatsakis can you confirm that it was related to segmented stacks?

dotdash · 2015-01-02T21:36:25Z

http://reviews.llvm.org/D6829 -- This teaches LLVM to handle the aggregate and extract the loaded value from it, allowing it to optimize the code even without #20161. It also handles the example I presented there where the deferred load breaks an optimization.

alexcrichton · 2015-01-02T21:40:05Z

Wow, thanks so much for looking into this @dotdash!

Currently, small aggregates are passed to functions as immediate values as is. This has two consequences. One is that aggregates are passed component-wise by LLVM, so e.g. a struct containing four u8 values (e.g. an RGBA struct) will be passed as four individual values. The other is that LLVM isn't very good at optimizing loads/stores of first class attributes. What clang does is converting the aggregate to an appropriately sized integer type (e.g. i32 for the four u8 values), and using that for the function argument. This allows LLVM to create code that is a lot better. Fixes rust-lang#20450 rust-lang#20149 rust-lang#16506 rust-lang#13927

Currently, small aggregates are passed to functions as immediate values as is. This has two consequences. One is that aggregates are passed component-wise by LLVM, so e.g. a struct containing four u8 values (e.g. an RGBA struct) will be passed as four individual values. The other is that LLVM isn't very good at optimizing loads/stores of first class attributes. What clang does is converting the aggregate to an appropriately sized integer type (e.g. i32 for the four u8 values), and using that for the function argument. This allows LLVM to create code that is a lot better. Fixes #20450 #20149 #16506 #13927

huonw · 2015-01-11T09:07:21Z

Closed by #20755.

alexcrichton added the A-codegen Area: Code generation label Dec 23, 2014

Aatch mentioned this issue Dec 23, 2014

Defer immediate loads in rvalue expressions #20161

Closed

dotdash mentioned this issue Jan 3, 2015

Stop using first class aggregates #20450

Closed

dotdash mentioned this issue Jan 8, 2015

Avoid loads/stores of first class aggregates #20755

Merged

huonw closed this as completed Jan 11, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unfortunate optimization with Option<bool> vs Option<uint> #20149

Unfortunate optimization with Option<bool> vs Option<uint> #20149

alexcrichton commented Dec 23, 2014

alexcrichton commented Dec 23, 2014

dotdash commented Dec 23, 2014

dotdash commented Dec 23, 2014

dotdash commented Dec 23, 2014

dotdash commented Dec 23, 2014

Aatch commented Dec 23, 2014

Aatch commented Dec 23, 2014

Aatch commented Dec 23, 2014

dotdash commented Jan 2, 2015

alexcrichton commented Jan 2, 2015

huonw commented Jan 11, 2015

Unfortunate optimization with Option<bool> vs Option<uint> #20149

Unfortunate optimization with Option<bool> vs Option<uint> #20149

Comments

alexcrichton commented Dec 23, 2014

alexcrichton commented Dec 23, 2014

dotdash commented Dec 23, 2014

dotdash commented Dec 23, 2014

dotdash commented Dec 23, 2014

dotdash commented Dec 23, 2014

Aatch commented Dec 23, 2014

Aatch commented Dec 23, 2014

Aatch commented Dec 23, 2014

dotdash commented Jan 2, 2015

alexcrichton commented Jan 2, 2015

huonw commented Jan 11, 2015