Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an expect intrinsic #1131

Merged
merged 2 commits into from Jun 10, 2015

Conversation

Projects
None yet
@Aatch
Copy link
Contributor

Aatch commented May 20, 2015

`if expect(foo == bar, false) { .. }`.

The expected value is required to be a constant value.

This comment has been minimized.

@seanmonstar

seanmonstar May 20, 2015

Contributor

I assume you mean to use something like llvm.expect.i8 and converting the booleans to i8s, but this could be included in the Detailed Design.

Likewise, when I tried this myself by creating extern fn expect_i8 etc functions, linking against llvm, I couldn't find the optimizations in the asm. It'd be nice to see the improvements that such a function could possibly bring in this RFC.

This comment has been minimized.

@Aatch

Aatch May 20, 2015

Author Contributor

@seanmonstar actually, you can use llvm.expect.i1. As for improvements, that's hard to show since benchmark code tends to warm up the prediction table pretty quickly. It also isn't always visible in the asm if the code was already structured such that the blocks were ordered appropriately.

This comment has been minimized.

@seanmonstar

seanmonstar May 20, 2015

Contributor

Right, but we don't have an i1, the smallest in rust is i8.

True about the benchmarks. I meant the compiled asm having the jump altered, before and after using expect.

This comment has been minimized.

@Aatch

Aatch May 20, 2015

Author Contributor

@seanmonstar actually, bools are now treated as i1s when generating LLVM IR.

This comment has been minimized.

@seanmonstar

seanmonstar May 20, 2015

Contributor

Ah, very cool. That confusion may be even more reason to include the implementation details.

This comment has been minimized.

@Aatch

Aatch May 20, 2015

Author Contributor

Oh, and I left it out of the detailed design because I don't think the precise implementation details for a specific backend are important. Just the semantics of the intrinsic.

@seanmonstar

This comment has been minimized.

Copy link
Contributor

seanmonstar commented May 20, 2015

However, 👍 assuming there is a gain to found in there. This is one of the bigger pieces missing so that httparse can be as fast picohttpparser.

@nrc nrc added the T-lang label May 20, 2015

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 20, 2015

This is something I'd like to see in Rust. However, the suggested 'syntax' seems weird (though some what traditional and I'm guessing close to llvm) from both from a use and implementation point of view: why is the intrinsic on the condition when the hint is about which branch is taken? If we start doing our own optimisations, I assume we'd have to be careful about any kind of code motion here in order to not break hinting to llvm.

Could we instead have an attribute on the if expression? Or could we make guarantees not to re-order branches and rely on llvm always treating the if case as more likely than the else case? (I realise that is not very discoverable, but this is a feature for advanced developers).

@pythonesque

This comment has been minimized.

Copy link
Contributor

pythonesque commented May 20, 2015

why is the intrinsic on the condition when the hint is about which branch is taken

IIRC, this is because expect is somewhat more general than likely / unlikely (so you can tell LLVM to expect a value even if a jump isn't immediately dependent on it).

I'm fully in favor of this RFC... this is very important for getting optimal codegen in many cases and I've personally been wishing for it. Right now I've been resorting to #[cold] to get the desired behavior in a few cases, but it's too blunt of an instrument to use everywhere.

@nagisa

This comment has been minimized.

Copy link
Contributor

nagisa commented May 20, 2015

fn(bool, bool) -> bool does not allow to easily “expect” for some variant of enum unless is_variant methods are provided.

@quantheory

This comment has been minimized.

Copy link
Contributor

quantheory commented May 20, 2015

To me, the fact that a constant expression is necessary seems to argue for likely/unlikely. (Or maybe for making it an expect! macro, which can insert something to assign the RHS to a constant, and thus force the compiler to prove that it gets a constant expression.) In any case, the RFC should probably at least mention what happens if you incorrectly use a non-constant expression as an argument to expect. I.e., is that an error that the compiler detects, or is the hint just ignored?

@nagisa The enum variant case could be done with dependent types, sort of:

fn expect<const C: T, T=bool>(expr: T) {
    /* ... */
    expr
}
if expect::<true>( /* some expr */ ) { /* ... */ }
match expect::<Variant1, MyEnum>( /* some expr */ ) {
    Variant1 => { /* ... */ }
    _ => { /* ... */ }
}

But that's a bit clunky and doesn't deal with patterns (e.g. if you want to state that a branch for Variant1 | Variant2 is likely). I think that match patterns probably could only be dealt with as in @nrc's solution, i.e. somehow annotating the construct that's doing the branching, rather than an expression used in the condition being tested.

Edit: My original example didn't make any sense at all. Hopefully it's better now.

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented May 20, 2015

@quantheory yeah, I'd be happy with likely/unlikely too, it's pretty much how we handled similar restrictions for atomic operations (since the ordering has to be a constant).

@nrc Ok, where would the attribute go? On the if itself? Where would you put it if you wanted to hint an else if branch? What if you're taking advantage of short-circuiting and only want to hint the second part of an and or or? The only reasonable place is to add the attribute to the condition, since you're not hinting branch, you're hinting the likelyhood of the condition the branch is dependent on. Unconditional branches don't need hinting.

@nagisa most of the time use of branch hinting doesn't help at all. The fact that you can't hint about match expressions isn't really a concern I have. This is quite simply to help tune code where the penalty of a misprediction can end up quite large. Most of the time, the only difference this will make to codegen is block reordering (for example, we hint the overflow checks as being unlikely, which pushes the panics down to the bottom of the function).

@llogiq

This comment has been minimized.

Copy link
Contributor

llogiq commented May 21, 2015

@quantheory 👍 for likely(b: bool)/unlikely(b: bool) – it's certainly more readable than if expect!(my_predicate, true) { … }, does not need any special handling and will be completely sufficient to get the majority of the (rare) cases where giving the compiler a hint about branches matters.

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 21, 2015

@Aatch Seems reasonable to have the annotation on the condition. What about the idea of always assuming the if branch is always expected? Do you think that is impractical?

@seanmonstar

This comment has been minimized.

Copy link
Contributor

seanmonstar commented May 21, 2015

I'd rather it not be an attribute, as that would make it harder when an if statement includes different conditions with different likelihoods. Take this example: https://github.com/h2o/picohttpparser/blob/master/picohttpparser.c#L173

@Stebalien

This comment has been minimized.

Copy link
Contributor

Stebalien commented May 22, 2015

@nrc

What about the idea of always assuming the if branch is always expected? Do you think that is impractical?

That would lead to bad codegen when checking error conditions:

if some_unexpected_error_condition {
    return Err(...);
}

Aside:

It would also be nice to get likely/unlikely attributes for enum variants:

enum Result<V, E> {
    #[likely]
    Ok(V),
    #[unlikely]
    Err(E)
}

These wouldn't override an explicit likely/unlikely call on a branch but would provide sane defaults. Unfortunately, this would likely take quite a bit more work to implement.

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented May 22, 2015

@nrc having the attribute on the condition is ultimately identical to having a function call wrapping the condition, no? I'd argue that an attribute on the condition itself would be really unclear, especially since you'd probably need to wrap the condition in parens anyway to be sure that it's applying to the correct expression.

What this means is that instead of if likely(foo == bar) { .. } you have if #[likely] (foo == bar) { .. }. How is that any better?

@seanmonstar

This comment has been minimized.

Copy link
Contributor

seanmonstar commented May 22, 2015

I really like the attribute on variants idea by @Stebalien

@nrc

This comment has been minimized.

Copy link
Member

nrc commented May 22, 2015

@Aatch sorry, I was asking about the idea of always assuming the if branch is expected, not the attribute idea. But I think @Stebalien answered pretty well. I'm pretty much convinced that the suggested path is the best one, but I just want to check there isn't some way the implicit solution could work - could we guarantee that if there is an if and else branch then we expect the if branch otherwise there is no expectation? AIUI, this is the way that optimising compilers and branch prediction hardware work (plus a whole bunch of heuristics on top of that). I imagine that any level of guarantees for this kind of optimisation is just making things complicated between Rust and LLVM though. I guess I'm just a little bit afraid of preserving behaviour here if we add our own optimising passes in the future - having a special case function call that the compiler needs to treat differently to everything else scares me.

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented May 22, 2015

@nrc, I don't think we should have any guarantees for this feature beyond "doesn't change the observable behaviour of the function", meaning we could just ignore it if we wanted without technically doing anything wrong. In fact, there's no guarantee that it will do anything anyway. LLVM might have ordered the blocks the same way with or without the hint.

I guess what I mean is that there isn't any behaviour to preserve for branch hinting. The point is that stuff like ordering blocks is pretty much arbitrary (especially in LLVM IR where you have to have a terminator, no fallthrough) so compilers make a best guess at how to do it. This just gives them another piece of information they can use to do so. If using expect, likely or unlikely changes what your code actually does, that's a bug.

@bluss

This comment has been minimized.

Copy link

bluss commented May 22, 2015

It would also be nice to get likely/unlikely attributes for enum variants:

@Stebalien These hints shouldn't be used in a blanket way like this. The default should always be to let the branch prediction do its job.

likely and unlikely seems like the most readable API. I would like if the eventual docs recommend profiling and not using these indiscriminately.

I'm skeptical of this. Are we giving programmers the illusion of control? Isn't profile guided optimization the correct approach? It also appears that branch prediction hints are being obsoleted with newer microarchitectures.

@bluss

This comment has been minimized.

Copy link

bluss commented May 22, 2015

I tested the basic benchmark in picohttpparser with and without the likely/unlikely hints. Just to show that it makes a significant difference. I don't know the specifics of this code. Both clang-3.5 and gcc-4.9 reach the same performance with hints.

clang-3.5 -O3 -DNDEBUG -Wall bench.c picohttpparser.c
perf stat -r 3 -- ./a.out

clang-3.5 with likely/unlikely

       8726,429083      task-clock (msec)         #    0,999 CPUs utilized            ( +-  0,48% )
                15      context-switches          #    0,002 K/sec                    ( +- 13,33% )
                 6      cpu-migrations            #    0,001 K/sec                    ( +- 25,64% )
                42      page-faults               #    0,005 K/sec                  
    14 896 309 916      cycles                    #    1,707 GHz                      ( +-  0,76% )
       665 293 679      stalled-cycles-frontend   #    4,47% frontend cycles idle     ( +- 17,51% )
       217 128 725      stalled-cycles-backend    #    1,46% backend  cycles idle     ( +- 18,56% )
    50 496 839 534      instructions              #    3,39  insns per cycle        
                                                  #    0,01  stalled cycles per insn  ( +-  0,00% )
    13 091 246 811      branches                  # 1500,184 M/sec                    ( +-  0,00% )
        10 015 406      branch-misses             #    0,08% of all branches          ( +-  0,25% )

       8,732213677 seconds time elapsed                                          ( +-  0,48% )

clang-3.5 without likely/unlikely

       9119,914947      task-clock (msec)         #    0,999 CPUs utilized            ( +-  0,06% )
                37      context-switches          #    0,004 K/sec                    ( +- 58,36% )
                 8      cpu-migrations            #    0,001 K/sec                    ( +- 35,59% )
                43      page-faults               #    0,005 K/sec                    ( +-  1,56% )
    15 524 708 328      cycles                    #    1,702 GHz                      ( +-  0,05% )
       228 164 325      stalled-cycles-frontend   #    1,47% frontend cycles idle     ( +-  2,36% )
       217 936 659      stalled-cycles-backend    #    1,40% backend  cycles idle     ( +-  0,34% )
    51 077 255 602      instructions              #    3,29  insns per cycle
                                                  #    0,00  stalled cycles per insn  ( +-  0,00% )
    13 581 323 829      branches                  # 1489,194 M/sec                    ( +-  0,00% )
        10 611 136      branch-misses             #    0,08% of all branches          ( +-  3,26% )

       9,126001095 seconds time elapsed                                          ( +-  0,06% )

the difference is small but it's there.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented May 22, 2015

First, this seems like an important area to have support for in general. When I was at the NY meetup, I was approached by a few people working in the "HFT" area who had all sorts of questions about low-level codegen details like this. In most cases, the answer was basically "well, doing that optimization would be LLVM's job, but we'd probably have to expose some primitive".

Anyway, I am currently in favor of likely/unlikely (in preference to the current text) as well, but @Aatch I think it would be great to see some discussion of the approaches taken by other compilers, or projects. We've seen already what picohttpparser uses, but I know that many codebases (the linux kernel, the firefox code base, etc) incorporate some sort of hints, and it'd be nice to know what they use.

  • Data point: Firefox uses MOZ_LIKELY and MOZ_UNLIKELY
@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented May 22, 2015

Oh, the other question I was wondering: how fragile are LLVM's optimizations with regard to the expected value? (i.e., with regard to @nrc's point about us having to be careful when we reorder code etc). I would sort of expect that the expectation is associated with an SSA value produced by the expect intrinsic, or something similar, and that hence it is fairly robust.

@nikomatsakis nikomatsakis self-assigned this May 22, 2015

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented May 23, 2015

@nikomatsakis well C/C++ exposes __builtin_expect, which matches the signature of the LLVM intrinsic (probably not a coincidence). Most projects I've seen define likely and unlikely macros around it. I don't know about other languages, a quick search suggests that the C/C++ are the only languages that provide these hints.

As for how LLVM actually does it, it alters the weights of branches where the condition is the result of the intrinsic. It processes all the branches in the function first before replacing llvm.expect with the passed value.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented May 26, 2015

@Aatch I think you mean gcc exposes __builtin_expect, not C/C++, right? But I was wondering whether any projects actually use it directly: as you say, all examples we've seen so far do not. I am also therefore in favor of exposing likely and unlikely instead.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented May 27, 2015

It seems to me that likely(x == 20) is essentially the same as expect(x, 20)? (It certainly seems like a peephole optimisation/normalisation that LLVM might do, or, if not, rustc could do.)

@bluss

This comment has been minimized.

Copy link

bluss commented May 27, 2015

likely(x == 20) is the same as expect(x == 20, true), the expectation is on the conditional.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented May 27, 2015

Yeah, I understand the desugaring, I'm more interested in the semantics. (If you're talking about semantics, what's the difference between expectation on the conditional vs. expectation on the values?)

@eternaleye

This comment has been minimized.

Copy link

eternaleye commented May 27, 2015

@bluss, @huonw

On the one hand, LLVM does have the i8 expect intrinsic. In that sense, @huonw may be correct.

On the other hand, that will in many cases get expanded right back out again for the underlying machine, to some variety of jump-if-equal. (or a branch prediction hint of some sort taking similar arguments)

@bluss

This comment has been minimized.

Copy link

bluss commented May 27, 2015

@huonw consider likely(x > 0).

@huonw

This comment has been minimized.

Copy link
Member

huonw commented May 27, 2015

@bluss while true, I'm actually focusing on the == case. It seems to me that likely/unlikely are more natural in many ways (e.g. your example), so I'm trying to explore reasons for why we might possibly want a raw expect. The only reason I can see is if expect(x, y) is somehow "better" than likely(x == y), but, as my parenthetical stated, I don't think it would be, since, e.g. the compiler can untangle it. (Basically just looking for confirmation.) In other words, does implementing expect in terms of likely lose us anything?

I'm only second guessing myself because the C compilers expose it as expect rather than likely/unlikely.

@cmr

This comment has been minimized.

Copy link
Member

cmr commented Jun 3, 2015

@Aatch my bad, I assumed based on most of the initial comments and the title of the RFC that this was actually about expect, without realizing the contents of the RFC changed to be likely/unlikely. 🐳

@cmr

This comment has been minimized.

Copy link
Member

cmr commented Jun 3, 2015

Actually, completely nevermind, that isn't the post I thought it was either.

@eternaleye

This comment has been minimized.

Copy link

eternaleye commented Jun 4, 2015

Michael Kerrisk's post regarding the performance impact of likely()/unlikely() may be worth mining for documentation notes.

He maintains the Linux manpages project, and one particularly useful point in his post is where the breakeven point is on the benchmark he provides: 1/10^5 (EDIT: typo, actually 1/10^4) incorrect predictions is about where likely() and unlikely() begin to hurt performance more than they help.

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented Jun 5, 2015

@eternaleye that is interesting. The fact that 1/10^5 is the "break-even" point is surprising. At any rate, there's a reason my motivating example is a branch not taken in about 1/(2*10^19) cases. I'll keep stuff like this in mind when documenting the intrinsics though.

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented Jun 6, 2015

Since the end of the final comment period is approaching and it's likely that this will be accepted when it does, I started on implementation. Turns out that the way we translated function calls meant that the intrinsics didn't actually work, but I improved that, so it all works now. The branch is here: https://github.com/Aatch/rust/tree/branch-hinting

@llogiq

This comment has been minimized.

Copy link
Contributor

llogiq commented Jun 6, 2015

👍 after having read the diff.

@brson

This comment has been minimized.

Copy link
Contributor

brson commented Jun 8, 2015

I'm ok with the unstable intrinsics but still share @nrc's concern about how these are exposed. In Rust they will just look like functions, which I think is pretty damn misleading. At least in gcc they look like magic functions.

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented Jun 9, 2015

@brson sure, the lower-level __builtin_expect looks a little bit special, but the vast majority of projects create macros that wrap it in a way that they look exactly function calls. Furthermore, how does gcc putting __builtin at the front make it seem any more magic than our std::intrinsics?

In gcc, nobody uses __builtin_expect directly in their code. All code I've seen using branch hinting has if (LIKELY(cond)) { ... } or similar. Should I make the intrinsics capitalised? Not that all projects use capitalised macros anyway.

I only think them being functions is misleading if you don't actually know what they do. In which case I ask: why the hell are you using them then?! The actual use of them is pretty clear: if likely(cond) { .. }, don't deviate from that and you'll be fine. And remember, the absolute worst case scenario for using these incorrectly is that they change nothing at all. Unlike, say, assume where using it incorrectly can break your program completely!

@Valloric

This comment has been minimized.

Copy link

Valloric commented Jun 9, 2015

Agreed with @Aatch. Let's not forget why gcc uses a __builtin_ prefix here: the C & C++ standards say that identifiers that start with __ are reserved for the compiler. builtin is used as a namespacing prefix on top of __. So the only reason why we don't have namespace::expect() in C is because there's no namespacing in C.

Let's not try to infer some sort of grand design for the weird-looking names of intrinsic functions in C; there's no such thing, the names are weird because of technical restrictions that Rust thankfully doesn't have.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jun 10, 2015

It's official... the language subteam has decided to accept this RFC. Thanks @Aatch!

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jun 10, 2015

Tracking issue: rust-lang/rust#26179

@nikomatsakis nikomatsakis merged commit 31d230b into rust-lang:master Jun 10, 2015

@nikomatsakis

This comment has been minimized.

@ticki

This comment has been minimized.

Copy link
Contributor

ticki commented Jan 16, 2016

What is the state of this?

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented Jan 26, 2016

@ticki, implementation turned out to be rather hard, if you actually wanted the intrinsics to do anything at least. It's turned out to simply be easier to wait for work on MIR to progress and build them on top of that.

@ticki

This comment has been minimized.

Copy link
Contributor

ticki commented Jan 27, 2016

@Aatch, the intrinsics will currently just send an instruction to LLVM to assume a statement to be true. (I thought about the assume intrinsics).

@Aatch

This comment has been minimized.

Copy link
Contributor Author

Aatch commented Jan 29, 2016

@ticki the macro you added? These intrinsics aren't implemented in the compiler at all yet, so I'm not sure what your macro is doing. The likely/unlikely intrinsics work differently to assume, and LLVM is very finicky about the relationship between the generated expect and the branch being hinted. The way the compiler currently generates LLVM code isn't what it wants so the intrinsics end up just being ignored.

@ticki

This comment has been minimized.

Copy link
Contributor

ticki commented Jan 29, 2016

Oh! My bad. I was speaking about the assume intrinsics.

I wonder why the intrinsics cannot just use LLVM's branch weight metadata?

@jonas-schievink

This comment has been minimized.

Copy link
Member

jonas-schievink commented Aug 30, 2016

The feature name of the RFC still says expect_intrinsic

@chriskrycho chriskrycho referenced this pull request Feb 8, 2017

Closed

Document all features in the reference #38643

0 of 17 tasks complete

@chriskrycho chriskrycho referenced this pull request Mar 11, 2017

Closed

Document all features #9

18 of 48 tasks complete
@ofek

This comment has been minimized.

Copy link

ofek commented May 5, 2017

implementation turned out to be rather hard, if you actually wanted the intrinsics to do anything at least. It's turned out to simply be easier to wait for work on MIR to progress and build them on top of that.

@Aatch Does this do anything now that we have MIR?

@joshlf

This comment has been minimized.

Copy link
Contributor

joshlf commented Sep 6, 2017

Are there thoughts on how to get this to work with if let? I haven't seen any mention of if let in the doc or this discussion.

@cmr

This comment has been minimized.

Copy link
Member

cmr commented Sep 6, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.