Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC changing the overflow behavior for usize in release builds to panic #2635

Open
wants to merge 5 commits into
base: master
from

Conversation

Projects
None yet
@alex
Copy link

alex commented Feb 10, 2019

Rendered

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Feb 11, 2019

I suspect this change is not feasible without also providing some way to opt-out of these runtime checks, even in release mode. An obvious option is to add a set of unchecked_add/sub/mul/etc methods to usize, though it would be a little weird to have both a checked_add and an unchecked_add method neither of which match the behavior of +.

@alex

This comment has been minimized.

Copy link
Author

alex commented Feb 11, 2019

What makes you say it's not feasible, performance or some other concern? How would the behavior of unchecked_* differ from wrapped_*? (Is it the method I hypothesized that matches the current debug_assert! behavior?)

@Ixrec

This comment has been minimized.

Copy link
Contributor

Ixrec commented Feb 11, 2019

Yeah, performance. My understanding is that enforcing any specific overflow semantics has a measurable cost on some platforms, even if it's zero-cost on other platforms, because different hardware has different "default" overflow behavior. Rust is less weird than C because it requires two's complement, but as far as I know that doesn't guarantee a specific overflow behavior (at least, I can't imagine we'd have chosen the current behavior otherwise) (EDIT: huh, apparently we do already guarantee wrapping in release builds). Someone more familiar with these kinds of performance portability issues would have to comment on how big of a deal that is in practice.

@comex

This comment has been minimized.

Copy link

comex commented Feb 11, 2019

I think this RFC fundamentally needs actual benchmark results (ideally a large variety of them), to allow everyone to have an informed discussion about costs versus benefits.

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Feb 11, 2019

I like this RFC, and I believe that is the direction we want to move in, ideally. A few things:

  • RFC 560 defined the rules for integer overflow, mentioning it would be good
  • Additionally, note that the RFC text is a bit inaccurate in a sense; it's not "in debug mode" but "when debug_assertions are enabled. Basing it off of the RFC 560/reference text would be better.

Furthermore, I agree with @comex; RFC 560 mentions performance a number of times as the reason to make the decisions we made for the current behavior; if the performance were acceptable, I think there'd be broad support for checking in every case, let alone for just these sizes.


@Ixrec

My understanding is that enforcing any specific overflow semantics has a measurable cost on some platforms, even if it's zero-cost on other platforms, because different hardware has different "default" overflow behavior.

Yes, this is true, but we decided that we don't care as much about supporting that hardware as we do having consistent semantics. Non-two's compliment hardware is mostly legacy that we don't support particularly well in the first place, see discussions like https://gankro.github.io/blah/rust-layouts-and-abis/

NOTE: this is not a normative document and the Rust devs haven't been very diligent in committing to these claims, so be a bit wary of relying on a property here that lacks a citation.

For Rust to support a platform at all, its standard C dialect must:

  • Have integers be two's complement

As the note says, it's not normative, but I think, in this moment, it's roughly true.

@alex

This comment has been minimized.

Copy link
Author

alex commented Feb 11, 2019

I agree that measurements will ultimately be a necessity to accepting/merging this idea. In writing the RFC first, I wanted to address the chicken/egg scenario of writing the code before it was clear the idea was acceptable -- truth be told I expected people to find the idea of special casing usize to be more unusual than the feedback has been thus far!

There probably is performance differences on different architectures beyond the concern for those which don't have 2s complement arithmetic. x86 and ARM both have dedicated instructions for jump-on-overflow, however PPC and MIPS do not, so there's a non-trivial difference on the amount of code generated on these platforms for overflow checks.

That's an excellent point about the current behavior being keyed on debug_assertion, not debug. I'll update the text accordingly.

@hdevalence

This comment has been minimized.

Copy link

hdevalence commented Feb 11, 2019

I think this is a great proposal and I'd love to see it adopted by Rust. Having an additional security layer is a huge advantage, and as noted in the description, it would have prevented exploitation of a number of past security issues.

Currently it's possible to enable overflow checks for all integer types using something like

[profile.release]
overflow-checks = true

in the Cargo.toml. However, this can't be used to test the proposed default, because it enables checks for all integer types, not just for usize/isize.

Perhaps a good way to move forward on this would be to add a (possibly unstable) mechanism, like usize-overflow-checks, which would allow enabling overflow checks just for usize/isize, and then running ecosystem-wide benchmarks (like lolbench) to see what the actual impact on real code is.

This mechanism might be useful independently of whether the behaviour proposed in the RFC is adopted; for instance, I would choose to use it in my projects.

@leonardo-m

This comment has been minimized.

Copy link

leonardo-m commented Feb 11, 2019

If you want to introduce this I'd like it with an optimization pass as done by the Swift compiler.

@alex

This comment has been minimized.

Copy link
Author

alex commented Feb 12, 2019

@leonardo-m Can you explain what you mean by that? I don't see overflow-checking as being something that'd be implemented as a compiler optimization. Generally the compiler optimization piece of this is removing overflow checks where there are range bounds on the inputs.

@leonardo-m

This comment has been minimized.

@alex

This comment has been minimized.

Copy link
Author

alex commented Feb 12, 2019

Yes, as I mentioned in the Future Plans section, I suspect this work will spur desires for new and improved optimizations, either in Rust or LLVM. It is my hope that such optimizations will be nice to haves and not hard requirements for this to be practical.

There seems to be a lot of interest (unsurprisingly) in quantifying this. I'll take on trying to put together a proof of concept of this for Rustc that's good enough to get some measurements.

on `usize` values: `usize` is consistently used in `core`/`std` APIs which deal
in lengths, buffer sizes, etc., the kind of values where overflow can be
dangerous, so it's no coincidence that historic integer-overflow vulnerabilities
occured with `usize` values.

This comment has been minimized.

@scottmcm

scottmcm Feb 13, 2019

Member

As a small thing, I think this should also happen for isize -- code that's using offset might be calculating in that type instead, and just generally I'd like the two types to be as similar as possible.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Feb 13, 2019

I really like the nuance in doing this just for the types that are most likely the problem. And the scariest perf problem would be something like a[i+1], but LLVM already needs to be proving that that addition doesn't wrap in order to remove the bounds check anyway, so I suspect this might not be quite as scary as it seems. (If it can't remove the wrapping check, it can't remove the bounds check either, so two highly-predictable branches might not be materially worse than the already-existing one.)

That said, LLVM still can't fold sequential checked_adds (https://rust.godbolt.org/z/z7LUMt -- but it can saturating ones, thank you rust-lang/rust#58003!), so there's definitely low-hanging work here that could improve perf. (Maybe I should re-open rust-lang/rust#52203.)

As an upper-bound perf experiment, consider just changing the default in release to be panic, and we can run a try+perf on the experimental PR to see how far off things might be.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Feb 14, 2019

ring: Near-miss integer overflow, which would have led to heap buffer overflow.

There was never such a bug in ring and the necessary checks are done there, so I don't think it makes sense to mention it here. (in_prefix_len > in_out.len() is a condition that is complete nonsense in that codebase.) If anything ring's track record is evidence that this RFC is unnecessary.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Feb 14, 2019

CrosVM: Integer overflow leading to heap buffer overflow in ChromeOS's hypervisor.

According to the bug report "This didn't end up being an actual issue as it was covered by an early bounds check."

I didn't check the other two examples cited as motivation. Also, I am mostly supportive of this proposal. However, I think the citations suggesting non-vulnerabilities are vulnerabilities should be removed.

@alex

This comment has been minimized.

Copy link
Author

alex commented Feb 14, 2019

In Ring's case, that's why I described it as a "near miss" -- since it was unreachable, but the check wasn't near the overflow, and there was no type or other structure for ensuring the invariant. It was included because it demonstrated the potential pattern of overflow on usize + unsafe code. If you still think that's misleading, I can remove it.

In CrosVM's case, the overflow was probably unexploitable (at least, none of us figured out a way to), because of the size of the input data was bounded, not because there was any checking on the actual result of the arithmetic. This one definitely makes sense to include, in my view.

@briansmith

This comment has been minimized.

Copy link

briansmith commented Feb 14, 2019

I think in both cases, the code that prevents the overflow is far away from the code that it is protecting, and minimizing that distance is generally good.

I don't know all the details regarding the CrosVM case. In the case of ring, at least, it's not accidental that overflow is avoided. Bounding the input size very early to prevent integer overflows later is a pretty common practice. For example, this is one reason why mozilla::pkix (C++) limits its input sizes to 0xffff (uint16_t) or less. The untrusted library does something similar, but not as drastic. These libraries do things this way partly to minimize any potential performance impact; i.e. as an alternative to what's being proposed here.

Anyway, again, I'm not trying to argue against this RFC; just I think the way ring is referenced is very misleading.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Feb 14, 2019

I didn't check the other two examples cited as motivation.

As the one who found the first one, it's definitely exactly this, and was fixed by making the overflow panic: https://github.com/rust-lang/rust/pull/54399/files#diff-773807740b9d7f176c85b4e2e34b2607R434

@tomwhoiscontrary

This comment has been minimized.

Copy link

tomwhoiscontrary commented Feb 14, 2019

There probably is performance differences on different architectures beyond the concern for those which don't have 2s complement arithmetic.

There is a proposal floating around in C++ space to declare that Signed Integers are Two’s Complement. In that proposal, there is a survey of signed integer representations which concludes that there are no current non-two's-complement hardware architectures. The closest thing is some Unisys product line which has an ASIC to support code written for its '60s mainframes, which were ones' complement (and had 36 bit words!).

So, i would suggest that it's safe to not consider non-two's complement machines at all.

@scottmcm scottmcm self-assigned this Feb 14, 2019

@joshtriplett

This comment has been minimized.

Copy link
Member

joshtriplett commented Feb 14, 2019

We discussed this briefly in the lang team meeting, and generally agreed with the thread on the following:

  • We should add a way to enable overflow-checks just for isize/usize and not other types.
  • Once we have that, we should get some data on the performance and security impact of using that option: does enabling overflow-checks only for isize/usize provide a substantial security/correctness improvement with a disproportionately low performance cost?
  • Given that data, we could decide what the default should be, but in the meantime people could opt into this new mode.
@alex

This comment has been minimized.

Copy link
Author

alex commented Feb 15, 2019

Ok, I have a very straightforward implementation for measuring here: rust-lang/rust#58475

alexgaynor@penguin ~/p/rust> rustc -C opt-level=2 t.rs
alexgaynor@penguin ~/p/rust> ./t
199
alexgaynor@penguin ~/p/rust> ./build/x86_64-unknown-linux-gnu/stage2/bin/rustc -C opt-level=2 t.rs
alexgaynor@penguin ~/p/rust> ./t
thread 'main' panicked at 'attempt to add with overflow', t.rs:2:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
alexgaynor@penguin ~/p/rust> cat t.rs 
fn f(x: usize, y: usize) -> usize {
    x + y
}

fn main() {
    println!("{}", f(usize::max_value(), 200usize));
}

bors pushed a commit to rust-lang/rust that referenced this pull request Feb 15, 2019

[WIP] Make usize overflow always have debug-assertions semantics
This is a prototype for rust-lang/rfcs#2635 to enable us to collect some measurements.

bors added a commit to rust-lang/rust that referenced this pull request Feb 15, 2019

Auto merge of #58475 - alex:usize-overflow-panic, r=<try>
[WIP] Make usize overflow always have debug-assertions semantics

This is a prototype for rust-lang/rfcs#2635 to enable us to collect some measurements.
@vorner

This comment has been minimized.

Copy link

vorner commented Feb 18, 2019

I do find usize acting differently than eg. u8 a bit weird and usually I really dislike inconsistencies. But that being said, I think this instance of weirdness is quite OK ‒ rust's stance to overflowing is mostly „don't do it, but I might not check every time you don't, it's up to you“. So tweaking the rules when this „might not check“ happens sounds good to me, especially if it tends to prevent bugs.

But I'd like to raise a concern that I haven't seen here. Is this backwards compatible change? Not everyone necessarily shares my understanding of „please don't overflow“ and might have decided to a) turn off the checks in Cargo.toml for debug too (or ever use release only builds), b) rely on the overflow not panicking. By upgrading the compiler to never version, wouldn't such code stop working? Or is the general stance that every overflow is actually a bug therefore that code should have never worked in the first place?

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Feb 18, 2019

@mark-i-m

This comment has been minimized.

Copy link
Contributor

mark-i-m commented Feb 18, 2019

Perhaps an unresolved question could be added to request a crater run?

alex added a commit to alex/rust that referenced this pull request Feb 23, 2019

[WIP] Make usize overflow always have debug-assertions semantics
This is a prototype for rust-lang/rfcs#2635 to enable us to collect some measurements.

bors added a commit to rust-lang/rust that referenced this pull request Feb 23, 2019

Auto merge of #58475 - alex:usize-overflow-panic, r=<try>
[WIP] Make usize overflow always have debug-assertions semantics

This is a prototype for rust-lang/rfcs#2635 to enable us to collect some measurements.
@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Feb 24, 2019

Team member @scottmcm has proposed to postpone this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@scottmcm

This comment has been minimized.

Copy link
Member

scottmcm commented Feb 24, 2019

While I like this conceptually, it seems like current measurements of experimenting with this are showing about a 2-6% impact to the speed of the compiler. As such, I propose that we postpone considering changing the release default for now, pending further work.

@rfcbot fcp postpone

Process notes: I'm only proposing postponing the T-lang change here. I (personally) would absolutely like to see work here continue, as better optimizations around checked math will help even if we never change a default here (for example, in Range::nth). And if T-compiler decides that this is a useful option to expose for general opt-in use, I would be happy for that too. But AFAIK those aren't lang decisions.

@alex

This comment has been minimized.

Copy link
Author

alex commented Feb 24, 2019

First missed optimization I've found: rust-lang/rust#58692

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.