Avoiding integer overflow #45

bill-myers · 2014-04-14T18:58:45Z

This is an RFC for avoiding integer overflow, which first discusses various options, and then focuses on and explains a specific proposal that does not make use of failure, and introduces a new kind of integer type to handle things like loop counters.

brendanzab · 2014-04-14T19:18:13Z

One idea I had for making using the Checked* traits easier to work with was to add a Checked<T> adaptor struct. For example:

pub struct Checked<T>(pub T);

impl<T> Checked<T> {
    /// Retrieves the stored value
    pub fn get(self) -> T {
        match self { Checked(x) => x }
    }
}

impl<T: CheckedAdd> Add<Checked<T>, Option<T>> for Checked<T> {
    fn add(&self, other: &Checked<T>) -> Option<T> {
        match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_add(y) }
    }
}

impl<T: CheckedSub> Sub<Checked<T>, Option<T>> for Checked<T> {
    fn sub(&self, other: &Checked<T>) -> Option<T> {
        match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_sub(y) }
    }
}

impl<T: CheckedMul> Mul<Checked<T>, Option<T>> for Checked<T> {
    fn mul(&self, other: &Checked<T>) -> Option<T> {
        match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_mul(y) }
    }
}

impl<T: CheckedDiv> Div<Checked<T>, Option<T>> for Checked<T> {
    fn div(&self, other: &Checked<T>) -> Option<T> {
        match (self, other) { (&Checked(ref x), &Checked(ref y)) => x.checked_div(y) }
    }
}

Could this be a simpler solution?

Anyway, despite the above, I think both range and mod types would be extremely valuable for a safety conscious language like Rust.

I think it might be worth looking at how other languages do range and mod types. For example Ada:

type Score is range 1 .. 10;
type Hours is mod 24;

I asked a question on stack overflow a while back regarding how these are implemented. This is probably already known to you though.

pczarn · 2014-04-19T14:35:07Z

Don't forget about arithmetic with saturation, there's very basic Saturating trait.

akdjka · 2014-04-28T08:20:46Z

Nimrod has 2 types on integers. The default ones are not allowed to overflow. If they do in a debug build, there's warranted exception. In the release mode behaviour is undefined.
To be precise, these are just like in C, signed and unsigned. I call signed the default because unsigned is not even a part of the language, it's in the stdlib.

I like this dependable crash in debug with the default integer type, but no check in release. Very fast when it matters and much more secure than simple wrapping.

Also, I fully agree with bjz that both range and mod types would be very valuable.

ghost · 2014-04-28T12:09:32Z

Relevant: the isl library used in GCC's graphite optimizations. I believe it's only used for optimizations because the generic case is exactly the Halting Problem.

As for alternatives, an i64 is big enough that an accidental overflow is a bug somewhere else. If it was a domain where such overflows are even possible, the programmer would use checks or a bigint. We could just warn/unsafe 32-bit types, and allow/wrap the unsafety in Checked<T>.

rpjohnst · 2014-04-28T15:48:23Z

The paper Dependent Types in Practical Programming (http://www.cs.cmu.edu/~fp/papers/popl99.pdf) is a good example of this (and some other things it would be useful for), although it doesn't really deal with the size of the integer types (especially the issues with runtime-closed range types, which feel rather unpleasant).

thestinger · 2014-04-28T20:06:02Z

@akdjka: Rust is never going to have checks that are left out at runtime and lead to undefined behaviour. It doesn't belong in the language outside of unsafe blocks. I don't think checks for debug builds are really relevant here because forgetting to handle overflow is almost by definition something you're not going to be testing for. A subset of these bugs could be caught by fuzzing, but not all. Even C has checks on overflow for debug builds via the GCC/clang sanitizers and it doesn't really change the safety of the language.

akdjka · 2014-04-29T10:54:17Z

@thestinger
I have not said that I'm for undefined behaviour but for detecting overflow in debug builds.
If you're testing edge conditions, you're likely to find a large part of your overflows.

thestinger · 2014-04-29T18:28:11Z

Rust isn't going to allow catching undefined behaviour only in debug builds and allowing it in release builds. Adding sanitizer-like flags to rustc for unsafe code is possible, but adding undefined behaviour to release builds is really not.

dobkeratops · 2014-04-29T21:27:53Z

Could rust get another build mode with equivalent (dangerous) functionality to c++ release, but in rust terminology it would be an 'unsafe-performance' or something.

I get that rusts focus is safe, you might add yet more checks for a 'debug' build' .. but for me it would be interesting to consider rust's default semantics as basically a "debug build on steroids" with lots of extra "assertions" built into the language semantics.. but for "unsafe-performance" matching C++ you dont want a single branch out of place.

.. changes would be things like unwrap not testing for failiure, no bounds-check on array acess, probably some different numeric assumptions ..

IMO Rust has a lot to offer besides safety - I looked for this language mostly because I dont like the way classes and headers work in c++, and was interested in 'a more FP style', whilst not being able to tolerate a GC language.
I'm actually ok with C++'s dangerous behaviour - its essential for some domains. Whats limiting is how the language has everything shoe-horned into C syntax (eg so you cant have propper tuples because of how ,'s are used.. or multiple return values.. issues with [] syntax..), header files, the asymetry between methods and functions.

Bear in mind the language shootouts... perhaps it would be nice to show that rust can match C++ 1:1, if you want it to. It shouldn't involve having to write uglier code than C++ (unwraps, not getting nice array sugar, extra nesting for unsafe or making your code incompatible with mainstream rust code, having to use nonstandard types like 'real' instead of float, etc)

bstrie · 2014-05-06T22:07:47Z

@dobkeratops , an "unsafe-performance" compiler flag is not acceptable. Thoughtlessly disabling safety features for the sake of performance is contrary to the language's philosophy.

nrc · 2014-06-06T01:26:28Z

I would love some data on the performance impact of checked arithmetic. I don't think using the Clang flag is close enough. I would like to have fail on overflow implemented in Rust and profile Servo (or some other large code base) to see the performance impact. The downside to this is we might need some non-trivial optimisation in Rust code to make checking performant.

thestinger · 2014-06-06T01:42:19Z

Nothing short of hardware support is going to make it perform well. LLVM can't even deal with totally obvious value propagation within a local scope with only local pointers. It struggles just to optimize out the iterators built out of pairs of raw pointers because it's unable to optimize out a null check right next to code proving it's not null by dereferencing it. Doing something almost reasonable with code that looks like what a C programmer would write is about all you can expect from LLVM.

thestinger · 2014-06-06T01:54:18Z

Here's a trivial test case using the most efficient implementation of checked arithmetic that's available (LLVM intrinsics explicitly performing it and compiling down to near optimal machine code):

extern crate test;

use std::intrinsics::{abort, u32_add_with_overflow};
use test::Bencher;

#[inline(never)]
pub fn no_overflow_check_inner(xs: &mut [u32]) {
    for x in xs.mut_iter() {
        *x += 1
    }
}

#[inline(never)]
pub fn overflow_check_inner(xs: &mut [u32]) {
    for x in xs.mut_iter() {
        let i = *x;
        let (result, overflow) = unsafe { u32_add_with_overflow(i, 10) };
        if overflow { unsafe { abort() } }
        *x = result;
    }
}

#[bench]
pub fn no_overflow_check(b: &mut Bencher) {
    let mut xs = [0, ..300000];
    b.iter(|| {
        no_overflow_check_inner(xs);
    })
}

#[bench]
pub fn overflow_check(b: &mut Bencher) {
    let mut xs = [0, ..300000];
    b.iter(|| {
        overflow_check_inner(xs);
    })
}

test no_overflow_check ... bench:     63441 ns/iter (+/- 1123)
test overflow_check    ... bench:    291665 ns/iter (+/- 2326)

This is not atypical of what the hot loops in an application look like. Servo will likely have many loops like this (loops or series of scalar integer operations), and performing these checks makes it far slower than Java or even JavaScript. It's not possible to do this and claim Rust is a systems language - it will be fighting for the performance crown with PyPy, not C++. It's all well and good to pretend there's a sufficiently smart compiler but there is not; hardware support is required to make this perform well and even if it became available on modern x86 CPUs it wouldn't be used by the x86 binaries rustc needs to produce by default.

If someone fixes this test case, and the next one, and the next one, and many far more complicated cases with all kinds of pointer aliasing then perhaps it would start making more sense. This case should be trivial to optimize, because it's clear that every one of these is going to overflow at the same time from some interprocedural. Real world cases are far more complex than this, and LLVM isn't even able to begin tackling the Hello World of interprocedural optimization.

nrc · 2014-06-06T02:23:06Z

I don't think that a compiler will help us, not easily at least (the kind of optimisation I imagine are not the kind that LLVM would do, but that we would do in rustc, e.g., hoist the overflow check out of the loop in overflow_check_inner since all inputs are unsigned, but agree such things are difficult bordering on impossible).

I know that adding an overflow check for this kind of micro-benchmark will be ruinous to performance, but the question is really how many tight loops like this exist in Servo and how often they get executed. Even if performance in such a loop degrades 100% or more in the worst case, we have no idea how this will affect real performance in real software.

thestinger · 2014-06-06T02:29:30Z

don't think that a compiler will help us, not easily at least (the kind of optimisation I imagine are not the kind that LLVM would do, but that we would do in rustc

Optimizing in rustc means making another compiler IR, and compile-time will be even worse. It's already slower than C++, which is known as having incredible slow compile-time.

hoist the overflow check out of the loop in overflow_check_inner since all inputs are unsigned, but agree such things are difficult bordering on impossible)

You first have to prove that they're all equal via interprocedural analysis, which is trivial in this case but impossible in most cases.

Even if performance in such a loop degrades 100% or more in the worst case, we have no idea how this will affect real performance in real software.

This is far from the worst case. I can write a bigger program where the code bloat from all the branches on nearly every primitive operation causes far pipeline stalls.

thestinger · 2014-06-06T02:44:53Z

I don't see a point in Rust existing if it doesn't take performance seriously. In my opinion, Rust is already missing the performance and compile-time speed it needs to be a competitor to C and C++. At the moment, the abstractions hit too many missed optimizations and optimal code ends up requiring lots of custom unsafe blocks. I don't think the value prospect for the complexity it comes with (references with lifetimes, move/liveness checking) is there yet.

The array bounds checks are a killer performance issue and working around it is very ugly. The extensive use of unwinding is leading to ridiculously long compile-time and causing missed optimizations all over the place too. There's a very good reason for performance-conscious projects like LLVM disabling unwinding and Rust isn't viable in that niche right now. As far as I can tell, Swift completely avoids exceptions and unwinding and as a language designed by the lead developer of LLVM that says a lot.

Rust can bury its head in the sand and ignore the current compile-time and performance issues, but C++ programmers won't care about the excuses. They won't take a second look at the language if it branches on every single integer arithmetic operation. It's already pretty hard to sell the sacrifices the languages is making. I'm not sure how viable a C++ replacement the language is when anything doing array indexing is going to perform poorly without lots of unsafe code.

nrc · 2014-06-06T02:55:14Z

@thestinger if you are right (about the performance impact), than I would not support this, but I want to see evidence. Currently we have no data to indicate any performance hit in real software (and likewise no data to indicate there won't be a performance hit). Dismissing the feature because of poor performance in a micro-benchmark is premature optimisation of every program ever to be written in Rust.

Comments on array bounds checks, unwinding, and other performance concerns are off-topic for this RFC, please take them elsewhere.

thestinger · 2014-06-06T02:56:33Z

You're proposing unwinding via a branch on every arithmetic operation, so comments about the performance and compile-time woes caused by unwinding are very relevant. It's already a 2x compile-time hit, and adding more unwinding and lots of new branches will make it worse.

poor performance in a micro-benchmark is premature optimisation of every program ever to be written in Rust.

Rust should stop being advertised as a C competitor / systems language if it doesn't care about the obvious code mapping down to good machine code. If it's easier to write fast numeric code in Java, then the language isn't a systems language.

bill-myers · 2014-06-06T04:52:56Z

In addition to performance, and the fact that it makes operations not associative, overflow checking is bad because it solves the wrong problem.

If one adds two integers that don't fit into a 32/64-bit integer, the ideal behavior is generally not to abort the program, but rather to simply store the result in a type that is large enough, even if that means the integer type is larger than 32/64 bits.

The RFC does propose that, and applied to @thestinger test case, would allow to easily express a program that at the LLVM bytecode level uses an array of 64-bit to 128-bit integers (depending on settings and whether the 1/10 addend is constant or not) instead of 32-bit integers, making overflow impossible.

ghost · 2014-06-06T05:30:05Z

@bill-myers wasting half the registers and cache is not much better than excessive branching.

bstrie · 2014-06-06T15:05:54Z

I'm in agreement with @thestinger. We've long since decided to eat the cost of bounds checking (which nobody is proposing that we remove), and that eats up the entirety of our imposed-overhead budget. I'm also starting to be convinced that the cost of unwinding will be unacceptable in the long run. Given these, we just don't have space for a decadent feature like this, especially one that doesn't close any memory safety holes. Moving optimization passes into the frontend sounds like a bad idea as well.

Having well-defined integer overflow is already a step forward from C and C++. Baby steps.

brson · 2014-07-03T22:17:28Z

Per today's RFC triage we're going to close this. The default unchecked behavior is unlikely to change, and there are several other related proposals whose approach would be preferred over this.

Make stream::Fuse return None after finishing, not NotReady

Solicit feedback about IE8 and IE9 support in Ember 2.x

Add RFC on avoiding integer overflow.

61969f5

brson closed this Jul 3, 2014

nikomatsakis mentioned this pull request Aug 5, 2014

Rename Share to Sync rust-lang/rust#16281

Closed

ticki mentioned this pull request May 20, 2016

Numerical Constrained Types #1621

Open

withoutboats pushed a commit to withoutboats/rfcs that referenced this pull request Jan 15, 2017

Merge pull request rust-lang#45 from Stebalien/fix-stream-poll

c4af380

Make stream::Fuse return None after finishing, not NotReady

wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019

Merge pull request rust-lang#45 from emberjs/drop-ie8

8774be9

Solicit feedback about IE8 and IE9 support in Ember 2.x

wycats pushed a commit to wycats/rust-rfcs that referenced this pull request Mar 5, 2019

Add rust-lang#45 to active rfcs

278e0ea

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoiding integer overflow #45

Avoiding integer overflow #45

bill-myers commented Apr 14, 2014

brendanzab commented Apr 14, 2014

pczarn commented Apr 19, 2014

akdjka commented Apr 28, 2014

ghost commented Apr 28, 2014

rpjohnst commented Apr 28, 2014

thestinger commented Apr 28, 2014

akdjka commented Apr 29, 2014

thestinger commented Apr 29, 2014

dobkeratops commented Apr 29, 2014

bstrie commented May 6, 2014

nrc commented Jun 6, 2014

thestinger commented Jun 6, 2014

thestinger commented Jun 6, 2014

nrc commented Jun 6, 2014

thestinger commented Jun 6, 2014

thestinger commented Jun 6, 2014

nrc commented Jun 6, 2014

thestinger commented Jun 6, 2014

bill-myers commented Jun 6, 2014

ghost commented Jun 6, 2014

bstrie commented Jun 6, 2014

brson commented Jul 3, 2014

Avoiding integer overflow #45

Avoiding integer overflow #45

Conversation

bill-myers commented Apr 14, 2014

brendanzab commented Apr 14, 2014

pczarn commented Apr 19, 2014

akdjka commented Apr 28, 2014

ghost commented Apr 28, 2014

rpjohnst commented Apr 28, 2014

thestinger commented Apr 28, 2014

akdjka commented Apr 29, 2014

thestinger commented Apr 29, 2014

dobkeratops commented Apr 29, 2014

bstrie commented May 6, 2014

nrc commented Jun 6, 2014

thestinger commented Jun 6, 2014

thestinger commented Jun 6, 2014

nrc commented Jun 6, 2014

thestinger commented Jun 6, 2014

thestinger commented Jun 6, 2014

nrc commented Jun 6, 2014

thestinger commented Jun 6, 2014

bill-myers commented Jun 6, 2014

ghost commented Jun 6, 2014

bstrie commented Jun 6, 2014

brson commented Jul 3, 2014