Range is odd when used with floating point types #17010

Veedrac · 2014-09-05T05:47:49Z

For one,

fn main() {
    // The .take is needed to make this terminate
    let mut bad_range = range(0f32, 1e8).take(1000_000_000);

    let (min, max) = bad_range.size_hint();
    let max = max.unwrap();

    println!(
        "min: {:e} ≤ length: {:e} ≤ max: {:e}",
        min               .to_f64().unwrap(),
        bad_range.count() .to_f64().unwrap(),
        max               .to_f64().unwrap()
    );
}

outputs min: 1e8 ≤ length: 1e9 ≤ max: 1e8.

There's no real decision about what to do when rounding errors means r.next() == r.next(), such as with:

fn main() {
    let mut bad_range = range(1e8f32, 1e8+8.0);
    println!("{}", bad_range.next() == bad_range.next());
}

which currently results in an infinite iterator.

Personally I don't see a big problem with it, but unless floating ranges are banned it should at least avoid breaking size_hint.

Edit:

Further, rounding currently means that the size hints can be wrong. Both the upper bound:

fn main() {
    println!("{}", range(0f32, 1.5f32).count()); // 2
    println!("{}", range(0f32, 1.5f32).size_hint()); // (1, Some(1))
}

and the lower bound:

fn main() {
    println!("{}", range(4194303.8f32, 4194305f32).count()); // 1
    println!("{}", range(4194303.8f32, 4194305f32).size_hint()); // (2, Some(2))
}

The text was updated successfully, but these errors were encountered:

thestinger · 2014-09-05T05:59:41Z

It should just be ported to used a trait with successor / predecessor instead of addition. This would also allow it to work with char despite it not being an arithmetic type. The only thing that would make sense with floats is going to the next possible value.

vks · 2014-09-05T06:25:53Z

The only thing that would make sense with floats is going to the next possible value.

I disagree. I think this is almost never what you want. Usually you want some finite step size.

Using multiplication instead of addition fixes the rounding errors.

thestinger · 2014-09-05T06:27:48Z

@vks: It doesn't matter if it's almost never what you want for floating point types, it's the sane definition of the basic generic range function. There's a range_step function if you want to specify a different step than simply the next value. It should work for all enumerated types, including ones that aren't numbers. It doesn't make sense to define it based on increments of the number one.

Veedrac · 2014-09-05T06:28:23Z

It's worth noting that there aren't any rounding error per se; multiplication would only help as long as you were using a datatype that could hold the required precision. Consider instead 1e300; no normal-sized integer will help you with that.

And then instead of multiplication you might as well just cast to the float.

vks · 2014-09-05T07:09:45Z

@thestinger That is not what range currently means. It is equivalent to range_step with step size 1. I think what you are proposing should be called differently, because it does not make sense for real numbers (you cannot iterate over them).

@Veedrac I'm not sure I understand. How is 1e300 related to 'normal-sized integers'? What do you mean by 'casting to the float'?

I still think multiplication would solve all the issues you mentioned, because it would yield exactly the expected number of elements (even if some are identical do to lacking precision).

Veedrac · 2014-09-05T07:11:25Z

@vks So you have X * 1.0f64, where X == 1e300. What's the type of X? If it's integer, it has either overflowed or is big and thus slow. It it's a floating point, how are you incrementing it?

thestinger · 2014-09-05T07:42:40Z

@vks: The only reason that range currently uses a step size of 1 is because a trait for successors / predecessors didn't exist when it was originally defined. It was always intended to be updated to use a trait like that. Anyway, the implementation of range has to be the same for all types because that's how Rust's generics work. What kind of trait do you propose using instead?

vks · 2014-09-05T16:20:40Z

@Veedrac I think I understand your point now. The problem seems to be the implementation of size_hint: if you remove it, the code works fine.

@thestinger

It was always intended to be updated to use a trait like that.

Really? That surprises me.

Anyway, the implementation of range has to be the same for all types because that's how Rust's generics work.

Currently it is only defined for types that implement Add, PartialOrd, Clone, One and ToPrimitive (the latter is only needed for size_hint). Why does it have to be implemented for all types?

I'm not convinced that it should be generalized to types that don't implement Add and One. Python does not do that either.

If it is decided to implement it using some kind of successor trait, then I think it should not be implemented for floating point numbers (unless the "successor" of a float x is defined as x + 1).

thestinger · 2014-09-05T16:24:49Z

Python only has an equivalent to range_step, not range. The range function isn't just a wrapper around range_step, it provides a double-ended iterator intended to be based on predecessors / successors. It uses One and Add as a mostly working hack because the necessary trait didn't exist.

vks · 2014-09-05T16:47:19Z

About how to fix this issue:
I think there are five possibilities:

Add an assertion so that range fails if adding one does nothing. (This may have a huge performance impact.)
Implement range for integers only.
Implement range using a successor trait. Do or do not implement that trait for floats. (This has the advantage of being able to define range for chars.)
Do nothing and accept that size_hint may be wrong and the iterator may not terminate for floats. This behavior should be mentioned in the documentation however.
Make size_hint not return an upper bound. (This only fixes the size hint. It would be more correct, because currently the code implicitly assumes x + One::one() != x.)

(I would favor 2. or 3.)

Edit: @thestinger I removed the paragraph about Python's range, because it was talking past each other.

The range function isn't just a wrapper around range_step, it provides a double-ended iterator intended to be based on predecessors / successors.

I don't know what the intention was when range was first implemented. But you could implement (a bit redundantly) DoubleEndedIterator for RangeStep as well, so I don't really see why that shows any intention to be different than range_step.

Veedrac · 2014-09-05T18:41:37Z

Instead of 1. we could just have range check start != start + one && stop - one != stop on instantiation, which would be fast in most cases. I don't know whether this would optimise out for integers, but if not it should.

thestinger · 2014-09-05T18:50:53Z

It's not going to optimize out for big integers, and might not optimize out for primitive integers.

vks · 2014-09-05T18:56:45Z

Instead of 1. we could just have range check start != start + 1 && stop - 1 != stop on instantiation, which would be fast in most cases. I don't know whether this would optimise out for integers, but if not it should.

This is a good idea. It does not really matter whether it optimizes out, because it is only done once. (You don't call range very often.)

thestinger · 2014-09-05T19:01:59Z

Code size does matter, and making the loop end condition more complex is going to make it harder for LLVM to optimize it down to the same code as the equivalent C loop.

Veedrac · 2014-09-05T20:26:13Z

The loop would be the same; it's just the creation of the range that would have the check. In other words, it would go here: https://github.com/rust-lang/rust/blob/master/src/libcore/iter.rs#L1952.

vks · 2014-09-05T20:39:01Z

I also don't see how that assertion might affect the loop. I could imagine that it leads to problems when nesting loops though.

At the very least the wrong size hint should be fixed (see 4.).

Veedrac · 2014-09-05T21:05:29Z

Note also that the start != start + 1 && stop - 1 != stop check could instead be added to size_hint, in which case it wouldn't return an upper bound.

Personally I prefer putting it in the constructor because I feel it should optimise out in the case that I care about (primitive integers) and the other cases are very unlikely to be bottlenecks. In other words, I want to support:

loop { for _ in range(0i, n) {} } // n small

as construction times could (in theory) matter, but any case with big integers, floats or similar probably won't ever occur in a similar context.

`size_hint` was implicitly assuming that for `t: T` the following holds: t + One::one() != t This is not true, i.e. if `T` is a float. Due to this fact, `range` can return an infinite iterator. However, `size_hint` would always claim the iterator is finite. This commit fixes `size_hint` by not specifying an upper bound. Updates rust-lang#17010.

vks · 2014-09-05T21:24:36Z

A little problem of start != start + 1 && stop - 1 != stop is that it requires an additional trait (Sub).

(That's why I did not use it in my branch that fixes the size hint.)

Veedrac · 2014-09-05T21:54:03Z

You can use start != start + 1 && stop - 1 != stop as long as you don't mind the false positive of stop == 2^mantissa. It's at least more lenient than your current patch, which is a false positive for everything.

But we would then also have the problem where we aren't rounding intelligently, such as with

fn main() {
    println!("{}", range(0f32, 1.5f32).collect::<Vec<f32>>());
    println!("{}", range(0f32, 1.5f32).count());
    println!("{}", range(0f32, 1.5f32).size_hint());
}

giving

[0, 1]
2
(1, Some(1))

so the problem is more systemic than I thought.

Veedrac · 2014-09-05T22:10:05Z

fn main() {
    println!("{}", range(4194303.8f32, 4194305f32).collect::<Vec<f32>>());
    println!("{}", range(4194303.8f32, 4194305f32).count());
    println!("{}", range(4194303.8f32, 4194305f32).size_hint());
}

gives

[4194303.75]
1
(2, Some(2))

Since floats seem to also have an improper minimum size for their hint, I'm starting to doubt whether supporting size_hint at all for floats is a good idea, in which case supporting floating ranges doesn't seem to be a good idea. Which is sad.

vks · 2014-09-06T21:21:17Z

I'm fine with not supporting float ranges (don't think that this is used anyway). Note that for float ranges you often want non-integer step sizes.

For floats you want to use something like numpy's linspace anyway:

>>> linspace(start=2.0, stop=3.0, num=5)
array([ 2.  ,  2.25,  2.5 ,  2.75,  3.  ])

(The point is that it guarantees you to get exactly num numbers.)

Veedrac · 2014-09-07T15:59:39Z

This brings us back to

Implement range for integers only.
Implement range using a successor trait. Do or do not implement that trait for floats. (This has the advantage of being able to define range for chars.)
Do nothing and accept that size_hint may be wrong and the iterator may not terminate for floats. This behavior should be mentioned in the documentation however.

I don't think anyone is in favour doing nothing, but I wouldn't know which of the first two is best. A trait would be possibly more extensible to other types but it might just be pointless overhead if it's never used.

vks · 2014-09-07T16:58:49Z

Naively I would assume that the compiler can optimize away the successor abstraction. If not, I think implementing range for integers only is perfectly reasonable. (Python is doing great with that approach, and for floats people use linspace.)

Veedrac · 2014-09-07T17:02:32Z

Sorry, I meant implementation overhead. If the only things usable in range are integers and characters and nobody iterates over characters there's simply no point adding more methods for people to implement.

thestinger · 2014-09-07T17:03:38Z

A trait for enumerated types is useful beyond range... other languages like Haskell have support for this.

Veedrac · 2014-09-07T23:30:02Z

A successor() method will probably want to return an Option<T> to support small enumerations. Will this check optimize out for the case of range(0, n), or should int just always return Some(n+1)? It's not obvious to me how this would pan out.

vks · 2014-11-07T21:16:04Z

I think we should implement range for integers only. range should not be used for floats, and it is not very clear how range would be defined for unicode characters. If we really think we need it, a successor trait could be implemented later in a backwards-compatible way.

Alternatively, the size hint should only be implemented for integers.

aturon · 2015-01-23T04:37:02Z

range currently is bounded by Int, and it is also likely to be removed in favor of range notation (x .. y) which already produces iterators for the primtive integer types (and can easily be extended with new types like bignums).

…ykril Add `toggle_async_sugar` assist code action Implement code action for sugaring and de-sugaring asynchronous functions. This code action does not import `Future` trait when de-sugaring and does not touch function boby, I guess this can be implemented later if needed. This action also does not take into consideration other bounds because IMO it's usually "let me try to use sugared version here". Feel free to request changes, that's my first code action implementation 😄 Closes rust-lang#17010 Relates to rust-lang#16195

vks mentioned this issue Nov 8, 2014

Unfortunate type bound on Range #10414

Closed

aturon closed this as completed Jan 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Range is odd when used with floating point types #17010

Range is odd when used with floating point types #17010

Veedrac commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

thestinger commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

thestinger commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 6, 2014

Veedrac commented Sep 7, 2014

vks commented Sep 7, 2014

Veedrac commented Sep 7, 2014

thestinger commented Sep 7, 2014

Veedrac commented Sep 7, 2014

vks commented Nov 7, 2014

aturon commented Jan 23, 2015

Range is odd when used with floating point types #17010

Range is odd when used with floating point types #17010

Comments

Veedrac commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

thestinger commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

thestinger commented Sep 5, 2014

vks commented Sep 5, 2014

thestinger commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 5, 2014

Veedrac commented Sep 5, 2014

Veedrac commented Sep 5, 2014

vks commented Sep 6, 2014

Veedrac commented Sep 7, 2014

vks commented Sep 7, 2014

Veedrac commented Sep 7, 2014

thestinger commented Sep 7, 2014

Veedrac commented Sep 7, 2014

vks commented Nov 7, 2014

aturon commented Jan 23, 2015