Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Range is odd when used with floating point types #17010

Closed
Veedrac opened this issue Sep 5, 2014 · 28 comments
Closed

Range is odd when used with floating point types #17010

Veedrac opened this issue Sep 5, 2014 · 28 comments

Comments

@Veedrac
Copy link
Contributor

Veedrac commented Sep 5, 2014

For one,

fn main() {
    // The .take is needed to make this terminate
    let mut bad_range = range(0f32, 1e8).take(1000_000_000);

    let (min, max) = bad_range.size_hint();
    let max = max.unwrap();

    println!(
        "min: {:e} ≤ length: {:e} ≤ max: {:e}",
        min               .to_f64().unwrap(),
        bad_range.count() .to_f64().unwrap(),
        max               .to_f64().unwrap()
    );
}

outputs min: 1e8 ≤ length: 1e9 ≤ max: 1e8.

There's no real decision about what to do when rounding errors means r.next() == r.next(), such as with:

fn main() {
    let mut bad_range = range(1e8f32, 1e8+8.0);
    println!("{}", bad_range.next() == bad_range.next());
}

which currently results in an infinite iterator.

Personally I don't see a big problem with it, but unless floating ranges are banned it should at least avoid breaking size_hint.

Edit:

Further, rounding currently means that the size hints can be wrong. Both the upper bound:

fn main() {
    println!("{}", range(0f32, 1.5f32).count()); // 2
    println!("{}", range(0f32, 1.5f32).size_hint()); // (1, Some(1))
}

and the lower bound:

fn main() {
    println!("{}", range(4194303.8f32, 4194305f32).count()); // 1
    println!("{}", range(4194303.8f32, 4194305f32).size_hint()); // (2, Some(2))
}
@thestinger
Copy link
Contributor

It should just be ported to used a trait with successor / predecessor instead of addition. This would also allow it to work with char despite it not being an arithmetic type. The only thing that would make sense with floats is going to the next possible value.

@vks
Copy link
Contributor

vks commented Sep 5, 2014

The only thing that would make sense with floats is going to the next possible value.

I disagree. I think this is almost never what you want. Usually you want some finite step size.

Using multiplication instead of addition fixes the rounding errors.

@thestinger
Copy link
Contributor

@vks: It doesn't matter if it's almost never what you want for floating point types, it's the sane definition of the basic generic range function. There's a range_step function if you want to specify a different step than simply the next value. It should work for all enumerated types, including ones that aren't numbers. It doesn't make sense to define it based on increments of the number one.

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 5, 2014

It's worth noting that there aren't any rounding error per se; multiplication would only help as long as you were using a datatype that could hold the required precision. Consider instead 1e300; no normal-sized integer will help you with that.

And then instead of multiplication you might as well just cast to the float.

@vks
Copy link
Contributor

vks commented Sep 5, 2014

@thestinger That is not what range currently means. It is equivalent to range_step with step size 1. I think what you are proposing should be called differently, because it does not make sense for real numbers (you cannot iterate over them).

@Veedrac I'm not sure I understand. How is 1e300 related to 'normal-sized integers'? What do you mean by 'casting to the float'?

I still think multiplication would solve all the issues you mentioned, because it would yield exactly the expected number of elements (even if some are identical do to lacking precision).

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 5, 2014

@vks So you have X * 1.0f64, where X == 1e300. What's the type of X? If it's integer, it has either overflowed or is big and thus slow. It it's a floating point, how are you incrementing it?

@thestinger
Copy link
Contributor

@vks: The only reason that range currently uses a step size of 1 is because a trait for successors / predecessors didn't exist when it was originally defined. It was always intended to be updated to use a trait like that. Anyway, the implementation of range has to be the same for all types because that's how Rust's generics work. What kind of trait do you propose using instead?

@vks
Copy link
Contributor

vks commented Sep 5, 2014

@Veedrac I think I understand your point now. The problem seems to be the implementation of size_hint: if you remove it, the code works fine.

@thestinger

It was always intended to be updated to use a trait like that.

Really? That surprises me.

Anyway, the implementation of range has to be the same for all types because that's how Rust's generics work.

Currently it is only defined for types that implement Add, PartialOrd, Clone, One and ToPrimitive (the latter is only needed for size_hint). Why does it have to be implemented for all types?

I'm not convinced that it should be generalized to types that don't implement Add and One. Python does not do that either.

If it is decided to implement it using some kind of successor trait, then I think it should not be implemented for floating point numbers (unless the "successor" of a float x is defined as x + 1).

@thestinger
Copy link
Contributor

Python only has an equivalent to range_step, not range. The range function isn't just a wrapper around range_step, it provides a double-ended iterator intended to be based on predecessors / successors. It uses One and Add as a mostly working hack because the necessary trait didn't exist.

@vks
Copy link
Contributor

vks commented Sep 5, 2014

About how to fix this issue:
I think there are five possibilities:

  1. Add an assertion so that range fails if adding one does nothing. (This may have a huge performance impact.)
  2. Implement range for integers only.
  3. Implement range using a successor trait. Do or do not implement that trait for floats. (This has the advantage of being able to define range for chars.)
  4. Do nothing and accept that size_hint may be wrong and the iterator may not terminate for floats. This behavior should be mentioned in the documentation however.
  5. Make size_hint not return an upper bound. (This only fixes the size hint. It would be more correct, because currently the code implicitly assumes x + One::one() != x.)

(I would favor 2. or 3.)

Edit: @thestinger I removed the paragraph about Python's range, because it was talking past each other.

The range function isn't just a wrapper around range_step, it provides a double-ended iterator intended to be based on predecessors / successors.

I don't know what the intention was when range was first implemented. But you could implement (a bit redundantly) DoubleEndedIterator for RangeStep as well, so I don't really see why that shows any intention to be different than range_step.

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 5, 2014

Instead of 1. we could just have range check start != start + one && stop - one != stop on instantiation, which would be fast in most cases. I don't know whether this would optimise out for integers, but if not it should.

@thestinger
Copy link
Contributor

It's not going to optimize out for big integers, and might not optimize out for primitive integers.

@vks
Copy link
Contributor

vks commented Sep 5, 2014

Instead of 1. we could just have range check start != start + 1 && stop - 1 != stop on instantiation, which would be fast in most cases. I don't know whether this would optimise out for integers, but if not it should.

This is a good idea. It does not really matter whether it optimizes out, because it is only done once. (You don't call range very often.)

@thestinger
Copy link
Contributor

Code size does matter, and making the loop end condition more complex is going to make it harder for LLVM to optimize it down to the same code as the equivalent C loop.

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 5, 2014

The loop would be the same; it's just the creation of the range that would have the check. In other words, it would go here: https://github.com/rust-lang/rust/blob/master/src/libcore/iter.rs#L1952.

@vks
Copy link
Contributor

vks commented Sep 5, 2014

I also don't see how that assertion might affect the loop. I could imagine that it leads to problems when nesting loops though.

At the very least the wrong size hint should be fixed (see 4.).

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 5, 2014

Note also that the start != start + 1 && stop - 1 != stop check could instead be added to size_hint, in which case it wouldn't return an upper bound.

Personally I prefer putting it in the constructor because I feel it should optimise out in the case that I care about (primitive integers) and the other cases are very unlikely to be bottlenecks. In other words, I want to support:

loop { for _ in range(0i, n) {} } // n small

as construction times could (in theory) matter, but any case with big integers, floats or similar probably won't ever occur in a similar context.

vks added a commit to vks/rust that referenced this issue Sep 5, 2014
`size_hint` was implicitly assuming that for `t: T` the following holds:

    t + One::one() != t

This is not true, i.e. if `T` is a float.
Due to this fact, `range` can return an infinite iterator. However,
`size_hint` would always claim the iterator is finite.

This commit fixes `size_hint` by not specifying an upper bound.

Updates rust-lang#17010.
@vks
Copy link
Contributor

vks commented Sep 5, 2014

A little problem of start != start + 1 && stop - 1 != stop is that it requires an additional trait (Sub).

(That's why I did not use it in my branch that fixes the size hint.)

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 5, 2014

You can use start != start + 1 && stop - 1 != stop as long as you don't mind the false positive of stop == 2^mantissa. It's at least more lenient than your current patch, which is a false positive for everything.

But we would then also have the problem where we aren't rounding intelligently, such as with

fn main() {
    println!("{}", range(0f32, 1.5f32).collect::<Vec<f32>>());
    println!("{}", range(0f32, 1.5f32).count());
    println!("{}", range(0f32, 1.5f32).size_hint());
}

giving

[0, 1]
2
(1, Some(1))

so the problem is more systemic than I thought.

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 5, 2014

fn main() {
    println!("{}", range(4194303.8f32, 4194305f32).collect::<Vec<f32>>());
    println!("{}", range(4194303.8f32, 4194305f32).count());
    println!("{}", range(4194303.8f32, 4194305f32).size_hint());
}

gives

[4194303.75]
1
(2, Some(2))

Since floats seem to also have an improper minimum size for their hint, I'm starting to doubt whether supporting size_hint at all for floats is a good idea, in which case supporting floating ranges doesn't seem to be a good idea. Which is sad.

@vks
Copy link
Contributor

vks commented Sep 6, 2014

I'm fine with not supporting float ranges (don't think that this is used anyway). Note that for float ranges you often want non-integer step sizes.

For floats you want to use something like numpy's linspace anyway:

>>> linspace(start=2.0, stop=3.0, num=5)
array([ 2.  ,  2.25,  2.5 ,  2.75,  3.  ])

(The point is that it guarantees you to get exactly num numbers.)

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 7, 2014

This brings us back to

  • Implement range for integers only.
  • Implement range using a successor trait. Do or do not implement that trait for floats. (This has the advantage of being able to define range for chars.)
  • Do nothing and accept that size_hint may be wrong and the iterator may not terminate for floats. This behavior should be mentioned in the documentation however.

I don't think anyone is in favour doing nothing, but I wouldn't know which of the first two is best. A trait would be possibly more extensible to other types but it might just be pointless overhead if it's never used.

@vks
Copy link
Contributor

vks commented Sep 7, 2014

Naively I would assume that the compiler can optimize away the successor abstraction. If not, I think implementing range for integers only is perfectly reasonable. (Python is doing great with that approach, and for floats people use linspace.)

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 7, 2014

Sorry, I meant implementation overhead. If the only things usable in range are integers and characters and nobody iterates over characters there's simply no point adding more methods for people to implement.

@thestinger
Copy link
Contributor

A trait for enumerated types is useful beyond range... other languages like Haskell have support for this.

@Veedrac
Copy link
Contributor Author

Veedrac commented Sep 7, 2014

A successor() method will probably want to return an Option<T> to support small enumerations. Will this check optimize out for the case of range(0, n), or should int just always return Some(n+1)? It's not obvious to me how this would pan out.

@vks
Copy link
Contributor

vks commented Nov 7, 2014

I think we should implement range for integers only. range should not be used for floats, and it is not very clear how range would be defined for unicode characters. If we really think we need it, a successor trait could be implemented later in a backwards-compatible way.

Alternatively, the size hint should only be implemented for integers.

@aturon
Copy link
Member

aturon commented Jan 23, 2015

range currently is bounded by Int, and it is also likely to be removed in favor of range notation (x .. y) which already produces iterators for the primtive integer types (and can easily be extended with new types like bignums).

@aturon aturon closed this as completed Jan 23, 2015
lnicola pushed a commit to lnicola/rust that referenced this issue Jun 2, 2024
…ykril

Add `toggle_async_sugar` assist code action

Implement code action for sugaring and de-sugaring asynchronous functions.

This code action does not import `Future` trait when de-sugaring and does not touch function boby, I guess this can be implemented later if needed. This action also does not take into consideration other bounds because IMO it's usually "let me try to use sugared version here".

Feel free to request changes, that's my first code action implementation 😄

Closes rust-lang#17010
Relates to rust-lang#16195
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants