Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend 1192 (RangeInclusive) to use an enum. #1320

Merged
merged 3 commits into from Jan 22, 2016

Conversation

Projects
None yet
10 participants
@Stebalien
Copy link
Contributor

Stebalien commented Oct 13, 2015

This PR proposes that RangeInclusive be an enum with Empty/NonEmpty variants instead of a struct with a finished field:

pub enum RangeInclusive<T> {
    Empty {
        at: T,
    },
    NonEmpty {
        start: T,
        end: T,
    }
}

Rational:

  1. finished is very iterator specific. Regardless of what happens, I think this field should be called empty.
  2. start/end don't make sense if the range is empty. Using an enum prevents users from using the start/end of spent ranges. Basically, this makes it impossible for users to do something like foo(my_range.take(10)); bar(my_range) and forget to check finished in bar.
  3. If we ever get more space optimizations (specifically, utf8 code point ones) 'a'...'z' should be the same size as 'a'..'z'.
  4. Don't have to allocate the next start when the end of the range is reached (slight constant factor gain..., maybe).
Amend 1192 (RangeInclusive) to use an enum.
Rational:

1. The word "finished" is very iterator specific. Really, this field is
trying to indicate that the range is actually empty.

2. `start`/`end` don't make sense if the range is empty. Using an enum
prevents coders from using the `start`/`end` of spent ranges. Basically,
this makes it impossible for the coder to do something like
`foo(my_range.take(10)); bar(my_range)` and forget to check `finished`
in `bar`.

3. If we ever get better enum optimizations (specifically, utf8 code
point ones) `'a'...'z'` should be the same size as `'a'..'z'`; the Empty
variant can be encoded as an invalid code point.
@huonw

This comment has been minimized.

Copy link
Member

huonw commented Oct 28, 2015

Hm, I wonder if there's any use for storing the end point, i.e. Empty { end: T } (or even Empty { start: T, end :T }). This at least respects ownership better, in that you don't lose the Ts.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Oct 29, 2015

To expand a little/respond to your rational:

  1. empty is definitely nicer/less opinionated than finished

  2. people are still prevented from dismissing the empty case even if it has fields, and I think it doens't not make no sense, e.g. 0...1 represents decreasing segments as you walk through it:

    [ 0 1 ] 2 3 ..
    0 [ 1 ] 2 3 ...
    0 1 [ ] 2 3 ...
    

    Once at that point, one can expand it in either direction, e.g. 0 1 [ 2 3 ] ....

  3. the space optimisations don't seem imperative for this type: I imagine it will generally either be transient, or, if it is being stored, the Empty case is important (e.g. being stored and used as an iterator). (And, it's easy to store (T, T) instead, if the space is important.)

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

Stebalien commented Oct 29, 2015

I agree that keeping the end is useful but I don't really get the ownership argument for keeping both.

@bluss

This comment has been minimized.

Copy link

bluss commented Nov 3, 2015

I think we should back out of RangeInclusive. We need fully general ranges instead which are open/closed/unbounded of both ends.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Nov 3, 2015

@Stebalien it can be expensive to create/copy/destroy values in Rust (this is in contrast to managed languages, where copying is often just copying a single pointer, or at most manipulating some reference counts), so minimising how often this happens implicitly/without programmer control is good. E.g. if you have a RangeInclusive<BigInt>, it is unfortunate to destroy one end-point when you get to Empty and then have to copy the other if the range is extended.

@bluss I agree that having fully general ranges would be nice, but there has been a lot of discussion/thinking about it and AFAIK there has been no nice (and backwards compatible) syntax raised.

@bluss

This comment has been minimized.

Copy link

bluss commented Nov 3, 2015

Syntax is a second-order issue for me, which types we want to introduce into the core of rust is much more important.

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

Stebalien commented Nov 3, 2015

@huonw std::iter::Step already allocates new objects so you'd have to change that interface to make storing both endpoints useful.

We need fully general ranges instead which are open/closed/unbounded of both ends.

I (obviously) agree. However, it may be a bit late for that given that we're stuck with Range as-is but I'd love to hear proposals.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Nov 3, 2015

I think std::iter::Step essentially creates as few extra objects as it can, i.e. it only duplicates things when it absolutely has to, in order satisfy the type signatures of the traits it is used in (i.e. Iterator::next cannot return references pointing into the Range itself), whereas dropping/reduplicating an object in this case isn't necessary.

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

Stebalien commented Nov 3, 2015

Scratch that, I thought Step was used for iterating in general. Not storing start actually saves an allocation because you don't have to allocate a new one when transitioning from NonEmpty to Empty.

@bluss

This comment has been minimized.

Copy link

bluss commented Nov 3, 2015

I (obviously) agree. However, it may be a bit late for that given that we're stuck with Range as-is but I'd love to hear proposals.

Range<T> is just a struct of two values, what the endpoints mean can be changed by convention (without breaking existing practice). Among the possibilities is to let a...b or another inclusive range syntax produce Range { start: Bound::Include(a), end: Bound::Exclude(b) }.

Range<usize> has a pretty fixed interpretation as it is now, but Range<Bound<usize>> need not have. These are just rather loose ideas.

@Stebalien

This comment has been minimized.

Copy link
Contributor Author

Stebalien commented Nov 13, 2015

@huonw, I've updated the proposal to avoid throwing away the endpoint. I do only need to keep one because, on the last iteration, I can just return the start as-is without advancing it leaving me with the end only.

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Nov 18, 2015

I can just return the start as-is without advancing it leaving me with the end only.

Oh, that's a good point.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2016

Hear ye, hear ye! This RFC is now entering final comment period.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 15, 2016

I am 👍 on this RFC but I would request that the author add a "# History" or "# Amendments" section and simply note the change that is occurring. It's nice when reading the text of an RFC if you don't have to consult the git history to know when it was updated. For example:

# Amendments

- In rust-lang/rfcs#1320, this RFC was amended to change the `RangeInclusive` type from a struct with a `finished` field to an enum.
@bluss

This comment has been minimized.

Copy link

bluss commented Jan 15, 2016

Where did the discussion of more complete range syntax go? Mentioned in #1254

@huonw

This comment has been minimized.

Copy link
Member

huonw commented Jan 15, 2016

That topic seems more relevant for stabilisation of the actual ... sugar, orthogonal to this particular RFC which is mostly an implementation detail (i.e. this RFC is an improvement for the ... sugar, even if we don't end up going with it in the end).

(Maybe you're thinking of https://internals.rust-lang.org/t/vs-for-inclusive-ranges/1539 ?)

@bluss

This comment has been minimized.

Copy link

bluss commented Jan 15, 2016

I'm not thinking of that, that's an old discussion. #1254 was much more recent, indicating aturon and lang team wanted to look at this.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jan 19, 2016

@bluss Yes, I am on the same page: I want to back out ... an instead settle on a general syntax that can cover inclusive/exclusive on both sides, just like mathematical range notation. I just haven't had time to write an RFC for it. (I'd be happy to collaborate on an RFC if you're up for that...)

However, as @huonw said, this RFC is just an amendment to the previous one, so it's somewhat orthogonal to the larger question.

@aturon

This comment has been minimized.

Copy link
Member

aturon commented Jan 20, 2016

Libs team consensus: this is a clear improvement over the original RFC.

@@ -37,12 +41,11 @@ pub struct RangeToInclusive<T> {
}
```

Writing `a...b` in an expression desugars to `std::ops::RangeInclusive
{ start: a, end: b, finished: false }`. Writing `...b` in an
Writing `a...b` in an expression desugars to `std::ops::RangeInclusive::NonEmpty { start: a, end: b }`. Writing `...b` in an
expression desugars to `std::ops::RangeToInclusive { end: b }`.

This comment has been minimized.

@liigo

liigo Jan 21, 2016

Contributor

what?

This comment has been minimized.

@liigo

liigo Jan 21, 2016

Contributor

oops. soorry!

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 22, 2016

Huzzah! The lang team has decided to accept this RFC.

@nikomatsakis

This comment has been minimized.

Copy link
Contributor

nikomatsakis commented Jan 22, 2016

I kept the same tracking issue, but added a "to do" item to implement the changes suggested here.

@nikomatsakis nikomatsakis merged commit 2025389 into rust-lang:master Jan 22, 2016

@durka

This comment has been minimized.

Copy link
Contributor

durka commented Jan 26, 2016

Implemented in (in progress) rust-lang/rust#30884.

@eddyb

This comment has been minimized.

Copy link
Member

eddyb commented Apr 5, 2016

I wonder whether, from a performance perspective, how does the enum approach compare to using x+1...x as the empty state and having a MAX...MAX-1 special case when x == MAX?

I can't tell if this was suggested at some point, but even if it wasn't, I doubt the implementation will change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.