Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amend 1192 (RangeInclusive) to use an enum. #1320

Merged
merged 3 commits into from
Jan 22, 2016

Conversation

Stebalien
Copy link
Contributor

This PR proposes that RangeInclusive be an enum with Empty/NonEmpty variants instead of a struct with a finished field:

pub enum RangeInclusive<T> {
    Empty {
        at: T,
    },
    NonEmpty {
        start: T,
        end: T,
    }
}

Rational:

  1. finished is very iterator specific. Regardless of what happens, I think this field should be called empty.
  2. start/end don't make sense if the range is empty. Using an enum prevents users from using the start/end of spent ranges. Basically, this makes it impossible for users to do something like foo(my_range.take(10)); bar(my_range) and forget to check finished in bar.
  3. If we ever get more space optimizations (specifically, utf8 code point ones) 'a'...'z' should be the same size as 'a'..'z'.
  4. Don't have to allocate the next start when the end of the range is reached (slight constant factor gain..., maybe).

Rational:

1. The word "finished" is very iterator specific. Really, this field is
trying to indicate that the range is actually empty.

2. `start`/`end` don't make sense if the range is empty. Using an enum
prevents coders from using the `start`/`end` of spent ranges. Basically,
this makes it impossible for the coder to do something like
`foo(my_range.take(10)); bar(my_range)` and forget to check `finished`
in `bar`.

3. If we ever get better enum optimizations (specifically, utf8 code
point ones) `'a'...'z'` should be the same size as `'a'..'z'`; the Empty
variant can be encoded as an invalid code point.
@nrc nrc added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Oct 15, 2015
@huonw
Copy link
Member

huonw commented Oct 28, 2015

Hm, I wonder if there's any use for storing the end point, i.e. Empty { end: T } (or even Empty { start: T, end :T }). This at least respects ownership better, in that you don't lose the Ts.

@huonw
Copy link
Member

huonw commented Oct 29, 2015

To expand a little/respond to your rational:

  1. empty is definitely nicer/less opinionated than finished

  2. people are still prevented from dismissing the empty case even if it has fields, and I think it doens't not make no sense, e.g. 0...1 represents decreasing segments as you walk through it:

    [ 0 1 ] 2 3 ..
    0 [ 1 ] 2 3 ...
    0 1 [ ] 2 3 ...
    

    Once at that point, one can expand it in either direction, e.g. 0 1 [ 2 3 ] ....

  3. the space optimisations don't seem imperative for this type: I imagine it will generally either be transient, or, if it is being stored, the Empty case is important (e.g. being stored and used as an iterator). (And, it's easy to store (T, T) instead, if the space is important.)

@Stebalien
Copy link
Contributor Author

I agree that keeping the end is useful but I don't really get the ownership argument for keeping both.

@bluss
Copy link
Member

bluss commented Nov 3, 2015

I think we should back out of RangeInclusive. We need fully general ranges instead which are open/closed/unbounded of both ends.

@huonw
Copy link
Member

huonw commented Nov 3, 2015

@Stebalien it can be expensive to create/copy/destroy values in Rust (this is in contrast to managed languages, where copying is often just copying a single pointer, or at most manipulating some reference counts), so minimising how often this happens implicitly/without programmer control is good. E.g. if you have a RangeInclusive<BigInt>, it is unfortunate to destroy one end-point when you get to Empty and then have to copy the other if the range is extended.

@bluss I agree that having fully general ranges would be nice, but there has been a lot of discussion/thinking about it and AFAIK there has been no nice (and backwards compatible) syntax raised.

@bluss
Copy link
Member

bluss commented Nov 3, 2015

Syntax is a second-order issue for me, which types we want to introduce into the core of rust is much more important.

@Stebalien
Copy link
Contributor Author

@huonw std::iter::Step already allocates new objects so you'd have to change that interface to make storing both endpoints useful.

We need fully general ranges instead which are open/closed/unbounded of both ends.

I (obviously) agree. However, it may be a bit late for that given that we're stuck with Range as-is but I'd love to hear proposals.

@huonw
Copy link
Member

huonw commented Nov 3, 2015

I think std::iter::Step essentially creates as few extra objects as it can, i.e. it only duplicates things when it absolutely has to, in order satisfy the type signatures of the traits it is used in (i.e. Iterator::next cannot return references pointing into the Range itself), whereas dropping/reduplicating an object in this case isn't necessary.

@Stebalien
Copy link
Contributor Author

Scratch that, I thought Step was used for iterating in general. Not storing start actually saves an allocation because you don't have to allocate a new one when transitioning from NonEmpty to Empty.

@bluss
Copy link
Member

bluss commented Nov 3, 2015

I (obviously) agree. However, it may be a bit late for that given that we're stuck with Range as-is but I'd love to hear proposals.

Range<T> is just a struct of two values, what the endpoints mean can be changed by convention (without breaking existing practice). Among the possibilities is to let a...b or another inclusive range syntax produce Range { start: Bound::Include(a), end: Bound::Exclude(b) }.

Range<usize> has a pretty fixed interpretation as it is now, but Range<Bound<usize>> need not have. These are just rather loose ideas.

@Stebalien
Copy link
Contributor Author

@huonw, I've updated the proposal to avoid throwing away the endpoint. I do only need to keep one because, on the last iteration, I can just return the start as-is without advancing it leaving me with the end only.

@huonw
Copy link
Member

huonw commented Nov 18, 2015

I can just return the start as-is without advancing it leaving me with the end only.

Oh, that's a good point.

@huonw huonw added T-lang Relevant to the language team, which will review and decide on the RFC. I-nominated labels Jan 14, 2016
@nikomatsakis
Copy link
Contributor

Hear ye, hear ye! This RFC is now entering final comment period.

@nikomatsakis nikomatsakis added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed I-nominated labels Jan 15, 2016
@nikomatsakis
Copy link
Contributor

I am 👍 on this RFC but I would request that the author add a "# History" or "# Amendments" section and simply note the change that is occurring. It's nice when reading the text of an RFC if you don't have to consult the git history to know when it was updated. For example:

# Amendments

- In rust-lang/rfcs#1320, this RFC was amended to change the `RangeInclusive` type from a struct with a `finished` field to an enum.

@bluss
Copy link
Member

bluss commented Jan 15, 2016

Where did the discussion of more complete range syntax go? Mentioned in #1254

@huonw
Copy link
Member

huonw commented Jan 15, 2016

That topic seems more relevant for stabilisation of the actual ... sugar, orthogonal to this particular RFC which is mostly an implementation detail (i.e. this RFC is an improvement for the ... sugar, even if we don't end up going with it in the end).

(Maybe you're thinking of https://internals.rust-lang.org/t/vs-for-inclusive-ranges/1539 ?)

@bluss
Copy link
Member

bluss commented Jan 15, 2016

I'm not thinking of that, that's an old discussion. #1254 was much more recent, indicating aturon and lang team wanted to look at this.

@aturon
Copy link
Member

aturon commented Jan 19, 2016

@bluss Yes, I am on the same page: I want to back out ... an instead settle on a general syntax that can cover inclusive/exclusive on both sides, just like mathematical range notation. I just haven't had time to write an RFC for it. (I'd be happy to collaborate on an RFC if you're up for that...)

However, as @huonw said, this RFC is just an amendment to the previous one, so it's somewhat orthogonal to the larger question.

@aturon
Copy link
Member

aturon commented Jan 20, 2016

Libs team consensus: this is a clear improvement over the original RFC.

@@ -37,12 +41,11 @@ pub struct RangeToInclusive<T> {
}
```

Writing `a...b` in an expression desugars to `std::ops::RangeInclusive
{ start: a, end: b, finished: false }`. Writing `...b` in an
Writing `a...b` in an expression desugars to `std::ops::RangeInclusive::NonEmpty { start: a, end: b }`. Writing `...b` in an
expression desugars to `std::ops::RangeToInclusive { end: b }`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops. soorry!

@nikomatsakis
Copy link
Contributor

Huzzah! The lang team has decided to accept this RFC.

@nikomatsakis
Copy link
Contributor

I kept the same tracking issue, but added a "to do" item to implement the changes suggested here.

@nikomatsakis nikomatsakis merged commit 2025389 into rust-lang:master Jan 22, 2016
@durka
Copy link
Contributor

durka commented Jan 26, 2016

Implemented in (in progress) rust-lang/rust#30884.

@eddyb
Copy link
Member

eddyb commented Apr 5, 2016

I wonder whether, from a performance perspective, how does the enum approach compare to using x+1...x as the empty state and having a MAX...MAX-1 special case when x == MAX?

I can't tell if this was suggested at some point, but even if it wasn't, I doubt the implementation will change.

@Centril Centril added the A-ranges Proposals relating to ranges. label Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ranges Proposals relating to ranges. final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants