Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: overloaded slice notation #198

Merged
merged 0 commits into from Sep 11, 2014

Conversation

Projects
None yet
@aturon
Copy link
Member

aturon commented Aug 14, 2014

This RFC adds overloaded slice notation:

  • foo[] for foo.as_slice()
  • foo[n..m] for foo.slice(n, m)
  • foo[n..] for foo.slice_from(n)
  • foo[..m] for foo.slice_to(m)
  • mut variants of all the above

via two new traits, Slice and SliceMut.

It also changes the notation for range match patterns to ..., to
signify that they are inclusive whereas .. in slices are exclusive.

Rendered

Active: https://github.com/rust-lang/rfcs/blob/master/text/0198-slice-notation.md

@metajack

This comment has been minimized.

Copy link

metajack commented Aug 14, 2014

I like this. +1

@metajack

This comment has been minimized.

Copy link

metajack commented Aug 14, 2014

It occurs to me that other languages tend to use foo[:] for the as_slice part.

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Aug 14, 2014

Super 👍 here. I'm not sure I'm aware of precedent either way for [] vs [:], but I don't mind either syntax. foo[..] is another variant that might be more symmetric?

Also, in Ruby, .. vs ... is a question of inclusive vs exclusive, is that worth having?

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

I think foo[:] is used in Python, where : is used like .. here.

I agree that [..] would be reasonable, but since as_slice is so very common, IMO the shorter the better (and I think [] is reasonably clear).

I think inclusive/exclusive is probably overkill here.

@metajack

This comment has been minimized.

Copy link

metajack commented Aug 14, 2014

Python uses [:], but so does Julia, Matlab, and probably a few others. I'm fine with [] I think, I just thought I'd bring up what other syntaxes exist in case it hadn't been considered.

@Valloric

This comment has been minimized.

Copy link

Valloric commented Aug 14, 2014

+1. Would love to see this!


```rust
trait Slice<Idx, S> {
fn as_slice<'a>(&'a self) -> &'a S;

This comment has been minimized.

@huonw

huonw Aug 14, 2014

Member

AFAICT, returning a reference like this limits the trait to contiguous-memory slices, e.g. it wouldn't be possible to write x[n..m] on a RingBuf, or rope, or a hypothetical StridedSlice<'a, T> { data: &'a [T], step: uint } type that represents every stepth element of data (i.e. strided[i] == data[i * step]).

I think this may mean it is only possible to use with [T]/str and Vec<T>/String.

This comment has been minimized.

@aturon

aturon Aug 14, 2014

Author Member

It should be possible to use this with other unsized types as well (i.e., structs with embedded [T] fields).

There are two ways to get the kind of generality you have in mind:

  • HKT. In that case, the type "S" here could instead be a lifetime-indexed family of types, so you could use StridedSlice for example.
  • Lifetime lifting. That is, rather than have the trait methods take &self, have them take self and impl the trait on reference types. (That's more generally a way to encode HKTs.) Problem is, auto-deref and auto-borrowing for method receivers doesn't work for this kind of case: if foo: Vec<T>, you'd have to write (&foo)[] to get a full slice, for example. You can hack around this in various ways, but it quickly gets pretty ugly.

I'm not sure if there's a design that would allow us to smoothly transition to an HKT type parameter later on. Worth thinking about.

This comment has been minimized.

@huonw

huonw Aug 14, 2014

Member

It should be possible to use this with other unsized types as well (i.e., structs with embedded [T] fields).

How many such types will we have for which slicing makes sense? I can only think of minor variations on the 4 types above e.g. different allocators (which Vec<T, Allocator> would solve anyway) and interned strings ala libsyntax's InternedString.

(My point here is this RFC isn't particularly flexible; it's really nice for the places it does work.)

This comment has been minimized.

@brendanzab

brendanzab Aug 14, 2014

Member

Add this to the list of things that HKT will allow us ;)

This comment has been minimized.

@SiegeLord

SiegeLord Aug 14, 2014

I don't understand exactly why HKT is invoked here. I just did some testing with these traits returning just S and everything seemed to work fine.

Again, I am going to protest this RFC just like the Index traits one because it does not support lazy evaluation, which is critical to making efficient mathematical libraries. In particular, it seems clear that if this specific design is kept, it will need to be supplemented by SliceSet and SliceGet like Index/IndexMut should be.

This comment has been minimized.

@aturon

aturon Aug 14, 2014

Author Member

@SiegeLord The traits as given work fine. My point about HKT is that you'd need it to make the traits general enough to do what @huonw was describing.

This comment has been minimized.

@aturon

aturon Aug 14, 2014

Author Member

@SiegeLord I wasn't around when the Index traits were designed. Can you either point me to that earlier discussion, or explain your concerns in more detail?

This comment has been minimized.

@brendanzab

brendanzab Aug 19, 2014

Member

@SiegeLord were you referring to #111?


# Alternatives

For improving the ergonomics of `as_slice`, there are two main alternatives.

This comment has been minimized.

@alexcrichton

alexcrichton Aug 14, 2014

Member

All of the methods on slices currently come through traits, so another possible alternative would be implementing the traits for String and Vec directly? I would expect all methods would be default in terms of as_slice(), so the trait implementations for Vec and String wouldn't be too lengthy.

This would only solve half the problem, though. Calling methods would be much nicer, but passing arguments would still require a manual .as_slice(). I think I prefer slicing syntax over this, but it may be worth considering as an alternative.

This comment has been minimized.

@aturon

aturon Aug 14, 2014

Author Member

Yes, my hope is that we do not implement Deref on e.g. Vec, and instead add this slicing notation and implement the slice traits directly on Vec as you're suggesting.

But the ergonomics issue I'm most worried about is the need for as_slice in arguments.

This comment has been minimized.

@alexcrichton

alexcrichton Aug 14, 2014

Member

Sounds good to me!

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Aug 14, 2014

I would expect a common question about this syntax to be "Why did you choose .. over :?", or conversely "Why did you choose : over ..?". I'm curious, but what led you to ..? It may be worth adding some rationale for the choice of syntax (whatever it ends up being).

Of the other two bits of syntax introduced, I would predict fewer questions. I would expect that the usage of brackets to be unanimously expected as brackets convey "slice" in rust right now. I think it's definitely necessary to have a mutable slicing variant, and I think that the syntax you've proposed is likely the best one available.

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

@alexcrichton Updated with rationale.

The `as_slice` operator is particularly important. Since we've moved away from
auto-slicing in coercions, explicit `as_slice` calls have become extremely
common, and are one of the
[leading ergonomic/first impression](https://github.com/rust-lang/rust/issues/14983)

This comment has been minimized.

@huonw

huonw Aug 14, 2014

Member

I'm not 100% sure that e.g. foo(some_string[]); is better from a first impressions/beginners perspective, it's rather cryptic/hard-to-google (sigil like?) syntax, unlike foo(some_string.as_slice()) or just foo(some_string).

severely limiting the ability of programmers to usefully implement new traits
for `Vec`.

* The idea of `Vec` as a smart pointer around a slice, and the use of `&*v` as

This comment has been minimized.

@huonw

huonw Aug 14, 2014

Member

I personally rather like it... (Something like: a Vec<T> is just a DST sequence type with two words of metadata rather than one like &[T], with Vec being the pointer, like Box.)

This comment has been minimized.

@aturon

aturon Aug 14, 2014

Author Member

I'm not necessarily opposed, but as I tried to explain in the RFC, there may be very practical reasons not to implement Deref on Vec<T>, depending on our final resolution rules.

My point is just that if we can solve the ergonomic issues without tying it to Deref, we'll have more freedom for the design, even if we do ultimately go the Deref route.

@brendanzab

This comment has been minimized.

Copy link
Member

brendanzab commented Aug 14, 2014

Awesome, looks good! I prefer vec[..] for vec.as_slice() because it is more consistent, but it is a minor thing.

@brendanzab

This comment has been minimized.

Copy link
Member

brendanzab commented Aug 14, 2014

Could you include a mention of what standard library types will implement these traits?

@liigo

This comment has been minimized.

Copy link
Contributor

liigo commented Aug 14, 2014

I prefer .. range syntax, while : should always used as name:type.


*Mutable slicing*

- `foo[mut]` for `foo.as_mut_slice()`

This comment has been minimized.

@SiegeLord

SiegeLord Aug 14, 2014

This puzzles me. Why doesn't indexing use a[mut 0] for this same effect? Is there a between-the-lines proposal here to make IndexMut also require a mut annotation? If not, why does this need an annotation and IndexMut not?

@bluss

This comment has been minimized.

Copy link

bluss commented Aug 14, 2014

.. is good, but let me note that match does use it for inclusive ranges, which is a bit of an allround inconsistency. i.e. match ch { 'a' .. 'z' => {} }

Obligatory Python-inspired question, what about multiple dimensions and strides? Python has xs[:, 1:3, :], does it make sense for Rust to mimic that with xs[.., 1..3, ..] for the same purpose? (With this example, [..] is the winner over [] for the full slice syntax.)

And a strided slice would be less efficient and not implementable as &[T]. Python's xs[a:b:c] could be maybe xs[a..b..c] in Rust.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Aug 14, 2014

I have always wanted this sugar, but I always was afraid that it'd be implemented in this very restricted fashion, making it primarily useful for built-in types and Vec. Personally I'd prefer for this not to be if it's so restricted, and rather for macros to be enhanced to support something like the foo.s![1..2] syntax, which would convinently take care of all the stride/multiple dimension queries.

A reference for the multi-dimenstional slicing is D's solution, which might be doable in Rust: http://dlang.org/operatoroverloading#Slice

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord It would be helpful if you could spell out the restrictions you're worried about here, and what you'd like to see instead. (Or if you've written about this previously, please post a link.)

@nikomatsakis nikomatsakis referenced this pull request Aug 14, 2014

Closed

Syntax for slices #13

@lilyball

This comment has been minimized.

Copy link
Contributor

lilyball commented Aug 14, 2014

👍

@bluss I agree that this is mildly inconsistent with match, but really it's match that's doing it wrong, .. is traditionally a half-open range.

@SiegeLord The support here should not interfere with your ability to write foo.s![1..2] if we ever get method macros. And I too want to know what restrictions you're talking about. I can't imagine how slicing is ever supposed to work on e.g. a RingBuf, unless you really just want a convenient syntax to produce iterators.

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

@huonw A quick note regarding the more general version: I talked with @nikomatsakis about it this morning, and it should be possible to add a SliceHKT trait later with a more general (HKT) return type. We would provide a blanket impl of SliceHKT for anything implementing Slice, and switch the notation to use SliceHKT. So it should be backwards-compatible.

That should suffice to eventually handle striding, ropes, etc.

I'll amend the RFC with these points and a few others soon.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Aug 14, 2014

Well, here's a very simple illustration of what would be possible with 'better' operator overloading. This snippet (made very non-generic for illustrative purposes) adds two vectors elementwise and then slices them without allocating a temporary vector. Note how there is nothing to return the reference to in the slice_from and index methods. Additionally, the slice_from takes self by value in order for lifetimes to work out. This isn't a contrived example, and I've built a small linear algebra library based on these idioms. Notably, this is exactly the same thing that Iterator does.

struct Vector
{
    data: Vec<f64>,
}

impl Vector
{
    fn add<'l>(&'l self, other: &'l Vector) -> Adder<'l>
    {
        Adder{ lhs: self, rhs: other }
    }

    fn index(&self, idx: uint) -> f64
    {
        self.data[idx]
    }
}

struct Adder<'l>
{
    lhs: &'l Vector,
    rhs: &'l Vector
}

impl<'l> Adder<'l>
{
    fn index(&self, idx: uint) -> f64
    {
        self.lhs.index(idx) + self.rhs.index(idx)
    }

    fn slice_from(self, from: uint) -> Slicer<'l>
    {
        Slicer{ base: self, from: from }
    }
}

struct Slicer<'l>
{
    base: Adder<'l>,
    from: uint
}

impl<'l> Slicer<'l>
{
    fn index(&self, idx: uint) -> f64
    {
        self.base.index(self.from + idx)
    }
}

fn main()
{
    let a = Vector{ data: vec![1.0, 2.0, 3.0] };
    let b = Vector{ data: vec![4.0, 5.0, 6.0] };
    let res = a.add(&b).slice_from(1);

    assert_eq!(res.index(0), 7.0);
    assert_eq!(res.index(1), 9.0);
}
@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord Thanks. This seems essentially the same as the point as @huonw raised. In both cases, the problem is that we need HKT in order to abstract over a type like Slicer (which is actually a family of types indexed by lifetimes).

See my comment above to @huonw, namely that we will be able to add this more general form of slice syntax later, after we have HKT, in a backwards-compatible way. I will add details to the RFC as soon as I can, but the basic point is that accepting this syntax now doesn't preclude us from doing the more general version later.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Aug 14, 2014

Well, that's where my question comes in. I managed to create a fully functional library without HKT today (perhaps it could be more functional with HKT?) but it's not clear to me that HKT is necessary.

Here are the relevant traits and implementations from my library:

pub trait MatrixSlice
{
    fn slice(self, start: uint, end: uint) -> Slice<Self>;
}

impl<T: MatrixShape>
MatrixSlice for
T
{
    fn slice(self, start: uint, end: uint) -> Slice<T>
    {
        Slice::new(self, start, end)
    }
}

pub trait MatrixShape
{
    fn ncol(&self) -> uint;
    fn nrow(&self) -> uint;
}

impl<'l>
MatrixShape for
&'l Matrix
{
    ...
}

You'll note that I did specialize my Slice trait to have a specific return type, but I just tried using a generic return type and everything seems to still work:

pub trait MatrixSlice2<T>
{
    fn slice2(self, start: uint, end: uint) -> T;
}

impl<T: MatrixShape + Collection>
MatrixSlice2<Slice<T>> for
T
{
    fn slice2(self, start: uint, end: uint) -> Slice<T>
    {
        Slice::new(self, start, end)
    }
}
@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord That works because you're taking self by value. (See my first comment to Huon.) This is effectively a way of encoding HKT by "lifting" the lifetime parameter to the type implementing the trait.

In general, though, we want indexing and slicing operations to take &self or &mut self, and in that case the lifetime of the slice returned needs to depend on the lifetime of the self parameter.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Aug 14, 2014

It has to be self in order to support chaining, as far as I understand. I.e. today this doesn't pass borrow-check:

let a = vec![1u];
let s = a.as_slice().as_slice();

My by value self varieties work fine, however (aside from the occasional explicit borrow that you mentioned to Huon).

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord That's a problem with the rules for borrowing and temporaries, which will be fixed when the implementation of @kballard's RFC lands.

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 14, 2014

@SiegeLord To be more clear, if we used self for Index and Slice, you'd have to write things like:

let a = vec![1u];
let x = (&a)[0];
let s = (&a)[];

which seems less than ideal. I think with the HKT-based traits and better rules for temporaries (which we've already committed to doing), we should be able to have good ergonomics and still handle your and @huonw's examples.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Aug 14, 2014

Alright, I'll defer until those things are implemented and maybe try to port my library away from self to see how it works out.

@glaebhoerl

This comment has been minimized.

Copy link
Contributor

glaebhoerl commented Aug 24, 2014

Thinking about it, I don't see why we couldn't have .. be an inclusive range. Scrolling through the discussion I see some strongly-worded claims that it would be confusing, but not much justification.

Naively I would expect that if foo[0] is one element, then foo[0..1] is two elements.

Then we're also consistent with match, don't need to think about adding ..., and so on. If the issue is that it would be inconsistent with range(), then I think that's a smaller problem, because range() is a library function and doesn't use .. syntax.

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Aug 25, 2014

Exclusive ranges are a lot easier to deal with:

let slice = some_vec[a..a + 2];
assert_eq!(slice.len(), 2);
let m = some_vec.len() / 2;
// These two slices have no overlap
let slice1 = some_vec[..m];
let slice2 = some_vec[m..];

You'd have to sprinkle lots of + 1 or - 1 everywhere if that syntax was inclusive. Having used slices extensively in D and Python (where they are exclusive), having them be exclusive is very natural and convenient. Having used them in Octave/MATLAB where they are inclusive, I consider that to be one of the worst mistakes made in MATLAB (right next to 1-based indexing).

@lilyball

This comment has been minimized.

Copy link
Contributor

lilyball commented Aug 25, 2014

I think slicing pretty much requires exclusive ranges. It's confusing otherwise.

Honestly, maybe it's not really that bad to have .. in a slice be an exclusive range and .. in a match arm be an inclusive range. It's weird, sure, and I think it makes more sense to adjust match arms to use ..., but if there's some reason people don't like that, we could just use .. in both places. The contexts are different, and the semantics of the .. are different (one is basically shorthand for a bunch of alternate arms, the other is defining a range to index over).

@pcwalton

This comment has been minimized.

Copy link
Contributor

pcwalton commented Aug 25, 2014

Agreed with @kballard.

@CloudiDust

This comment has been minimized.

Copy link
Contributor

CloudiDust commented Aug 25, 2014

@kballard @pcwalton, it is still better to have more consistency, and honestly I don't see why one would oppose this change (other than maybe he/she prefers the Ruby way, then he/she will oppose the slice notation too). This is not Rust's first breaking change after all. :P

@lilyball

This comment has been minimized.

Copy link
Contributor

lilyball commented Aug 25, 2014

@CloudiDust At this point, I think we should modify match. However, that can be done separately to introducing slicing. My claim right now is that we can proceed with the RFC for overloaded slice notation without modifying match, and a separate RFC can be introduced later to change the match syntax to use ... instead.

@CloudiDust

This comment has been minimized.

Copy link
Contributor

CloudiDust commented Aug 26, 2014

@kballard, I agree with you.

@liigo

This comment has been minimized.

Copy link
Contributor

liigo commented Aug 26, 2014

agree with @kballard (his latest comment).

@SiegeLord

This comment has been minimized.

Copy link

SiegeLord commented Aug 27, 2014

Incidentally, I want to mention an important use case for exclusive ranges in matches (i.e. I'm suggesting allowing both .. and ... in them with the former being exclusive like in slices, and the latter being inclusive): floating point ranges:

// .. means exclusive here
match 1.0f32
{
    0.0 .. 1.0 => (),
    1.0 .. 2.0 => (),
    _ => ()
}

It is basically impossible to write the above match with inclusive ranges, so while we support floating point ranges, there is some impetus to providing exclusive ranges with matches.

That said, floating point ranges are kind of... strange anyway, so an alternate solution would be the non-solution of nuking this problematic feature.

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 29, 2014

I've amended the RFC to clarify some points discussed in the comments, and to change the match notation to ... (for inclusive matching).

@lilyball

This comment has been minimized.

Copy link
Contributor

lilyball commented Aug 29, 2014

👍, although I think the Summary should also include a mention of the match change.

@aturon

This comment has been minimized.

Copy link
Member Author

aturon commented Aug 29, 2014

@kballard Thanks, updated.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Sep 11, 2014

Discussed a few weeks ago and the decision was to merge this.

@John-Nagle

This comment has been minimized.

Copy link

John-Nagle commented Jan 26, 2015

Did this make it into the pre-1.0 Alpha? The reference manual doesn't mention it. Currently, where s is a "&str", the old way works but is deprecated:

256:34 warning: use of deprecated item: use slice notation [a..] instead, #[warn(deprecated)] on by default
src/parsedatetimestd.rs:256     let sdigits = s.slice_from(1);  

The new way doesn't work yet.

src/parsedatetimestd.rs:257:9: 257:16 error: the trait `core::marker::Sized` is not implemented for the type `str` [E0277]
src/parsedatetimestd.rs:257     let sdigits = s[1..];  

Something seems to be out of sync.

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Jan 26, 2015

@John-Nagle let sdigits = &s[1..];

@John-Nagle

This comment has been minimized.

Copy link

John-Nagle commented Jan 26, 2015

OK. Still needs to go in the reference manual, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.