Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Add for Vec #38573

Closed
wants to merge 1 commit into from
Closed

Conversation

michaelsproul
Copy link
Contributor

In a bid to gain a little more delicious syntactic sugar, this PR implements + for vectors.

Some operations are perfectly suited to using plus, rather than creating a mutable variable, extending it, and then returning it, like so:

fn foo() -> Vec<u32> {
    let mut v = vec![1, 2, 3];
    v.extend(other_vec_func());
    v
}

Compared to using plus:

fn foo() -> Vec<u32> {
    vec![1, 2, 3] + other_vec_func()
}

Rust's + is already noncommutative due to the implementations for strings, i.e. "hello" + "world" ≠ "world" + "hello", so no worries there.

@est31
Copy link
Member

est31 commented Dec 23, 2016

I think its not delicious, as its non obvious. + can mean element wise addition just as well.

@KalitaAlexey
Copy link
Contributor

KalitaAlexey commented Dec 23, 2016

@est31,
There are a lot of languages where binary + applied to two arrays means concatenation of them.
This is why I think this is obvious.

@michaelsproul
Copy link
Contributor Author

@est31, @KalitaAlexey: Python and Ruby for example. I thought JavaScript might, but it's just hilariously weird as usual... [1] + [2] === "12", nice.

@michaelsproul
Copy link
Contributor Author

Other points to discuss:

  • If Vec implements Add should it also implement AddAssign?
  • Is it sufficient to have impls for Vec<T> + Vec<T> and Vec<T> + &[T]? Other RHS types that might be desirable include arrays ([T; n]) and mutable slices (&mut [T]). My current thinking is that neither are necessary as its usually a really easy change to get a &[T] from an array or a mutable slice.

@michaelsproul
Copy link
Contributor Author

@est31: On the topic of element-wise addition, I don't think Vec is the right type for that sort of thing, as you'd have to rely on the lengths being the same at runtime. In a mathematical/linear-algebra context, fixed length vectors would be more appropriate, something like Vect3(x: T, y: T, z: T). If Rust eventually gets support for integer dependent types, you'd be able to write nice types like Vect<T, n>.

@michaelsproul
Copy link
Contributor Author

This test failed, but I think whatever it was once testing has been lost in the annals of time... https://github.com/rust-lang/rust/blob/master/src/test/compile-fail/vec-res-add.rs

(when it was first commited in 2011 it looked like this: cb4e99b#diff-d776888cfcb96ac7c2c3eed0a499e137)

@Mark-Simulacrum
Copy link
Member

I think implementing Add for Vec isn't a good idea. While some languages allow this, Rust's ownership semantics make it somewhat more complicated.

Depending on the specific scenario, you could want Add to extend i (but that's not really Add's semantic meaning), and you could also potentially want Add to clone the first (and maybe the second?) vectors.

However, since we did this for Strings, maybe we should do it for Vec too for consistency. It's not really clear exactly what the right approach here should be.

@frewsxcv
Copy link
Member

To me, I would have expected:

fn foo() -> Vec<u32> {
    vec![1, 2, 3] + other_vec_func()
}

...to be equivalent to:

fn foo() -> Vec<u32> {
    let v = vec![];
    v.extend(vec![1, 2, 3]);
    v.extend(other_vec_func());
    v
}

It seems misleading that Add mutates self (the left-hand-side). That seems like something AddAssign should do.

@michaelsproul
Copy link
Contributor Author

michaelsproul commented Dec 23, 2016

@frewsxcv: Starting with an empty vector and extending it twice is semantically equivalent to extending the first one by the second. The mut self argument in my current impl doesn't change how the function is called at all, it just makes the self value that was moved into the function locally mutable.

@frewsxcv
Copy link
Member

The mut self argument in my current impl doesn't change how the function is called at all, it just makes the self value that was moved into the function locally mutable.

Ah, I missed this. I thought the Add trait was taking a &mut self parameter, not mut self.

@frewsxcv frewsxcv added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Dec 24, 2016
@leonardo-m
Copy link

Having a handy way to concatenate vectors is a nice, despite the "+" symbol is not the best for this. But the compiler has to optimize well the chains like "A+B+C+D".

@alexcrichton alexcrichton self-assigned this Dec 26, 2016
@alexcrichton
Copy link
Member

Thanks for the PR @michaelsproul! I feel like we've had this proposed before but I unfortunately can't seem to dig that up any more.

I believe others on @rust-lang/libs have regretted the Add for String implementation in the past and have desired to avoid an impl such as this. That being said Add for String exists and its stable, so I personally at least would be ok adding this impl. I'm curious what others think on @rust-lang/libs, though.

Also, as a libs policy, we do indeed recommend AddAssign whenever Add is implemented, so if you'd like to add those impls here that'd be great!

@michaelsproul
Copy link
Contributor Author

Given the mostly negative feedback so far, I'm happy to consider dropping this PR. I've summarised some points for and against below.

For

  • It might make Rust friendly to new-comers from Python/Ruby/etc who are used to being able to + lists together.
  • It gives seasoned Rust programmers a more concise syntax to use in some situations where it could aid readability (and elegance).
  • It is consistent with the implementations of Add for String.

Against

  • Plus is usually a commutative operation.
  • Enabling + for Vec breaks the universality of extend as the recommended way to add things to a collection in bulk. Beginners who are likely to start off using Vecs won't discover extend until later. We may end up with code like vec += iter.collect() instead of vec.extend(iter).
  • In a similar vein to the point above, changing the recommended way to concatenate vectors would require changing style guides, lints and documentation.
  • Adding lists and strings together can quickly turn into a mess (I'm looking at you, Java). String addition is already stable, but perhaps we should draw the line and say no more Add implementations for list-like collections (aside: an Add implementation for Vec probably also warrants one for VecDeque and LinkedList).

All things considered, I could go either way...

@leonardo-m
Copy link

leonardo-m commented Dec 28, 2016

The "cons" seem about as strong as the "pro".

Regarding Python, in it you can add items to the tail of a list (dynamic array) in several ways:

>>> a = [1]
>>> a += [2]
>>> a
[1, 2]
>>> b = [1]
>>> b.append(2)
>>> b
[1, 2]
>>> c = [1]
>>> c.extend([2])
>>> c
[1, 2]

The Python Zen rule "There should be one-- and preferably only one --obvious way to do it." in this case suggests to use the append().

D language is cleaner because it lacks an "extend" or "append" method, you just use the noncommutative operator with a single item:

void main() {
    auto a = [1];
    a ~= 2; // Idiomatic way.
    auto b = [1];
    b ~= [2];
}

@aturon
Copy link
Member

aturon commented Jan 4, 2017

Thanks for the summary, @michaelsproul!

On the whole, I don't think we should merge this PR. For me, the two major factors are:

  • The fact that + on vectors has a couple of natural meanings, so someone is likely to be surprised.
  • The fact that using extend is already quite ergonomic and consistent.

I'm not too swayed by the argument that the + impl for String means we should have it for Vec too; they're different types conceptually, and + has a much clearer interpretation for strings.

@leonardo-m
Copy link

The fact that + on vectors has a couple of natural meanings, so someone is likely to be surprised.
The fact that using extend is already quite ergonomic and consistent.

The extend you refer to is the +=, that mutates in places, that you can do in some ways:

fn main() {
    let mut data = vec![10, 20];
    data.extend(vec![30, 40]);
    println!("{:?}", data);
    data.extend_from_slice(&[50, 60]);
    println!("{:?}", data);
}

The + is a different thing, it's a concatenation, it doesn't modify the input values (so it's good for functional-style programming, with immutables and inside expressions), and currently you can perform it with something like:

fn main() {
    let mut data = vec![10, 20];
    println!("{:?}", [data.as_slice(), &[30, 40]].concat());
}

This thread discusses about adding some more expressiveness to Rust:

https://internals.rust-lang.org/t/roadmap-2017-productivity-learning-curve-and-/4097

The introduction of the + between vectors allows shorter and more expressive code:

println!("{:?}", data + &[30, 40]);

The + has indeed two meanings, to concatenate (in Python, etc) and to perform itemwise sum (in D language and others). For a person that's not expert of Rust this can lead to some confusion.

@bluss
Copy link
Member

bluss commented Jan 4, 2017

In Python + is either concatenation or elementwise addition (list vs numpy.ndarray) and it is common for code to mix lists and ndarrays and create some confusion. (numpy is an extremely common package to use, depending on application I guess, even if it's not in Python's stdlib itself.)

In this sense I think Rust, even if it has a completely different type system than Python, should not end up in this situation where Vec uses + for concatenate and other things called "Vector" or maybe "Array" use it for elementwise sum. I'm not sure what to call this, it's a near-collision of semantics.(*)

For Rust examples, rulinalg defines a Vector which uses + for elementwise sum, nalgebra a DVector. My project ndarray has an Array (which can be one-dimensional) that does this, and so on.

Rulinalg, ndarray, or others can make the clash yet worse by defining addition between their own Vector types and &[T]. I don't know if any numerical crate does this yet, but I can imagine they might do that for convenience (Even if that's the kind of slippery slope of type confusion my comment here wants us to avoid).

Moreover the current situation for String is deeply unsatisfying, and I would not want to extend this problem to Vec:
Addition is not associative or monoidal because only String + &str => String is defined; The user should not have to memorize which exact impl clauses exist, or which exact incantation of string addition that works. IMO function signature details like pass by reference or pass by value come across more naturally in methods than with operators.

@leonardo-m
Copy link

I agree.

So, is it a good idea to add a way to concatenate two vects (returning the result) in a nicer and shorter way compared to the concat() usage I've shown above?

data.conj(&[30, 40])

@michaelsproul
Copy link
Contributor Author

Ok, in light of all the feedback, I'm going to close this PR. I think the reasons presented here are convincing, and should serve as good documentation for why this change isn't a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants