Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add joining slices of slices with a slice separator, not just a single item #62528

Open
wants to merge 7 commits into
base: master
from

Conversation

Projects
None yet
8 participants
@SimonSapin
Copy link
Contributor

commented Jul 9, 2019

#27747 (comment)

It's kinda annoying to be able to join strings with a str (which can have multiple chars), but joining a slice of slices, you can only join with a single element.

This turns out to be fixable, with some possible inference regressions.

TL;DR

Related trait(s) are unstable and tracked at #27747, but the [T]::join method that is being extended here is already stable.

Example use of the new insta-stable functionality:

let nested: Vec<Vec<Foo>> = /* … */;
let separator: &[Foo] = /* … */;  // Previously: could only be a single &Foo
nested.join(separator)

Complete API affected by this PR, after changes:

impl<T> [T] {
    pub fn concat<Item: ?Sized>(&self) -> <Self as Concat<Item>>::Output
        where Self: Concat<Item>
    {
        Concat::concat(self)
    }
    pub fn join<Separator>(&self, sep: Separator) -> <Self as Join<Separator>>::Output
        where Self: Join<Separator>
    {
        Join::join(self, sep)
    }
}

// The `Item` parameter is only useful for the the slice-of-slices impl.
pub trait Concat<Item: ?Sized> {
    type Output;
    fn concat(slice: &Self) -> Self::Output;
}

pub trait Join<Separator> {
    type Output;
    fn join(slice: &Self, sep: Separator) -> Self::Output;
}

impl<T: Clone, V: Borrow<[T]>> Concat<T> for [V] {
    type Output = Vec<T>;
}

impl<T: Clone, V: Borrow<[T]>> Join<&'_ T> for [V] {
    type Output = Vec<T>;
}

// New functionality here!
impl<T: Clone, V: Borrow<[T]>> Join<&'_ [T]> for [V] {
    type Output = Vec<T>;
}

impl<S: Borrow<str>> Concat<str> for [S] {
    type Output = String;
}

impl<S: Borrow<str>> Join<&'_ str> for [S] {
    type Output = String;
}

Details

After #62403 but before this PR, the API is:

impl<T> [T] {
    pub fn concat<Separator: ?Sized>(&self) -> T::Output
        where T: SliceConcat<Separator>
    {
        SliceConcat::concat(self)
    }

    pub fn join<Separator: ?Sized>(&self, sep: &Separator) -> T::Output
        where T: SliceConcat<Separator>
    {
        SliceConcat::join(self, sep)
    }
}

pub trait SliceConcat<Separator: ?Sized>: Sized {
    type Output;
    fn concat(slice: &[Self]) -> Self::Output;
    fn join(slice: &[Self], sep: &Separator) -> Self::Output;
}

impl<T: Clone, V: Borrow<[T]>> SliceConcat<T> for V {
    type Output = Vec<T>;
}

impl<S: Borrow<str>> SliceConcat<str> for S {
    type Output = String;
}

By adding a trait impl we should be able to accept a slice of T as the separator, as an alternative to a single T value.

In a some_slice.join(some_separator) call, trait resolution will pick an impl or the other based on the type of some_separator. In some_slice.concat() however there is no separator, so this call would become ambiguous. Some regression in type inference or trait resolution may be acceptable on principle, but requiring a turbofish for every single call to concat isn’t great.

The solution to that is splitting the SliceConcat trait into two Concat and Join traits, one for each eponymous method. Only Join would gain a new impl, so that some_slice.concat() would not become ambiguous.

Now, at the trait level the Concat trait does not need a Separator parameter anymore. However, simply removing it causes one of the impls not to be accepted anymore:

error[E0207]: the type parameter `T` is not constrained by the impl trait, self type, or predicates
  --> src/liballoc/slice.rs:608:6
    |
608 | impl<T: Clone, V: Borrow<[T]>> Concat for [V] {
    |      ^ unconstrained type parameter

This makes sense: if [V]::concat is a method that is itself not generic, then its return type (which is the Concat::Output associated type) needs to be determined based on solely V. And although there is no such type in the standard library, there is nothing stopping another crate from defining a V type that implements both Borrow<[Foo]> and Borrow<[Bar]>. It might not be a good idea, but it’s possible. Both would apply here, and there would be no way to determine T.

This could be a warning sign that this API is too generic. Perhaps we’d be better off having one less type variable, and only implement Concat for [&'_ [T]] and Concat for [Vec<T>] etc. However this aspect of [V]::concat is already stable, so we’re stuck with it.

The solution is to keep a dummy type parameter on the Concat trait. That way, if a type has multiple Borrow<[_]> impls, it’ll end up with multiple corresponding Concat<_> impls.

In impl<S: Borrow<str>> Concat<str> for [S], the second occurrence of str is not meaningful. It could be any type. As long as there is only once such type with an applicable impl, trait resolution will be appeased without demanding turbofishes.

Joining strings with char

For symmetry I also tried adding this impl (because why not):

impl<S: Borrow<str>> Join<char> for [S] {
    type Output = String;
}

This immediately caused an inference regression in a dependency of rustc:

error[E0277]: the trait bound `std::string::String: std::borrow::Borrow<[std::string::String]>` is not satisfied
   --> /home/simon/.cargo/registry/src/github.com-1ecc6299db9ec823/getopts-0.2.19/src/lib.rs:595:37
    |
595 |             row.push_str(&desc_rows.join(&desc_sep));
    |                                     ^^^^ the trait `std::borrow::Borrow<[std::string::String]>` is not implemented for `std::string::String`
    |
    = help: the following implementations were found:
              <std::string::String as std::borrow::Borrow<str>>
    = note: required because of the requirements on the impl of `std::slice::Join<&std::string::String>` for `[std::string::String]`

In the context of this code, two facts are known:

  • desc_rows is a Vec<String>
  • desc_sep is a String

Previously the first fact alone reduces the resolution of join to only one solution, where its argument it expected to be &str. Then, &String is coerced to &str.

With the new Join impl, the first fact leavs two applicable impls where the separator can be either &str or char. But &String is neither of these things. It appears that possible coercions are not accounted for, in the search for a solution in trait resolution.

I have not included this new impl in this PR. It’s still possible to add later, but the getopts breakage does not need to block the rest of the PR. And the functionality easy for end-user to duplicate: slice_of_strings.join(&*char_separator.encode_utf8(&mut [0_u8, 4]))

The &* part of that last code snippet is another case of the same issue: encode_utf8 returns &mut str which can be coerced to &str, but isn’t when trait resolution is ambiguous.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Jul 9, 2019

r? @dtolnay

(rust_highfive has picked a reviewer for you, use r? to override)

@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2019

While the SliceConcat trait is unstable, this PR includes new insta-stable functionality:

let nested: Vec<Vec<Foo>> = /* … */;
let separator: &[Foo] = /* … */;  // Previously: could only be a single &Foo
nested.join(separator)

@rfcbot fcp merge

Because there may be inference regressions, we may want to do a Crater run:

@bors try

@bors

This comment has been minimized.

Copy link
Contributor

commented Jul 9, 2019

⌛️ Trying commit b62a77b with merge 13fab3f...

bors added a commit that referenced this pull request Jul 9, 2019

Auto merge of #62528 - SimonSapin:concat, r=<try>
Add joining slices of slices with a slice separator, not just a single item

#27747 (comment)
> It's kinda annoying to be able to join strings with a str (which can have multiple chars), but joining a slice of slices, you can only join with a single element.

This turns out to be fixable, with some possible inference regressions.

# TL;DR

Related trait(s) are unstable and tracked at #27747, but the `[T]::join` method that is being extended here is already stable.

Example use of the new insta-stable functionality:

```rust
let nested: Vec<Vec<Foo>> = /* … */;
let separator: &[Foo] = /* … */;  // Previously: could only be a single &Foo
nested.join(separator)
```

Complete API affected by this PR, after changes:

```rust
impl<T> [T] {
    pub fn concat<Item: ?Sized>(&self) -> <Self as Concat<Item>>::Output
        where Self: Concat<Item>
    {
        Concat::concat(self)
    }
    pub fn join<Separator>(&self, sep: Separator) -> <Self as Join<Separator>>::Output
        where Self: Join<Separator>
    {
        Join::join(self, sep)
    }
}

// The `Item` parameter is only useful for the the slice-of-slices impl.
pub trait Concat<Item: ?Sized> {
    type Output;
    fn concat(slice: &Self) -> Self::Output;
}

pub trait Join<Separator> {
    type Output;
    fn join(slice: &Self, sep: Separator) -> Self::Output;
}

impl<T: Clone, V: Borrow<[T]>> Concat<T> for [V] {
    type Output = Vec<T>;
}

impl<T: Clone, V: Borrow<[T]>> Join<&'_ T> for [V] {
    type Output = Vec<T>;
}

// New functionality here!
impl<T: Clone, V: Borrow<[T]>> Join<&'_ [T]> for [V] {
    type Output = Vec<T>;
}

impl<S: Borrow<str>> Concat<str> for [S] {
    type Output = String;
}

impl<S: Borrow<str>> Join<&'_ str> for [S] {
    type Output = String;
}
```

# Details

After #62403 but before this PR, the API is:

```rust
impl<T> [T] {
    pub fn concat<Separator: ?Sized>(&self) -> T::Output
        where T: SliceConcat<Separator>
    {
        SliceConcat::concat(self)
    }

    pub fn join<Separator: ?Sized>(&self, sep: &Separator) -> T::Output
        where T: SliceConcat<Separator>
    {
        SliceConcat::join(self, sep)
    }
}

pub trait SliceConcat<Separator: ?Sized>: Sized {
    type Output;
    fn concat(slice: &[Self]) -> Self::Output;
    fn join(slice: &[Self], sep: &Separator) -> Self::Output;
}

impl<T: Clone, V: Borrow<[T]>> SliceConcat<T> for V {
    type Output = Vec<T>;
}

impl<S: Borrow<str>> SliceConcat<str> for S {
    type Output = String;
}
```

By adding a trait impl we should be able to accept a slice of `T` as the separator, as an alternative to a single `T` value.

In a `some_slice.join(some_separator)` call, trait resolution will pick an impl or the other based on the type of `some_separator`. In `some_slice.concat()` however there is no separator, so this call would become ambiguous. Some regression in type inference or trait resolution may be acceptable on principle, but requiring a turbofish for every single call to `concat` isn’t great.

The solution to that is splitting the `SliceConcat` trait into two `Concat` and `Join` traits, one for each eponymous method. Only `Join` would gain a new impl, so that `some_slice.concat()` would not become ambiguous.

Now, at the trait level the `Concat` trait does not need a `Separator` parameter anymore. However, simply removing it causes one of the impls not to be accepted anymore:

```rust
error[E0207]: the type parameter `T` is not constrained by the impl trait, self type, or predicates
  --> src/liballoc/slice.rs:608:6
    |
608 | impl<T: Clone, V: Borrow<[T]>> Concat for [V] {
    |      ^ unconstrained type parameter
```

This makes sense: if `[V]::concat` is a method that is itself not generic, then its return type (which is the `Concat::Output` associated type) needs to be determined based on solely `V`. And although there is no such type in the standard library, there is nothing stopping another crate from defining a `V` type that implements both `Borrow<[Foo]>` and `Borrow<[Bar]>`. It might not be a good idea, but it’s possible. Both would apply here, and there would be no way to determine `T`.

This could be a warning sign that this API is too generic. Perhaps we’d be better off having one less type variable, and only implement `Concat for [&'_ [T]]` and `Concat for [Vec<T>]` etc. However this aspect of `[V]::concat` is already stable, so we’re stuck with it.

The solution is to keep a dummy type parameter on the `Concat` trait. That way, if a type has multiple `Borrow<[_]>` impls, it’ll end up with multiple corresponding `Concat<_>` impls.

In `impl<S: Borrow<str>> Concat<str> for [S]`, the second occurrence of `str` is not meaningful. It could be any type. As long as there is only once such type with an applicable impl, trait resolution will be appeased without demanding turbofishes.

# Joining strings with `char`

For symmetry I also tried adding this impl (because why not):

```rust
impl<S: Borrow<str>> Join<char> for [S] {
    type Output = String;
}
```

This immediately caused an inference regression in a dependency of rustc:

```rust
error[E0277]: the trait bound `std::string::String: std::borrow::Borrow<[std::string::String]>` is not satisfied
   --> /home/simon/.cargo/registry/src/github.com-1ecc6299db9ec823/getopts-0.2.19/src/lib.rs:595:37
    |
595 |             row.push_str(&desc_rows.join(&desc_sep));
    |                                     ^^^^ the trait `std::borrow::Borrow<[std::string::String]>` is not implemented for `std::string::String`
    |
    = help: the following implementations were found:
              <std::string::String as std::borrow::Borrow<str>>
    = note: required because of the requirements on the impl of `std::slice::Join<&std::string::String>` for `[std::string::String]`
```

In the context of this code, two facts are known:

* `desc_rows` is a `Vec<String>`
* `desc_sep` is a `String`

Previously the first fact alone reduces the resolution of `join` to only one solution, where its argument it expected to be `&str`. Then, `&String` is coerced to `&str`.

With the new `Join` impl, the first fact leavs two applicable impls where the separator can be either `&str` or `char`. But `&String` is neither of these things. It appears that possible coercions are not accounted for, in the search for a solution in trait resolution.

I have not included this new impl in this PR. It’s still possible to add later, but the `getopts` breakage does not need to block the rest of the PR. And the functionality easy for end-user to duplicate: `slice_of_strings.join(&*char_separator.encode_utf8(&mut [0_u8, 4]))`

The `&*` part of that last code snippet is another case of the same issue: `encode_utf8` returns `&mut str` which can be coerced to `&str`, but isn’t when trait resolution is ambiguous.
@rfcbot

This comment has been minimized.

Copy link

commented Jul 9, 2019

Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@bors

This comment has been minimized.

Copy link
Contributor

commented Jul 9, 2019

☀️ Try build successful - checks-azure, checks-travis
Build commit: 13fab3f

Show resolved Hide resolved src/liballoc/slice.rs Outdated
Show resolved Hide resolved src/liballoc/slice.rs Outdated
Show resolved Hide resolved src/liballoc/str.rs Outdated

SimonSapin and others added some commits Jul 9, 2019

Update src/liballoc/slice.rs
Co-Authored-By: Mazdak Farrokhzad <twingoow@gmail.com>
Update src/liballoc/slice.rs
Co-Authored-By: Mazdak Farrokhzad <twingoow@gmail.com>
Update src/liballoc/str.rs
Co-Authored-By: Mazdak Farrokhzad <twingoow@gmail.com>
@SimonSapin

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2019

@craterbot run mode=check-only

@craterbot

This comment has been minimized.

Copy link
Collaborator

commented Jul 9, 2019

👌 Experiment pr-62528 created and queued.
🤖 Automatically detected try build 13fab3f
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@rfcbot

This comment has been minimized.

Copy link

commented Jul 12, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@craterbot

This comment has been minimized.

Copy link
Collaborator

commented Jul 13, 2019

🚧 Experiment pr-62528 is now running on agent aws-2.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot

This comment has been minimized.

Copy link
Collaborator

commented Jul 13, 2019

🚨 Experiment pr-62528 has encountered an error: docker is not running
🛠 If the error is fixed use the retry command.

🆘 Can someone from the infra team check in on this? @rust-lang/infra
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@pietroalbini

This comment has been minimized.

Copy link
Member

commented Jul 13, 2019

@craterbot retry

@craterbot

This comment has been minimized.

Copy link
Collaborator

commented Jul 13, 2019

🛠 Experiment pr-62528 queued again.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@pietroalbini

This comment has been minimized.

Copy link
Member

commented Jul 16, 2019

@craterbot assign=agent:aws-1

@craterbot

This comment has been minimized.

Copy link
Collaborator

commented Jul 16, 2019

📝 Configuration of the pr-62528 experiment changed.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot

This comment has been minimized.

Copy link
Collaborator

commented Jul 18, 2019

🚧 Experiment pr-62528 is now running on agent aws-1.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.