Tracking Issue for Extend::{extend_one,extend_reserve} #72631

cuviper · 2020-05-26T21:14:32Z

This is a tracking issue for the trait methods Extend::extend_one and extend_reserve.
The feature gate for the issue is #![feature(extend_one)].

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also uses as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Implement the API
Adjust documentation (see instructions on rustc-dev-guide)
Stabilization PR (see instructions on rustc-dev-guide)

Unresolved Questions

Are these method names appropriate?
- Note that Extend is in the prelude, so the explicit extend_ prefix avoids ambiguity.
Is Extend the best place for just methods?
- One goal is to improve Iterator::unzip, which only has Default + Extend, so there's not really much choice.

Implementation history

Initial implementation in Add Extend::{extend_one,extend_reserve} #72162.

The text was updated successfully, but these errors were encountered:

dtolnay · 2020-05-28T01:40:09Z

Preserving a relevant concern from @SimonSapin in #72162 (comment):

These methods look very out of place in the Extend trait, and the extend_ prefix in their name is apparently unrelated to the meaning of the name of the extend method.

From an API design perspective, this seems to be asking for new Push and Reserve traits. This might deserve to be part of a larger overall design for traits for generic collections. The standard library has mostly avoided doing this so far, if fact Extend is the odd one out and arguably should not have been a trait.

the8472 · 2021-05-11T22:45:52Z

Additional concern:

reserve calls into allocation machinery. extend_one is not fallible, so it will also call into the allocation machinery in case not enough capacity has been reserved. This can lead to code bloat. Instead it should be something more along the lines of #84649

est31 · 2021-08-05T02:28:41Z

With arrays implementing the IntoIterator trait, I'm not sure extend_one is still pulling its weight? It seems the reasoning was that extend_one(v) is nicer than extend(Some(v))? To me at least, extend([v]) seems even nicer.

cuviper · 2021-08-05T02:59:30Z

It's not just how nice it looks, but also simpler implementation, like Vec is just a push. We call this repeatedly in unzip's own loop, so it's a benefit to keep that simple instead of adding an inner loop (even const 1).

est31 · 2021-08-05T03:14:29Z

@cuviper can't this be also solved with specialization? Adding special code for Option<T>, iter::Once<T>, and [T;1]?

cuviper · 2021-08-05T03:29:03Z

Yeah, I suppose specialization could do that too. Note that only [T;1] is always one item though.

est31 · 2021-08-05T03:47:20Z

Note that only [T;1] is always one item though.

Good point! [T;1] should be enough though.

joshtriplett · 2021-09-08T20:07:48Z

Summarizing some real-time discussion from just now:

extend_one seems fine to me.

For extend_reserve, I'd prefer to at least see the default implementation be the one @dtolnay suggested: extend with a dummy iterator that has a size_hint of the appropriate size. We should also document the expected semantics: should extend_reserve panic if it can't allocate (like reserve methods), or is it a hint that should ignore failure?

(I'd still prefer to have it as a free function that can't be overridden, rather than a default method, but with the above documentation and default impl I'd just be -0 on it and wouldn't oppose adding it.)

cuviper · 2021-09-08T20:30:04Z

For extend_reserve, I'd prefer to at least see the default implementation be the one @dtolnay suggested: extend with a dummy iterator that has a size_hint of the appropriate size.

So, returning a hint (size, Some(size)), but no items? Often the lower bound is what collections use for reservation. I guess it's technically safe to lie about this though. (edit: I see #88761 is already there!)

(I'd still prefer to have it as a free function that can't be overridden, rather than a default method, but with the above documentation and default impl I'd just be -0 on it and wouldn't oppose adding it.)

You mean by using that hint trick in the free function? I could do that.

Actually, if we go that route, we can also let unzip and Extend for (A, B) pass the complete size hint from the source iterator, rather than just additional from the lower bound. That seems compelling.

(Note: hashbrown will have to remove its extend_reserve before we can actually kill it.)

the8472 · 2021-09-08T20:40:24Z

So, returning a hint (size, Some(size)), but no items? Often the lower bound is what collections use for reservation. I guess it's technically safe to lie about this though.

That's a good point. Iterator::size_hint docs say

A buggy iterator may yield less than the lower bound or more than the upper bound of elements. [...]

That said, the implementation should provide a correct estimation, because otherwise it would be a violation of the trait’s protocol.

So it implies that it's kind of buggy behavior to have an iterator that doesn't return its lower bound. At the very least it leaves a bad taste.

It would be nicer if we just passed in a single iterator where a call to next() spins the outer loop of whatever is driving Extend (generators were mentioned in the meeting).

qm3ster · 2021-09-20T09:13:18Z

Shoulld extend_reserve have Vec::reserve or Vec::reserve_exact semantics?
I see #72162 uses reserve everywhere, but has this been examined and justified?

Stargateur · 2021-11-15T02:24:48Z

I would also like a Push and TryPush and TryExtend trait:

pub trait Push {
  type Item;
  fn push<'a>(&'a mut self, item: Self::Item) -> &'a Self::Item;
}

pub trait TryPush {
  type Item;
  fn try_push<'a>(&'a mut self, item: Self::Item) -> Result<&'a Self::Item>;
}

pub trait TryExtend {
  type Item;
  fn try_extend<Iter: IntoIterator<Item = Self::Item>>(&mut self, iter: Iter) -> Result<()>;
}

qm3ster · 2021-11-16T23:19:14Z

@Stargateur:

Existing push and insert on eg Vec don't return a reference. Is this a bad legacy decision? I don't know.
Are push and try_push returning an immutable reference despite requiring a mutable reference to the container intentional? Are there containers in the wild that need this constraint?
Should they prescribe &Self::Item as a return? What about eg a HashSet that gets extended with (K, V) tuples? I imagine you'd want a push on that to return a &mut V.

Stargateur · 2021-11-16T23:54:44Z

@qm3ster

Oh It was just a quite and dirty example, my main idea is I don't see why we can't have a trait for Push one item on alloc collection. Maybe split Push and TryPush is not need (same for extend). But I think Extend and Push should be two separate trait. My may issue with current design is you need to try_reserve THEN you can use extend. It's two step when it's should be one step for the user. Since we loose the battle in early Rust to have fallible allocation I would like to have trait where at least are nice to simple to use and not error prone. Having a trait TryPush and TryExtend will remove all user the burden to check at every call try_reserve().

yeah, I don't like to not have access to item I just pushed this is a thing I want to have almost everytime, one will say "call x.last()" but this require on unwrap() when push could just return it. Also this is probably cost free every function should return something expect few thing like drop. And even if this is not cost free I expect the compiler will optimize it away since it's very simple to optimize a not used reference (same argument is say about user to call vec::last that is optimize away after a Vec::push but again better optimize away the not used return value and this remove unwrap in user code). I give something to a function and it give me nothing back ? steal !
oh yeah sure nice catch
Maybe just don't implement Push on HashMap is a possibility too. We are not force to do it. But it a valid point, maybe have a second associate type like Output allowing HashMap to return a (&K, &V) instead of &(K, V) that will be not possible for several implementation of such. Problem of this it probably require Gats to compile.

So maybe:

pub trait Push {
  type Item;
  type Output;
  fn push<'a>(&'a mut self, item: Self::Item) -> Self::Output;
}

pub trait TryPush {
  type Item;
  type Output;
  // the error should return the item cause it didn't consume it
  fn try_push<'a>(&'a mut self, item: Self::Item) -> Result<Self::Output, ErrorFoo<Self::Item>>;
}

I don't know GaTs enough to know exactly but I often design thing where compiler tell me GATS as error. So if someone know better to make this trait compilable with or without gats. The idea is here, have traits to push and extend and return the item for push. Also, need to design the error type, should it be fixed or generic.

KamilaBorowska · 2021-12-13T11:49:54Z

A GAT API for pushing to HashMap could possibly look like this:

#![feature(entry_insert, generic_associated_types)]

use std::collections::HashMap;
use std::hash::{BuildHasher, Hash};

pub trait Push {
    type Item;
    type Output<'a>
    where
        Self: 'a;
    fn push<'a>(&'a mut self, item: Self::Item) -> Self::Output<'a>;
}

impl<K, V, S> Push for HashMap<K, V, S>
where
    K: Eq + Hash,
    S: BuildHasher,
{
    type Item = (K, V);
    type Output<'a>
    where
        Self: 'a,
    = &'a mut V;
    fn push<'a>(&'a mut self, (key, value): (K, V)) -> &'a mut V {
        self.entry(key).insert(value).into_mut()
    }
}

Kixunil · 2022-01-25T21:31:16Z

I'm writing something that uses size hint similar to iterators and was wondering how it works in std. I found some surprises related to this. As @qm3ster points out there's a question about use of reserve vs reserve_exact but there's also issue of upper bound of Iterator::size_hint() not being used.

I'm thinking about these scenarios:

iter.collect() or collection.extend(), then store the collection for a long time - best to avoid over-allocating
iter.collect() or collection.extend(), then push some new elements without iterator (~~is this ever reasonable? I couldn't find a good scenario~~ useful for things coming from async and such)
maximum is known and reasonably likely
maximum is known and unlikely or unknown likelihood - this is the case of Iterator::take() which can have very high upper bound (typical scenario is DoS protection if the iterator is untrusted input)

I'm still unsure about the best solution but if there are people smarter than me interested I'd like to know what they are thinking.

qm3ster · 2022-01-26T15:01:06Z

@Kixunil I would like to note that repeatedly calling collection.extend() with small iterators is common. There's also cases when a collection is deserialized and extended immediately, or deserialized and then extended indefinitely (in the latter case, probably with some removals, so resizing may be amortized regardless).

Kixunil · 2022-01-26T16:00:20Z

@qm3ster I believe all such cases can be solved using Iterator::chain which also happens to be less tedious than manually calculating the right reserve size. If you have a situation that can not be converted I'm curious to learn!

qm3ster · 2022-01-26T18:36:19Z

Of course!
I am talking about calling it at runtime, in an event/message-driven loop, not statically multiple times in a row in a block of code.

Kixunil · 2022-01-26T22:58:37Z

Ah, finally a good case when it's needed!

coolreader18 · 2022-11-03T19:29:34Z

Perhaps another solution is some kind of fn allocation_hint(&self) -> Option<usize>; method on iterators. That would fix the "lying about size_hint" problem, and you could get the functionality of extend_reserve with just .extend(itertools::allocation_hint(n)) or something. I've wanted this in the context of fallible iterators before, since for .collect::<Result<Vec<_>, _>>() the iterator constructed by try_process has a lower-bound size hint of 0, so the vec always uses an exponential allocation strategy, even though the iterator being collected from might have an exact size_hint.

conradludgate · 2022-11-21T09:46:15Z

For the record, I've just ran some simple benchmarks against vec.extend(Some(x)) and vec.extend_one(x) and the extend_some variant performed 20% faster on average. The main difference I can see from the codegen is the reserve call, and the order of the branching

example::extend_one:
        push    rbx
        mov     rbx, rdi
        mov     rsi, qword ptr [rdi + 16]
        cmp     rsi, qword ptr [rdi + 8]
        jne     .LBB3_2
        mov     rdi, rbx
        call    alloc::raw_vec::RawVec<T,A>::reserve_for_push
        mov     rsi, qword ptr [rbx + 16]
.LBB3_2:
        mov     rax, qword ptr [rbx]
        mov     byte ptr [rax + rsi], 0
        inc     rsi
        mov     qword ptr [rbx + 16], rsi
        pop     rbx
        ret

example::extend_some:
        push    rbx
        mov     rbx, rdi
        mov     rsi, qword ptr [rdi + 16]
        cmp     qword ptr [rdi + 8], rsi
        je      .LBB4_1
.LBB4_2:
        mov     rax, qword ptr [rbx]
        mov     byte ptr [rax + rsi], 0
        inc     rsi
        mov     qword ptr [rbx + 16], rsi
        pop     rbx
        ret
.LBB4_1:
        mov     edx, 1
        mov     rdi, rbx
        call    alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle
        mov     rsi, qword ptr [rbx + 16]
        jmp     .LBB4_2

My judgement from reading the raw_vec code is that extend_some, by using reserve which has allocation marked as #[cold], reorders the branching to be more helpful. Whereas the extend_one uses reserve_for_push which is not #[cold] and therefore suffers.

cuviper added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels May 26, 2020

cuviper mentioned this issue May 26, 2020

Add Extend::{extend_one,extend_reserve} #72162

Merged

taiki-e mentioned this issue Jul 10, 2020

Update stream rfc rust-lang/wg-async#15

Merged

KodrAus added A-collections Area: std::collections. Libs-Tracked Libs issues that are tracked on the team's project board. A-iterators Area: Iterators labels Jul 29, 2020

dingelish mentioned this issue Nov 9, 2020

New 1.1.3 breaks older mesalock packages? apache/incubator-teaclave-sgx-sdk#282

Open

blasrodri mentioned this issue Dec 5, 2020

stream: add stream extend function tokio-rs/tokio#3010

Closed

m-ou-se added the I-nominated label Sep 8, 2021

m-ou-se removed the I-nominated label Sep 8, 2021

dtolnay mentioned this issue Sep 8, 2021

Smarter default implementation of Extend::extend_reserve #88761

Closed

taiki-e mentioned this issue Jul 2, 2022

Collect/TryCollect does not use Stream::size_hint rust-lang/futures-rs#2617

Closed

conradludgate mentioned this issue Nov 21, 2022

remove reserve_for_push #104668

Closed

Dylan-DPC mentioned this issue Mar 4, 2024

Tracking issues for unstable library features used by std #94971

Open

32 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking Issue for Extend::{extend_one,extend_reserve} #72631

Tracking Issue for Extend::{extend_one,extend_reserve} #72631

cuviper commented May 26, 2020

dtolnay commented May 28, 2020

the8472 commented May 11, 2021

est31 commented Aug 5, 2021

cuviper commented Aug 5, 2021

est31 commented Aug 5, 2021

cuviper commented Aug 5, 2021

est31 commented Aug 5, 2021

joshtriplett commented Sep 8, 2021

cuviper commented Sep 8, 2021 •

edited

the8472 commented Sep 8, 2021

qm3ster commented Sep 20, 2021 •

edited

Stargateur commented Nov 15, 2021

qm3ster commented Nov 16, 2021

Stargateur commented Nov 16, 2021 •

edited

KamilaBorowska commented Dec 13, 2021

Kixunil commented Jan 25, 2022 •

edited

qm3ster commented Jan 26, 2022

Kixunil commented Jan 26, 2022

qm3ster commented Jan 26, 2022 •

edited

Kixunil commented Jan 26, 2022

coolreader18 commented Nov 3, 2022

conradludgate commented Nov 21, 2022

Tracking Issue for Extend::{extend_one,extend_reserve} #72631

Tracking Issue for Extend::{extend_one,extend_reserve} #72631

Comments

cuviper commented May 26, 2020

About tracking issues

Steps

Unresolved Questions

Implementation history

dtolnay commented May 28, 2020

the8472 commented May 11, 2021

est31 commented Aug 5, 2021

cuviper commented Aug 5, 2021

est31 commented Aug 5, 2021

cuviper commented Aug 5, 2021

est31 commented Aug 5, 2021

joshtriplett commented Sep 8, 2021

cuviper commented Sep 8, 2021 • edited

the8472 commented Sep 8, 2021

qm3ster commented Sep 20, 2021 • edited

Stargateur commented Nov 15, 2021

qm3ster commented Nov 16, 2021

Stargateur commented Nov 16, 2021 • edited

KamilaBorowska commented Dec 13, 2021

Kixunil commented Jan 25, 2022 • edited

qm3ster commented Jan 26, 2022

Kixunil commented Jan 26, 2022

qm3ster commented Jan 26, 2022 • edited

Kixunil commented Jan 26, 2022

coolreader18 commented Nov 3, 2022

conradludgate commented Nov 21, 2022

cuviper commented Sep 8, 2021 •

edited

qm3ster commented Sep 20, 2021 •

edited

Stargateur commented Nov 16, 2021 •

edited

Kixunil commented Jan 25, 2022 •

edited

qm3ster commented Jan 26, 2022 •

edited