New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for Vec::drain_filter and LinkedList::drain_filter #43244
Comments
Add Vec::drain_filter This implements the API proposed in #43244. So I spent like half a day figuring out how to implement this in some awesome super-optimized unsafe way, which had me very confident this was worth putting into the stdlib. Then I looked at the impl for `retain`, and was like "oh dang". I compared the two and they basically ended up being the same speed. And the `retain` impl probably translates to DoubleEndedIter a lot more cleanly if we ever want that. So now I'm not totally confident this needs to go in the stdlib, but I've got two implementations and an amazingly robust test suite, so I figured I might as well toss it over the fence for discussion.
Maybe this doesn't need to include the kitchen sink, but it could have a range parameter, so that it's like a superset of drain. Any drawbacks to that? I guess adding bounds checking for the range is a drawback, it's another thing that can panic. But drain_filter(.., f) can not. |
Is there any chance this will stabilize in some form in the not to far future? |
If the compiler is clever enough to eliminate the bounds checks ( And I'm pretty sure you can implement it in a way |
I know this is bikeshedding to some extent, but what was the reasoning behind naming this |
No idea, but |
|
I think
|
There is no precedent for using |
The "said equivalent" code in the comment is not correct... you have to minus one from i at the "your code here" site, or bad things happens. |
IMO it's not Again, just from a newbie perspective, the things I would search for if trying to find something to do what this issue proposes would be I actually searched for It seems like a simple function named |
On a separate note, I don't feel as though this should mutate the vector it's called on. It prevents chaining. In an ideal scenario one would want to be able to do something like: vec![
"",
"something",
a_variable,
function_call(),
"etc",
]
.reject(|i| { i.is_empty() })
.join("/") With the current implementation, what it would be joining on would be the rejected values. I'd like to see both an |
You can already do the chaining thing with |
Yes, it's a member of |
Drain is novel terminology because it represented a fourth kind of ownership in Rust that only applies to containers, while also generally being a meaningless distinction in almost any other language (in the absence of move semantics, there is no need to combine iteration and removal into a single ""atomic"" operation). Although drain_filter moves the drain terminology into a space that other languages would care about (since avoiding backshifts is relevant in all languages). |
I came across
|
I still feel as though |
Shouldn't |
Yes |
Add Drop impl for linked_list::DrainFilter This is part of #43244. See #43244 (comment)
Add Drop impl for linked_list::DrainFilter This is part of #43244. See #43244 (comment)
Also tbh. I think we should lint by default when implementing Iterator on a type which is not More in scope is that the behavior what happens if you drop a "consume-on-drop" iterator which you wrapped inside of an map: Which is why all standard iterator combinators are marked as So I would argue that for the with-combinator use-case my proposal is still better then consume-on-drop, at the same time the scope has the issue of making it harder/more verbose to use such combinators. EDIT: Also in most cases where you combine |
I think in most cases where you want to use
In generally I think it's preferable to provide to separate functions for this. Jumbling them together creates the risc of setting a bad API design precedence (like we have here) or stumbling |
And yet I have not seen anyone point out what is wrong with its design. |
You mean besides:
(that point can be super confusing for less experienced programmers)
And it generally behaving different to more or less all other iterators in std. Sure it's not to too bad but it's still a case where forcing two different functions into one resulted in sub-par end results.
It does drain-on-drop semantically, sure you can force the view that it instantly consumes the collection, but then And if you use |
The semantics of
Sure implementation wise with the current code you could argue that it is instead moving all elements in the collection |
Under any circumstance, I do not think the Especially for For consistency of APIs among variant collections, we may consider to restrict to the drain iterators not to depend on |
It's super confusing to me because I haven't seen any example. I know things can get hairy on drop (I've written a
Yes you can, you can give the end of the range. If you want the iteration to "interact" with the drain process, you end up with a different algorithm, basically a better version of
As far as iteration is concerned, It's pretty much like
Define drain-on-drop. To me, it means that the iterator needs a drop handler that alters the source collection. The contract of
Not sure what you mean. I take the view that
then you should rather use
You're describing the current implementation of |
To be clear, this is about the iterator returned by
I think that's a given in all official Rust. There are unit tests against (single) panic in a drop of the elements, panic in predicates, using mem::forget, and they don't allow UB, or leaving behind a poisoned collection that triggers UB later. Though I wouldn't be surprised one can cook up a combination that is still UB. |
men::forgetting an collection modifying iterator is normally leaving behind a "poisoned" collection.
Through that is with a definition of poison in the sense of "it has forgotten about all it's content due to poisoning" or similar unspecified potential non-stable but rust-safe (i.e. not UB) behavior. I probably should have used a different term there as poison is used e.g. in context of llvm in relation too UB.
…-------- Original Message --------
On Jan 4, 2022, 15:02, Stein Somers wrote:
> What if we make the drain_iterator only accessible via a closure scope? Then drop all the un-iterated elements after scope exited
To be clear, this is about the iterator returned by drain_filter. Does such a scope parameter exist elsewhere in libraries?
> Under any circumstance, I do not think the .drain and .drain_filter method should return an iterator that can cause UB if it is std::mem::forgeted.
I think that's a given in all official Rust. There are unit tests against (single) panic in a drop of the elements, panic in predicates, using mem::forget, and they don't allow UB, or leaving behind a poisoned collection later triggers UB later. Though I wouldn't be surprised one can cook up a combination that is still UB.
—
Reply to this email directly, [view it on GitHub](#43244 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AB2HJEJDCQBTTRMJW44S343UUL4Y5ANCNFSM4DTDLGPA).
Triage notifications on the go with GitHub Mobile for [iOS](https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675) or [Android](https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
That is deciding what to drain beforhand, not stopping draining. With any normal rustish iterator I can just decide to stop iterating and drop it. Not with drain which then will magically continue iterating.
The point of
It drains the collection when dropped even if it is not iterated/consumed. If it would be lazy (like it IMHO should be) then calling drain and dropping the iterator without iterating on it at all would not drain anything as you never used the "draining" iterator.
That's the point we don't need to implement
yes |
note that it's just an ad-hoc example and you normally would write the algorithm shown there differently, |
Note that even for normal iterators this is about laziness, not about stopping exactly at the desired point, so it would still be confusing with some combinators ("wait, why does new |
But it wouldn't do it, at least not as long as it's not leaked. There is absolutely no reason logical it performance wise why a lazy drain should remove elements for which the predicate was not applied and returned `true`. Even if you drop it mid way.
Sure if you iterator had lookahead it will look ahead of if it's about an async/threaded stream it might also do more work. But for lazy implementation which have neither lookahead or similar nor have something like parallelization I do consider it a bug if dropping it after 10 elements affects more then 10 elements.
…-------- Original Message --------
On Jan 6, 2022, 12:13, Denis Lisov wrote:
> [...] With any normal rustish iterator I can just decide to stop iterating and drop it. Not with drain which then will magically continue iterating.
Note that even for normal iterators this is about laziness, not about stopping exactly at the desired point, so it would still be confusing with some combinators ("wait, why does new drain with take_while remove an element not matching the condition?").
—
Reply to this email directly, [view it on GitHub](#43244 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AB2HJEO4ZD4NZ7TCEHOJFZ3UUV2M5ANCNFSM4DTDLGPA).
Triage notifications on the go with GitHub Mobile for [iOS](https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675) or [Android](https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
That's apparently many people's expectation. My expectation is that it efficiently removes a range and somehow repurposes the elements. I would say the most explicit difference with But clearly that's not how it lands. Therefore I conclude that
I think everyone agrees with that. I'm just saying that |
Wow, the doc of Vec::drain says it "Creates a draining iterator"… only to describe what it actually accomplishes later. I appreciate the effort, but now I'm even more lost. The surprise of the iterator returned by |
How about add a marker trait To me, I expect
In addition, to be consistent with the contract that iterators are lazy, I think we should only allow the drain iterators (including iterators created by For example. If a new Rust programmer who has just learnt how to use let mut foo = vec![0, 1, 2, 3, 4];
let mut iter = foo.drain(1..5);
assert_eq!(iter.next().unwrap(), 1);
drop(iter); // After `drop`, it stops draining and recovers the original collection.
assert_eq!(foo, [0, 2, 3, 4]); |
Vec::retain has been stabilized since our original usage, and meets our original intent more concisely than `drain_filter`. For details on the current status of the `drain_filter` RFC, including hang-ups on API consistency and unintuitiveness of lazy evaluation, see its tracking issue at: rust-lang/rust#43244 Also improves error messages and backtraces for proxy_tcp, which was altered to remove drain_filter usage.
For what it's worth, I was searching using the word |
I came across this feature (would be great to see it stabilised!) but what I actually wanted was something like An illustration: // loaded_images: Record<ImageHandle, Image>
// loaded_images.get(handle) -> Option<Image>
// loading_image_handles: Vec<ImageHandle>
// if the image is loaded, remove its handle from `loading_image_handles`, and do something with the result
// otherwise leave it in (not yet loaded)
// this
for handle in loading_image_handles
.drain_filter(|handle| loaded_images.get(handle).is_some())
{
let image = loaded_images.get(&handle).unwrap();
// to get `image`, we have to do `images.get` again, and unwrap it,
// after we'd already previously checked
}
// could become this
for image in loading_image_handles
.drain_filter_map(|handle| loaded_images.get(handle))
{
// `image` is available
} As a side note, because of the analogy of |
@tigregalis see also rust-lang/rfcs#3299 |
I noticed linked list's DrainFilter is missing Send and Sync impls. I believe these types should have identical autotrait impls because they do not differ in thread safety. std::vec::DrainFilter:
std::collections::linked_list::DrainFilter:
|
I've had more and more use cases where I wish (Alternatively, the item index could also be passed to the filter closure, but that would not be consistent with similar methods.) |
Feature gate:
#![feature(drain_filter)]
This is a tracking issue for
Vec::drain_filter
andLinkedList::drain_filter
, which can be used for random deletes using iterators.Public API
Steps / History
Unresolved Questions
drain_filter
accept aRange
argument?Send
+Sync
impls on linked list's DrainFilter, see commentSee #43244 (comment) for a more detailed summary of open issues.
The text was updated successfully, but these errors were encountered: