Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: x/exp/slices: DeleteFunc #54768

Open
Deleplace opened this issue Aug 30, 2022 · 13 comments
Open

proposal: x/exp/slices: DeleteFunc #54768

Deleplace opened this issue Aug 30, 2022 · 13 comments
Labels
Milestone

Comments

@Deleplace
Copy link

Deleplace commented Aug 30, 2022

I propose we add this new func to golang.org/x/exp/slices :

// DeleteFunc removes any elements from s for which del returns true.
// DeleteFunc modifies the contents of the slice s; it does not create a new slice.
// When DeleteFunc removes m elements in total, it might not modify the elements s[len(s)-m:len(s)]. If 
// those elements contain pointers you might consider zeroing those elements so that objects they
// reference can be garbage collected.
func DeleteFunc[S ~[]E, E any](s S, del func(E) bool) S
@gopherbot gopherbot added this to the Proposal milestone Aug 30, 2022
@Deleplace
Copy link
Author

Deleplace commented Aug 30, 2022

The intent is similar to these existing funcs:

  • maps.DeleteFunc
  • maps.EqualFunc
  • slices.CompactFunc
  • slices.CompareFunc
  • slices.EqualFunc
  • slices.IndexFunc
  • slices.IsSortedFunc
  • slices.SortFunc

@Deleplace
Copy link
Author

Deleplace commented Aug 30, 2022

In some uncommon cases, it could be useful to know the original index of an element, to decide if it should be removed or not. For this we would need to pass the index as an argument to the del func: del func(int, E) bool.

In my opinion we should not pass such an index because it is ambiguous, as DeleteFunc is destructive and changes the indices. To keep things simple, DeleteFunc would work only when elements are independent from each other, and the indices don't matter.

@seankhliao seankhliao changed the title proposal: exp/slices: DeleteFunc proposal: x/exp/slices: DeleteFunc Aug 30, 2022
@carlmjohnson
Copy link
Contributor

carlmjohnson commented Aug 31, 2022

This is traditionally called Filter or FilterInPlace. The decision was made to deliberately omit map and filter from slices pending discovery of best practices in real experience.

With that said, I think this would be helpful, and maybe even under the name DeleteFunc. Filtering in place without generics is kind of tedious. The idiom I usually use looks like this:

filteredSlice := original[:0]
for _, e := range original {
    if cond(e) {
        filteredSlice = append(filteredSlice, e)
    }
}
original = filteredSlice

It's not so bad, but it's not obvious to people unfamiliar with the idiom how it works, and it's a lot of lines for a simple idea. I would prefer:

slices.DeleteFunc(&original, func(e E) {
    return !cond(e)
})

In terms of the signature, I think DeleteFunc(s *S, del func(E) bool) is better than DeleteFunc(s S, del func(E) bool) S because the original slice isn't very useful after having run DeleteFunc over it, so you have an extra unnecessary assignment. In the case where someone really wants the original slice, they can do filteredSlice := original before calling DeleteFunc. This also better matches maps.DeleteFunc, which does not return the original map.

@Deleplace
Copy link
Author

Deleplace commented Aug 31, 2022

I agree about the original slice not being useful for any common purpose, after having been changed by one of the destructive funcs.

  • slices.Compact
  • slices.CompactFunc
  • slices.Delete
  • slices.Insert

all accept S and return S, instead of accepting *S and returning nothing. Maybe they do it that way to mimick append, which is only "half-destructive" (depending on what we want to do with the original slice).

@carlmjohnson
Copy link
Contributor

carlmjohnson commented Aug 31, 2022

For append, I think it makes sense that it returns a slice instead of taking a pointer because you may want to append to nil or a subslice etc. I don't think it makes as much sense for Compact/Delete/Insert.

@flowchartsman
Copy link

flowchartsman commented Oct 28, 2022

Having played around with implementing this generically myself, one thing that occurred to me is that another potential benefit for a generic DeleteFunc might be to go ahead and automatically zero out "deleted" slice elements so that any pointed-to memory can be GCed, avoiding the need to remember the "trick to it".

The simplest way would just be to always zero from len(new) -> len(original), or you could reflect on the elem type and zero out the elements conditionally if that's worth it (not sure, poc: https://go.dev/play/p/foOKrgQ5gxN).

This brings up some more interesting questions. Because, if you have generics, and you want to put generic slice methods in the standard library, of course you would want to do the right thing for your users in one place if the cost isn't too high. For my part, I can definitely say that the documentation for slices.Delete is unsatisfying to me as a Go educator, since it's one more thing to have to explain upfront about a seemingly-simple process, and explaining append is already a heavy lift. It's not intuitive that you can "delete" something, but still have chunks of uncollected memory floating around. Of course this is nothing new, since slice expressions effectively do the same thing, but if you're going to the trouble of making a library for it, might as well have it do something extra. It would certainly add a little something back on the other side of the scale that we lose to the need to call functions like this without a lambda syntax.

Perhaps this is an argument for Delete(in-place, reflective) and Filter(copy), or perhaps it's an argument that copying is the more natural thing, and we should just always copy values and live with that. That's certainly the more conservative, Go-y thing to do, but we are also in a new generic world, and that brings a lot of opportunity for new conventions. It may even be that it will be perfectly fine to have something like DeleteFunc(s *S, del func(E) bool), though it does feel a little unnatural to me right now.

I think, I'd vote for either

  • give Delete and DeleteFunc the ability to zero out old capacity or
  • don't bother with them at all, and only copy.

@carlmjohnson
Copy link
Contributor

carlmjohnson commented Oct 29, 2022

You may be interested in #56351 which proposes to add a clear function which would remove all elements from a map and set all values in a slice to the zero value.

@flowchartsman
Copy link

flowchartsman commented Oct 29, 2022

Yes, I think DeleteFunc should probably call that behind the scenes, since there's no reasonable way to zero out the elements yourself unless you keep the original slice and also return the range of indices of the leftover len() after all of the shuffling has completed.

@Deleplace
Copy link
Author

Deleplace commented Oct 30, 2022

I agree with the sentiment that slices.Delete, slices.DeleteFunc, slices.Compact, slices.CompactFunc, slices.Clip would all be more useful and less cryptic if they did the extra work of zeroing out the discarded right tail of the original slice. I suggested such behavior but got strong pushback.

@flowchartsman the range of the leftover is always len(new):len(old) but yes leaving this work as an exercise to the developer at call site is not great in my opinion.

@carlmjohnson
Copy link
Contributor

carlmjohnson commented Oct 30, 2022

Maybe we should propose slice.ClearTail(s S) which runs clear(s[len(s):cap(s)]).

@Deleplace
Copy link
Author

Deleplace commented Oct 30, 2022

ClearTail is much better than nothing, in the world in currently live in.

@flowchartsman
Copy link

flowchartsman commented Oct 30, 2022

I agree with the sentiment that slices.Delete, slices.DeleteFunc, slices.Compact, slices.CompactFunc, slices.Clip would all be more useful and less cryptic if they did the extra work of zeroing out the discarded right tail of the original slice. I suggested such behavior but got strong pushback.

I think that pushback is misplaced. Just because slice expressions let us leave hanging references in a situation we might not want to isn't a strong argument in my opinion that a higher-level, generic function should not "do the right thing" for its expressed usecase. Slice expressions are, to my thinking, a basic feature that has no intrinsic intent beyond just "adjust the len and cap for this view into a backing array". They can be used in a variety of contexts, and it's up to the programmer to execute their intent by either zeroing out the elements or not, depending on what their plans are for that slice in the future. In addition, slice expressions do not change the positions of elements, so there's every expectation that everything in the backing array is just as you left it, and is the explicit reason we have three-argument slice expressions in the first place, to allow sharing a subset view without the worry that someone will accidentally append() and overwrite elements.

With something called Delete*, the intent is clear: we want to remove things, and there's no reason to keep around the remaining capacity, especially when the implementation deliberately overwrites values by shifting their indices, as it does in DeleteFunc There's no way for the user to know which elements are going to stay on the end of the slice and which will be overwritten by shifting a "good" element into their place. The only possible piece of extra information we could give them, as @carlmjohnson suggests, is a range of leftover elements, but even that's only useful for clearing them out, and what would be the point of requiring an extra step?

That's why my argument is that we either don't mess with the indices at all, in which case DeleteFunc (and arguably Delete) are non-starters, or we clear out the other items automatically. The middle ground is kind of pointless IMO

@flowchartsman
Copy link

flowchartsman commented Oct 30, 2022

As an aside, I think exp/slices and exp/maps are an interesting window into the conflict between the general conservative attitude towards language changes and side-effects versus the expanded functionality of type parameters and how useful they can actually be if the desire is to do as little as possible behind the scenes.

Delete and DeleteFunc are especially interesting because they are predicated on implementations that move things around, so you have to ask why you wouldn't clear out the remainder anyway. Then you have to ask why it's an assignment in the first place if there's no reason to keep around the (possibly changed) original. Then you have to ask why you shouldn't just make it take a pointer, and then you have to ask where the aversion to methods taking a slice pointer comes from in the first place. It's a lot of precedent to consider for some seemingly-simple generic functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

4 participants