-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: slices: add CollectN with preallocated size. #68261
Comments
The name gives me more maxsize than minsize vibes. |
I guess we'd rather change the implementation of That said, the function signature is not particularly nice. I guess (sorry, no evidence) that the majority of use cases is around the functions of
Are there other wide-spread cases that need to be covered? |
ChatGPT suggests the name
I don't understand. Are you suggesting adding Sorted is just Collect+Sort, so I don't see the point of making a bunch of variations on it. If you need better Collecting, then use CollectWithCap, if not, use plain Sorted.
I think this would be a good general purpose tool. You could use it with a hypothetical container/set type, for example, or if we ever genericize container/list and container/heap. I have a generic deque type I would have used this with if it had existed. |
I still assume (sorry), that we'll have 90% maps keys/values. |
I am rehashing arguments that have already been made, but this whole API which is more mental load on the programmer and more places to make mistake wouldn't need to exists if the package iter
type Seq[V any] interface {
Iter(yield func(V) bool)
}
type Seq2[K, V any] interface {
Iter(yield func(K, V) bool)
}
// Len can be implemented by [Seq] and [Seq2] as optional features.
type Len interface {
// Len returns how many calls to yield Iter will do if ok == true.
// To be a correct implementation it MUST do n many calls, no more no less.
// When ok == false the length is not known.
Len() (n int, ok bool)
}
// Cap can be implemented by [Seq] and [Seq2] as optional features.
type Cap interface {
// Cap returns how many calls to yield Iter could do if ok == true.
// To be a correct implementation it MUST do at most n many calls, no more.
// When ok == false the capacity is not known.
Cap() (n int, ok bool)
} package slices
import "iter"
func Collect[V any](i iter.Seq[V]) []V {
var sizeHint int
var hasSizeHint bool
if l, ok := i.(iter.Len); ok {
sizeHint, hasSizeHint = l.Len()
}
if !hasSizeHint {
if l, ok := i.(iter.Cap); ok {
sizeHint, hasSizeHint = l.Cap()
}
}
if !hasSizeHint {
sizeHint = 0
}
r := make([]V, 0, sizeHint)
for v := range i {
r = append(r, v) // if you actually submitted this code to std you probably would want hardening against buggy iterators that return wrong .Len and .Cap.
}
return r
} Having it be optionally implemented would make this so much better, and more generally anything where you sometime can do some extra optimizations if you know the size ahead of time. |
But it isn't. Let's focus on building on top of what is now. |
@fzipp @earthboundkid I would suggest the name "hint" or "sizehint" for the parameter, as it takes the same role as the size hint parameter for make with maps: It has no impact on the semantics or correctness of the program, but makes it more efficient if given correctly. The name |
Like @fzipp above, to me the name And as others have observed we can already do this using I think this function might be premature at this point. |
If an iterator must be used, consider adding a |
@zigo101 That sounds like the existing |
I think it would be okay to put this proposal on hold for six months while we get used to using iterators in production. |
@earthboundkid You can use nstd.CollectMapKeys now. :D |
That's sort of what I mean. We can just use our personal iteration libraries for six months or so then regroup and see if we mostly want CollectWithCap (because we need to collect a lot of other containers into slices) or mostly want maps.KeySlice or maps.KeysSorted etc. based on real world usage. It's a little theoretical for now, although I have started doing some stuff with x/net/html iterators in production. |
It might be a bad idea to put "slices.Collect" in std. |
Imho it is very unfortunate that something as simple and often used as
Majority of use cases is probably better addressed adding dedicated functions for maps.Keys and slices.Values. Suggesting to revisit #61626 |
If the main concern is about efficiently iterating over sorted keys of a map (and possibly its associated values), wouldn't it be addressed by the addition of the following functions (perhaps with better names) to the func Sorted[M ~map[K]V, K cmp.Ordered, V any](m M) iter.Seq2[K, V]
func SortedFunc[M ~map[K]V, K comparable, V any](m M, cmp func(K, K) int) iter.Seq2[K, V] The need for Edit: I see that @earthboundkid had already suggested something similar (the addition of a |
Proposal Details
As discussed in #61900, it's a shame that
slices.Collect(maps.Keys(m))
andslices.Sorted(maps.Keys(m))
don't preallocate the size of m. You can work around it by writingslices.Sorted(slices.Append(make([]T, 0, len(m)), maps.Keys(m)))
, but that's kind of ugly.I propose adding
func CollectN[S ~[]T, T any](seq iter.Seq[T], minsize int) S
to slices that just returnsAppend(make([]T, 0, minsize), seq)
. It would help with map keys but also other container types where you know the length in advance or can at least make a good guess.Sorting map keys more efficiently would look like
keys := slices.CollectN(maps.Keys(m), len(m); slices.Sort(keys)
.The text was updated successfully, but these errors were encountered: