-
Notifications
You must be signed in to change notification settings - Fork 17.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: slices: function for count specific value on slice #62301
Comments
how common is this operation that it needs to be in the standard library? this would just be reduce in #61898 |
It's possible to define this in term of slices.Index: https://go.dev/play/p/H9HZqXlyI2e |
I think it's widespread. |
Also, It's possible to implement the |
it's simpler to just define it directly, too: https://go.dev/play/p/6kCIzk-yTgl how does this come up in your code bases? I've certainly needed to know the count of things before but I usually use a multiset as a histogram because when I need the count of one thing I usually need the count of several others too. |
Typically, I require the |
Here's what I usually do: https://go.dev/play/p/PZoRXcl8qc3 that lets me build the counts up for the whole thing and then I query the map. This is better than doing repeated O(n) queries on the slice if you need to run multiple queries. If you just want to run one query the search is faster and it needs to get rebuilt if the slice changes. I'm usually just doing this for one off analysis or a one time ETL so I usually have all the slices "done" before I need to check the counts and I tend to need to check the counts of multiple items. |
This implementation is better than mine, But what about struct slices? Typically, you would require the |
Have a func from your value to something comparable. When you build the map use h[key(v)] += 1 and check the map using Or use a different data structure. |
As noted earlier, the h := make(map[string]int)
Reduce(h, func(h map[string]int, s string) map[string]int { h[s]++; return h }, seq) Less generally: // could easily use `vs Seq[V]` instead
func Counts[V comparable](vs []V) map[V]int {
return CountsFunc(vs, func(v V) V { return v })
}
func CountsFunc[V any, K comparable](vs []V, key func(V) K) map[K]int {
h := make(map[K]int)
for _, v := range vs {
h[key(v)]++
}
return h
} Counting like this is a no-brainer in e.g. a data science library or language, but strategies for optimal counting diverge ... no opinion on whether or where this fits culturally in Go. |
C++ has |
Python has collections.Counter which consumes an iterator to make a histogram map: https://docs.python.org/3/library/collections.html#collections.Counter |
@ianlancetaylor So, will this function be added to the slices package or x/exp/xiter? |
@dozheiny This is a proposal for adding new API. No decision has been made. I'm saying that I think it would be preferable to add this to x/exp/iter rather than slices. If anybody wants to make an argument that slices, or perhaps both, is better, this is the place to do it. |
I think this definitely makes way more sense as a function for iterators, not slices. It's a function that inherently needs to be implemented via iteration over a slice anyways, so by implementing it for iterators it'll be exactly the same but not limited to just slices. |
For what it’s worth, I feel like the simple case of just counting how many times X appears in a slice or iterator is better of just being done in a loop. I have used languages that have a lot of fancy functions to do this stuff for you, and the end result is you spend more time looking up the documentation for the iterator library than it would take to just write the loop. It creates a second way to do things that only saves a trivial amount of typing but takes up a certain amount of cognitive load. Obviously these things are a spectrum. I’m happy to have slices.Clone even though it’s trivial because it’s shorter and more clear for a very common operation. As the operations become less common though, the value of having them in a library drops off pretty rapidly. |
A quirky argument for |
I don't get why there's a |
@PierreTurnbull That's a fair question. My 2¢ is that strings.Count is doing something different than slices.Count would do. From the strings.Count docs:
Counting non-overlapping instances is slightly tricky. Implementing count of cond(item) in items is not tricky. strings.Count also uses an assembly language version of count for one byte needles, which is also tricky. I think it would be better to give up on this proposal instead propose that a container/histogram package be with a new type similar to Python's collection.Counter class because efficiently doing a count of everything in a sequence is a little tricky. |
We can implement |
@earthboundkid I think trivial features like counting elements should be implemented in the simplest way. The developer that uses/learns the language should not be required to know the technical details of how it works under the hood in order to understand the language. It should be possible to count elements in any kind of iterable data structure in the simplest way, following a standard convention (eg : strings.Count, slices.Count, map.Count, anyiteratoryoucanimagine.Count). Things should be simple, especially when one the main motive of the language is simplicity. As a beginner in Go, I might be missing some intellectual keys to understanding the way things are done so please excuse me if I'm missing an obvious point. This thing and few others leave me quite confused and skeptic concerning the quality of the language and its standard libraries, especially in terms of DX. This first impression is in complete opposition with the positive opinions I heard about this language, which doubles my confusion. |
@PierreTurnbull As you point out, there is an existing Further, the desire to count the number of times an element occurs in a slice would, I think, be better satisfied by adding a function to the iter package or, more likely to start, the x/exp/iter package. That would correct to the C++ Iterators, and for that matter the slices package, are new to Go, and we are working through the details of how to best handle them. We move carefully because we are very strict about backward compatibility, so once we have implemented something we can never change it. |
The slices package has no functions for counting a specific value for now.
I think it's good the definition of function is like this:
An example of the Counts function is like this:
The text was updated successfully, but these errors were encountered: