most_common method for counters #198

ymer · 2016-04-10T06:42:57Z

Return a list of the most common elements in a counter and their counts, ordered from the most common to the least.

most_common(counter([1, 6, 2, 2, 5, 5, 5]))

4-element Array{Pair{Int64,Int64},1}:
5=>3
2=>2
6=>1
1=>1

rawls238 · 2016-04-11T03:47:41Z

doc/source/accumulators.rst

  merge(a, a2)     # return a new accumulator/counter that combines the
                   # values/counts in both a and a2
+
+most_common(a)     # Return a list of the most common elements and their counts from the most common to the least.


I think fixing the indentation here should get rid of the weirdness in the rich diff

kmsquire · 2016-04-11T05:52:52Z

For parity with Julia Base, it might be nice to create a version of select which does this. See http://docs.julialang.org/en/release-0.4/stdlib/sort/#Base.select! (and http://docs.julialang.org/en/release-0.4/stdlib/sort/#Base.select). Or use select to define most_common, if that makes sense.

There are related functions, nsmallest and nlargest, which are implemented for heaps in this package as well.

ymer · 2016-04-11T06:50:02Z

Could the heap nlargest be extended to take a "by" keyword?

kmsquire · 2016-04-11T13:12:43Z

Well, it could. This might not be a good idea if performance is an issue (see http://docs.julialang.org/en/release-0.4/manual/performance-tips/#declare-types-of-keyword-arguments, although there isn't much explanation as to why keywords can be slow).

I'm curious what your use case would be, since nlargest would already choose the largest values according to the value, and that's what you're proposing here.

Another idea, which I like a little better, would be to implement a separate HeapCounter type, which created a counter based on heaps instead of dictionaries.

ymer · 2016-04-11T17:53:30Z

I think that the users would often want all the values in order, rather than the n most common. And most_common could be somewhat more suitable for this than nlargest.

Here is another suggestion:

most_common(ct) = sort(collect(ct.map), by=p->p[2], rev=true)

most_common(ct, n) = select(collect(ct.map), 1:n, by=p->p[2], rev=true)

oxinabox · 2016-05-17T03:32:38Z

It may also be worth special casing most_common(ct, 1) which I think is another common use-case.
select is really slow compared to other methods for finding the single most common.

Also good idea. I didn't even know this is missing. This is one of the main uses of the Counter type.

rawls238

@ymer any updates on this? It would be nice to merge this in once some of the comments are resolved

kmsquire · 2017-05-28T08:04:28Z

Bump.

kmsquire · 2018-09-11T06:37:44Z

This seems to have been satisfied by #375 (using nlargest and nsmallest).

ymer added 3 commits April 10, 2016 08:37

most_common method for counters

3a58928

newline at EOF

14283da

newline at EOF

1f25d6d

rawls238 reviewed Apr 11, 2016
View reviewed changes

rawls238 reviewed Sep 14, 2016

View reviewed changes

oxinabox mentioned this pull request Mar 23, 2018

add most_common for accumulator #375

Merged

kmsquire closed this Sep 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

most_common method for counters #198

most_common method for counters #198

Uh oh!

ymer commented Apr 10, 2016

Uh oh!

rawls238 Apr 11, 2016

Uh oh!

kmsquire commented Apr 11, 2016

Uh oh!

ymer commented Apr 11, 2016

Uh oh!

kmsquire commented Apr 11, 2016

Uh oh!

ymer commented Apr 11, 2016

Uh oh!

oxinabox commented May 17, 2016

Uh oh!

rawls238 left a comment

Uh oh!

kmsquire commented May 28, 2017

Uh oh!

kmsquire commented Sep 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

most_common method for counters #198

most_common method for counters #198

Uh oh!

Conversation

ymer commented Apr 10, 2016

Uh oh!

rawls238 Apr 11, 2016

Choose a reason for hiding this comment

Uh oh!

kmsquire commented Apr 11, 2016

Uh oh!

ymer commented Apr 11, 2016

Uh oh!

kmsquire commented Apr 11, 2016

Uh oh!

ymer commented Apr 11, 2016

Uh oh!

oxinabox commented May 17, 2016

Uh oh!

rawls238 left a comment

Choose a reason for hiding this comment

Uh oh!

kmsquire commented May 28, 2017

Uh oh!

kmsquire commented Sep 11, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants