New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
countmap should sort values #223
Comments
Then there's the question of type stability: returning a (FWIW, |
Thanks @nalimilan
so freqtable may be a solution but I was expecting something in |
You could return an |
Maybe we could have an other function |
A more Julian approach would be to have |
According http://stackoverflow.com/questions/29848734/is-it-possible-to-sort-a-dictionary-in-julia We can get an OrderedDict sort by value (descending) using
This PR JuliaCollections/DataStructures.jl#43 should help |
Yes, but that's kind of wasteful since it creates an intermediate array. Maybe acceptable as a temporary solution until that PR is merged, though. |
It's never too late to add those methods. 😉 |
When merged, this PR JuliaCollections/DataStructures.jl#225 should help Is it possible to add DataStructures as a dependency of StatsBase? |
I'd quite prefer we not add that dependency. If we have a method like countmap{T<:Associative}(::Type{T}, ...) that should allow |
@femtotrader Why not just load |
@andreasnoack I know this, with JuliaCollections/DataStructures.jl#225 we should be able to do using DataStructures
d = sort(countmap(a), byvalue=true) which can be quite good for dict with few keys/values but I like @ararslan idea of defining countmap{T<:Associative}(::Type{T}, ...) because if a We just need to find a |
Are you sure that it wouldn't be as fast or faster to sort after the creation of the |
I'm rather skeptical that any kind of special support for this needs to be in StatsBase beyond what's currently there. For your use case, it sounds like the easiest approach would just be to use DataStructures.jl on your end and sort the dict after creation, especially since (AFAIK) no Regarding my suggestion for addcounts!{T}(cm::Dict{T}, x::AbstractArray{T}) If we widen that to addcounts!{T}(cm::Associative{T,Int}, x::AbstractArray{T}) then |
I agree with Alex and the fact that no one has answered in the last 6.5 years means this can probably be closed? |
According doc http://statsbasejl.readthedocs.io/en/latest/counts.html#countmap
countmap
returns a dictionary that maps distinct values in x to their counts (or total weights).It may be useful to sort these values (using ascending order)
So maybe an OrderedDict may be a better data structure for this.
Use case:
When you define a timeseries, it may help to find the most prevalent period between 2 timestamps (sampling period).
You will have to sort this OrderedDict using descending order of values
Most prevalent period will be the first value of this OrderedDict.
When timestamps are regulary spaced you get a dict with only one key (for which value is sampling period)
Python Pandas have a
value_counts
property for Series which is very useful.http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.value_counts.html
The text was updated successfully, but these errors were encountered: