Skip to content

Commit

Permalink
document :radixsort performance (#720)
Browse files Browse the repository at this point in the history
  • Loading branch information
LilithHafner committed Sep 3, 2023
1 parent 0e8e4aa commit 7ca72c7
Showing 1 changed file with 12 additions and 8 deletions.
20 changes: 12 additions & 8 deletions src/counts.jl
Original file line number Diff line number Diff line change
Expand Up @@ -258,12 +258,14 @@ raw counts.
- `:radixsort`: if `radixsort_safe(eltype(x)) == true` then use the
[radix sort](https://en.wikipedia.org/wiki/Radix_sort)
algorithm to sort the input vector which will generally lead to
shorter running time. However the radix sort algorithm creates a
copy of the input vector and hence uses more RAM. Choose `:dict`
if the amount of available RAM is a limitation.
shorter running time for large `x` with many duplicates. However
the radix sort algorithm creates a copy of the input vector and
hence uses more RAM. Choose `:dict` if the amount of available
RAM is a limitation.
- `:dict`: use `Dict`-based method which is generally slower but uses less
RAM and is safe for any data type.
RAM, is safe for any data type, is faster for small arrays, and
is faster when there are not many duplicates.
"""
addcounts!(cm::Dict, x; alg = :auto) = _addcounts!(eltype(x), cm, x, alg = alg)

Expand Down Expand Up @@ -430,12 +432,14 @@ raw counts.
- `:radixsort`: if `radixsort_safe(eltype(x)) == true` then use the
[radix sort](https://en.wikipedia.org/wiki/Radix_sort)
algorithm to sort the input vector which will generally lead to
shorter running time. However the radix sort algorithm creates a
copy of the input vector and hence uses more RAM. Choose `:dict`
if the amount of available RAM is a limitation.
shorter running time for large `x` with many duplicates. However
the radix sort algorithm creates a copy of the input vector and
hence uses more RAM. Choose `:dict` if the amount of available
RAM is a limitation.
- `:dict`: use `Dict`-based method which is generally slower but uses less
RAM and is safe for any data type.
RAM, is safe for any data type, is faster for small arrays, and
is faster when there are not many duplicates.
"""
countmap(x; alg = :auto) = addcounts!(Dict{eltype(x),Int}(), x; alg = alg)
countmap(x::AbstractArray{T}, wv::AbstractVector{W}) where {T,W<:Real} = addcounts!(Dict{T,W}(), x, wv)
Expand Down

0 comments on commit 7ca72c7

Please sign in to comment.