document :radixsort performance (#720)

JuliaStats · Sep 3, 2023 · 7ca72c7 · 7ca72c7
1 parent 0e8e4aa
commit 7ca72c7
Showing 1 changed file with 12 additions and 8 deletions.
diff --git a/src/counts.jl b/src/counts.jl
@@ -258,12 +258,14 @@ raw counts.
 - `:radixsort`:      if `radixsort_safe(eltype(x)) == true` then use the
                      [radix sort](https://en.wikipedia.org/wiki/Radix_sort)
                      algorithm to sort the input vector which will generally lead to
-                     shorter running time. However the radix sort algorithm creates a
-                     copy of the input vector and hence uses more RAM. Choose `:dict`
-                     if the amount of available RAM is a limitation.
+                     shorter running time for large `x` with many duplicates. However
+                     the radix sort algorithm creates a copy of the input vector and
+                     hence uses more RAM. Choose `:dict` if the amount of available
+                     RAM is a limitation.
 
 - `:dict`:           use `Dict`-based method which is generally slower but uses less
-                     RAM and is safe for any data type.
+                     RAM, is safe for any data type, is faster for small arrays, and
+                     is faster when there are not many duplicates.
 """
 addcounts!(cm::Dict, x; alg = :auto) = _addcounts!(eltype(x), cm, x, alg = alg)
 
@@ -430,12 +432,14 @@ raw counts.
 - `:radixsort`:      if `radixsort_safe(eltype(x)) == true` then use the
                      [radix sort](https://en.wikipedia.org/wiki/Radix_sort)
                      algorithm to sort the input vector which will generally lead to
-                     shorter running time. However the radix sort algorithm creates a
-                     copy of the input vector and hence uses more RAM. Choose `:dict`
-                     if the amount of available RAM is a limitation.
+                     shorter running time for large `x` with many duplicates. However
+                     the radix sort algorithm creates a copy of the input vector and
+                     hence uses more RAM. Choose `:dict` if the amount of available
+                     RAM is a limitation.
 
 - `:dict`:           use `Dict`-based method which is generally slower but uses less
-                     RAM and is safe for any data type.
+                     RAM, is safe for any data type, is faster for small arrays, and
+                     is faster when there are not many duplicates.
 """
 countmap(x; alg = :auto) = addcounts!(Dict{eltype(x),Int}(), x; alg = alg)
 countmap(x::AbstractArray{T}, wv::AbstractVector{W}) where {T,W<:Real} = addcounts!(Dict{T,W}(), x, wv)