Skip to content

Median#12

Merged
brenhinkeller merged 23 commits intomainfrom
median
Jan 5, 2022
Merged

Median#12
brenhinkeller merged 23 commits intomainfrom
median

Conversation

@brenhinkeller
Copy link
Copy Markdown
Collaborator

@brenhinkeller brenhinkeller commented Jan 2, 2022

So this only manages to vectorize a relatively small part of the underlying sorting, but still seems to beat base pretty handily in most non-memory-bandwidth-limited cases. Good enough for now?

julia> A = rand(100); sort(A) == vsort!(A)
true

julia> @benchmark sort!(A) setup = A = rand(100)
BenchmarkTools.Trial: 10000 samples with 199 evaluations.
 Range (min … max):  413.638 ns …  1.454 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     442.970 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   474.331 ns ± 82.305 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▇▅█▆▃▄▅▂▁ ▂▂▃▄▁   ▂▂▃▃▃                                  ▁  ▂
  ▇███████████████████████▇▆▇▇▇▇▆▆▆▅▅▇▁▆▅▅▅▄▅▅▄▄▄▅▅▃▃▅▃▅▅██▆██ █
  414 ns        Histogram: log(frequency) by time       860 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vsort!(A) setup = A = rand(100)
BenchmarkTools.Trial: 10000 samples with 962 evaluations.
 Range (min … max):  85.747 ns … 246.201 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     86.377 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   92.525 ns ±  16.230 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █ ▁ ▄  ▁   ▄        ▄                                      ▂ ▁
  █████▇▇█▇▇▇██▇▇▆▆▆▆▆█▆▆▆▅▅▅▆▅▅▅▅▄▃▅▅▅▄▅▄▄▅▄▄▄▄▃▃▁▄▃▄▄▁▃▄▃▁▄█ █
  85.7 ns       Histogram: log(frequency) by time       168 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark sort!(A) setup = A = rand(10_000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  410.692 μs … 933.455 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     431.529 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   458.630 μs ±  67.393 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▆█▇▄▄▅▄▂▂▂▂▂▃▃▂▁▁ ▁▁▂▃▂▁                                     ▂
  ██████████████████████████▇▆▇▆▆▇▆▆▅▅▅▆▆▄▅▅▆▄▄▅▆▆▁▄▄▅▆▅▅▅▄▄▅▆▇ █
  411 μs        Histogram: log(frequency) by time        827 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vsort!(A) setup = A = rand(10_000)
BenchmarkTools.Trial: 6582 samples with 5 evaluations.
 Range (min … max):  130.921 μs … 274.166 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     137.940 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   144.805 μs ±  16.559 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▁█▇▁                                                         
  ▂▄████▆▅▄▄▄▄▄▃▃▃▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂ ▃
  131 μs           Histogram: frequency by time          214 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

The way I've implemented this currently, we also do especially well with arrays that are antisorted or have large sorted/antisorted runs (especially given that the reversal can be nicely vectorized)

julia> @benchmark sort!(A) setup = A = collect(1000:-1:1)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min … max):  3.814 μs … 15.853 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.085 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.524 μs ±  1.012 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅█▅▆▄▁▇▁ ▅▁    ▄▄ ▁▁   ▅▂                                ▃ ▂
  ████████▇███▇▆▅██████▇▇██▆▅▅▆▅▆▅▅▅▇▆▅▄▅▅▅▅▄▄▅▄▅▅▄▂▄▃▂▄▂▃▃█ █
  3.81 μs      Histogram: log(frequency) by time     8.36 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vsort!(A) setup = A = collect(1000:-1:1)
BenchmarkTools.Trial: 10000 samples with 200 evaluations.
 Range (min … max):  417.085 ns …  1.540 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     441.080 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   480.585 ns ± 96.013 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅█▅▃▄ ▆▃▂▄▂▁ ▁▁ ▄       ▅                                  ▃ ▂
  █████▇███████████▇███▇▇▅███▅▅▅▆▇▆▆▅▅▅▆▄▄▄▃▃▅▅▄▄▅▄▅▁▃▁▃▃▄▄▁▃█ █
  417 ns        Histogram: log(frequency) by time       909 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

@brenhinkeller
Copy link
Copy Markdown
Collaborator Author

brenhinkeller commented Jan 2, 2022

As for the actual median,
Small arrays, where quicksort! is used by default:

julia> A = rand(100); Statistics.median(A) == vmedian!(A)
true

julia> @benchmark Statistics.median!(A) setup = A = rand(100)
BenchmarkTools.Trial: 10000 samples with 250 evaluations.
 Range (min … max):  306.088 ns … 988.924 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     316.208 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   342.815 ns ±  64.051 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅█▆▃▂▄▃ ▁▁ ▂▄▃▁     ▃▄▃▁                                  ▁▁▂ ▂
  █████████████████▇▇█████▇▆▆▅▇▇▅▆▆▅▆▃▅▅▁▄▄▅▃▅▄▄▄▄▁▄▁▄▁▄▃▃▄▆███ █
  306 ns        Histogram: log(frequency) by time        617 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vmedian!(A) setup = A = rand(100)
BenchmarkTools.Trial: 10000 samples with 963 evaluations.
 Range (min … max):  83.958 ns … 278.579 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     84.805 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   91.658 ns ±  16.531 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▄▃ ▅▁ ▁   ▄▁       ▄                                      ▂ ▁
  ██████████▇████▆▇▇█▇███▆▅▆▆▇▇▆▆▆▆▆▆▆▅▅▆▅▃▅▄▅▄▅▅▆▄▁▄▅▃▅▅▁▄▄▆█ █
  84 ns         Histogram: log(frequency) by time       165 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Larger arrays, where quickselect! is used by default:

julia> A = rand(10_000); Statistics.median(A) == vmedian!(A)
true

julia> @benchmark Statistics.median!(A) setup = A = rand(10_000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   61.734 μs … 278.162 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     114.070 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   115.872 μs ±  23.300 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▁▂▄▄▆▇██▇▇▇▄▂                                     
  ▁▁▁▂▂▂▃▄▄▅▆▇▇██████████████▇▅▄▃▃▃▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  61.7 μs          Histogram: frequency by time          221 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vmedian!(A) setup = A = rand(10_000)
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range (min … max):  15.664 μs … 101.265 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     27.004 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   27.720 μs ±   5.672 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▁▃▄▅█▇▇█▆▅▃▁                                     
  ▁▁▁▂▂▃▃▄▅▆▇██████████████▇▆▅▅▄▃▃▃▃▂▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  15.7 μs         Histogram: frequency by time           51 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

@brenhinkeller
Copy link
Copy Markdown
Collaborator Author

Quantile/percentile, small arrays (quicksort!)

julia> A = rand(100); StatsBase.quantile(A, 0.33) == vquantile!(A, 0.33)
true

julia> @benchmark StatsBase.quantile!(A, 0.33) setup = A = rand(100)
BenchmarkTools.Trial: 10000 samples with 230 evaluations.
 Range (min … max):  325.417 ns …  1.102 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     332.954 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   364.569 ns ± 68.025 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▂▁▄▅  ▁▁ ▃▅       ▄▃                                     ▂ ▁
  ██████▇▇███████▇▇▇▇▇██▇▆▆▆▆▆▆▇▆▅▆▅▅▅▅▄▆▅▄▅▄▄▄▄▄▄▅▄▄▂▃▅▃▅▅▃██ █
  325 ns        Histogram: log(frequency) by time       646 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vquantile!(A, 0.33) setup = A = rand(100)
BenchmarkTools.Trial: 10000 samples with 960 evaluations.
 Range (min … max):  87.403 ns … 309.148 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     88.156 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   95.835 ns ±  17.409 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▄▁ ▅▁ ▁   ▄▁       ▄▁                                     ▂ ▁
  ███▇████▆▇▆██▇▇▆▆▅▇▆██▆▅▅▆▅▆▅▅▅▅▅▅▄▃▅▄▄▄▄▅▅▃▃▃▅▃▃▄▄▄▄▄▃▄▃▃▄█ █
  87.4 ns       Histogram: log(frequency) by time       171 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

Quantile/percentile, larger arrays (quickselect!)

julia> A = rand(10_000); StatsBase.quantile(A, 0.33) == vquantile!(A, 0.33)
true

julia> @benchmark StatsBase.quantile!(A, 0.33) setup = A = rand(10_000)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):   52.767 μs … 410.060 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     111.597 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   112.350 μs ±  24.431 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                ▁▁▂▂▅▆▆▆▇█▇█▅▅▃                                  
  ▁▁▁▂▂▂▃▄▄▅▅▅▆▇████████████████▆▆▄▃▃▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  52.8 μs          Histogram: frequency by time          213 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark vquantile!(A, 0.33) setup = A = rand(10_000)
BenchmarkTools.Trial: 10000 samples with 6 evaluations.
 Range (min … max):  12.238 μs … 58.217 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     22.722 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   23.199 μs ±  5.066 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▂▃▄▄▆▇▇█▆▆▄▁                                    
  ▁▁▂▂▂▃▄▅▅▆▇▇██████████████▇▅▅▄▄▃▃▃▃▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  12.2 μs         Histogram: frequency by time        43.9 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 2, 2022

Codecov Report

Merging #12 (e568c52) into main (15fda55) will decrease coverage by 1.77%.
The diff coverage is 93.48%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #12      +/-   ##
==========================================
- Coverage   98.71%   96.93%   -1.78%     
==========================================
  Files           7       11       +4     
  Lines         934     1404     +470     
==========================================
+ Hits          922     1361     +439     
- Misses         12       43      +31     
Impacted Files Coverage Δ
src/VectorizedStatistics.jl 100.00% <ø> (ø)
src/vsort.jl 91.48% <91.48%> (ø)
src/quicksort.jl 91.51% <91.51%> (ø)
src/vmedian.jl 95.28% <95.28%> (ø)
src/vquantile.jl 96.19% <96.19%> (ø)
src/vcov.jl 97.63% <100.00%> (ø)
src/vmean.jl 99.57% <100.00%> (ø)
src/vstd.jl 100.00% <100.00%> (ø)
src/vsum.jl 99.55% <100.00%> (ø)
src/vvar.jl 98.13% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 15fda55...e568c52. Read the comment docs.

Comment thread src/vsort.jl Outdated
@brenhinkeller brenhinkeller merged commit 4861579 into main Jan 5, 2022
@brenhinkeller brenhinkeller deleted the median branch January 5, 2022 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants