Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster bin index for uniform edge #613

Closed
wants to merge 1 commit into from

Conversation

Moelf
Copy link
Contributor

@Moelf Moelf commented Oct 29, 2020

see #563

before:

julia> @benchmark fit(Histogram, $(rand(10^6)), 0:0.05:1; closed=:right)
BenchmarkTools.Trial:
  memory estimate:  320 bytes
  allocs estimate:  2
  --------------
  minimum time:     25.080 ms (0.00% GC)
  median time:      25.262 ms (0.00% GC)
  mean time:        25.308 ms (0.00% GC)
  maximum time:     27.498 ms (0.00% GC)
  --------------
  samples:          198
  evals/sample:     1

julia> @benchmark fit(Histogram, $(rand(10^6)), 0:0.05:1; closed=:left)
BenchmarkTools.Trial:
  memory estimate:  320 bytes
  allocs estimate:  2
  --------------
  minimum time:     25.962 ms (0.00% GC)
  median time:      26.176 ms (0.00% GC)
  mean time:        26.764 ms (0.00% GC)
  maximum time:     28.579 ms (0.00% GC)
  --------------
  samples:          187
  evals/sample:     1

after:

julia> @benchmark fit(Histogram, $(rand(10^6)), 0:0.05:1; closed=:right)
BenchmarkTools.Trial:
  memory estimate:  320 bytes
  allocs estimate:  2
  --------------
  minimum time:     9.181 ms (0.00% GC)
  median time:      9.534 ms (0.00% GC)
  mean time:        9.539 ms (0.00% GC)
  maximum time:     10.606 ms (0.00% GC)
  --------------
  samples:          525
  evals/sample:     1

julia> @benchmark fit(Histogram, $(rand(10^6)), 0:0.05:1; closed=:left)
BenchmarkTools.Trial:
  memory estimate:  320 bytes
  allocs estimate:  2
  --------------
  minimum time:     15.343 ms (0.00% GC)
  median time:      18.779 ms (0.00% GC)
  mean time:        18.766 ms (0.00% GC)
  maximum time:     18.947 ms (0.00% GC)
  --------------
  samples:          267
  evals/sample:     1

@Moelf
Copy link
Contributor Author

Moelf commented Oct 29, 2020

notice this also makes multi-dimensional faster automatically. I hope @ianshmean is as happy as I am rn ;)

julia> d = rand(3,10^6);

julia> @btime fit(Histogram, (d[1,:], d[2,:], d[3,:]), (0:0.05:1, 0:0.05:1, 0:0.05:1));
  88.310 ms (15 allocations: 22.95 MiB)

julia> @btime fit(Histogram, (d[1,:], d[2,:], d[3,:]), (0:0.05:1, 0:0.05:1, 0:0.05:1));
  48.421 ms (15 allocations: 22.95 MiB)

@Octogonapus
Copy link

I gave this a try locally and it is a performance regression in our use case. Our bins are a StepRange{UInt8,UInt8} 0x00:0x10:0xf0 so maybe something about the non-unit step causes this? I also had to clamp n between the bounds of the edge range to avoid an OOB exception (caused by the non-unit step in the range): n = clamp(round(Int, (x - first(edge)) / step(edge) + 1), 1, length(edge)).

@Moelf
Copy link
Contributor Author

Moelf commented Oct 30, 2020

@Octogonapus interesting, any chance to paste a MWE here?

@Octogonapus
Copy link

using Images, TestImages, FileIO, Rotations, CoordinateTransformations, StatsBase, BenchmarkTools

img = Gray.(testimage("lighthouse"));
tfm = Translation(125,250)  LinearMap(RotMatrix(pi/50))  Translation(-125,-250);
img_rotated = warp(img, tfm);

fixed = collect(reinterpret(UInt8, img[50:300, 50:300]));
moving = collect(reinterpret(UInt8, img_rotated[50:300, 50:300]));

step = UInt8(16);
h = fit(Histogram, (vec(fixed), vec(moving)), (0x00:step:0xff, 0x00:step:0xff));
@benchmark append!(h, (vec(fixed), vec(moving)))

Baseline:

BenchmarkTools.Trial: 
  memory estimate:  192 bytes
  allocs estimate:  5
  --------------
  minimum time:     859.135 μs (0.00% GC)
  median time:      924.734 μs (0.00% GC)
  mean time:        959.305 μs (0.00% GC)
  maximum time:     1.570 ms (0.00% GC)
  --------------
  samples:          5208
  evals/sample:     1

This patch without my fix:

ERROR: InexactError: trunc(UInt8, 256)
Stacktrace:
 [1] throw_inexacterror(::Symbol, ::Type{UInt8}, ::Int64) at ./boot.jl:558
 [2] checked_trunc_uint at ./boot.jl:588 [inlined]
 [3] toUInt8 at ./boot.jl:650 [inlined]
 [4] UInt8 at ./boot.jl:710 [inlined]
 [5] convert at ./number.jl:7 [inlined]
 [6] getindex at ./range.jl:658 [inlined]
 [7] _edge_binindex at /home/salmon/Documents/StatsBase.jl/src/hist.jl:255 [inlined]
 [8] (::StatsBase.var"#131#132"{Histogram{Int64,2,Tuple{StepRange{UInt8,UInt8},StepRange{UInt8,UInt8}}}})(::StepRange{UInt8,UInt8}, ::UInt8) at /home/salmon/Documents/StatsBase.jl/src/hist.jl:227
 [9] map at ./tuple.jl:177 [inlined]
 [10] binindex at /home/salmon/Documents/StatsBase.jl/src/hist.jl:226 [inlined]
 [11] push! at /home/salmon/Documents/StatsBase.jl/src/hist.jl:311 [inlined]
 [12] append!(::Histogram{Int64,2,Tuple{StepRange{UInt8,UInt8},StepRange{UInt8,UInt8}}}, ::Tuple{Array{UInt8,1},Array{UInt8,1}}) at /home/salmon/Documents/StatsBase.jl/src/hist.jl:332
 [13] ##core#853() at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:371
 [14] ##sample#854(::BenchmarkTools.Parameters) at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:377
 [15] _run(::BenchmarkTools.Benchmark{Symbol("##benchmark#852")}, ::BenchmarkTools.Parameters; verbose::Bool, pad::String, kwargs::Base.Iterators.Pairs{Symbol,Integer,NTuple{4,Symbol},NamedTuple{(:samples, :evals, :gctrial, :gcsample),Tuple{Int64,Int64,Bool,Bool}}}) at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:405
 [16] (::Base.var"#inner#2"{Base.Iterators.Pairs{Symbol,Integer,NTuple{5,Symbol},NamedTuple{(:verbose, :samples, :evals, :gctrial, :gcsample),Tuple{Bool,Int64,Int64,Bool,Bool}}},typeof(BenchmarkTools._run),Tuple{BenchmarkTools.Benchmark{Symbol("##benchmark#852")},BenchmarkTools.Parameters}})() at ./essentials.jl:713
 [17] #invokelatest#1 at ./essentials.jl:714 [inlined]
 [18] #run_result#37 at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:32 [inlined]
 [19] run(::BenchmarkTools.Benchmark{Symbol("##benchmark#852")}, ::BenchmarkTools.Parameters; progressid::Nothing, nleaves::Float64, ndone::Float64, kwargs::Base.Iterators.Pairs{Symbol,Integer,NTuple{5,Symbol},NamedTuple{(:verbose, :samples, :evals, :gctrial, :gcsample),Tuple{Bool,Int64,Int64,Bool,Bool}}}) at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:94
 [20] #warmup#45 at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:141 [inlined]
 [21] warmup(::BenchmarkTools.Benchmark{Symbol("##benchmark#852")}) at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:141
 [22] top-level scope at /home/salmon/.julia/packages/BenchmarkTools/eCEpo/src/execution.jl:287

This patch with my fix:

BenchmarkTools.Trial: 
  memory estimate:  192 bytes
  allocs estimate:  5
  --------------
  minimum time:     1.565 ms (0.00% GC)
  median time:      1.671 ms (0.00% GC)
  mean time:        1.679 ms (0.00% GC)
  maximum time:     2.385 ms (0.00% GC)
  --------------
  samples:          2977
  evals/sample:     1

@Moelf
Copy link
Contributor Author

Moelf commented Aug 28, 2021

implemented in https://github.com/Moelf/FHist.jl

julia> a = rand(10^5);

julia> using BenchmarkTools

julia> using FHist, StatsBase

julia> @benchmark Hist1D($a, -3:0.01:3) 
BenchmarkTools.Trial: 9057 samples with 1 evaluation.
 Range (min  max):  117.353 μs   1.412 ms  ┊ GC (min  max): 0.00%  82.64%
 Time  (median):     547.890 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   550.589 μs ± 32.415 μs  ┊ GC (mean ± σ):  0.07% ±  1.59%

                                                          █     
  ▂▂▂▂▁▂▂▁▁▁▂▁▁▁▂▁▁▂▂▁▁▁▂▁▁▁▁▁▁▂▁▂▁▁▁▂▁▁▂▁▁▁▂▁▂▁▁▁▂▂▂▁▁▂▁▃█▆█▅ ▂
  117 μs          Histogram: frequency by time          572 μs <

 Memory estimate: 14.58 KiB, allocs estimate: 6.

julia> @benchmark fit(Histogram, $a, -3:0.01:3) 
BenchmarkTools.Trial: 1995 samples with 1 evaluation.
 Range (min  max):  2.457 ms   2.635 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.482 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.505 ms ± 40.487 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

        ▄█▅▁                                                  
  ▂▂▃▃▃▅████▅▅▅▅▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▁▂▁▂▂▂▂▂▃▃▄▅▆▆▄▃▃▃▃▃▂▃▂▂▂▂▂▂▂ ▃
  2.46 ms        Histogram: frequency by time        2.61 ms <

 Memory estimate: 4.89 KiB, allocs estimate: 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants