Performance improvement #185

ParadaCarleton · 2021-08-27T02:04:49Z

Only sort the tail values of the array, rather than the whole thing.

avehtari · 2021-09-20T13:53:05Z

Is this keeping the largest value smaller than tail values in a correct position?

cutoff <- ord$x[min(tail_ids) - 1] # largest value smaller than tail values

ParadaCarleton · 2021-09-20T16:57:56Z

Hopefully someday my brain will stop making off-by-one errors, but today is not that day.

avehtari · 2021-09-20T18:15:27Z

Maybe I need to rephrase. After your change, how do you determine the largest value that is smaller than the tail values? If you don't sort them, I think you should at least find the largest value among the non-tail values, e.g. with max() function.

ParadaCarleton · 2021-09-20T18:49:41Z

Maybe I need to rephrase. After your change, how do you determine the largest value that is smaller than the tail values? If you don't sort them, I think you should at least find the largest value among the non-tail values, e.g. with max() function.

Doesn't the code determine that by sorting the n+1 largest values into position, where n is the tail length? I might be misunderstanding what R's partial sort does.

avehtari · 2021-09-20T19:34:30Z

Ah, I was too tired when checking this and got confused. I'm also confused as off-by-one commit didn't make any change to the code, only change is in json. So how did you fix that off-by-one error?

ParadaCarleton · 2021-09-20T21:09:21Z

Ah, I was too tired when checking this and got confused. I'm also confused as off-by-one commit didn't make any change to the code, only change is in json. So how did you fix that off-by-one error?

Whoops, pushed the wrong change -- this should fix the off-by-one error. (I was accidentally only sorting the tail, rather than sorting from the last element not in the tail.)

jgabry · 2021-09-20T21:29:46Z

Thanks @ParadaCarleton and sorry for not getting to this sooner! I just triggered all the continuous integration tests to run again so let's see if they pass.

And yeah, @avehtari I think the latest commit from Carlos fixes the off by one issue, although if you want to double check that would be great. The tests should also catch it since I think we have regression tests that detect any changes to the loo results (and presumably an off by one error would lead to some change in the results).

jgabry · 2021-09-20T21:32:15Z

Actually some tests seem to be failing, so we'll need to look into that: https://github.com/stan-dev/loo/pull/185/checks?check_run_id=3656664234#step:11:150

sethaxen · 2021-12-11T01:14:46Z

Are there benchmarks showing that this is indeed a performance improvement in R? I tried something similar for PSIS.jl and found that in Julia at least it was always slower. e.g.

julia> using BenchmarkTools

julia> tail_length(reff, S) = min(cld(S, 5), ceil(Int, 3 * sqrt(S / reff)));

julia> function foo(logw)
           S = length(logw)
           M = tail_length(1, S)
           return sort(logw)[M:S]
       end;

julia> function bar(logw)
           S = length(logw)
           M = tail_length(1, S)
           return partialsort(logw, M:S)
       end;

julia> logw = randn(1000);

julia> foo(logw) == bar(logw)
true

julia> @btime $foo($logw);
  18.822 μs (2 allocations: 15.19 KiB)

julia> @btime $bar($logw);
  21.504 μs (1 allocation: 7.94 KiB)

ParadaCarleton · 2021-12-11T03:52:25Z

Are there benchmarks showing that this is indeed a performance improvement in R? I tried something similar for PSIS.jl and found that in Julia at least it was always slower. e.g.

julia> using BenchmarkTools

julia> tail_length(reff, S) = min(cld(S, 5), ceil(Int, 3 * sqrt(S / reff)));

julia> function foo(logw)
           S = length(logw)
           M = tail_length(1, S)
           return sort(logw)[M:S]
       end;

julia> function bar(logw)
           S = length(logw)
           M = tail_length(1, S)
           return partialsort(logw, M:S)
       end;

julia> logw = randn(1000);

julia> foo(logw) == bar(logw)
true

julia> @btime $foo($logw);
  18.822 μs (2 allocations: 15.19 KiB)

julia> @btime $bar($logw);
  21.504 μs (1 allocation: 7.94 KiB)

I benchmarked this and got a major speedup in Julia when I switched to only sorting partially; I assume the difference is to do with Julia's sorting algorithms, which have some large problems with cache misses. Feel free to discuss this with me on Slack.

ParadaCarleton · 2021-12-23T20:26:16Z

Are there benchmarks showing that this is indeed a performance improvement in R? I tried something similar for PSIS.jl and found that in Julia at least it was always slower. e.g.

julia> using BenchmarkTools

julia> tail_length(reff, S) = min(cld(S, 5), ceil(Int, 3 * sqrt(S / reff)));

julia> function foo(logw)
           S = length(logw)
           M = tail_length(1, S)
           return sort(logw)[M:S]
       end;

julia> function bar(logw)
           S = length(logw)
           M = tail_length(1, S)
           return partialsort(logw, M:S)
       end;

julia> logw = randn(1000);

julia> foo(logw) == bar(logw)
true

julia> @btime $foo($logw);
  18.822 μs (2 allocations: 15.19 KiB)

julia> @btime $bar($logw);
  21.504 μs (1 allocation: 7.94 KiB)

You're sorting from the tail length to the end of the array (accidentally sorting everything but the tail). The result is you're asking for almost the whole array to be sorted (~900 instead of ~100 elements), so the constant overhead of partialsort exceeds partialsort's better asymptotic performance. The cost is O(n+k log(k)) to sort k elements from a list of n -- because k=O(sqrt(n)), the cost is linear in the number of samples instead of O(n*log(n)). (The performance is still better for small samples, though, because k is at most 20% of the sample.)

julia> @benchmark partialsort($x, (1000-len):1000)
BenchmarkTools.Trial: 
  memory estimate:  8.03 KiB
  allocs estimate:  4
  --------------
  minimum time:     2.609 μs (0.00% GC)
  median time:      2.741 μs (0.00% GC)
  mean time:        3.107 μs (6.50% GC)
  maximum time:     177.498 μs (89.91% GC)
  --------------
  samples:          10000
  evals/sample:     9

julia> @benchmark sort($x)
BenchmarkTools.Trial: 
  memory estimate:  7.94 KiB
  allocs estimate:  1
  --------------
  minimum time:     12.518 μs (0.00% GC)
  median time:      13.624 μs (0.00% GC)
  mean time:        14.270 μs (0.00% GC)
  maximum time:     213.956 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

sethaxen · 2021-12-29T15:23:42Z

You're sorting from the tail length to the end of the array (accidentally sorting everything but the tail).

Good catch! Yes, I'm seeing the same performance improvement you're seeing now.

Performance improvement

b887a1d

Only sort the tail values of the array, rather than the whole thing.

off-by-one

6e070e1

Whoops

788c889

sethaxen mentioned this pull request Dec 29, 2021

Only sort the upper tail weights arviz-devs/PSIS.jl#20

Closed

Merge remote-tracking branch 'upstream/master' into patch-1

3b270aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance improvement #185

Performance improvement #185

ParadaCarleton commented Aug 27, 2021

avehtari commented Sep 20, 2021

ParadaCarleton commented Sep 20, 2021

avehtari commented Sep 20, 2021

ParadaCarleton commented Sep 20, 2021

avehtari commented Sep 20, 2021

ParadaCarleton commented Sep 20, 2021

jgabry commented Sep 20, 2021

jgabry commented Sep 20, 2021

sethaxen commented Dec 11, 2021

ParadaCarleton commented Dec 11, 2021 •

edited

Loading

ParadaCarleton commented Dec 23, 2021

sethaxen commented Dec 29, 2021

Performance improvement #185

Are you sure you want to change the base?

Performance improvement #185

Conversation

ParadaCarleton commented Aug 27, 2021

avehtari commented Sep 20, 2021

ParadaCarleton commented Sep 20, 2021

avehtari commented Sep 20, 2021

ParadaCarleton commented Sep 20, 2021

avehtari commented Sep 20, 2021

ParadaCarleton commented Sep 20, 2021

jgabry commented Sep 20, 2021

jgabry commented Sep 20, 2021

sethaxen commented Dec 11, 2021

ParadaCarleton commented Dec 11, 2021 • edited Loading

ParadaCarleton commented Dec 23, 2021

sethaxen commented Dec 29, 2021

ParadaCarleton commented Dec 11, 2021 •

edited

Loading