Performance degradation for very small arguments for besselj #58

jwscook · 2022-10-17T20:05:11Z

I have augmented my gist to compare the speed of besselj in Bessels vs SpecialFunctions. Bessels is overall faster except for some orders for very small arguments.

BesseljTime.pdf

heltonmc · 2022-10-17T20:34:24Z

Just to clarify this isn't a performance degradation Bessels.jl from version to version? It is simply that SpecialFunctions outperforms in some domains?

My attempt to reproduce this is....

julia> @benchmark SpecialFunctions.besselj(10, x) setup=(x=rand()*0.0000000001)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  8.333 ns … 20.084 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     8.458 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.482 ns ±  0.373 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▃  ▇  █  ▄  ▄  ▁                                         ▂
  ▆▁█▁▁█▁▁█▁▁█▁▁█▁▁█▁▁▅▁▁▅▁▁▇▁█▁▁▆▁▁▆▁▁▆▁▁▅▁▁▅▁▁▄▁▁▆▁▁▆▁▁▆▁▅ █
  8.33 ns      Histogram: log(frequency) by time     9.17 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark Bessels.besselj(10, x) setup=(x=rand()*0.0000000001)
BenchmarkTools.Trial: 10000 samples with 995 evaluations.
 Range (min … max):  30.653 ns … 50.335 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     30.946 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   31.106 ns ±  0.820 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▄▆█▇▇▇▇▄▂ ▁          ▁▁▂▄▂▂▂▂                              ▂
  ████████████▇▆▇▆▇▇█▇████████████▇█▅▆▅▄▁▄▃▃▃▃▅▁▃▁▃▄▁▅▄▅▅▆▅▁▆ █
  30.7 ns      Histogram: log(frequency) by time      33.8 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

One thing to note is that Bessels.jl doesn't differentiate between nu being an integer and some abstract float. On some domains it will be faster to have dedicated routines for integer orders and using forward recurrence for besselj but that doesn't seem to be what is happening here though...

A benchmark for larger order...

julia> @benchmark SpecialFunctions.besselj(30, x) setup=(x=rand()*0.0000000001)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min … max):  17.743 ns … 32.983 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     17.869 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   17.921 ns ±  0.566 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▃ ▆  █ ▆  ▄ ▁                                             ▂
  ▅▁█▁█▁▁█▁█▁▁█▁█▁▁▇▁▆▁▁▅▁▆▁▁▄▁▁▁▃▁▁▃▁▃▁▁▄▁▇▁▁█▁▆▁▁▅▁▅▁▁▄▁▃▁▃ █
  17.7 ns      Histogram: log(frequency) by time      18.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

As it has to iterate to higher orders. Now the main problem is that for large nu and tiny x the SpecialFunctions routine also just switches over to the power series directly instead of calling besselj0 and besselj1

julia> @benchmark SpecialFunctions.besselj(100, x) setup=(x=rand()*0.0000000001)

BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min … max):  4.583 ns … 17.625 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     4.708 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.710 ns ±  0.290 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▁           ▆          █           ▄          ▂ ▁
  ▅▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█ █
  4.58 ns      Histogram: log(frequency) by time     4.79 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

The difficulty is that once the order gets higher the Bessels.jl routine immediately goes to the large order asymptotic expansion and so you are hitting that branch instead of the power series.....

heltonmc · 2022-10-17T20:50:32Z

Hmmm on a different inspection it looks like for x< 0.1 a branch is made for just the power series. For very large orders it will return 0 with no computation explaining the very fast performance there. It looks like that is what it is hitting for largish orders...... Now the question is why our power series is so much slower... we don't assume anything about nu which requires us to call gamma instead of factorial... so the lower bound of the power series will always be dominated by the gamma call (which benchmarks at around the same as the benchmarks above)

So we perhaps could fix the power series version to have some check for integer in the gamma call.....

heltonmc added the enhancement New feature or request label Oct 22, 2022

This was referenced Oct 24, 2022

minor gamma cleanups. #62

Merged

Small dispatch fix for gamma #63

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degradation for very small arguments for besselj #58

Performance degradation for very small arguments for besselj #58

jwscook commented Oct 17, 2022

heltonmc commented Oct 17, 2022 •

edited

Loading

heltonmc commented Oct 17, 2022 •

edited

Loading

Performance degradation for very small arguments for besselj #58

Performance degradation for very small arguments for besselj #58

Comments

jwscook commented Oct 17, 2022

heltonmc commented Oct 17, 2022 • edited Loading

heltonmc commented Oct 17, 2022 • edited Loading

heltonmc commented Oct 17, 2022 •

edited

Loading

heltonmc commented Oct 17, 2022 •

edited

Loading