Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation for very small arguments for besselj #58

Open
jwscook opened this issue Oct 17, 2022 · 2 comments
Open

Performance degradation for very small arguments for besselj #58

jwscook opened this issue Oct 17, 2022 · 2 comments
Labels
enhancement New feature or request

Comments

@jwscook
Copy link

jwscook commented Oct 17, 2022

I have augmented my gist to compare the speed of besselj in Bessels vs SpecialFunctions. Bessels is overall faster except for some orders for very small arguments.

BesseljTime.pdf

@heltonmc
Copy link
Member

heltonmc commented Oct 17, 2022

Just to clarify this isn't a performance degradation Bessels.jl from version to version? It is simply that SpecialFunctions outperforms in some domains?

My attempt to reproduce this is....

julia> @benchmark SpecialFunctions.besselj(10, x) setup=(x=rand()*0.0000000001)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  8.333 ns  20.084 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     8.458 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   8.482 ns ±  0.373 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▃  ▇  █  ▄  ▄  ▁                                         ▂
  ▆▁█▁▁█▁▁█▁▁█▁▁█▁▁█▁▁▅▁▁▅▁▁▇▁█▁▁▆▁▁▆▁▁▆▁▁▅▁▁▅▁▁▄▁▁▆▁▁▆▁▁▆▁▅ █
  8.33 ns      Histogram: log(frequency) by time     9.17 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark Bessels.besselj(10, x) setup=(x=rand()*0.0000000001)
BenchmarkTools.Trial: 10000 samples with 995 evaluations.
 Range (min  max):  30.653 ns  50.335 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     30.946 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   31.106 ns ±  0.820 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▄▆█▇▇▇▇▄▂ ▁          ▁▁▂▄▂▂▂▂                              ▂
  ████████████▇▆▇▆▇▇█▇████████████▇█▅▆▅▄▁▄▃▃▃▃▅▁▃▁▃▄▁▅▄▅▅▆▅▁▆ █
  30.7 ns      Histogram: log(frequency) by time      33.8 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

One thing to note is that Bessels.jl doesn't differentiate between nu being an integer and some abstract float. On some domains it will be faster to have dedicated routines for integer orders and using forward recurrence for besselj but that doesn't seem to be what is happening here though...

A benchmark for larger order...

julia> @benchmark SpecialFunctions.besselj(30, x) setup=(x=rand()*0.0000000001)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min  max):  17.743 ns  32.983 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     17.869 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   17.921 ns ±  0.566 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▃ ▆  █ ▆  ▄ ▁                                             ▂
  ▅▁█▁█▁▁█▁█▁▁█▁█▁▁▇▁▆▁▁▅▁▆▁▁▄▁▁▁▃▁▁▃▁▃▁▁▄▁▇▁▁█▁▆▁▁▅▁▅▁▁▄▁▃▁▃ █
  17.7 ns      Histogram: log(frequency) by time      18.7 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

As it has to iterate to higher orders. Now the main problem is that for large nu and tiny x the SpecialFunctions routine also just switches over to the power series directly instead of calling besselj0 and besselj1

julia> @benchmark SpecialFunctions.besselj(100, x) setup=(x=rand()*0.0000000001)

BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
 Range (min  max):  4.583 ns  17.625 ns  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.708 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.710 ns ±  0.290 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▁           ▆          █           ▄          ▂ ▁
  ▅▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█ █
  4.58 ns      Histogram: log(frequency) by time     4.79 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

The difficulty is that once the order gets higher the Bessels.jl routine immediately goes to the large order asymptotic expansion and so you are hitting that branch instead of the power series.....

@heltonmc
Copy link
Member

heltonmc commented Oct 17, 2022

Hmmm on a different inspection it looks like for x< 0.1 a branch is made for just the power series. For very large orders it will return 0 with no computation explaining the very fast performance there. It looks like that is what it is hitting for largish orders...... Now the question is why our power series is so much slower... we don't assume anything about nu which requires us to call gamma instead of factorial... so the lower bound of the power series will always be dominated by the gamma call (which benchmarks at around the same as the benchmarks above)

So we perhaps could fix the power series version to have some check for integer in the gamma call.....

@heltonmc heltonmc added the enhancement New feature or request label Oct 22, 2022
This was referenced Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants