Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AD.jacobian much slower than Zygote.jacobian #54

Closed
gdalle opened this issue Mar 26, 2022 · 3 comments · Fixed by #128
Closed

AD.jacobian much slower than Zygote.jacobian #54

gdalle opened this issue Mar 26, 2022 · 3 comments · Fixed by #128
Labels
performance Performance analysis or optimization

Comments

@gdalle
Copy link
Member

gdalle commented Mar 26, 2022

Hi, and thanks for this amazing interface!

When computing jacobians, I recently noted a significant speed difference between standalone Zygote and Zygote used as an AD backend. The allocations also differ wildly. Do you happen to know where that comes from?

Here's a minimal working example:

julia> using AbstractDifferentiation, BenchmarkTools, Zygote

julia> f(x) = x .^ 2
f (generic function with 1 method)

julia> x = rand(100);

julia> ab = AD.ZygoteBackend()
AbstractDifferentiation.ReverseRuleConfigBackend{Zygote.ZygoteRuleConfig{Zygote.Context}}(Zygote.ZygoteRuleConfig{Zygote.Context}(Zygote.Context(nothing)))

julia> Zygote.jacobian(f, x)
([1.3044988022198039 0.0  0.0 0.0; 0.0 1.9710054432976252  0.0 0.0;  ; 0.0 0.0  1.9276513864536091 0.0; 0.0 0.0  0.0 1.6930015760093526],)

julia> AD.jacobian(ab, f, x)
([1.3044988022198039 0.0  0.0 0.0; 0.0 1.9710054432976252  0.0 0.0;  ; 0.0 0.0  1.9276513864536091 0.0; 0.0 0.0  0.0 1.6930015760093526],)

julia> @benchmark Zygote.jacobian(f, x)
BenchmarkTools.Trial: 7001 samples with 1 evaluation.
 Range (min  max):  629.524 μs    5.403 ms  ┊ GC (min  max): 0.00%  87.03%
 Time  (median):     681.270 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   712.290 μs ± 295.183 μs  ┊ GC (mean ± σ):  3.42% ±  6.90%

                    ▃█▄                                          
  ▁▁▁▁▁▁▂▁▁▁▁▂▂▂▂▁▁▃███▇▄▄▃▅▅▅▄▄▃▃▃▃▂▂▂▂▂▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  630 μs           Histogram: frequency by time          778 μs <

 Memory estimate: 427.06 KiB, allocs estimate: 4141.

julia> @benchmark AD.jacobian(ab, f, x)
BenchmarkTools.Trial: 1393 samples with 1 evaluation.
 Range (min  max):  2.673 ms  64.564 ms  ┊ GC (min  max):  0.00%  94.96%
 Time  (median):     2.999 ms              ┊ GC (median):     0.00%
 Time  (mean ± σ):   3.583 ms ±  4.786 ms  ┊ GC (mean ± σ):  11.42% ±  8.30%

  ▂▇██▆▆▅▄▃▂                                                  
  █████████████▆▇█▆▅▅▅▅▅▅▅▅▄▅▁▆▁▁▁▁▄▄▁▄▁▄▁▅▄▄▁▁▁▄▁▅▄▄▁▁▁▁▄▄▄ █
  2.67 ms      Histogram: log(frequency) by time      8.1 ms <

 Memory estimate: 1.26 MiB, allocs estimate: 28023.
@gdalle gdalle added the performance Performance analysis or optimization label Oct 5, 2023
@devmotion
Copy link
Member

Possibly, recent changes have already reduced the performance gap. But I assume possibly the original benchmarks in this issues were not completely accurate - note that ab, f, and x are non-constant globals and not interpolated in the @benchmarking. Typically, this leads to incorrectly high number of allocations and incorrectly slow benchmark results (https://juliaci.github.io/BenchmarkTools.jl/stable/manual/#Interpolating-values-into-benchmark-expressions). On the master branch I get

julia> import AbstractDifferentiation as AD

julia> using BenchmarkTools, Zygote

julia> f(x) = x .^ 2
f (generic function with 1 method)

julia> x = rand(100);

julia> ab = AD.ZygoteBackend()
AbstractDifferentiation.ReverseRuleConfigBackend{Zygote.ZygoteRuleConfig{Zygote.Context{false}}}(Zygote.ZygoteRuleConfig{Zygote.Context{false}}(Zygote.Context{false}(nothing)))

julia> Zygote.jacobian(f, x)
([0.09722052175255036 0.0  0.0 0.0; 0.0 1.4127603956877295  0.0 0.0;  ; 0.0 0.0  1.4095040963600012 0.0; 0.0 0.0  0.0 0.864262095462434],)

julia> AD.jacobian(ab, f, x)
([0.09722052175255036 0.0  0.0 0.0; 0.0 1.4127603956877295  0.0 0.0;  ; 0.0 0.0  1.4095040963600012 0.0; 0.0 0.0  0.0 0.864262095462434],)

julia> @benchmark Zygote.jacobian($f, $x)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  365.542 μs   2.172 ms  ┊ GC (min  max): 0.00%  79.38%
 Time  (median):     375.958 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   384.061 μs ± 67.837 μs  ┊ GC (mean ± σ):  1.61% ±  5.99%

        █▆▁▃▂
  ▂▄▄▃▄▇█████▇▅▅▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▁▂▂▂▂▂ ▃
  366 μs          Histogram: frequency by time          437 μs <

 Memory estimate: 416.30 KiB, allocs estimate: 3639.

julia> @benchmark AD.jacobian($ab, $f, $x)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  463.333 μs   1.184 ms  ┊ GC (min  max): 0.00%  55.33%
 Time  (median):     474.750 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   482.724 μs ± 58.422 μs  ┊ GC (mean ± σ):  1.46% ±  5.94%

  ▆█▅▃▂                                                        ▁
  ███████▇▆▅▅▁▃▁▃▁▁▁▁▃▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▄▁▃▄▄▄▃▅▅▅▆▆ █
  463 μs        Histogram: log(frequency) by time       929 μs <

 Memory estimate: 701.36 KiB, allocs estimate: 15919.

@gdalle
Copy link
Member Author

gdalle commented Feb 4, 2024

That's reassuring! Indeed I was still a newbie when I did these non-interpolated benchmarks

@devmotion
Copy link
Member

IMO it's a bug nevertheless - I'd expect AD.jacobian to just call Zygote.jacobian.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance analysis or optimization
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants