`@benchmark` on `ReverseDiff.gradient!` gives a shorter time than the original function #102

xukai92 · 2018-02-22T21:06:34Z

type LP
    lp::Union{Float64,ReverseDiff.TrackedReal}
end

lp = LP(0.0)

f2(x) = begin
    lp.lp = 0.0
    lp.lp = logpdf(Normal(0, 1), x[1])
    for i = 2:length(x)
        lp.lp += logpdf(Normal(x[i-1], 1), x[i])
    end
    lp.lp
end

f_tape2 = GradientTape(f2, (x))
compiled_f_tape2 = compile(f_tape2)
inputs = (x)
results = (similar(x))
;

@benchmark f2(x) # => mean=1.336ms
@benchmark ReverseDiff.gradient!(results, compiled_f_tape2, x) # => mean=208.294 μs

jrevels · 2018-02-22T22:12:08Z

Are the output results correct, or is ReverseDiff giving the wrong answer in your case?

xukai92 · 2018-02-24T17:15:39Z

Yes I just checked - the answers are the same. I also improve the type stability suggested by Chris, and the first one runs faster than before, though still slower than ReverseDiff.jl one. See below

type LP2{T<:Union{Float64,ReverseDiff.TrackedReal}}
    lp::T
end

x = rand(1000)
lp = LP2(0.0)

f2(x) = begin
    lp.lp = logpdf(Normal(0, 1), x[1])
    for i = 2:length(x)
        lp.lp += logpdf(Normal(x[i-1], 1), x[i])
    end
    lp.lp
end

f_tape2 = GradientTape(f2, (x))
compiled_f_tape2 = compile(f_tape2)
inputs = (x)
results = (similar(x))
all_results = DiffResults.GradientResult(results)
cfg = GradientConfig(inputs)

f2(x) # => -1004.707726395706

@benchmark f2(x)

# BenchmarkTools.Trial: 
#   memory estimate:  62.48 KiB
#   allocs estimate:  3999
#   --------------
#   minimum time:     287.937 μs (0.00% GC)
#   median time:      295.351 μs (0.00% GC)
#   mean time:        367.426 μs (1.16% GC)
#   maximum time:     2.283 ms (76.92% GC)
#   --------------
#   samples:          10000
#   evals/sample:     1

@benchmark ReverseDiff.gradient!(all_results, compiled_f_tape2, x)

# BenchmarkTools.Trial: 
#   memory estimate:  0 bytes
#   allocs estimate:  0
#   --------------
#   minimum time:     184.500 μs (0.00% GC)
#   median time:      187.636 μs (0.00% GC)
#   mean time:        188.634 μs (0.00% GC)
#   maximum time:     545.807 μs (0.00% GC)
#   --------------
#   samples:          10000
#   evals/sample:     1

all_results.value # => -1004.707726395706

Do you think it might be the case when passing variables as ReverseDiff.TrackedReal it might somehow resolve the type-instability thing in the original program?

jrevels · 2018-02-24T19:32:48Z

I've only skimmed your results rather than trying it out myself, but compiling a ReverseDiff tape can a) precompute dispatch and b) preallocate/reuse buffers for the execution trace, such that executing the tape (which calculates both the original value and gradients) can be faster than just executing the target code on it's own.

I've thought in the past about just making a generic "accelerator" package using this taping mechanism that doesn't do AD, but never got around to it.

Closing this since there doesn't seem to be a problem here, but thanks for sharing!

jrevels closed this as completed Feb 24, 2018

xukai92 mentioned this issue Sep 10, 2018

observe and assume TuringLang/Turing.jl#504

Closed

xukai92 mentioned this issue Feb 1, 2019

LDA Model incredibly slow, even in low dimension TuringLang/Turing.jl#668

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`@benchmark` on `ReverseDiff.gradient!` gives a shorter time than the original function #102

`@benchmark` on `ReverseDiff.gradient!` gives a shorter time than the original function #102

xukai92 commented Feb 22, 2018

jrevels commented Feb 22, 2018

xukai92 commented Feb 24, 2018

jrevels commented Feb 24, 2018

@benchmark on ReverseDiff.gradient! gives a shorter time than the original function #102

@benchmark on ReverseDiff.gradient! gives a shorter time than the original function #102

Comments

xukai92 commented Feb 22, 2018

jrevels commented Feb 22, 2018

xukai92 commented Feb 24, 2018

jrevels commented Feb 24, 2018

`@benchmark` on `ReverseDiff.gradient!` gives a shorter time than the original function #102

`@benchmark` on `ReverseDiff.gradient!` gives a shorter time than the original function #102