Computing specific columns of the Hessian matrix without computing the entire Hessian? #44
-
Hello, this is a question extended from this discourse post. I want to be able to compute the last column of the Hessian matrix without needing to compute the rest of the Hessian. Is it possible to do so via Example: as in the discourse post, if using FastDifferentiation
function f(y::AbstractVector, X::AbstractMatrix, β::AbstractVector)
0.5*sum(abs2, y - X*β)
end
f(β::AbstractVector) = f(y, X, β)
# simulate data
n = 3
p = 50
X = randn(n, p)
y = randn(n)
β = randn(p)
# autodiff Hessian
βsym = make_variables(:βsym, p)
hess = hessian(f(βsym),βsym)
hess_func! = make_function(hess,βsym,in_place=true)
resmat = similar(hess,Float64)
hess_func!(resmat,β)
all(resmat .≈ X'*X) # check autodiff Hessian agrees with analytical Hessian
resmat
50×50 Matrix{Float64}:
5.00732 0.0634119 -0.354297 … -3.11295 -0.253543 0.066441
0.0634119 1.34182 1.08908 1.49119 -0.978242 -2.30481
-0.354297 1.08908 2.41709 3.07986 -1.65422 -0.451659
-2.88024 1.6851 1.46463 3.60189 -1.02225 -3.13484
1.56025 -0.951461 -0.92282 -2.10044 0.639089 1.67104
-2.60355 1.81745 1.27099 … 3.27712 -0.966758 -3.61936
0.841776 1.359 0.223236 0.138382 -0.545505 -3.08734
-4.48811 0.740396 -0.367389 2.26609 0.428091 -2.70571
2.38963 -0.706139 -1.79286 -3.42519 1.01263 0.320257
2.76193 -0.948066 -1.18378 -3.03961 0.684057 1.54844
⋮ ⋱
2.75955 0.336452 0.875597 … -0.485302 -0.841237 0.30637
-0.119848 0.734173 0.362663 0.650242 -0.38517 -1.50126
-2.54165 -0.259515 -0.221778 1.08836 0.420392 0.150548
-1.13437 1.33122 0.554195 1.57148 -0.556493 -2.92407
-1.5965 -0.84197 -0.573529 0.0369948 0.687891 1.37603
-3.1482 0.282195 -1.66096 … 0.0193454 1.18001 -2.646
-0.376896 -0.638432 0.654172 0.740102 -0.189122 2.1776
-3.11295 1.49119 3.07986 5.41312 -1.89731 -1.13346
-0.253543 -0.978242 -1.65422 -1.89731 1.23449 0.835149
0.066441 -2.30481 -0.451659 -1.13346 0.835149 5.33428 How can I get just the last column of resmat[:, end]
50-element Vector{Float64}:
0.06644102872726781
-2.3048059531664657
-0.4516588615405185
-3.134842825476085
1.6710405587611024
-3.6193631855066832
-3.0873351124534816
-2.7057075689676293
0.32025734132895106
1.548444563321187
⋮
0.30636962324480765
-1.501261326221507
0.15054797360218636
-2.9240723405220286
1.3760347732562257
-2.6460043865926
2.177604749465641
-1.133463305431753
0.8351493597528705
5.334276680715078 I found this package to be much faster than |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 2 replies
-
You can use FastDifferentiation to compute any subset of the entries of the Jacobian or Hessian. There isn't a built in function to do this (maybe I'll add one if this comes up frequently) but you can easily do it yourself.
I think this does what you want. |
Beta Was this translation helpful? Give feedback.
-
Thank you very much, @brianguenter. Your suggestion is a bit mysterious to me, and it doesn't fully solve my real problem. In my real problem, the data matrix X changes often, and I would like to update the last column of the resulting Hessian every time X changes. It seems your Hessian function returns the same answer even if I change X? using FastDifferentiation
n = 3
p = 5
X = randn(n, p)
y = randn(n)
β = randn(p)
function f(y::AbstractVector, X::AbstractMatrix, β::AbstractVector)
0.5 * sum(abs2, y - X * β)
end
f(β::AbstractVector) = f(y, X, β)
# compute autodiff Hessian
βsym = make_variables(:βsym, p)
jac = jacobian([f(βsym)], [βsym[p]]) # compute last element of Jacobian
last_hess_col = jacobian(vec(jac), βsym) # compute last column of Hessian
hess_func! = make_function(last_hess_col, βsym, in_place=true)
resmat = similar(last_hess_col, Float64)
@time hess_func!(resmat, β)
# change data and recompute Hessian
X[:, end] .= randn(n)
resmat2 = similar(last_hess_col, Float64)
@time hess_func!(resmat2, β)
# check new autodiff Hessian agrees with new analytical Hessian
@show all(vec(resmat2) .≈ (X'*X)[:, p])
# new Hessian is still the same as old Hessian??
@show all(resmat2 .≈ resmat); The output is 0.031271 seconds (61.13 k allocations: 4.304 MiB, 99.52% compilation time)
0.000009 seconds (2 allocations: 32 bytes)
all(vec(resmat) .≈ (X' * X)[:, p]) = false
all(resmat2 .≈ resmat) = true The first Hessian evaluation is "slow", while the second evaluation is "fast" but it does not use the newly updated X. Is there a way to adjust for this? I'm sorry if my question is basic, I'm quite new to your package and autodiff in general. |
Beta Was this translation helpful? Give feedback.
-
If I re-run the using FastDifferentiation
n = 3
p = 5
X = randn(n, p)
y = randn(n)
β = randn(p)
function f(y::AbstractVector, X::AbstractMatrix, β::AbstractVector)
0.5 * sum(abs2, y - X * β)
end
f(β::AbstractVector) = f(y, X, β)
# compute autodiff Hessian
βsym = make_variables(:βsym, p)
jac = jacobian([f(βsym)], [βsym[p]]) # compute last element of Jacobian
last_hess_col = jacobian(vec(jac), βsym) # compute last column of Hessian
hess_func! = make_function(last_hess_col, βsym, in_place=true)
resmat = similar(last_hess_col, Float64)
@time hess_func!(resmat, β)
@show all(vec(resmat) .≈ (X'*X)[:, p])
# change data
X[:, end] .= randn(n)
# re-compute autodiff Hessian
jac = jacobian([f(βsym)], [βsym[p]]) # compute last element of Jacobian
last_hess_col = jacobian(vec(jac), βsym) # compute last column of Hessian
hess_func! = make_function(last_hess_col, βsym, in_place=true)
resmat2 = similar(last_hess_col, Float64)
@time hess_func!(resmat2, β)
@show all(vec(resmat2) .≈ (X'*X)[:, p]); Output: 0.031177 seconds (61.13 k allocations: 4.304 MiB, 99.46% compilation time)
all(vec(resmat) .≈ (X' * X)[:, p]) = true
0.026013 seconds (61.13 k allocations: 4.304 MiB, 99.04% compilation time)
all(vec(resmat2) .≈ (X' * X)[:, p]) = true |
Beta Was this translation helpful? Give feedback.
-
The mental model you should be using for FastDifferentiation is more like the symbolic differentiation done in Symbolics.jl. First you create a symbolic derivative with If you want something to vary at runtime then you need to have a symbolic variable for it when you create the function. |
Beta Was this translation helpful? Give feedback.
Some of your confusion may arise from the fundamentally different mechanisms ForwardDiff and FastDifferentiation use to evaluate derivatives. ForwardDiff is interpreting your Julia expression at run time. FastDifferentiation is compiling your symbolic derivative expression into an executable. Anything that is fixed at compile time will be a constant in the executable.
I didn't realize you wanted X to be a variable. You need to define a symbolic matrix
Xsym = make_variables(:Xsym,n,p)
and then create a runtime generated function that takes bothβ
andX
as arguments.The function
X_as_variable
below should do what you want. You'll notice that the first timehess_func!
is called it is relati…