Hessian of a vector function? #61

GravityAssisted · 2015-09-29T02:09:12Z

Hi,

Is it possible to calculate Hessian of a vector valued function ?

E.g, lets say I have a function:
f(x::Vector) = x./(sqrt(x[1]^2+x[2]^2+x[3]^3))

I can easily calculate its Jacobian by:
ForwardDiff.jacobian(f,rand(3))

but I get a convert error when I try to calculate Hessian. I assume this is because the hessian is expecting a scalar function. Is there trick to get around this ?

The hessian for the above example would be a 3d tensor, with each matrix component corresponding to a hessian w.r.t. to each output component of the function.

The text was updated successfully, but these errors were encountered:

mlubin · 2015-09-29T02:11:22Z

No reason this shouldn't be possible, I guess you're the first one to request it

mlubin · 2015-09-29T02:11:55Z

I'm not too familiar with terminology in this case, is it still typically called the Hessian?

GravityAssisted · 2015-09-29T02:15:46Z

Its commonly called a vector hessian, I think. In our field we just call it a hessian of a vector function. Not sure if there is an official name for this. Wikipedia has a small section on this called "vector -valued function":

https://en.wikipedia.org/wiki/Hessian_matrix

jrevels · 2015-09-29T02:19:07Z

This isn't currently implemented, but like @mlubin said, it's technically possible. If we do decide to provide a built-in function for this, though, we'd have to do some thinking about how it'd fit in the API. @johnmyleswhite any API opinions on this?

My initial reaction would be to provide a vechessian function, or something like that, rather than putting more burden on hessian.

mlubin · 2015-09-29T02:20:15Z

Agreed on a separate name for this.

GravityAssisted · 2015-09-29T02:21:44Z

This type of hessian common in our field of astrodynamics, due to gravity being a vector valued function.

Thanks for looking into this.

johnmyleswhite · 2015-09-29T02:22:01Z

+1 to a separate name. No idea what to call these things. For extensibility maybe we should go with an API that passes in types as arguments so we can reuse names more often?

jrevels · 2015-09-29T02:42:24Z

You mean something like:

hessian(f, x, ::Type{Number}) # expect f to be scalar valued
hessian(f, x, ::Type{Vector}) # expect f to be vector valued

...or did you mean passing in types so that we can better distinguish method signatures with name conflicts (a la ForwardDiff.gradient vs. Calculus.gradient)?

Somewhat off-topic: Funny story, my initial API redesign for ForwardDiff passed all the config info via abstract types (chunk_size ---> ChunkSize{n} etc). Once a couple of options had been added, I tried to switch to keyword args so that users wouldn't have to remember complicated default arg ordering. That's when I found out that keyword arg type inference is wonky and settled on the current API + assertions in problem areas to help out inferencing.

johnmyleswhite · 2015-09-29T04:33:04Z

Yeah, that is what I had in mind. Stefan was also talking about the idea of passing the type of differentiation as a type to make things even more generic, but that does start to be a whole lot of types.

Keyword args do need another pass at their design from what I can see.

cortner · 2015-12-01T11:18:43Z

In our case, we would like to take 1st, 2nd and 3rd derivatives of a Hamiltonian matrix, i.e.
f(x::Vector) -> Matrix
More generally, all of the differentiation implemented to far are special cases of
f : d-tensor -> m-tensor
where each derivative increases the dimension of the output tensor. Bearing in mind that I have zero experience with AD, I wonder can a general abstract setting like this be implemented and gradient, hessian, jacobian, etc just become special cases?

Christoph

shabbychef · 2016-01-14T21:26:30Z

One way to implement the multivariate case is to represent the derivative simply as a matrix, with rows equal to the total number of elements in the dependent variable, and columns equal to the number of elements in the independent variable (so called 'numerator layout'). The user (or intermediary code) has to interpret this when needed in terms of the original dimensionality of the independent and dependent variables, but it is not too much of a pain. This does not, I think, get you much towards computing higher order derivatives. In this scheme the second derivative would be represented as 3-dimensional array, not taking advantage of the symmetry in the second and third dimensions.

Full disclosure notice: I've implemented a multivariate AD system in R (and am lurking here looking for good ideas to steal), and considered supporting higher order derivatives. I concluded this would require I compute arbitrary order derivatives for every method I supported on my objects, which was unpleasant at best. For element-wise operations, I could rely on elementary calculus, but computing, say, the second derivative of the matrix inverse operation was too hairy for my tastes. I abandoned higher order derivatives in favor of supporting multivariate operations. I am hoping ForwardDiff devs find a clever solution to this problem.

jrevels · 2016-06-27T01:02:09Z

So I revisited this with the latest version of ForwardDiff and lo-and-behold:

using ForwardDiff

f(x) = x./(sqrt(x[1]^2+x[2]^2+x[3]^3))

function vector_hessian(f, x)
    n = length(x)
    out = ForwardDiff.jacobian(x -> ForwardDiff.jacobian(f, x), x)
    return reshape(out, n, n, n)
end

julia> vector_hessian(f, [1,2,3])
3×3×3 Array{Float64,3}:
[:, :, 1] =
 -0.0160549  -0.0100127   -0.067586
 -0.0100127  -0.00345267   0.0139833
 -0.0150191   0.0031074    0.0154507

[:, :, 2] =
 -0.0100127   -0.00345267   0.0139833
 -0.00345267  -0.0290024   -0.046611
  0.0031074   -0.010358     0.0309014

[:, :, 3] =
 -0.067586    0.0139833   0.0446689
  0.0139833  -0.046611    0.0893378
  0.0154507   0.0309014  -0.0151486

@GravityAssisted Is this kind of output you expect here? If this is delivering the correct result, you can tweak vector_hessian to your liking using ForwardDiff.jacobian/ForwardDiff.jacobian! (e.g. to support f!(y, x) as well as f(x)).

timholy · 2024-04-17T08:32:09Z

Note that a better way to write that reshape of the Hessian of a vector-valued function is

reshape(out, (eachindex(fx), eachindex(x), eachindex(x)))

where fx = f(x). This makes it clearer that the first index is for the output of f, and the last two are for the coordinates of x.

jrevels added a commit that referenced this issue Aug 18, 2016

add test and docs for vector-valued Hessians. Fixes #61

08e8324

jrevels closed this as completed in 5ae9445 Aug 18, 2016

albertomercurio mentioned this issue Jul 7, 2022

Support for geodesic accelaration with vector function JuliaNLSolvers/LsqFit.jl#219

Open

timholy mentioned this issue Apr 24, 2024

Second order for vector to vector functions gdalle/DifferentiationInterface.jl#206

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hessian of a vector function? #61

Hessian of a vector function? #61

GravityAssisted commented Sep 29, 2015

mlubin commented Sep 29, 2015

mlubin commented Sep 29, 2015

GravityAssisted commented Sep 29, 2015

jrevels commented Sep 29, 2015

mlubin commented Sep 29, 2015

GravityAssisted commented Sep 29, 2015

johnmyleswhite commented Sep 29, 2015

jrevels commented Sep 29, 2015

johnmyleswhite commented Sep 29, 2015

cortner commented Dec 1, 2015

shabbychef commented Jan 14, 2016

jrevels commented Jun 27, 2016

timholy commented Apr 17, 2024

Hessian of a vector function? #61

Hessian of a vector function? #61

Comments

GravityAssisted commented Sep 29, 2015

mlubin commented Sep 29, 2015

mlubin commented Sep 29, 2015

GravityAssisted commented Sep 29, 2015

jrevels commented Sep 29, 2015

mlubin commented Sep 29, 2015

GravityAssisted commented Sep 29, 2015

johnmyleswhite commented Sep 29, 2015

jrevels commented Sep 29, 2015

johnmyleswhite commented Sep 29, 2015

cortner commented Dec 1, 2015

shabbychef commented Jan 14, 2016

jrevels commented Jun 27, 2016

timholy commented Apr 17, 2024