Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sparse Hessian decompression #190

Merged
merged 21 commits into from
Jul 30, 2022

Conversation

ElOceanografo
Copy link
Contributor

@ElOceanografo ElOceanografo commented Jun 15, 2022

Following up on #142, this is a basic implementation of sparse Hessian decompression using matrix coloring. (See also SciML/Optimization.jl#269.) Basic interface is modeled on the existing one for Jacobians; it includes forwarddiff_color_hessian and forwarddiff_color_hessian! methods for in-place and out-of-place computations, and a ForwardColorHesCache type containing buffers for computation. Feedback welcome!

To do:

  • Make sure there are method signatures corresponding to all the ones for Jacobians
  • More tests
  • Make sure performance is ok

@codecov-commenter
Copy link

codecov-commenter commented Jun 15, 2022

Codecov Report

Merging #190 (e1c8eaf) into master (74d9fc6) will increase coverage by 1.20%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #190      +/-   ##
==========================================
+ Coverage   79.94%   81.14%   +1.20%     
==========================================
  Files          14       15       +1     
  Lines         753      801      +48     
==========================================
+ Hits          602      650      +48     
  Misses        151      151              
Impacted Files Coverage Δ
src/SparseDiffTools.jl 100.00% <ø> (ø)
src/differentiation/compute_hessian_ad.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74d9fc6...e1c8eaf. Read the comment docs.

Project.toml Outdated Show resolved Hide resolved
- Include a ForwardDiff.GradientConfig in the ForwardColorHesCache to
  eliminate allocations
- Reduce the number of signatures for forwarddiff_color_hessian to the
  minimum needed to match forwarddiff_color_jacobian
- Get rid of default arguments for colorvec and sparsity for now
@ElOceanografo
Copy link
Contributor Author

forwarddiff_color_jacobian! has default arguments for colorvec and sparsity, which I currently don't supply for forwarddiff_color_hessian!. Currently not sure what makes the most sense:

  1. defaulting to an assumption that the Hessian is dense (i.e., colorvec = eachindex(x), sparsity = ones(length(x), length(x))),
  2. doing automatic sparsity detection when they aren't specified, or
  3. leaving it as-is and making the user explicitly supply the coloring and sparsity pattern.

@ChrisRackauckas
Copy link
Member

1

@ElOceanografo
Copy link
Contributor Author

I currently have two signatures for forwarddiff_color_hessian!:

  • forwarddiff_color_hessian!(H, f, x, colorvec, sparsity) (creates a ForwardColorHesCache)
  • forwarddiff_color_hessian!(H, f, x, hes_cache) (totally non-allocating)

There are also two corresponding signatures for forward_color_hessian, which create the sparse Hessian H instead of modifying it in place. I think this matches the interface for forwarddiff_color_jacobian!. Are any others needed?

started implementing an "All-One" array, decided not worth the
complexity at this point
Make sure it has the right signature, modify to accept GradientConfig if
needed. Also some tests for it.
@ElOceanografo
Copy link
Contributor Author

Ok, I think this is getting pretty close. @ChrisRackauckas when you've got a second, lemme know what else you think it needs and I'll try to take care of it!

README.md Outdated Show resolved Hide resolved
Comment on lines 57 to 59
hes_cache.grad!(hes_cache.dG, x, hes_cache.grad_config)
x .-= ϵ .* @view hes_cache.D[:, j]
hes_cache.buffer[:, j] .= (hes_cache.dG .- hes_cache.G) ./ ϵ
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hes_cache.grad!(hes_cache.dG, x, hes_cache.grad_config)
x .-= ϵ .* @view hes_cache.D[:, j]
hes_cache.buffer[:, j] .= (hes_cache.dG .- hes_cache.G) ./ ϵ
hes_cache.grad!(hes_cache.dG, x, hes_cache.grad_config)
x .-= 2ϵ .* @view hes_cache.D[:, j]
hes_cache.buffer[:, j] .= (hes_cache.dG .- hes_cache.G) ./ 2ϵ

If you're not going to use a cached f here, is there a reason to not do a central difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason not to is because it modifies x three times as opposed to two, saving a little time. However, profiling shows that the time saved is only ~2% of the total, and the centered difference decreases error by more than 10x, so a worthwhile tradeoff. Changed as suggested.

Can you clarify what you mean by "use a cached f?"

@ChrisRackauckas
Copy link
Member

Is this still WIP? Looks close to done.

@ElOceanografo ElOceanografo changed the title [WIP] Add sparse Hessian decompression Add sparse Hessian decompression Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants