Add ChainRules support #71

MilesCranmer · 2024-04-28T02:40:09Z

Implements ChainRulesCore.rrule for eval_tree_array for the tree and the X argument.

For the tree argument I had to implement something custom because ChainRulesCore.Tangent doesn't support recursive types. To get around this I implement

struct NodeTangent{T,N<:AbstractExpressionNode{T},A<:AbstractArray{T}} <: AbstractTangent
    tree::N
    gradient::A
end

where gradient is a vector gradient of the constants in the tree in the usual depth-first order. It has some of the AbstractTangent interface implemented (as much as makes sense).

However this probably requires some care in downstream uses because it's not an array.

@avik-pal perhaps this is useful for the Lux.jl extension? (Would love to hear what you think of this PR, btw, given your experience in this area)

TODO:

(Maybe for later) rewrite Optim.optimize extension to use Zygote AD with this interface. Or at least be compatible with user-passed gradients that return a NodeTangent.

coveralls · 2024-04-28T02:51:26Z

Pull Request Test Coverage Report for Build 8870522033

Details

27 of 27 (100.0%) changed or added relevant lines in 1 file are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.08%) to 94.789%

Totals
Change from base Build 8870212630:	0.08%
Covered Lines:	1637
Relevant Lines:	1727

💛 - Coveralls

avik-pal · 2024-04-28T16:04:02Z

@avik-pal perhaps this is useful for the Lux.jl extension? (Would love to hear what you think of this PR, btw, given your experience in this area)

This looks great, I think I will be able to remove some of the custom handling I had in Lux for this

MilesCranmer · 2024-04-28T18:18:57Z

Fantastic. Thanks for looking!

avik-pal · 2024-04-28T19:04:40Z

Do you plan to capture ForwardDiff calls as well? I was unsure how to capture them at the Node constants level, for Lux I handled them at the parameters level https://github.com/LuxDL/Lux.jl/blob/main/ext/LuxDynamicExpressionsForwardDiffExt.jl#L8-L52

MilesCranmer · 2024-04-28T19:41:08Z

Do you plan to capture ForwardDiff calls as well?

I would be very happy to have ForwardDiff support for tree constants. For my own use-cases its lower on the priority list, so not sure when I'll get to it. The rrule is so far a priority for me as I want to have some Zygote-based AD optimization in SymbolicRegression.jl (right now its still finite difference-based – which surprisingly hasn't been so bad given it's low-dimensional, but can get a bit slow for very complex expressions).

I was unsure how to capture them at the Node constants level, for Lux I handled them at the parameters level https://github.com/LuxDL/Lux.jl/blob/main/ext/LuxDynamicExpressionsForwardDiffExt.jl#L8-L52

Nice! I'm not sure how to translate this but let me know if you'd be open to moving it over here. Not sure how much work it would be though.

capture them at the Node constants level

In the Optim.optimize what I will do is store a vector of Ref to the constant nodes, and just update them via dereferencing. (Not sure if this is what you were asking).

DynamicExpressions.jl/ext/DynamicExpressionsOptimExt.jl

Lines 91 to 94 in 27b6199

    
           constant_refs = filter_map( 
        
               t -> t.degree == 0 && t.constant, t -> Ref(t), tree, Ref{typeof(tree)} 
        
           ) 
        
           x0 = T[copy(t[].val) for t in constant_refs]

Then I can update all the parameters by

DynamicExpressions.jl/ext/DynamicExpressionsOptimExt.jl

Lines 114 to 117 in 27b6199

    
           minimizer = Optim.minimizer(base_res) 
        
           @inbounds for i in eachindex(constant_refs, minimizer) 
        
               constant_refs[i][].val = minimizer[i] 
        
           end

The nice part about this is that it also works for GraphNode where you have multiple parents pointing to the same child – the filter_map will only return a single Ref to the child node, so you don't end up optimizing the same parameter from two elements.

avik-pal · 2024-04-28T19:52:20Z

Yes, I am definitely open to moving them here. What I meant with capturing them is how to define the dispatch. For eg, in Lux since I keep the parameters extracted in a vector so it is simple enough to write ::AbstractVector{<:Dual}. I am not sure how to detect ForwardDiff Duals "nicely" when they are part of the Nodes.

It is possible to do it here

DynamicExpressions.jl/ext/DynamicExpressionsOptimExt.jl

Lines 38 to 45 in 27b6199

    
           function wrapped_f(args::Vararg{Any,M}) where {M} 
        
               first_args = args[1:(end - 1)] 
        
               x = last(args) 
        
               @inbounds for i in eachindex(constant_refs, x) 
        
                   constant_refs[i][].val = x[i] 
        
               end 
        
               return @inline(f(first_args..., tree)) 
        
           end

I think, because that code won't natively work with ForwardDiff.

It is also possible that ForwardDiff might be efficient enough without this special handling, given that you mention FiniteDifferences is already fast.

MilesCranmer · 2024-04-28T21:01:37Z

I see, thanks. Seems a bit trickier. Will think more...

MilesCranmer added 2 commits April 28, 2024 02:43

feat: add ChainRulesCore.rrule for eval_tree_array

1aabe1d

test: simple test of chain rules

4a231b9

This comment was marked as resolved.

Sign in to view

test: NaN branches of rrule

906b847

MilesCranmer force-pushed the chainrules-core branch from 911a7e7 to e0344a1 Compare April 28, 2024 20:34

Merge branch 'master' into chainrules-core

b688654

MilesCranmer force-pushed the chainrules-core branch from e0344a1 to b688654 Compare April 28, 2024 20:36

MilesCranmer added 2 commits April 28, 2024 22:26

refactor: export NodeTangent

64b803e

test: Zygote optimization within optim

749385a

MilesCranmer merged commit 6211067 into master Apr 28, 2024
16 checks passed

MilesCranmer deleted the chainrules-core branch April 28, 2024 22:18

avik-pal mentioned this pull request Apr 29, 2024

Rework ChainRules for DynamicExpressions LuxDL/Lux.jl#608

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ChainRules support #71

Add ChainRules support #71

MilesCranmer commented Apr 28, 2024 •

edited

Loading

coveralls commented Apr 28, 2024 •

edited

Loading

This comment was marked as resolved.

avik-pal commented Apr 28, 2024

MilesCranmer commented Apr 28, 2024

avik-pal commented Apr 28, 2024

MilesCranmer commented Apr 28, 2024 •

edited

Loading

avik-pal commented Apr 28, 2024

MilesCranmer commented Apr 28, 2024

Add ChainRules support #71

Add ChainRules support #71

Conversation

MilesCranmer commented Apr 28, 2024 • edited Loading

coveralls commented Apr 28, 2024 • edited Loading

Pull Request Test Coverage Report for Build 8870522033

Details

💛 - Coveralls

This comment was marked as resolved.

avik-pal commented Apr 28, 2024

MilesCranmer commented Apr 28, 2024

avik-pal commented Apr 28, 2024

MilesCranmer commented Apr 28, 2024 • edited Loading

avik-pal commented Apr 28, 2024

MilesCranmer commented Apr 28, 2024

MilesCranmer commented Apr 28, 2024 •

edited

Loading

coveralls commented Apr 28, 2024 •

edited

Loading

MilesCranmer commented Apr 28, 2024 •

edited

Loading