Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of ugly macro? #123

Open
gdalle opened this issue Dec 21, 2023 · 6 comments
Open

Get rid of ugly macro? #123

gdalle opened this issue Dec 21, 2023 · 6 comments
Labels
question Inquiries and discussions

Comments

@gdalle
Copy link
Member

gdalle commented Dec 21, 2023

I was chatting with @adrhill and he suggested that the macro @primitive could be discarded if each backend simply implemented some methods from AbstractDifferentiation, mostly jacobian and a pushforward or pullback. Thoughts?

@gdalle gdalle added the question Inquiries and discussions label Dec 21, 2023
@adrhill
Copy link
Contributor

adrhill commented Dec 22, 2023

To elaborate on this:

What the macro does

As far as I understand, the @primitive macro is used on pullbacks/pushforwards from individual backends to generate the following AD.jacobian functions:

Forward-mode AD:

function define_pushforward_function_and_friends(fdef)
fdef[:name] = :($(AbstractDifferentiation).pushforward_function)
args = fdef[:args]
funcs = quote
$(ExprTools.combinedef(fdef))
function $(AbstractDifferentiation).jacobian($(args...))
identity_like = $(identity_matrix_like)($(args[3:end]...))
pff = $(pushforward_function)($(args...))
if eltype(identity_like) <: Tuple{Vararg{Union{AbstractMatrix,Number}}}
return map(identity_like) do identity_like_i
return mapreduce(hcat, $(_eachcol).(identity_like_i)...) do (cols...)
pff(cols)
end
end
elseif eltype(identity_like) <: AbstractMatrix
# needed for the computation of the Hessian and Jacobian
ret = hcat.(mapslices(identity_like[1]; dims=1) do cols
# cols loop over basis states
pf = pff((cols,))
if typeof(pf) <: AbstractVector
# to make the hcat. work / get correct matrix-like, non-flat output dimension
return (pf,)
else
return pf
end
end...)
return ret isa Tuple ? ret : (ret,)
else
return pff(identity_like)
end
end
end
return funcs
end

Reverse-mode AD:

function define_value_and_pullback_function_and_friends(fdef)
fdef[:name] = :($(AbstractDifferentiation).value_and_pullback_function)
args = fdef[:args]
funcs = quote
$(ExprTools.combinedef(fdef))
function $(AbstractDifferentiation).jacobian($(args...))
value, pbf = $(value_and_pullback_function)($(args...))
identity_like = $(identity_matrix_like)(value)
if eltype(identity_like) <: Tuple{Vararg{AbstractMatrix}}
return map(identity_like) do identity_like_i
return mapreduce(vcat, $(_eachcol).(identity_like_i)...) do (cols...)
pbf(cols)'
end
end
elseif eltype(identity_like) <: AbstractMatrix
# needed for Hessian computation:
# value is a (grad,). Then, identity_like is a (matrix,).
# cols loops over columns of the matrix
return vcat.(mapslices(identity_like[1]; dims=1) do cols
adjoint.(pbf((cols,)))
end...)
else
return adjoint.(pbf(identity_like))
end
end
end
return funcs
end

These functions compute full Jacobians by evaluating the pullbacks/pushforwards on the standard basis (identity_like).

Fallback behavior

By default, the fallback jacobian function is empty (maybe this should be replaced by a NotImplementedError):

function jacobian(ab::AbstractBackend, f, xs...) end

As shown in the implementer guide, this jacobian function is the fallback at the core of most functions exported by AbstractDifferentiation:
image

Taking reverse-mode AD as an example, the function dependency graph of value_and_pullback_function would look as follows:

  • value_and_pullback_function calls jacobian
  • jacobian is an empty function

Now, when a reverse-mode AD backend is loaded, value_and_pullback_function is defined for the backend and @primitive is called on it, the function dependency graph is inverted:

  • value_and_pullback_function calls the backend
  • a new generated jacobian calls value_and_pullback_function

The second behaviour is desired, as we wouldn't want to compute a full Jacobian just to compute a VJP when we can instead evaluate the pullback directly.

The fact that the function dependency graph is flipped was very confusing to me at first. A lot of hidden control flow is added via package extensions and the @primitive macro, which currently isn't documented in the implementer guide.

Back to the question

Why is AD.jacobian so central to AbstractDifferentiation.jl and why does it have to be generated via a macro? Can't it be implemented in a more generic way by making sure pullbacks and pushforward wrappers have consistent output types?

The only advantage I currently see is to allow users to

  • compute VJPs by constructing a full Jacobian using JVPs
  • compute JVPs by constructing a full Jacobian using VJPs

but those sound like things that should usually be avoided.

Why isn't AbstractDifferentiation.jl built around two primitives value_and_pullback_function and value_and_pushforward 1 and making more liberal use of dispatch on the AbstractReverseMode and AbstractForwardMode types?

Footnotes

  1. Ideally with in-place mutating variants.

@devmotion
Copy link
Member

Duplicate of #13, or at least #13 (comment) and the following discussion?

@oxinabox
Copy link
Member

oxinabox commented Jan 8, 2024

Why is AD.jacobian so central to AbstractDifferentiation.jl
Why isn't AbstractDifferentiation.jl built around two primitives value_and_pullback_function and value_and_pushforward

Historical reasons based mainly on the original author have a strong enough understanding of the calculus involved, but not such a strong understanding of autodiff or julia abstractions, IIRC. And the priority being on getting something out that worked and was usable. It should be.

@mohamed82008
Copy link
Member

This issue is my fault. Feel free to remove the macro if it makes things simpler.

@devmotion
Copy link
Member

BTW, regarding

Why is AD.jacobian so central to AbstractDifferentiation.jl and why does it have to be generated via a macro? Can't it be implemented in a more generic way by making sure pullbacks and pushforward wrappers have consistent output types?

#95 trimmed down the macro, it can only be used anymore to implement the jacobian based on a pushforward_function or a value_and_pullback_function. Support for ReverseDiff and FiniteDifferences is implemented without the macro already, and e.g. ForwardDiff uses the automatically constructed jacobian function only for functions with multiple arguments (the single-argument version just calls ForwardDiff.jacobian).

@mohamed82008
Copy link
Member

As I mentioned in #13 (comment) and #123 (comment), I am ok with removing the macro. It is currently a thin wrapper over a pushforward or pullback definition. Feel free to open a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Inquiries and discussions
Projects
None yet
Development

No branches or pull requests

5 participants