gradient (and gradient!) fails on basic example #1340

biona001 · 2024-03-10T01:30:15Z

In the following least squares objective, it does not seems like calling gradient! (or gradient) is doing anything?

MWE:

using Enzyme
using LinearAlgebra

# objective = 0.5 || y - X*beta ||^2
function ols(y, X, beta, storage=zeros(size(X, 1)))
    mul!(storage, X, beta)
    storage .= y .- storage
    return 0.5 * sum(abs2, storage)
end
ols(beta::AbstractVector) = ols(y, X, beta, storage)

# simulate data
n = 100
p = 50
X = randn(n, p)
y = randn(n)
beta = randn(p)
storage = zeros(n)
ols(y, X, beta, storage)

# autodiff grad
grad_storage = zeros(length(beta))
Enzyme.gradient!(Reverse, grad_storage, ols, beta)

# analytical grad
true_grad = X' * (y - X*beta);

Compare answers:

julia> [true_grad grad_storage]
50×2 Matrix{Float64}:
  -14.636    0.0
  -15.0858   0.0
 -184.264    0.0
    8.70142  0.0
  -71.4575   0.0
 -165.231    0.0
  -93.2602   0.0
  -49.404    0.0
  -62.5627   0.0
    ⋮        
  136.334    0.0
   81.2353   0.0
  152.26     0.0
 -198.541    0.0
   41.5824   0.0
  -13.3246   0.0
  319.853    0.0
 -180.376    0.0

This is on Julia v1.9.1 with Enzyme v0.11.19

The text was updated successfully, but these errors were encountered:

biona001 · 2024-03-11T18:43:46Z

After reading Zygote.jl documentation I realized array mutation is hard for AD packages. Thus, I changed the objective slightly, and Enzyme.jl worked.

using Enzyme
using LinearAlgebra

# objective = 0.5 || y - X*beta ||^2
function ols(y, X, beta)
    storage = X * beta # avoid mul!
    obj = zero(eltype(beta))
    for i in eachindex(y)
        obj += abs2(y[i] - storage[i])
    end
    return 0.5obj
end
ols(beta::AbstractVector) = ols(y, X, beta)

# simulate data
n = 100
p = 50
X = randn(n, p)
y = randn(n)
beta = randn(p)
ols(y, X, beta)

# autodiff grad
grad1 = zeros(length(beta))
Enzyme.gradient!(Reverse, grad1, ols, beta) # method 1
grad2 = Enzyme.gradient(Reverse, ols, beta) # method 2
grad3 = zeros(length(beta))
Enzyme.autodiff(Reverse, ols, Active, Duplicated(beta, grad3)) # method 3

# analytical grad
true_grad = -X' * (y - X*beta)

# compare answers
[true_grad grad1 grad2 grad3]
50×4 Matrix{Float64}:
  223.421      223.421      223.421      223.421
   56.115       56.115       56.115       56.115
  162.285      162.285      162.285      162.285
  -58.6251     -58.6251     -58.6251     -58.6251
  109.732      109.732      109.732      109.732
 -152.867     -152.867     -152.867     -152.867
   51.7608      51.7608      51.7608      51.7608
  -64.8512     -64.8512     -64.8512     -64.8512
   38.8877      38.8877      38.8877      38.8877
  -62.3895     -62.3895     -62.3895     -62.3895
   92.2303      92.2303      92.2303      92.2303
  208.785      208.785      208.785      208.785
   47.6175      47.6175      47.6175      47.6175
    ⋮                                   
  -26.5664     -26.5664     -26.5664     -26.5664
   65.8905      65.8905      65.8905      65.8905
  279.925      279.925      279.925      279.925
  -18.8817     -18.8817     -18.8817     -18.8817
 -169.649     -169.649     -169.649     -169.649
 -134.628     -134.628     -134.628     -134.628
  273.975      273.975      273.975      273.975
    0.220214     0.220214     0.220214     0.220214
 -130.344     -130.344     -130.344     -130.344
   -3.59516     -3.59516     -3.59516     -3.59516
  204.023      204.023      204.023      204.023
 -268.678     -268.678     -268.678     -268.678

But of course I would prefer mul!(storage, X, beta) over storage = X * beta. I'm not sure whether Enzyme.jl supports this operation.

bgroenks96 · 2024-03-14T23:47:20Z

I'm not an Enzyme expert or anything, but this case is discussed pretty clearly in the doucmentation.

You need to explicitly tell Enzyme about your storage array, e.g:

Enzyme.autodiff(Reverse, ols, Active, Const(y), Const(X), Duplicated(beta, grad_storage), Duplicated(storage, zero(storage)))

will produce the correct gradient:

maximum(abs.(true_grad .- grad_storage))
# output: 0.0

It might also be possible to pre-wrap the variables with Duplicated in the closure... but I'm not sure.

biona001 · 2024-03-15T00:16:07Z

@bgroenks96 thank you very much. I actually saw that example, but its meaning somehow did not register with me when I first saw it. Closing this now.

using Enzyme
using LinearAlgebra

# objective = 0.5 || y - X*beta ||^2
function ols(y, X, beta, storage=zeros(size(X, 1)))
    mul!(storage, X, beta)
    storage .= y .- storage
    return 0.5 * sum(abs2, storage)
end
ols(beta::AbstractVector) = ols(y, X, beta, storage)

# simulate data
n = 100
p = 50
X = randn(n, p)
y = randn(n)
beta = randn(p)
storage = zeros(n)
ols(y, X, beta, storage)

# analytical grad
true_grad = -X' * (y - X*beta)

# Enzyme grad with inplace mul!
grad_storage = zeros(length(beta))
Enzyme.autodiff(Reverse, ols, Active, Const(y), Const(X), Duplicated(beta, grad_storage), Duplicated(storage, zero(storage)))

julia> [true_grad grad_storage]
50×2 Matrix{Float64}:
  191.052      191.052
  117.039      117.039
  -46.2872     -46.2872
  -59.5048     -59.5048
 -162.259     -162.259
 -241.862     -241.862
 -118.645     -118.645
    8.81539      8.81539
   32.7351      32.7351
   -4.78171     -4.78171
    ⋮         
  219.782      219.782
  137.305      137.305
    0.131725     0.131725
  -38.0468     -38.0468
 -107.653     -107.653
  141.702      141.702
  138.302      138.302
  107.221      107.221
   59.7225      59.7225

vchuravy added the linear algebra label Mar 11, 2024

biona001 closed this as completed Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gradient (and gradient!) fails on basic example #1340

gradient (and gradient!) fails on basic example #1340

biona001 commented Mar 10, 2024 •

edited

biona001 commented Mar 11, 2024

bgroenks96 commented Mar 14, 2024

biona001 commented Mar 15, 2024

gradient (and gradient!) fails on basic example #1340

gradient (and gradient!) fails on basic example #1340

Comments

biona001 commented Mar 10, 2024 • edited

biona001 commented Mar 11, 2024

bgroenks96 commented Mar 14, 2024

biona001 commented Mar 15, 2024

biona001 commented Mar 10, 2024 •

edited