Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect gradients when using views #979

Closed
wisp3rwind opened this issue Aug 4, 2023 · 3 comments · Fixed by #985
Closed

Incorrect gradients when using views #979

wisp3rwind opened this issue Aug 4, 2023 · 3 comments · Fixed by #985

Comments

@wisp3rwind
Copy link

wisp3rwind commented Aug 4, 2023

The following code shows an example of incorrect gradients when using views (see output and comments for details). It would be great if you could have a look; thanks!

EDIT: This is Enzyme#main on

Julia Version 1.9.2
Commit e4ee485e909 (2023-07-05 09:39 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, tigerlake)
  Threads: 1 on 8 virtual cores

EDIT#2: De-cluttered MWE.

MWE

using Enzyme

function V(r::AbstractMatrix)
    res = 0.0
    for ij in CartesianIndices(r)
        i, j = Tuple(ij)
        res += i * j * r[ij]^2
    end
    res
end

function test!(R, dVdR)
    autodiff(Enzyme.Reverse, V, Active, Duplicated(R, dVdR))
end

function test_views!(dy, y)
    R = reshape((@view y[7:12]), (3, 2))
    dVdR = reshape((@view dy[1:6]), (3, 2))
    
    dVdR .= 0.0
    autodiff(Enzyme.Reverse, V, Active, Duplicated(R, dVdR))
end

# Expected result, without using any view/reshaping
R1 = [
    0.0 0.0 ;
    0.0 0.0 ;
    0.0 1.7 ;
]
dVdR = zero(R1)

test!(R1, dVdR)
display(dVdR)

# Same calculation, but the input/output data is now flattened
# into a larger vector and accessed using views.
y = zeros(20)
dy = zeros(20)
R2 = reshape((@view y[7:12]), (3, 2))
dy712 = reshape((@view dy[7:12]), (3, 2))
dVdR2 = reshape((@view dy[1:6]), (3, 2))
R2 .= R1

test_views!(dy, y)
display(dVdR2)
display(dy712)

Output

julia> include("bug-reports/enzyme-views.jl")
3×2 Matrix{Float64}:
 0.0   0.0
 0.0   0.0
 0.0  20.4
3×2 reshape(view(::Vector{Float64}, 1:6), 3, 2) with eltype Float64:
 0.0  0.0
 0.0  0.0
 0.0  0.0
3×2 reshape(view(::Vector{Float64}, 7:12), 3, 2) with eltype Float64:
 0.0   0.0
 0.0   0.0
 0.0  20.4

The two 2d arrays printed on top should be the same; but as can be seen in the array displayed at the end, the gradient is stored to indices 7:12 instead of the correct 1:6

@wsmoses
Copy link
Member

wsmoses commented Aug 6, 2023

Yeah this is a known sharp edge. The primal and shadow must be structurally equivalent -- which in this case means having the same offset. We should probably issue a warning on construction with views so folks are more aware.

@wisp3rwind
Copy link
Author

Indeed, a warning (or even an error, if at all feasible) would be great, this is not at all clear from the documentation.

For context, what I was trying to do here is integrate a Hamiltonian system and use autodifferentiation to build the derivative (i.e. the forces) for use with DifferentialEquations.jl. Specifically

$$\frac{\mathrm{d}r}{\mathrm{d}t} = p / m$$

and

$$\frac{\mathrm{d}p}{\mathrm{d}t} = -\frac{\mathrm{d} V}{\mathrm{d} r}$$

with autodiff for the right-hand side. In an attempt to avoid superfluous allocations, I used the views in the manner shown in the MWE. Then, the offset problem occurs quite naturally due to the Hamiltonian structure. Thus, I'd argue that this is not a super-obscure use case.

@wisp3rwind
Copy link
Author

For reference, a better solution to my problem seems to be to use ArrayPartition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants