Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect derivatives when new arrays are constructed in computation #127

Open
jeffreyesun opened this issue Apr 5, 2022 · 0 comments
Open

Comments

@jeffreyesun
Copy link

After much head-scratching about why my derivatives weren't coming out right, I discovered that weird things happen when I build new arrays by indexing from existing arrays. Sometimes, the derivatives are omitted completely:

gb = param(1,1; atype=Array{Float32})
function loss_test(b)
    loss = [b[1]]
    return sum(loss)
end

∇b = @diff loss_test(gb)
grad(∇b, gb) #Sparse(Matrix{Float32}(1,1)()) # should be [1.0;;]

Sometimes, they overwrite the derivatives of arrays they are added to:

gb = param(2,2; atype=Array{Float32})
function loss_test(bb)
    loss = sum(bb, dims=2)
    b = bb.*1
    loss += [b[i] for i=1:2]
    return sum(loss)
end

∇b = @diff loss_test(gb)
grad(∇b, gb) #[1.0 0.0; 1.0 0.0] # should be [2.0 1.0; 2.0 1.0]

If this kind of construction is known to be disallowed, then trying to do it should raise an error. If it's not known to be disallowed, then this is a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant