You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For sparse Gaussian processes, one often wants to parametrize the variational distribution using its Cholesky factors, which form an upper-triangular matrix and are trainable.
However, when taking gradients, Zygote instead computes the gradient with respect to all entries in the UpperTriangular matrix, including ones that are set to zero by virtue of it being UpperTriangular. This is related to #163.
For the moment, a workaround suitable for training models is to simply call UpperTriangular(x) on the upper-triangular matrix x before using it.
The text was updated successfully, but these errors were encountered:
Perhaps this is more related to #402. As an alternative you could insert the projection function ℙ in your code, see #402 (comment). Then, the adjoint will be an UpperTriangular. You would have to add the definition for the projection of the UpperTriangular:
ℙ(::Type{T},X) where {T<:UpperTriangular} =UpperTriangular(X)
For sparse Gaussian processes, one often wants to parametrize the variational distribution using its Cholesky factors, which form an upper-triangular matrix and are trainable.
However, when taking gradients, Zygote instead computes the gradient with respect to all entries in the UpperTriangular matrix, including ones that are set to zero by virtue of it being UpperTriangular. This is related to #163.
For the moment, a workaround suitable for training models is to simply call
UpperTriangular(x)
on the upper-triangular matrixx
before using it.The text was updated successfully, but these errors were encountered: