-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make qml.math.block_diag
fully differentiable with autograd
#1816
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1816 +/- ##
=======================================
Coverage 98.92% 98.92%
=======================================
Files 209 209
Lines 15727 15737 +10
=======================================
+ Hits 15558 15568 +10
Misses 169 169
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dwierichs this is great!
pennylane/math/single_dispatch.py
Outdated
rsizes = _np.array([t.shape[0] for t in tensors]) | ||
csizes = _np.array([t.shape[1] for t in tensors]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will rsizes and csizes always be identical for block_diag
to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, I just checked the scipy docs, and it looks like arbitrary shapes are allowed 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose, if you assume that all tensors are 2D, you could do
rsizes, csizes = _np.array([t.shape for t in tensors]).T
to save on one for loop 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(unless this blocks constructing ND block diagonal matrices, which it might?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no, we only allow for 2d block diagonals :)
So yes, I'll incorporate the trick above, thanks!
Regarding non-square shapes, we are a bit inconsistent: I made this compatible with non-squares because the scipy version is, but for example TF does not allow for it anyways so if one wants to use all frameworks, there is a restriction to all-square tensors.
res = _np.hstack([tensors[0], *all_zeros[0][1:]]) | ||
for i, t in enumerate(tensors[1:], start=1): | ||
row = _np.hstack([*all_zeros[i][:i], t, *all_zeros[i][i + 1 :]]) | ||
res = _np.vstack([res, row]) | ||
|
||
return res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice solution! I realize now that the torch approach below wouldn't work here, since res[row, col] = t
uses tensor assignment that is not allowed by autograd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, that is what I tried first :D
# Transposes in the following because autograd behaves strangely | ||
assert fn.allclose(res[:, :, 0].T, exp[:, :, 0]) | ||
assert fn.allclose(res[:, :, 1].T, exp[:, :, 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, how come?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it fixed if you swap to autograd.jacobian
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because then one has to request the jacobian for one argnum
anyways and a output-like tensor is returned.
qml.jacobian
does the following:
>>> qml.jacobian(lambda x, y: np.array([[1*x, 2*x, 3*x],[4*y,5*y, 6*y]]))(0.3, 0.2)
[[[1. 0.]
[0. 4.]]
[[2. 0.]
[0. 5.]]
[[3. 0.]
[0. 6.]]]
that is, for a mxn
-shaped output function with k
(scalar) arguments, the jacobian has the shape (n, m, k)
.
If instead there is a single argument with k
entries, we get the shape (m, n, k)
:
>>> qml.jacobian(lambda x: np.array([[1*x[0], 2*x[0], 3*x[0]],[4*x[1],5*x[1], 6*x[1]]]))(np.array([0.3, 0.2]))
[[[1. 0.]
[2. 0.]
[3. 0.]]
[[0. 4.]
[0. 5.]
[0. 6.]]]
Context:
Previously, results of
qml.math.block_diag
were only differentiable by indexing and differentiating single entries of the resulting matrix. This was due to the Autograd single dispatch using the vanilla scipy functionblock_diag
.Description of the Change:
There now is a custom method that implements
block_diag
for Autograd, supporting differentiation of the full output matrix.Benefits:
More diffability of
qml.math.block_diag
(as far as I can tell only used inmetric_tensor
so far).Possible Drawbacks:
(More custom code.)
Related GitHub Issues:
#1814