-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a parameter-shift Hessian transformation #1884
Conversation
- includes some basic test cases - imports 'param_shift_hessian' into the pennylane/gradients module - currently only computes the diagonal elements of the hessian
- all elements of the Hessian are now computed - shape of the returned array is wrong
Previously all the elements of the Hessian would be returned in a 1D array (assuming scalar-valued function, +1D for vector valued functions), a hold-over from computing gradients. Now the proper dimensions are returned, with extrenous ones being stripped (for example for single variable, scalar-valued functions).
Codecov Report
@@ Coverage Diff @@
## master #1884 +/- ##
========================================
Coverage 99.18% 99.19%
========================================
Files 226 228 +2
Lines 17396 17532 +136
========================================
+ Hits 17255 17391 +136
Misses 141 141
Continue to review full report at Codecov.
|
This test fails. The added classical processing between the QNode and the gate parameters reveal that computing the "hybrid" hessian requires steps beyond those provided by `gradient_transform`.
When generating derivatives of QNodes with both classical and quantum processing of the QNode inputs, the `gradient_transform` decorator automatically computes a "hybrid" derivate from the classical jacobian and quantum jacobian. This only works with first derivates however. The new decorator behaves similarly while accounting for the second derivative nature of the hessian.
- Differientiate between diagonal (2 tapes) and off-diagonal (4 tapes) elements of the Hessian with different recipes. - Formalize recipe structure that allows for multiple shifts per tape. - Adapt `_process_gradient_recipe` for new recipe structure.
Previously, only QNode output dimensions of up to 1 were considered (scalars or vectors).
Previously, the Hessian transform could only support 2D classical Jacobians (corresponding to classical processing of QNode parameters).
Adds new tests to check the number of quantum device invocations is below a certain target. 3 different targets for 2-term parameter shift rules are provided.
Exploits the symmetry of the derivative tensor under index permutation to generate half the number of tapes for the off-diagonal elements of the Hessian.
Also formatting and linting. Caved to black reformatting my beautiful matrices.
The new QNode requires a new argument `max_diff` to set the desired differentiation order. Also includes a few additional assertions that (inicidentally) uncovered a bug in the new QNode.
- Reduce the number of gradient tapes generated by only computing the unshifted tape (all shifts = 0) once and reusing the result. - Reduce the number of gradient tapes generated by skipping partial derivatives which are known to be zero.
`math.comb` is only supported on Python 3.8 and up so was removed from the tests. 2 pylint warnings that are failing codefactor are fixed as well.
Array assingments are unsupported in Autograd, so in order to support computing derivates of the Hessian transform, they need to be removed from its implementation. This change also adds a new test to check the correctness of the third derivate obtained via the Hessian transform.
Instead of creating Numpy arrays, use `qml.math.stack` to convert accumulated gradients into (flat) arrays/tensors, and reshape to the proper dimensions at the end.
- new input checks and computation shortcuts - new f0 parameter to unshifted circuit results - split off the core functionality from preprocessing tasks - proper docstrings - new tests for increased code coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dime10, great work so far on a complex problem! Especially working out the tensor dot product to get the correct output shape.
I've left some comments and suggestions, but have also a few more major suggestions:
-
Don't forget to add a changelog entry! - addressed
-
Good job on getting the transform differentiable! this is definitely non-trivial. I recommend also adding tests for Torch, TF, and JAX, to ensure that the differentiability still works. - addressed
-
qml.gradients.param_shift_hessian
does not appear in the docs (see https://pennylane--1884.org.readthedocs.build/en/1884/code/qml_gradients.html). It will need to be added to the module docstring ofgradients/__init__.py
. - addressed -
I think it makes sense to move
hessian_transform
to its own file, and add a properly documented docstring, so that other developers can re-use it as needed - addressed -
Finally, I noticed a few places where the incorrect matrix product is being computed - addressed
... when computing the trainable indices across frameworks.
Register the transpose function with autoray to convert the `axes` keyword argument to `perm`. Amend test to account for return type being a tuple with tensorflow, even for single argument QNodes.
Amend test to use backprop, as derivatives of order > 1 are unsupported by the JAX interface with parameter-shift. Additionally, change the QNode return value to expval, as probabilities are unsupported by the JAX interface. Stack the tape results before determining their shape. The numpy shape function converts its argument to ndarray first if it is not already a single array (in this case, the tape results are a list of arrays). With JAX, this conversion fails as the list of traced arrays cannot be converted to a regular ndarray.
Similar to JAX, a list of tensors needs to be stacked before qml.math.shape can be used (fixed in 35a724b). Amend test to account for return type being a tuple with Torch, even for single argument QNodes. Remove requires_grad from the tensor passed to the QNode.
Thank you for your help @josh146 and @dwierichs ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great, @dime10 :) 😍
Just had a last tiny comment regarding the changelog.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @dime10! Looks good for merging in on my end once my comments are addressed :)
New changes in #2059 removed the ability to perform higher-order differentiation in autograd on QNodes with multiple trainable argumemts. Expected results for these tests are now calculated using JAX instead.
Context: Currently, it is possible to compute the parameter-shift Hessian of a QNode by using an autodifferentiation framework, and asking for the Jacobian of the gradient (or the Jacobian of the Jacobian, for a vector-valued QNode). However, this leads to redundant evaluations from computing the QNode and first-derivative of the QNode.
Description of the Change: Adds a new transformation that directly computes the second-derivative (Hessian) of QNodes via parameter-shift rules.
Benefits: Fewer quantum evaluations, can directly inspect the "Hessian tapes".
Possible Drawbacks:
Related GitHub Issues: Closes #1752