-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the stacking (and transposing) behaviour of qml.jacobian
#2059
Conversation
Hello. You may have forgotten to update the changelog!
|
Codecov Report
@@ Coverage Diff @@
## master #2059 +/- ##
==========================================
- Coverage 99.18% 99.18% -0.01%
==========================================
Files 226 226
Lines 17393 17373 -20
==========================================
- Hits 17252 17232 -20
Misses 141 141
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dwierichs! Amazing how much cleaner this makes it --- almost as much code deleted as added!
I left questions throughout for my own understanding. One thing I want to understand more is the metric tensor regression; do you have code examples for what used to work on master, and what no longer works? We should definitely add this to the changelog.
Co-authored-by: Josh Izaac <josh146@gmail.com>
Hi @dwierichs, thank you so much for working on this, this change is really helpful 🥳 One question I have is, is it worth considering removing only the transpose and leaving the stacking? This would still make the behaviour more consistent (the Jacobian is always of shape (num_args, *qnode_output_shape, *arg_shape), where the only difference is whether the first dimension is a tuple or array/tensor), while maybe negating some of the drawbacks you listed (working with a single tensor where possible). |
Hey @dime10, that's a good question! On one hand it might come in handy because of the listed possible drawbacks that indeed would be reduced by keeping the |
@dime10 @dwierichs I fear by this point you are both more informed than I am on the nuances 😄 From a higher-level, I am in favour of the approach which brings us more in line with the existing conventions of the autodiff frameworks. Further, an issue I see with the stacking is that it is only well-defined if the QNode itself returns a single tensor. This is not always the case (and in the future might be even more of a minority --- I can imagine cases where QNodes return tuples of tensors). And, because of the existing |
True these are good points 😃 In this case I do think it sensible to remove both the transpose and the stacking. |
@josh146 This would be ready for another review, and the mentioned override of CodeCov would be required :) |
I can review this next week :) |
[sc-13047] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @dwierichs (especially fixing all those tests 😅).
A few minor comments below:
pennylane/optimize/qng.py
Outdated
# TODO: The following is essentially not implemented yet, as only | ||
# a single metric tensor can be processed. An optimizer refactor | ||
# is needed to accomodate for this. | ||
# | ||
# For multiple QNode arguments, `qml.jacobian` and `qml.metric_tensor` | ||
# return a tuple of arrays. Each of the gradient arrays has to be processed | ||
# together with the corresponding array in the metric tensor tuple. | ||
# This requires modifications of the `GradientDescentOptimizer` base class | ||
# as none of the optimizers accomodate for this use case. | ||
return new_args, forward |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what this means, what happens when multiple QNode arguments are present? Is there a check with a warning/error message, or is this branch unreachable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be an error raised before this, at latest in line 218 above:
new_args = np.array(self.apply_grad(g, args), requires_grad=True)
apply_grad
is not prepared to work with a tuple g
and a tuple self.metric_tensor
(this is the status for all optimizers as far as I know, and thus should be changed across all of them in a separate PR, I'd think).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dwierichs, did this line become unreachable due to the PR? It is reported to be uncovered by Codecov. If we know that this is not a line is being used, then maybe it's best to change the logic here to have a single case (without an if
statement) and potentially reference the other case in a comment such that the approach in mind can be used in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly.
I'm fine with excluding the second case and just leaving the comment. @josh146 is this fine by you as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that makes a lot of sense to me :)
Co-authored-by: David Ittah <dime10@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dime10 for the thorough review and catching the old comments as well as the unfortunate doc string line break :)
pennylane/optimize/qng.py
Outdated
# TODO: The following is essentially not implemented yet, as only | ||
# a single metric tensor can be processed. An optimizer refactor | ||
# is needed to accomodate for this. | ||
# | ||
# For multiple QNode arguments, `qml.jacobian` and `qml.metric_tensor` | ||
# return a tuple of arrays. Each of the gradient arrays has to be processed | ||
# together with the corresponding array in the metric tensor tuple. | ||
# This requires modifications of the `GradientDescentOptimizer` base class | ||
# as none of the optimizers accomodate for this use case. | ||
return new_args, forward |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There will be an error raised before this, at latest in line 218 above:
new_args = np.array(self.apply_grad(g, args), requires_grad=True)
apply_grad
is not prepared to work with a tuple g
and a tuple self.metric_tensor
(this is the status for all optimizers as far as I know, and thus should be changed across all of them in a separate PR, I'd think).
…`gradient_transform` (#2080) * safe_squeeze+tests * gradient trafo * include previously skipped tests * black * changelog * Apply suggestions from code review Co-authored-by: Josh Izaac <josh146@gmail.com> Co-authored-by: Josh Izaac <josh146@gmail.com>
…to unstack-qml-jac
* Initial attempt at a parameter-shift hessian - includes some basic test cases - imports 'param_shift_hessian' into the pennylane/gradients module - currently only computes the diagonal elements of the hessian * Compute offdiagonal elements of the Hessian - all elements of the Hessian are now computed - shape of the returned array is wrong * Fix shape of returned Hessian Previously all the elements of the Hessian would be returned in a 1D array (assuming scalar-valued function, +1D for vector valued functions), a hold-over from computing gradients. Now the proper dimensions are returned, with extrenous ones being stripped (for example for single variable, scalar-valued functions). * Add test that reuses qnode parameters This test fails. The added classical processing between the QNode and the gate parameters reveal that computing the "hybrid" hessian requires steps beyond those provided by `gradient_transform`. * Add new gradient transform decorator for Hessian When generating derivatives of QNodes with both classical and quantum processing of the QNode inputs, the `gradient_transform` decorator automatically computes a "hybrid" derivate from the classical jacobian and quantum jacobian. This only works with first derivates however. The new decorator behaves similarly while accounting for the second derivative nature of the hessian. * Use different recipes for diag/off-diag elements - Differientiate between diagonal (2 tapes) and off-diagonal (4 tapes) elements of the Hessian with different recipes. - Formalize recipe structure that allows for multiple shifts per tape. - Adapt `_process_gradient_recipe` for new recipe structure. * Fix hessian for arbitrary QNode output dimensions Previously, only QNode output dimensions of up to 1 were considered (scalars or vectors). * Support higher dim. cl. Jacobians in hybrid QNodes Previously, the Hessian transform could only support 2D classical Jacobians (corresponding to classical processing of QNode parameters). * Formatting * Set performance goals for Hessian implementation Adds new tests to check the number of quantum device invocations is below a certain target. 3 different targets for 2-term parameter shift rules are provided. * Reduce number of gradient tapes generated Exploits the symmetry of the derivative tensor under index permutation to generate half the number of tapes for the off-diagonal elements of the Hessian. * Fix typo in performance test Also formatting and linting. Caved to black reformatting my beautiful matrices. * Modify Hessian tests for the new QNode The new QNode requires a new argument `max_diff` to set the desired differentiation order. Also includes a few additional assertions that (inicidentally) uncovered a bug in the new QNode. * Add 2 optimizations to reduce the # of evaluations - Reduce the number of gradient tapes generated by only computing the unshifted tape (all shifts = 0) once and reusing the result. - Reduce the number of gradient tapes generated by skipping partial derivatives which are known to be zero. * Fix pylint and std math library dependance `math.comb` is only supported on Python 3.8 and up so was removed from the tests. 2 pylint warnings that are failing codefactor are fixed as well. * Rewrite transformation to avoid array assignments Array assingments are unsupported in Autograd, so in order to support computing derivates of the Hessian transform, they need to be removed from its implementation. This change also adds a new test to check the correctness of the third derivate obtained via the Hessian transform. * Add checks for unsupported operations + tests * Make Hessian transform framework agnostic Instead of creating Numpy arrays, use `qml.math.stack` to convert accumulated gradients into (flat) arrays/tensors, and reshape to the proper dimensions at the end. * Add test for circuit without trainable parameters * Refactor Hessian transform, increase test coverage - new input checks and computation shortcuts - new f0 parameter to unshifted circuit results - split off the core functionality from preprocessing tasks - proper docstrings - new tests for increased code coverage * More descriptive names, remove print statements * Review: change Jacobian for Hessian * Add proper docstring to hessian_transform * Review: add new functions to docs * Implement fix for multiple QNode arguments #1989 introduced a fix to the gradient transform for QNodes with multiple arguments using autograd. When arguments are of the same shape, the classical Jacobians are stacked and transposed, leading to a different tensor contraction with the quantum Jacobian/Hessian than in the single argument case. * Add support for tuples of classical Jacobians The computation is the same as for the single QNode argument case but repeated with each Jacobian per QNode argument. * Move hessian_transform to its own file * Add preliminary tests for all interfaces (failing) * Review: rename gradient->Hessian Co-authored-by: David Wierichs <davidwierichs@gmail.com> * More renaming * Review: remove redundant dim swaps * Review: optimize upper triangular optimization * Review: optimize 0 derivative optimization * Review: simplify axes reordering Co-authored-by: David Wierichs <davidwierichs@gmail.com> * Review: small fixes to the dimension reordering * Review: clarify wording for 'f0' arg * Review: prevent silent failure ... when computing the trainable indices across frameworks. * Fix Hessian transform for Tensorflow Register the transpose function with autoray to convert the `axes` keyword argument to `perm`. Amend test to account for return type being a tuple with tensorflow, even for single argument QNodes. * Fix Hessian transform for JAX Amend test to use backprop, as derivatives of order > 1 are unsupported by the JAX interface with parameter-shift. Additionally, change the QNode return value to expval, as probabilities are unsupported by the JAX interface. Stack the tape results before determining their shape. The numpy shape function converts its argument to ndarray first if it is not already a single array (in this case, the tape results are a list of arrays). With JAX, this conversion fails as the list of traced arrays cannot be converted to a regular ndarray. * Fix Hessian transform for PyTorch Similar to JAX, a list of tensors needs to be stacked before qml.math.shape can be used (fixed in 35a724b). Amend test to account for return type being a tuple with Torch, even for single argument QNodes. Remove requires_grad from the tensor passed to the QNode. * Mark slow tests * Review: fix hardcoded expansion function * Formatting * Changelog + docstring examples * Fix coverage with more interface tests * Formatting * Changelog improvements * Revert unrelated file * Fix docstring examples + add tape drawing * Add warning and use case to docstring * Remove stack&transpose workaround * Adapt multiple argument tests New changes in #2059 removed the ability to perform higher-order differentiation in autograd on QNodes with multiple trainable argumemts. Expected results for these tests are now calculated using JAX instead. * Improve docstring Co-authored-by: David Wierichs <davidwierichs@gmail.com> Co-authored-by: antalszava <antalszava@gmail.com>
Context:
This PR implements the proposal #2056. This is a breaking change for multiple equal-shaped QNode arguments in
qml.jacobian
and inmetric_tensor
,adjoint_metric_tensor
, andclassical_jacobian
when using the Autograd interface.Description of the Change:
stacking and transposition is removed from
qml.jacobian
.A lot of tests have to be adapted to this, including those for
metric_tensor
,adjoint_metric_tensor
, andclassical_jacobian
.Furthermore,
qml.transforms.metric_tensor._contract_metric_tensor_with_cjac
is change to not replicate the old Jacobian behaviour.Benefits:
qml.jacobian
, and even more so for second order derivatives like metric tensors and Hessians.qml.jacobian
more similar tojax.jacobian
,tf.GradientTape().jacobian
andtorch.autograd.functional.jacobian
.Possible Drawbacks:
in particular
QNGOptimizer
does not work for two of its tests any more, because it is written to use a single metric tensor, not one per QNode argument.Related GitHub Issues:
#1884 #1725 #1989 #1992 #2056