Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the stacking (and transposing) behaviour of qml.jacobian #2059

Merged
merged 39 commits into from
Jan 14, 2022

Conversation

dwierichs
Copy link
Contributor

Context:
This PR implements the proposal #2056. This is a breaking change for multiple equal-shaped QNode arguments in qml.jacobian and in metric_tensor, adjoint_metric_tensor, and classical_jacobian when using the Autograd interface.

Description of the Change:
stacking and transposition is removed from qml.jacobian.
A lot of tests have to be adapted to this, including those for metric_tensor, adjoint_metric_tensor, and classical_jacobian.
Furthermore, qml.transforms.metric_tensor._contract_metric_tensor_with_cjac is change to not replicate the old Jacobian behaviour.

Benefits:

  • Removes arguably strange behaviour regarding the output shape of qml.jacobian, and even more so for second order derivatives like metric tensors and Hessians.
  • Makes the output shape more predictable
  • Reduces code complexity
  • Makes the behaviour of qml.jacobian more similar to jax.jacobian, tf.GradientTape().jacobian and torch.autograd.functional.jacobian.

Possible Drawbacks:

  • It is a breaking change.
  • For multiple equal-shaped QNode arguments, higher-order derivatives are not computed properly yet,
    in particular QNGOptimizer does not work for two of its tests any more, because it is written to use a single metric tensor, not one per QNode argument.
  • Using a computed Jacobian is easier if the Jacobian w.r.t. different QNode arguments are stacked. On the other hand, the way they were stacked previously is not necessarily how one would expect, and in particular for array-valued functions, the output dimensions are transposed.

Related GitHub Issues:
#1884 #1725 #1989 #1992 #2056

@github-actions
Copy link
Contributor

Hello. You may have forgotten to update the changelog!
Please edit doc/releases/changelog-dev.md with:

  • A one-to-two sentence description of the change. You may include a small working example for new features.
  • A link back to this PR.
  • Your name (or GitHub username) in the contributors section.

@codecov
Copy link

codecov bot commented Dec 22, 2021

Codecov Report

Merging #2059 (db90956) into master (34d77f8) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2059      +/-   ##
==========================================
- Coverage   99.18%   99.18%   -0.01%     
==========================================
  Files         226      226              
  Lines       17393    17373      -20     
==========================================
- Hits        17252    17232      -20     
  Misses        141      141              
Impacted Files Coverage Δ
pennylane/math/__init__.py 100.00% <ø> (ø)
pennylane/_grad.py 100.00% <100.00%> (ø)
pennylane/gradients/gradient_transform.py 100.00% <100.00%> (ø)
pennylane/math/multi_dispatch.py 100.00% <100.00%> (ø)
pennylane/optimize/qng.py 100.00% <100.00%> (ø)
pennylane/optimize/rotosolve.py 100.00% <100.00%> (ø)
pennylane/transforms/adjoint_metric_tensor.py 100.00% <100.00%> (ø)
pennylane/transforms/classical_jacobian.py 100.00% <100.00%> (ø)
pennylane/transforms/metric_tensor.py 99.47% <100.00%> (-0.03%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 34d77f8...db90956. Read the comment docs.

@josh146 josh146 requested a review from dime10 December 23, 2021 13:44
@josh146 josh146 added the enhancement ✨ New feature or request label Dec 23, 2021
Copy link
Member

@josh146 josh146 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dwierichs! Amazing how much cleaner this makes it --- almost as much code deleted as added!

I left questions throughout for my own understanding. One thing I want to understand more is the metric tensor regression; do you have code examples for what used to work on master, and what no longer works? We should definitely add this to the changelog.

doc/releases/changelog-dev.md Outdated Show resolved Hide resolved
pennylane/_grad.py Outdated Show resolved Hide resolved
pennylane/_grad.py Outdated Show resolved Hide resolved
pennylane/_grad.py Show resolved Hide resolved
pennylane/_grad.py Show resolved Hide resolved
tests/gradients/test_gradient_transform.py Outdated Show resolved Hide resolved
tests/gradients/test_gradient_transform.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_autograd_qnode.py Show resolved Hide resolved
tests/math/test_functions.py Show resolved Hide resolved
tests/test_prob.py Outdated Show resolved Hide resolved
@dime10
Copy link
Contributor

dime10 commented Jan 6, 2022

Hi @dwierichs, thank you so much for working on this, this change is really helpful 🥳

One question I have is, is it worth considering removing only the transpose and leaving the stacking?

This would still make the behaviour more consistent (the Jacobian is always of shape (num_args, *qnode_output_shape, *arg_shape), where the only difference is whether the first dimension is a tuple or array/tensor), while maybe negating some of the drawbacks you listed (working with a single tensor where possible).

@dwierichs
Copy link
Contributor Author

Hey @dime10, that's a good question! On one hand it might come in handy because of the listed possible drawbacks that indeed would be reduced by keeping the stack. Personally, I am a bit hesitant with this, because a) we have the opportunity to match the behaviour of the other interfaces (even though it might be a bit less convenient to work with the output in cases where the stack succeeds) and b) the output shape of qml.jacobian becomes dependent on the argument shapes so that building more advanced functionalities on top of it might become cumbersome?
I find it hard to weigh the consistency between interfaces and stable output format against simpler output formatting when possible. What do you think? Maybe @josh146 also has an opinion on how this should be handled?

@josh146
Copy link
Member

josh146 commented Jan 6, 2022

@dime10 @dwierichs I fear by this point you are both more informed than I am on the nuances 😄

From a higher-level, I am in favour of the approach which brings us more in line with the existing conventions of the autodiff frameworks.

Further, an issue I see with the stacking is that it is only well-defined if the QNode itself returns a single tensor. This is not always the case (and in the future might be even more of a minority --- I can imagine cases where QNodes return tuples of tensors). And, because of the existing try-except around the stacking, the qml.jacobian return type will fluctuate!

@dime10
Copy link
Contributor

dime10 commented Jan 6, 2022

True these are good points 😃 In this case I do think it sensible to remove both the transpose and the stacking.

@dwierichs
Copy link
Contributor Author

@josh146 This would be ready for another review, and the mentioned override of CodeCov would be required :)

@dime10
Copy link
Contributor

dime10 commented Jan 7, 2022

I can review this next week :)

@josh146
Copy link
Member

josh146 commented Jan 11, 2022

[sc-13047]

Copy link
Contributor

@dime10 dime10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @dwierichs (especially fixing all those tests 😅).

A few minor comments below:

doc/releases/changelog-dev.md Outdated Show resolved Hide resolved
pennylane/_grad.py Outdated Show resolved Hide resolved
pennylane/_grad.py Show resolved Hide resolved
pennylane/gradients/gradient_transform.py Outdated Show resolved Hide resolved
Comment on lines 226 to 235
# TODO: The following is essentially not implemented yet, as only
# a single metric tensor can be processed. An optimizer refactor
# is needed to accomodate for this.
#
# For multiple QNode arguments, `qml.jacobian` and `qml.metric_tensor`
# return a tuple of arrays. Each of the gradient arrays has to be processed
# together with the corresponding array in the metric tensor tuple.
# This requires modifications of the `GradientDescentOptimizer` base class
# as none of the optimizers accomodate for this use case.
return new_args, forward
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure what this means, what happens when multiple QNode arguments are present? Is there a check with a warning/error message, or is this branch unreachable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be an error raised before this, at latest in line 218 above:

new_args = np.array(self.apply_grad(g, args), requires_grad=True)

apply_grad is not prepared to work with a tuple g and a tuple self.metric_tensor (this is the status for all optimizers as far as I know, and thus should be changed across all of them in a separate PR, I'd think).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dwierichs, did this line become unreachable due to the PR? It is reported to be uncovered by Codecov. If we know that this is not a line is being used, then maybe it's best to change the logic here to have a single case (without an if statement) and potentially reference the other case in a comment such that the approach in mind can be used in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly.
I'm fine with excluding the second case and just leaving the comment. @josh146 is this fine by you as well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes a lot of sense to me :)

pennylane/transforms/classical_jacobian.py Outdated Show resolved Hide resolved
tests/math/test_functions.py Outdated Show resolved Hide resolved
tests/test_classical_gradients.py Show resolved Hide resolved
tests/transforms/test_metric_tensor.py Show resolved Hide resolved
tests/transforms/test_classical_jacobian.py Show resolved Hide resolved
Co-authored-by: David Ittah <dime10@users.noreply.github.com>
Copy link
Contributor Author

@dwierichs dwierichs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dime10 for the thorough review and catching the old comments as well as the unfortunate doc string line break :)

Comment on lines 226 to 235
# TODO: The following is essentially not implemented yet, as only
# a single metric tensor can be processed. An optimizer refactor
# is needed to accomodate for this.
#
# For multiple QNode arguments, `qml.jacobian` and `qml.metric_tensor`
# return a tuple of arrays. Each of the gradient arrays has to be processed
# together with the corresponding array in the metric tensor tuple.
# This requires modifications of the `GradientDescentOptimizer` base class
# as none of the optimizers accomodate for this use case.
return new_args, forward
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be an error raised before this, at latest in line 218 above:

new_args = np.array(self.apply_grad(g, args), requires_grad=True)

apply_grad is not prepared to work with a tuple g and a tuple self.metric_tensor (this is the status for all optimizers as far as I know, and thus should be changed across all of them in a separate PR, I'd think).

tests/test_classical_gradients.py Show resolved Hide resolved
tests/transforms/test_metric_tensor.py Show resolved Hide resolved
@josh146 josh146 linked an issue Jan 13, 2022 that may be closed by this pull request
@antalszava antalszava merged commit 1275736 into master Jan 14, 2022
@antalszava antalszava deleted the unstack-qml-jac branch January 14, 2022 16:19
dime10 added a commit that referenced this pull request Jan 17, 2022
* Initial attempt at a parameter-shift hessian

- includes some basic test cases
- imports 'param_shift_hessian' into the pennylane/gradients module
- currently only computes the diagonal elements of the hessian

* Compute offdiagonal elements of the Hessian

- all elements of the Hessian are now computed
- shape of the returned array is wrong

* Fix shape of returned Hessian

Previously all the elements of the Hessian would be returned in a 1D array
(assuming scalar-valued function, +1D for vector valued functions), a
hold-over from computing gradients.
Now the proper dimensions are returned, with extrenous ones being
stripped (for example for single variable, scalar-valued functions).

* Add test that reuses qnode parameters

This test fails. The added classical processing between the QNode and
the gate parameters reveal that computing the "hybrid" hessian requires
steps beyond those provided by `gradient_transform`.

* Add new gradient transform decorator for Hessian

When generating derivatives of QNodes with both classical and quantum
processing of the QNode inputs, the `gradient_transform` decorator
automatically computes a "hybrid" derivate from the classical jacobian
and quantum jacobian. This only works with first derivates however.
The new decorator behaves similarly while accounting for the second
derivative nature of the hessian.

* Use different recipes for diag/off-diag elements

- Differientiate between diagonal (2 tapes) and off-diagonal (4 tapes)
  elements of the Hessian with different recipes.
- Formalize recipe structure that allows for multiple shifts per tape.
- Adapt `_process_gradient_recipe` for new recipe structure.

* Fix hessian for arbitrary QNode output dimensions

Previously, only QNode output dimensions of up to 1 were considered
(scalars or vectors).

* Support higher dim. cl. Jacobians in hybrid QNodes

Previously, the Hessian transform could only support 2D classical Jacobians
(corresponding to classical processing of QNode parameters).

* Formatting

* Set performance goals for Hessian implementation

Adds new tests to check the number of quantum device invocations is
below a certain target. 3 different targets for 2-term parameter shift
rules are provided.

* Reduce number of gradient tapes generated

Exploits the symmetry of the derivative tensor under index permutation
to generate half the number of tapes for the off-diagonal elements of
the Hessian.

* Fix typo in performance test

Also formatting and linting. Caved to black reformatting my beautiful
matrices.

* Modify Hessian tests for the new QNode

The new QNode requires a new argument `max_diff` to set the desired
differentiation order.
Also includes a few additional assertions that (inicidentally) uncovered
a bug in the new QNode.

* Add 2 optimizations to reduce the # of evaluations

- Reduce the number of gradient tapes generated by only computing the
  unshifted tape (all shifts = 0) once and reusing the result.
- Reduce the number of gradient tapes generated by skipping partial
  derivatives which are known to be zero.

* Fix pylint and std math library dependance

`math.comb` is only supported on Python 3.8 and up so was removed from
the tests.
2 pylint warnings that are failing codefactor are fixed as well.

* Rewrite transformation to avoid array assignments

Array assingments are unsupported in Autograd, so in order to support
computing derivates of the Hessian transform, they need to be removed
from its implementation.
This change also adds a new test to check the correctness of the third
derivate obtained via the Hessian transform.

* Add checks for unsupported operations + tests

* Make Hessian transform framework agnostic

Instead of creating Numpy arrays, use `qml.math.stack` to convert
accumulated  gradients into (flat) arrays/tensors, and reshape to the
proper dimensions at the end.

* Add test for circuit without trainable parameters

* Refactor Hessian transform, increase test coverage

- new input checks and computation shortcuts
- new f0 parameter to unshifted circuit results
- split off the core functionality from preprocessing tasks
- proper docstrings
- new tests for increased code coverage

* More descriptive names, remove print statements

* Review: change Jacobian for Hessian

* Add proper docstring to hessian_transform

* Review: add new functions to docs

* Implement fix for multiple QNode arguments

#1989 introduced a fix to the gradient transform for QNodes with
multiple arguments using autograd. When arguments are of the same shape,
the classical Jacobians are stacked and transposed, leading to a
different tensor contraction with the quantum Jacobian/Hessian than in
the single argument case.

* Add support for tuples of classical Jacobians

The computation is the same as for the single QNode argument case but
repeated with each Jacobian per QNode argument.

* Move hessian_transform to its own file

* Add preliminary tests for all interfaces (failing)

* Review: rename gradient->Hessian

Co-authored-by: David Wierichs <davidwierichs@gmail.com>

* More renaming

* Review: remove redundant dim swaps

* Review: optimize upper triangular optimization

* Review: optimize 0 derivative optimization

* Review: simplify axes reordering

Co-authored-by: David Wierichs <davidwierichs@gmail.com>

* Review: small fixes to the dimension reordering

* Review: clarify wording for 'f0' arg

* Review: prevent silent failure

... when computing the trainable indices across frameworks.

* Fix Hessian transform for Tensorflow

Register the transpose function with autoray to convert the `axes`
keyword argument to `perm`.
Amend test to account for return type being a tuple with tensorflow,
even for single argument QNodes.

* Fix Hessian transform for JAX

Amend test to use backprop, as derivatives of order  > 1 are unsupported
by the JAX interface with parameter-shift. Additionally, change the
QNode return value to expval, as probabilities are unsupported by the
JAX interface.
Stack the tape results before determining their shape. The numpy shape
function converts its argument to ndarray first if it is not already a
single array (in this case, the tape results are a list of arrays). With
JAX, this conversion fails as the list of traced arrays cannot be
converted to a regular ndarray.

* Fix Hessian transform for PyTorch

Similar to JAX, a list of tensors needs to be stacked before
qml.math.shape can be used (fixed in 35a724b).
Amend test to account for return type being a tuple with Torch, even for
single argument QNodes. Remove requires_grad from the tensor passed to
the QNode.

* Mark slow tests

* Review: fix hardcoded expansion function

* Formatting

* Changelog + docstring examples

* Fix coverage with more interface tests

* Formatting

* Changelog improvements

* Revert unrelated file

* Fix docstring examples + add tape drawing

* Add warning and use case to docstring

* Remove stack&transpose workaround

* Adapt multiple argument tests

New changes in #2059 removed the ability to perform higher-order
differentiation in autograd on QNodes with multiple trainable argumemts.
Expected results for these tests are now calculated using JAX instead.

* Improve docstring

Co-authored-by: David Wierichs <davidwierichs@gmail.com>
Co-authored-by: antalszava <antalszava@gmail.com>
@josh146 josh146 mentioned this pull request Feb 4, 2022
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove the stacking behaviour of qml.jacobian
4 participants