Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a simple API for gradient transforms #1589

Merged
merged 63 commits into from Aug 31, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
ed38d78
Add a tape unwrapping context manager
josh146 Aug 3, 2021
2213575
tests
josh146 Aug 3, 2021
424d19e
more
josh146 Aug 3, 2021
c45256f
changelog
josh146 Aug 3, 2021
80faa2b
Merge branch 'master' into unwrap2
josh146 Aug 3, 2021
2002f3d
suggested changes
josh146 Aug 4, 2021
446b640
suggested changes
josh146 Aug 4, 2021
46ab93f
Merge branch 'unwrap2' of github.com:PennyLaneAI/pennylane into unwrap2
josh146 Aug 4, 2021
b588ee8
suggested changes
josh146 Aug 4, 2021
b99fb3a
add comment
josh146 Aug 4, 2021
a8b031f
linting
josh146 Aug 4, 2021
1192b7a
linting
josh146 Aug 4, 2021
8c3e267
linting
josh146 Aug 4, 2021
4510ed3
Add support for gradient decompositions
josh146 Aug 4, 2021
1ed0dac
update changelog
josh146 Aug 4, 2021
34bc66c
Merge branch 'unwrap2' into custom_gradient
josh146 Aug 4, 2021
eb0c423
update changelog
josh146 Aug 4, 2021
cce9c81
Merge branch 'master' into custom_gradient
josh146 Aug 4, 2021
c40679e
Merge branch 'master' into custom_gradient
josh146 Aug 19, 2021
f330aa9
more
josh146 Aug 19, 2021
1e4ef3c
more
josh146 Aug 19, 2021
4e21821
Merge branch 'master' into custom_gradient
josh146 Aug 23, 2021
10e91c5
more
josh146 Aug 23, 2021
a5045e3
more
josh146 Aug 23, 2021
cb0cc93
more
josh146 Aug 23, 2021
e69c29c
more
josh146 Aug 23, 2021
26acd1f
Merge branch 'master' into custom_gradient
josh146 Aug 23, 2021
05a9201
black
josh146 Aug 23, 2021
50b4a67
fix
josh146 Aug 23, 2021
5fa87d8
tests
josh146 Aug 23, 2021
c1b9a5a
more tests
josh146 Aug 23, 2021
3766c45
more tests
josh146 Aug 24, 2021
54697af
suggested changes
josh146 Aug 24, 2021
903bc93
Merge branch 'master' into custom_gradient
josh146 Aug 24, 2021
5f3aadd
suggested changes
josh146 Aug 24, 2021
d7c1d9d
suggested changes
josh146 Aug 24, 2021
bddc1c6
suggested changes
josh146 Aug 24, 2021
4a283d9
Apply suggestions from code review
josh146 Aug 25, 2021
3a904aa
suggested changes
josh146 Aug 25, 2021
403ae4d
Update pennylane/transforms/batch_transform.py
josh146 Aug 25, 2021
f4028ea
update changelog
josh146 Aug 25, 2021
0ccb881
Merge branch 'custom_gradient' of github.com:PennyLaneAI/pennylane in…
josh146 Aug 25, 2021
f168130
Added custom gradient transform decorator
josh146 Aug 25, 2021
23a3c63
changelog
josh146 Aug 25, 2021
c73a67f
typo
josh146 Aug 25, 2021
355af2e
adding test file to repo
josh146 Aug 25, 2021
92e29be
linting
josh146 Aug 25, 2021
f47a7c7
merge master
josh146 Aug 26, 2021
361a8eb
fix
josh146 Aug 26, 2021
1548b5d
update
josh146 Aug 26, 2021
b4e49ac
update
josh146 Aug 26, 2021
73cb746
Apply suggestions from code review
josh146 Aug 26, 2021
e0282d3
merge master
josh146 Aug 27, 2021
9199679
Merge branch 'master' into gradient-transform
josh146 Aug 29, 2021
ad433de
fixes
josh146 Aug 29, 2021
a18a978
more tests
josh146 Aug 29, 2021
011bf78
more tests
josh146 Aug 29, 2021
1a4b48d
Merge branch 'master' into gradient-transform
josh146 Aug 30, 2021
6ba19e1
Merge branch 'master' into gradient-transform
anthayes92 Aug 30, 2021
1ebe252
Apply suggestions from code review
josh146 Aug 31, 2021
eeeefca
Update pennylane/gradients/gradient_transform.py
josh146 Aug 31, 2021
a098654
suggested changes
josh146 Aug 31, 2021
a47d582
Merge branch 'gradient-transform' of github.com:PennyLaneAI/pennylane…
josh146 Aug 31, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
46 changes: 46 additions & 0 deletions .github/CHANGELOG.md
Expand Up @@ -2,6 +2,52 @@

<h3>New features since last release</h3>

* Custom gradient transforms can now be created using the new
`@qml.gradients.gradient_transform` decorator on a batch-tape transform.
josh146 marked this conversation as resolved.
Show resolved Hide resolved
[(#1589)](https://github.com/PennyLaneAI/pennylane/pull/1589)

Quantum gradient transforms are a specific case of `qml.batch_transform`.
To create a quantum gradient transform, simply write a function that accepts a tape,
and returns a batch of tapes to be independently executed on a quantum device, alongside
a post-processing function that processes the tape results into the gradient.

Furthermore, a smart default expansion function is provided, which automatically expands tape
operations which are not differentiable prior to applying the quantum gradient.
All gradient transforms in `qml.gradients` are now decorated with this decorator.

Supported gradient transforms must be of the following form:

```python
@qml.gradients.gradient_transform
def my_custom_gradient(tape, argnum=None, **kwargs):
...
return gradient_tapes, processing_fn
```

Various built-in quantum gradient transforms are provided within the
`qml.gradients` module, including `qml.gradients.param_shift`.
Once defined, quantum gradient transforms can be applied directly
to QNodes:

```pycon
>>> @qml.qnode(dev)
... def circuit(x):
... qml.RX(x, wires=0)
... qml.CNOT(wires=[0, 1])
... return qml.expval(qml.PauliZ(0))
>>> circuit(0.3)
tensor(0.95533649, requires_grad=True)
>>> qml.gradients.param_shift(circuit)(0.5)
josh146 marked this conversation as resolved.
Show resolved Hide resolved
array([[-0.47942554]])
```

Quantum gradient transforms are fully differentiable, allowing higher order derivatives to be
accessed:

```pycon
>>> qml.grad(qml.gradients.param_shift(circuit))(0.5)
josh146 marked this conversation as resolved.
Show resolved Hide resolved
tensor(-0.87758256, requires_grad=True)
```

* A new pytorch device, `qml.device('default.qubit.torch', wires=wires)`, supports
backpropogation with the torch interface.
Expand Down
9 changes: 8 additions & 1 deletion pennylane/_grad.py
Expand Up @@ -180,7 +180,14 @@ def _jacobian_function(*args, **kwargs):
if len(argnum) == 1:
return _jacobian(func, argnum[0])(*args, **kwargs)

return np.stack([_jacobian(func, arg)(*args, **kwargs) for arg in argnum]).T
jacobians = [_jacobian(func, arg)(*args, **kwargs) for arg in argnum]

try:
return np.stack(jacobians).T
except ValueError:
josh146 marked this conversation as resolved.
Show resolved Hide resolved
# The Jacobian of each argument is a different shape and cannot
# be stacked; simply return the tuple of argument Jacobians.
return tuple(jacobians)

return _jacobian_function

Expand Down
1 change: 1 addition & 0 deletions pennylane/gradients/__init__.py
Expand Up @@ -18,6 +18,7 @@
from . import parameter_shift
from . import parameter_shift_cv

from .gradient_transform import gradient_transform
from .finite_difference import finite_diff, finite_diff_coeffs, generate_shifted_tapes
from .parameter_shift import param_shift
from .parameter_shift_cv import param_shift_cv
Expand Down
3 changes: 3 additions & 0 deletions pennylane/gradients/finite_difference.py
Expand Up @@ -23,6 +23,8 @@

import pennylane as qml

from .gradient_transform import gradient_transform


@functools.lru_cache(maxsize=None)
def finite_diff_coeffs(n, approx_order, strategy):
Expand Down Expand Up @@ -179,6 +181,7 @@ def generate_shifted_tapes(tape, idx, shifts, multipliers=None):
return tapes


@gradient_transform
def finite_diff(tape, argnum=None, h=1e-7, approx_order=1, n=1, strategy="forward", f0=None):
r"""Generate the finite-difference tapes and postprocessing methods required
to compute the gradient of a gate parameter with respect to its outputs.
Expand Down
179 changes: 179 additions & 0 deletions pennylane/gradients/gradient_transform.py
@@ -0,0 +1,179 @@
# Copyright 2018-2021 Xanadu Quantum Technologies Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

# http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""This module contains utilities for defining custom gradient transforms,
including a decorator for specifying gradient expansions."""
# pylint: disable=too-few-public-methods
import pennylane as qml


unsupported_op = lambda op: op.grad_method is None
supported_op = lambda op: op.grad_method is not None
trainable_op = lambda op: any(qml.math.requires_grad(p) for p in op.parameters)


def gradient_expand(tape, depth=10):
"""Expand out a tape so that it supports differentiation
of requested operations.

This is achieved by decomposing all trainable operations that have
``Operation.grad_method=None`` until all resulting operations
have a defined gradient method, up to maximum depth ``depth``. Note that this
might not be possible, in which case the gradient rule will fail to apply.

Args:
tape (.QuantumTape): the input tape to expand
depth (int) : the maximum expansion depth

Returns:
.QuantumTape: the expanded tape
"""

# check if the tape contains unsupported trainable operations
if any(unsupported_op(op) and trainable_op(op) for op in tape.operations):

# Define the stopping condition for the expansion
stop_cond = lambda obj: (
not isinstance(obj, qml.measure.MeasurementProcess)
and ((supported_op(obj) and trainable_op(obj)) or not trainable_op(obj))
)

return tape.expand(depth=depth, stop_at=stop_cond)

return tape


class gradient_transform(qml.batch_transform):
"""Decorator for defining quantum gradient transforms.

Quantum gradient transforms are a specific case of :class:`~.batch_transform`.
All quantum gradient transforms accept a tape, and output
a batch of tapes to be independently executed on a quantum device, alongside
a post-processing function that returns the result.

Args:
expand_fn (function): An expansion function (if required) to be applied to the
input tape before the gradient computation takes place. If not provided,
the default expansion function simply expands all operations that
have ``Operation.grad_method=None`` until all resulting operations
have a defined gradient method.
differentiable (bool): Specifies whether the gradient transform is differentiable or
not. A transform may be non-differentiable if it does not use an
autodiff framework for its tensor manipulations. In such a case, setting
``differentiable=False`` instructs the decorator
to mark the output as 'constant', reducing potential overhead.
hybrid (bool): Specifies whether classical processing inside a QNode
should be taken into account when transforming a QNode.

- If ``True``, and classical processing is detected and this
option is set to ``True``, the Jacobian of the classical
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the classical processing used for in the case where it is present?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good question :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recall that gradient transforms are of the form tape -> processing_fn(execute(gradient_tapes)). That is, they will return the Jacobian of the quantum circuit output with respect to gate arguments.

This is all fine and very well-defined when applying a gradient transform to a tape.

However, when applying a gradient transform to a QNode, there is a subtlety --- the arguments you pass to evlauate the transform are QNode arguments, which are not necessarily gate arguments! In fact, it is very easy to introduce classical processing in-between the QNode arguments and the gate arguments without realizing.

Consider the following:

@qml.qnode(dev)
def circuit(weights):
    qml.RX(weights[1], wires=0)
    qml.RY(2 * weights[0] + weights[2], wires=0)
    return qml.probs(wires=0)

Here, we have added classical processing inbetween the QNode and gate arguments by permuting the arguments, and multiplying by a scalar. The processing function mapping qnode args -> gate args looks like this:

C: (w0, w1, w2) -> (w1, 2 w0 + w2)

If we were to evaluating the gradient transform ignoring this classical processing, we would extract only the quantum Jacobian. This would be a (2, 2) matrix (2 gate arguments, 2 output dimensions), and would not be what the user expects at all.

Instead, the user would expect a (3, 2) matrix output (3 QNode args, 2 output dimensions). So what we need to do is compute the ('classical') Jacobian of the C function above (which will be (3, 2)) and multiply it by the quantum Jacobian:

Jac = CJac @ QJac = (3, 2) @ (2, 2) = (3, 2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josh146 thanks for the explanation, this helped me understand what is happening in the code below 💯

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! Even just permuting the gate arguments wrt qnode arguments is non-trivial classical processing - the classical Jacobian will be a permutation matrix, not an identity matrix!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part of the code I was most worried about, hence all the different classical processing tests 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glassnotes this is also similar to the issue you had with the Fourier module I believe? Except it is a bit harder to solve; simply knowing the Jacobian of the classical processing was insufficient, in that case, you needed the actual gate mapping function C.

Something maybe we could do with tracing 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for elaborating on this @josh146 , I've definitely used classical processing between the QNode arguments and gate arguments without realizing! I'll be keeping an eye out for this now

processing will be computed and included. When evaluated, the
returned Jacobian will be with respect to the QNode arguments.

- If ``False``, any internal QNode classical processing will be
**ignored**. When evaluated, the returned Jacobian will be with
respect to the **gate** arguments, and not the QNode arguments.

Supported gradient transforms must be of the following form:

.. code-block:: python

@gradient_transform
def my_custom_gradient(tape, argnum=None, **kwargs):
...
return gradient_tapes, processing_fn

where:

- ``tape`` (*QuantumTape*): the input quantum tape to compute the gradient of
josh146 marked this conversation as resolved.
Show resolved Hide resolved

- ``argnum`` (*int* or *list[int]* or *None*): Which trainable parameters of the tape
to differentiate with respect to. If not provided, the derivatives with respect to all
trainable inputs of the tape should be returned (``tape.trainable_params``).

- ``gradient_tapes`` (*List[QuantumTape]*): is a list of output tapes to be evaluated.
If this list is empty, no quantum evaluations will be made.

- ``processing_fn`` is a processing function to be applied to the output of the evaluated
``gradient_tapes``. It should accept a list of numeric results with length ``len(gradient_tapes)``,
and return the Jacobian matrix.

Once defined, the quantum gradient transform can be used as follows:

>>> gradient_tapes, processing_fn = my_custom_gradient(tape, *gradient_kwargs)
>>> res = execute(tapes, dev, interface="autograd", gradient_fn=qml.gradients.param_shift)
>>> jacobian = processing_fn(res)

Alternatively, gradient transforms can be applied directly to QNodes,
in which case the execution is implicit:

>>> fn = my_custom_gradient(qnode, *gradient_kwargs)
>>> fn(weights) # transformed function takes the same arguments as the QNode
1.2629730888100839

.. note::

The input tape might have parameters of various types, including
NumPy arrays, JAX DeviceArrays, and TensorFlow and PyTorch tensors.

If the gradient transform is written in a autodiff-compatible manner, either by
using a framework such as Autograd or TensorFlow, or by using ``qml.math`` for
tensor manipulation, then higher-order derivatives will also be supported.

Alternatively, you may use the ``tape.unwrap()`` context manager to temporarily
convert all tape parameters to NumPy arrays and floats:

>>> with tape.unwrap():
... params = tape.get_parameters() # list of floats
"""

def __init__(self, transform_fn, expand_fn=gradient_expand, differentiable=True, hybrid=True):
self.hybrid = hybrid
super().__init__(transform_fn, expand_fn=expand_fn, differentiable=differentiable)

def qnode_execution_wrapper(self, qnode, targs, tkwargs):
josh146 marked this conversation as resolved.
Show resolved Hide resolved
# Here, we overwrite the QNode execution wrapper in order
# to take into account that classical processing may be present
# inside the QNode.
hybrid = tkwargs.pop("hybrid", self.hybrid)
_wrapper = super().qnode_execution_wrapper(qnode, targs, tkwargs)
cjac_fn = qml.transforms.classical_jacobian(qnode)

def jacobian_wrapper(*args, **kwargs):
qjac = _wrapper(*args, **kwargs)
cjac = cjac_fn(*args, **kwargs)

if any(m.return_type is qml.operation.Probability for m in qnode.qtape.measurements):
qjac = qml.math.squeeze(qjac)

if isinstance(cjac, tuple):
# Classical processing of multiple arguments is present. Return qjac @ cjac.
jacs = [
qml.math.squeeze(qml.math.tensordot(c, qjac, [[0], [-1]]))
for c in cjac
if c is not None
]
return jacs

is_square = cjac.shape == (1,) or (cjac.ndim == 2 and cjac.shape[0] == cjac.shape[1])

if not hybrid or (is_square and qml.math.allclose(cjac, qml.numpy.eye(cjac.shape[0]))):
# Classical Jacobian is the identity. No classical processing
# is present inside the QNode.
return qjac

# Classical processing of a single argument is present. Return qjac @ cjac.
jac = qml.math.squeeze(qml.math.tensordot(qml.math.T(cjac), qjac, [[-1], [-1]]))
return qml.math.T(jac)

return jacobian_wrapper
2 changes: 2 additions & 0 deletions pennylane/gradients/parameter_shift.py
Expand Up @@ -20,6 +20,7 @@

import pennylane as qml

from .gradient_transform import gradient_transform
from .finite_difference import finite_diff, generate_shifted_tapes


Expand Down Expand Up @@ -349,6 +350,7 @@ def processing_fn(results):
return gradient_tapes, processing_fn


@gradient_transform
def param_shift(
tape, argnum=None, shift=np.pi / 2, gradient_recipes=None, fallback_fn=finite_diff, f0=None
):
Expand Down
2 changes: 2 additions & 0 deletions pennylane/gradients/parameter_shift_cv.py
Expand Up @@ -23,6 +23,7 @@

import pennylane as qml

from .gradient_transform import gradient_transform
from .finite_difference import finite_diff, generate_shifted_tapes
from .parameter_shift import expval_param_shift, _get_operation_recipe, _process_gradient_recipe

Expand Down Expand Up @@ -460,6 +461,7 @@ def processing_fn(results):
return gradient_tapes, processing_fn


@gradient_transform
def param_shift_cv(
tape,
dev,
Expand Down
2 changes: 1 addition & 1 deletion pennylane/interfaces/batch/autograd.py
Expand Up @@ -109,7 +109,7 @@ def _execute(
for i, r in enumerate(res):
res[i] = np.tensor(r)

if r.dtype == np.dtype("object"):
if res[i].dtype == np.dtype("object"):
# For backwards compatibility, we flatten ragged tape outputs
res[i] = np.hstack(r)

Expand Down
8 changes: 3 additions & 5 deletions pennylane/interfaces/batch/torch.py
Expand Up @@ -95,16 +95,14 @@ def forward(ctx, kwargs, *parameters): # pylint: disable=arguments-differ
break

for i, r in enumerate(res):
if r.dtype == np.dtype("object"):
if r.dtype is np.dtype("object"):
# For backwards compatibility, we flatten ragged tape outputs
r = np.hstack(r)

res[i] = torch.as_tensor(torch.from_numpy(r), device=ctx.torch_device)
res[i] = torch.as_tensor(r, device=ctx.torch_device)

if ctx.jacs:
ctx.jacs[i] = torch.as_tensor(
torch.from_numpy(ctx.jacs[i]), device=ctx.torch_device
)
ctx.jacs[i] = torch.as_tensor(ctx.jacs[i], device=ctx.torch_device)

return tuple(res)

Expand Down