-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add device and gradient expansions to the new batch-execution pipeline #1651
Changes from 63 commits
b2c4baf
317514b
a937d4d
c6dc629
0a5756a
490f106
fa10491
d70f0c8
e0b45f5
523b765
f8af2d2
754bbe7
bb6626c
377d8f3
7c400cc
57f83a0
0852693
abc543a
852f2fe
2a3c67b
4e54a26
0129cb7
5d1f0b3
a79c827
63f2990
74a8df0
1d35677
59b9d7d
31b0ff3
23f8433
cd8b00c
524724f
21c0386
3d84bb3
c84cf98
b445870
eaf7381
547a286
440a46a
d8d3b66
316a16f
9b32238
4a94acf
4bdf092
f0e5c3c
4c671dc
4308af5
c35b5cb
1e812d7
bc57b00
016857a
102d6ce
a9b26ca
063745b
79b55d2
19c3fe6
4d54374
0da517f
8d0880d
141671a
d46fb3a
a5f6424
af9c4c4
94de146
61383be
4478944
d78f5a4
8b40dfa
46d01bb
6bdcd21
320b468
bb640fe
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -81,6 +81,7 @@ | |
`qml.beta.QNode`, and `@qml.beta.qnode`. | ||
[(#1642)](https://github.com/PennyLaneAI/pennylane/pull/1642) | ||
[(#1646)](https://github.com/PennyLaneAI/pennylane/pull/1646) | ||
[(#1651)](https://github.com/PennyLaneAI/pennylane/pull/1651) | ||
|
||
It differs from the standard QNode in several ways: | ||
|
||
|
@@ -106,21 +107,52 @@ | |
significant performance improvement when executing the QNode on remote | ||
quantum hardware. | ||
|
||
- When decomposing the circuit, the default decomposition strategy will prioritize | ||
decompositions that result in the smallest number of parametrized operations | ||
required to satisfy the differentiation method. Additional decompositions required | ||
to satisfy the native gate set of the quantum device will be performed later, by the | ||
device at execution time. While this may lead to a slight increase in classical processing, | ||
it significantly reduces the number of circuit evaluations needed to compute | ||
gradients of complex unitaries. | ||
|
||
In an upcoming release, this QNode will replace the existing one. If you come across any bugs | ||
while using this QNode, please let us know via a [bug | ||
report](https://github.com/PennyLaneAI/pennylane/issues/new?assignees=&labels=bug+%3Abug%3A&template=bug_report.yml&title=%5BBUG%5D) | ||
on our GitHub bug tracker. | ||
|
||
Currently, this beta QNode does not support the following features: | ||
|
||
- Circuit decompositions | ||
- Non-mutability via the `mutable` keyword argument | ||
- Viewing specifications with `qml.specs` | ||
- The `reversible` QNode differentiation method | ||
- The ability to specify a `dtype` when using PyTorch and TensorFlow. | ||
|
||
It is also not tested with the `qml.qnn` module. | ||
|
||
* Two new methods were added to the Device API, allowing PennyLane devices | ||
increased control over circuit decompositions. | ||
[(#1651)](https://github.com/PennyLaneAI/pennylane/pull/1651) | ||
|
||
- `Device.expand_fn(tape) -> tape`: expands a tape such that it is supported by the device. By | ||
default, performs the standard device-specific gate set decomposition done in the default | ||
QNode. Devices may overwrite this method in order to define their own decomposition logic. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can a user overwrite this logic? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, this is a good point 🤔 At the moment yes, but it looks a bit hacky. You could do something like this: >>> def my_custom_expand_fn(tape, **kwargs):
... print("hello")
... return tape
>>> qnode.device.expand_fn = my_custom_expand_fn
>>> qnode(0.5)
hello
tensor(0.87758256, requires_grad=True) Hmmm 🤔 Do you think this will be useful? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @glassnotes I think @anthayes92 might be on to something re: custom decompositions.... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean, we could even support something like how you currently 'register' QNode execution wrappers while writing a batch transform dev = qml.device("default.qubit", wires=2)
@dev.custom_expand
def my_expansion_function(tape, **kwargs):
...
return tape
# from now on, the custom expansion is called whenever
# the device is executed. This is more powerful (too powerful?) compared to a dictionary of gates to custom decompositions. But I still have some question marks:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
How would custom decompositions be specified in these cases? What if we did something like this, which combines a few of the ideas floating around: custom_decomps = {qml.Hadamard : h_func, qml.CNOT : cnot_func}
def custom_expand(tape, custom_decomps):
# applies custom decompositions
qnode.device.set_expand_fn(custom_expand) but where the def set_expand_fn(custom_decomps):
for op, decomp in custom_decomps.items():
# Ensure all the operations in the decomposition are valid for the device
...
# Register the new decompositions to the operators
if decomp_is_valid:
op.register_new_decomposition(decomp) If we do this kind of validation, it ensures that we can apply the expansion after the gradient tapes have already been constructed, but with the guarantee that they'll still run on the device. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really like @glassnotes suggestion of having expansion after the gradient tapes have already been constructed! So would the logic here look like: if custom gates are unsupported then decompose to device native gates, so that this is where the guarantee they'll still run on the device comes from? |
||
|
||
Note that the numerical result after applying this method should remain unchanged; PennyLane | ||
will assume that the expanded tape returns exactly the same value as the original tape when | ||
executed. | ||
|
||
- `Device.batch_transform(tape) -> (tapes, processing_fn)`: pre-processes the tape in the case | ||
josh146 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
where the device needs to generate multiple circuits to execute from the input circuit. The | ||
requirement of a post-processing function makes this distinct to the `expand_fn` method above. | ||
|
||
By default, this method applies the transform | ||
|
||
.. math:: \left\langle \sum_i c_i h_i\right\rangle -> \sum_i c_i \left\langle h_i \right\rangle | ||
|
||
if `expval(H)` is present on devices that do not natively support Hamiltonians with | ||
non-commuting terms. | ||
|
||
|
||
<h3>Improvements</h3> | ||
|
||
* The tests for qubit operations are split into multiple files. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,7 +48,10 @@ def gradient_expand(tape, depth=10): | |
and ((supported_op(obj) and trainable_op(obj)) or not trainable_op(obj)) | ||
) | ||
|
||
return tape.expand(depth=depth, stop_at=stop_cond) | ||
new_tape = tape.expand(depth=depth, stop_at=stop_cond) | ||
params = new_tape.get_parameters(trainable_only=False) | ||
new_tape.trainable_params = qml.math.get_trainable_indices(params) | ||
Comment on lines
+52
to
+53
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Turns out these two lines are required to solve a bug I discovered while writing the tests There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this related to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really hesitate to say this, but I often don't fully understand the trainable_params setting 😬 Each autodiff framework is different, it is affected by expansion, by higher order derivatives, etc. So a lot of the current logic is simply guided by 'this causes the tests to pass, for all interfaces, for all QNode variations, for all order derivatives, for all differentiation methods'. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is how I feel any time I have to write interface tests 😓 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh ok, could this be something to think about addressing for the upcoming planning? Or is it the case that since it only affects devs and we can generally muddle though that it's lower priority There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we all feel it 😿 |
||
return new_tape | ||
|
||
return tape | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -174,7 +174,7 @@ def grad_fn(dy): | |
|
||
# Generate and execute the required gradient tapes | ||
if _n == max_diff: | ||
with qml.tape.Unwrap(*tapes): | ||
with qml.tape.Unwrap(*tapes, set_trainable=False): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A bug I discovered. I think by now, almost all cases of In a new PR, I will remove the |
||
vjp_tapes, processing_fn = qml.gradients.batch_vjp( | ||
tapes, | ||
dy, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when do decompositions generally occur? e.g if I was running a simple optimisation with a PL circuit, at what level does this happen on hadware/simulator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expansion_strategy="device"
: decomposition happens in the QNode, during construction, by querying the device for its supported gate set. This is beneficial in terms of overhead (since the decomposition only happens once), but results in future quantum transforms/compilations working with a potentially very big/deep circuit.expansion_strategy="gradient"
: decomposition happens in the QNode, during construction, by querying the gradient transform. Typically, the decomposed circuit will not be as deep as the device-decomposed one, since a lot of complex unitaries have gradient rules defined. Later on, further decompositions may be required on the device to get the circuit down to native gate sets.This is beneficial in terms of a reduction in quantum resources, at the expense of moving the device decomposition down to every evaluation of the device (so additional classical overhead).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to be yet another benefit for caching parametric circuits to reuse device translations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, precisely 💯 I would even argue, this is only fully solved by parametric compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followup question: suppose I do something like
(or alternatively the gradient strategy). When does decomposition happen currently vs. in this new PR w.r.t. the compilation transform? As you suggest @josh146 we would want compilation to happen before either the
device
orgradient
strategy so that compilation it is acting on a smaller circuit rather than the full-depth expanded one, and consequently leading to a smaller circuit that gets expanded / gradient transformed. (That said, it's possible that a decomposition leads to optimizations in the compilation pipeline that might not otherwise be recognized...)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation @josh146, that makes things very clear! The gradient transform continues to impress me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glassnotes, correct me if wrong, but
compile()
is a qfunc transform, not a QNode transform? So the following order is needed:and just based on the ordering, the
compile
transform would always occur prior to the QNode's expansions.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you're 100% correct, my bad 🤦♀️