Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add device and gradient expansions to the new batch-execution pipeline #1651

Merged
merged 72 commits into from
Sep 23, 2021

Conversation

josh146
Copy link
Member

@josh146 josh146 commented Sep 14, 2021

Context: The beta QNode, introduced in #1642, uses the new batch-execution pipeline internally, but does not yet support decompositions of circuits.

This PR adds circuit decomposition support, but in a different approach to the existing QNode.

  • The existing QNode, during construction, queries the device to see what operations it supports. It then expands the tape so as to only use operations native to the device. However, this leads to significant drawbacks, since this occurs prior to gradient rules being applied. In many cases, it is more efficient to apply the gradient rules first, and decompose down to device-supported gates at execution time.

For example, consider the DoubleExcitation operation. This operation decomposes down into eight parametrized RY gates, so in the existing pipeline, would require 16 evaluations to compute the gradient:

DoubleExcitation(theta) -> 8 RY(±theta) + 13 CNOT + 6 H ->[parameter-shift]--> 16 parameter-shift circuits

However, the DoubleExcitation operation supports a 4-term parameter-shift rule natively. Performing the device expansion later is therefore prefereable:

DoubleExcitation(theta) -> [parameter-shift]--> 4 parameter-shifts circuits -> decompose each down to RY, CNOT, H

Thus, in this PR, we add gradient specific decomposition to the QNode construction step, and move device-specific expansions to the device.

Description of the Change:

  • Two new methods were added to the Device API:

    • Device.expand_fn(tape) -> tape: expands a tape such that it is supported by the device. By default, performs the standard device-specific gate set decomposition done in the default QNode. Can be overwritten by the device. Note that the output is 1-1; the expanded tape returns exactly the same value as the original tape, no post-processing required.

    • Device.batch_transform(tape) -> tapes, processing_fn: pre-processes the tape in the case where the device needs to generate multiple circuits to execute from the input circuit. The requirement of a post-processing function makes this distinct to the expand_fn method above. By default, applies the transform

      expval(\sum_i ci hi) -> \sum_i ci expval(hi)
      

      for devices that do not natively support Hamiltonians with non-commuting terms.

  • At the end of QNode.construct(), we call gradient_fn.expand_fn(tape) to expand out the circuit so that all operations present have gradient rules defined. Only applies if a gradient transform is being used, and gradient transforms specify the expansion logic. E.g., DoubleExcitation defines a gradient recipe for parameter-shift and so will be left as is, but StronglyEntanglingLayers doesn't, and so will be expanded.

  • Within QNode.__call__, prior to execute(tapes) being called, we apply the device's batch_transform.

  • qml.execute() is modified, to ensure that device.expand_fn(tape) is called before passing a tape onto a device for execution.

  • All templates have grad_method=None added, to specify that they do not have a gradient method. This will trigger a decomposition into operations that do have a gradient method. This change is required because, by default, grad_method="F" by default!! Which is a silly default :(

  • Unit tests have been added to tests/beta/test_beta_qnode.py, and integration tests added to tests/interfaces/batch/test_batch_interface_qnode.py.

Benefits:

  • The new QNode will now automatically decompose templates/operations not supported by the device.

  • If a template/operation has a gradient rule, but is not supported by the device, the gradient logic will be applied prior to decomposition, leading to a significant reduction in circuit evals. In particular, AllSingleDoubles will result in far fewer circuit executions.

  • Devices are now in control of tape expansion, and device developers can overwrite Device.expand_fn and Device.batch_transform for full control of circuit manipulation.

Possible Drawbacks:

  • I originally wanted to call the new method Device.expand(), but this is already taken by an existing device 🤦

  • Previously, circuit decomposition for the device gate set was done only once in QNode.construct(). However, now that expansion is moved to the device, it happens with every execution. While this has quantum savings with respect to gradient executions, it results in a classical overhead --- the decompositions now happen multiple times a QNode is called.

  • The Operation.grad_method attribute is really old, and predates a lot of quantum gradient research. As a result, it is not flexible enough for our needs. The following options are allowed:

    • grad_method=None: this operation does not support gradients, attempt to decompose it
    • grad_method="F": this operation only supports finite-diff, using parameter-shift will raise an error
    • grad_method="A": this operation supports both parameter-shift and finite-diff

    However, we are missing an option for:

    • This operation supports finite-diff but not parameter-shift. Decompose it for parameter-shift support.

    This latter behaviour is needed for operations such as ApproxTimeEvolution(H, t, L). This operation can be differentiated using finite-differences to get a gradient with only 2 evaluations. For the parameter-shift, it can be decomposed, requiring O(2NL) (I think?) evaluations to get the gradient. However, there is no value of grad_method we can set that will 'unlock' this behaviour currently.

Related GitHub Issues: n/a

Base automatically changed from batch-qnode-interfaces to master September 21, 2021 14:59
pennylane/beta/qnode.py Outdated Show resolved Hide resolved
Copy link
Contributor

@anthayes92 anthayes92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to go through the tests for interfaces from autograd onward.

@@ -106,21 +107,52 @@
significant performance improvement when executing the QNode on remote
quantum hardware.

- When decomposing the circuit, the default decomposition strategy will prioritize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when do decompositions generally occur? e.g if I was running a simple optimisation with a PL circuit, at what level does this happen on hadware/simulator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • expansion_strategy="device": decomposition happens in the QNode, during construction, by querying the device for its supported gate set. This is beneficial in terms of overhead (since the decomposition only happens once), but results in future quantum transforms/compilations working with a potentially very big/deep circuit.

  • expansion_strategy="gradient": decomposition happens in the QNode, during construction, by querying the gradient transform. Typically, the decomposed circuit will not be as deep as the device-decomposed one, since a lot of complex unitaries have gradient rules defined. Later on, further decompositions may be required on the device to get the circuit down to native gate sets.

    This is beneficial in terms of a reduction in quantum resources, at the expense of moving the device decomposition down to every evaluation of the device (so additional classical overhead).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is beneficial in terms of a reduction in quantum resources, at the expense of moving the device decomposition down to every evaluation of the device (so additional classical overhead).

Seems to be yet another benefit for caching parametric circuits to reuse device translations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, precisely 💯 I would even argue, this is only fully solved by parametric compilation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup question: suppose I do something like

@qml.compile()
@qml.qnode(dev, expansion_strategy="device")
def some_qnode():
    # stuff

(or alternatively the gradient strategy). When does decomposition happen currently vs. in this new PR w.r.t. the compilation transform? As you suggest @josh146 we would want compilation to happen before either the device or gradient strategy so that compilation it is acting on a smaller circuit rather than the full-depth expanded one, and consequently leading to a smaller circuit that gets expanded / gradient transformed. (That said, it's possible that a decomposition leads to optimizations in the compilation pipeline that might not otherwise be recognized...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation @josh146, that makes things very clear! The gradient transform continues to impress me!

Copy link
Member Author

@josh146 josh146 Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glassnotes, correct me if wrong, but compile() is a qfunc transform, not a QNode transform? So the following order is needed:

@qml.qnode(dev, expansion_strategy="device")
@qml.compile()
def some_qnode():
    # stuff

and just based on the ordering, the compile transform would always occur prior to the QNode's expansions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're 100% correct, my bad 🤦‍♀️


- `Device.expand_fn(tape) -> tape`: expands a tape such that it is supported by the device. By
default, performs the standard device-specific gate set decomposition done in the default
QNode. Devices may overwrite this method in order to define their own decomposition logic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can a user overwrite this logic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is a good point 🤔

At the moment yes, but it looks a bit hacky. You could do something like this:

>>> def my_custom_expand_fn(tape, **kwargs):
...     print("hello")
...     return tape
>>> qnode.device.expand_fn = my_custom_expand_fn
>>> qnode(0.5)
hello
tensor(0.87758256, requires_grad=True)

Hmmm 🤔 Do you think this will be useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glassnotes I think @anthayes92 might be on to something re: custom decompositions....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, we could even support something like how you currently 'register' QNode execution wrappers while writing a batch transform

dev = qml.device("default.qubit", wires=2)

@dev.custom_expand
def my_expansion_function(tape, **kwargs):
    ...
    return tape

# from now on, the custom expansion is called whenever
# the device is executed.

This is more powerful (too powerful?) compared to a dictionary of gates to custom decompositions. But I still have some question marks:

  1. Should this replace the device expansion?

  2. If it doesn't, does it come before the device expansion? This way unsupported gates are finally decomposed down to device native gates. Or should it come after the device expansion? Execution would then fail if the custom decomp results in an unsupported gate.

  3. Rather than changing the device, does it make more sense to pass a custom decomposition to the QNode? E.g.,

    @qml.qnode(dev, expansion_strategy=my_custom_expansion)
    
    # or
    
    @existing_qnode.register_expansion
    def my_custom_expansion(...):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more powerful (too powerful?) compared to a dictionary of gates to custom decompositions.

How would custom decompositions be specified in these cases?

What if we did something like this, which combines a few of the ideas floating around:

custom_decomps = {qml.Hadamard : h_func, qml.CNOT : cnot_func}

def custom_expand(tape, custom_decomps):
    # applies custom decompositions 

qnode.device.set_expand_fn(custom_expand)

but where the device itself does some sort of internal validation of the decompositions?

def set_expand_fn(custom_decomps):
    for op, decomp in custom_decomps.items():
       # Ensure all the operations in the decomposition are valid for the device
       ...
       # Register the new decompositions to the operators
       if decomp_is_valid:
           op.register_new_decomposition(decomp)

If we do this kind of validation, it ensures that we can apply the expansion after the gradient tapes have already been constructed, but with the guarantee that they'll still run on the device.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like @glassnotes suggestion of having expansion after the gradient tapes have already been constructed! So would the logic here look like: if custom gates are unsupported then decompose to device native gates, so that this is where the guarantee they'll still run on the device comes from?

pennylane/_device.py Outdated Show resolved Hide resolved
pennylane/beta/qnode.py Outdated Show resolved Hide resolved
Comment on lines +52 to +53
params = new_tape.get_parameters(trainable_only=False)
new_tape.trainable_params = qml.math.get_trainable_indices(params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this related to the Unwrap issue?

tests/beta/test_beta_qnode.py Outdated Show resolved Hide resolved
tests/beta/test_beta_qnode.py Show resolved Hide resolved
Copy link
Contributor

@glassnotes glassnotes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some initial questions, will give things some time to sink in and come back to it later 🙂


- `Device.expand_fn(tape) -> tape`: expands a tape such that it is supported by the device. By
default, performs the standard device-specific gate set decomposition done in the default
QNode. Devices may overwrite this method in order to define their own decomposition logic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more powerful (too powerful?) compared to a dictionary of gates to custom decompositions.

How would custom decompositions be specified in these cases?

What if we did something like this, which combines a few of the ideas floating around:

custom_decomps = {qml.Hadamard : h_func, qml.CNOT : cnot_func}

def custom_expand(tape, custom_decomps):
    # applies custom decompositions 

qnode.device.set_expand_fn(custom_expand)

but where the device itself does some sort of internal validation of the decompositions?

def set_expand_fn(custom_decomps):
    for op, decomp in custom_decomps.items():
       # Ensure all the operations in the decomposition are valid for the device
       ...
       # Register the new decompositions to the operators
       if decomp_is_valid:
           op.register_new_decomposition(decomp)

If we do this kind of validation, it ensures that we can apply the expansion after the gradient tapes have already been constructed, but with the guarantee that they'll still run on the device.

doc/releases/changelog-dev.md Outdated Show resolved Hide resolved
pennylane/_device.py Outdated Show resolved Hide resolved
pennylane/_device.py Outdated Show resolved Hide resolved
pennylane/beta/qnode.py Outdated Show resolved Hide resolved
Comment on lines +52 to +53
params = new_tape.get_parameters(trainable_only=False)
new_tape.trainable_params = qml.math.get_trainable_indices(params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a lot of the current logic is simply guided by 'this causes the tests to pass, for all interfaces, for all QNode variations, for all order derivatives, for all differentiation methods'.

This is how I feel any time I have to write interface tests 😓

@@ -67,6 +67,7 @@ class StronglyEntanglingLayers(Operation):
num_params = 1
num_wires = AnyWires
par_domain = "A"
grad_method = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re. your comment in the PR description about grad methods,

This operation supports finite-diff but not parameter-shift. Decompose it for parameter-shift support.

Could we make this parameter an ordered list of grad methods, or tuples with grad methods and additional info? For example, (grad_method, requires_decomposition_to_do_grad_method)?

Copy link
Member Author

@josh146 josh146 Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely! The grad_method and grad_recipe attributes are long overdue for an overhaul. I believe they're on the agenda as part of the operator refactor

num_params = 1
par_domain = "R"

def expand(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but should all operations eventually have their decomposition method replaced by an expand like this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be decided 🤔 Another task for the Operator refactor story 😆

Copy link
Contributor

@anthayes92 anthayes92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small suggestions, otherwise looking great!

tests/interfaces/test_batch_autograd_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_autograd_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_tensorflow_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_tensorflow_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_tensorflow_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_autograd_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_torch_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_torch_qnode.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_torch_qnode.py Outdated Show resolved Hide resolved

- `Device.expand_fn(tape) -> tape`: expands a tape such that it is supported by the device. By
default, performs the standard device-specific gate set decomposition done in the default
QNode. Devices may overwrite this method in order to define their own decomposition logic.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like @glassnotes suggestion of having expansion after the gradient tapes have already been constructed! So would the logic here look like: if custom gates are unsupported then decompose to device native gates, so that this is where the guarantee they'll still run on the device comes from?

pennylane/_device.py Outdated Show resolved Hide resolved
josh146 and others added 5 commits September 22, 2021 22:41
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
Co-authored-by: anthayes92 <34694788+anthayes92@users.noreply.github.com>
Co-authored-by: anthayes92 <34694788+anthayes92@users.noreply.github.com>
@@ -106,21 +107,52 @@
significant performance improvement when executing the QNode on remote
quantum hardware.

- When decomposing the circuit, the default decomposition strategy will prioritize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followup question: suppose I do something like

@qml.compile()
@qml.qnode(dev, expansion_strategy="device")
def some_qnode():
    # stuff

(or alternatively the gradient strategy). When does decomposition happen currently vs. in this new PR w.r.t. the compilation transform? As you suggest @josh146 we would want compilation to happen before either the device or gradient strategy so that compilation it is acting on a smaller circuit rather than the full-depth expanded one, and consequently leading to a smaller circuit that gets expanded / gradient transformed. (That said, it's possible that a decomposition leads to optimizations in the compilation pipeline that might not otherwise be recognized...)

@josh146 josh146 merged commit 324192d into master Sep 23, 2021
@josh146 josh146 deleted the batch-qnode-expand branch September 23, 2021 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review-ready 👌 PRs which are ready for review by someone from the core team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants