Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiable batch execute using TensorFlow #1542

Merged
merged 86 commits into from
Aug 25, 2021
Merged

Conversation

josh146
Copy link
Member

@josh146 josh146 commented Aug 17, 2021

Context: This PR adds support for differentiable batch execution of circuits using TensorFlow following #1501 and #1508.

Description of the change:

This PR adds the following:

  • TensorFlow dispatch to the top-level qml.interfaces.batch.execute() function.

  • qml.interfaces.batch.tensorflow - a module containing a TensorFlow custom gradient function for dev.batch_execute.

Benefits:

  • Execute tapes in a batch, with the output remaining differentiable.

  • Supports gradient transforms and device execution methods

  • Since gradient transforms are differentiable, nth order higher derivatives are supported. Compared to PL master, this allows nth order derivatives of everything, including expval, var, probs, tensor products, non-two-term shifts, etc.

Example:

import tensorflow as tf

import pennylane as qml
from pennylane.interfaces.batch import execute

params = tf.Variable([0.1, 0.2, 0.3], dtype=tf.float64)
x = tf.Variable([0.5], dtype=tf.float64)

with tf.GradientTape() as t1:
    with tf.GradientTape() as t2:

        with qml.tape.JacobianTape() as tape1:
            qml.RX(params[0], wires=0)
            qml.RY(params[1], wires=0)
            qml.expval(qml.PauliZ(0))

        with qml.tape.JacobianTape() as tape2:
            qml.RX(params[2], wires=0)
            qml.CRY(x[0], wires=[0, 1])
            qml.CNOT(wires=[0, 1])
            qml.probs(wires=[1])

        tapes = [tape1, tape2]

        # execute both tapes in a batch on the given device
        dev = qml.device("lightning.qubit", wires=2)
        res = execute(tapes, dev, gradient_fn=qml.gradients.param_shift, interface="tf")

        loss = res[0][0] + res[1][0, 0] - res[1][0, 1]

    grad = t2.gradient(loss, [params, x])

hess = t1.jacobian(grad[0], params)

print("Loss:", loss)
print("Gradient:", grad)
print("Hessian:", hess)

gives

Loss: tf.Tensor(1.9332406126165342, shape=(), dtype=float64)

Gradient: [<tf.Tensor: shape=(3,), dtype=float64, numpy=array([-0.0978434 , -0.19767681, -0.27743179])>, <tf.Tensor: shape=(1,), dtype=float64, numpy=array([0.01070641])>]

Hessian: tf.Tensor(
[[-0.97517033  0.01983384  0.        ]
 [ 0.01983384 -0.97517033  0.        ]
 [ 0.          0.         -0.89686157]], shape=(3, 3), dtype=float64)

Potential drawbacks:

  • In Add a simple API for transforms that generate multiple tapes #1493, work is being done to create a standardized API for gradient transforms. Until then, we simply assume that any gradient_fn within the pennylane.gradients module is a transform.

  • All gradient transforms, and all device gradients (e.g., adjoint) are supported. The reversible method, however, is not currently supported, since it is not a transform nor a device method.

  • 'Jacobians of Jacobians' (or Hessians or vector-valued cost functions) are not supported out-of-the box, because TensorFlow attempts to autograph/JIT the Jacobian computation in order to parallelize it. However, you can't JIT a function that converts Tensors -> NumPy! E.g.,

    import tensorflow as tf
    
    import pennylane as qml
    from pennylane.interfaces.batch import execute
    
    params = tf.Variable([0.1, 0.2], dtype=tf.float64)
    
    with tf.GradientTape() as t1:
        with tf.GradientTape() as t2:
    
            with qml.tape.JacobianTape() as tape:
                qml.RX(params[0], wires=0)
                qml.CRY(params[1], wires=[0, 1])
                qml.CNOT(wires=[0, 1])
                qml.probs(wires=[1])
    
            # execute both tapes in a batch on the given device
            dev = qml.device("lightning.qubit", wires=2)
            res = execute([tape], dev, gradient_fn=qml.gradients.param_shift, interface="tf")
            res = tf.stack(res)
    
        grad = t2.jacobian(res, params)
    
    hess = t1.jacobian(grad, params)
    
    /home/josh/xanadu/pennylane/pennylane/interfaces/batch/tensorflow.py:62 execute
        [i.numpy() if isinstance(i, (tf.Variable, tf.Tensor)) else i for i in params]
    /home/josh/xanadu/pennylane/pennylane/interfaces/batch/tensorflow.py:62 <listcomp>
        [i.numpy() if isinstance(i, (tf.Variable, tf.Tensor)) else i for i in params]
    /home/josh/miniconda3/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:401 __getattr__
        self.__getattribute__(name)

    This is resolved by specifying experimental_use_pfor=False when computing the Jacobian.

Issues: n/a

Copy link
Contributor

@glassnotes glassnotes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a very quick review with a couple comments for now, but I tried executing one of the tests using the GPU and everything looks okay, the logger showed that all the operations were being run on it 🎉

pennylane/tape/unwrap.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_tensorflow.py Outdated Show resolved Hide resolved
josh146 and others added 2 commits August 20, 2021 01:41
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
Base automatically changed from autograd-caching to master August 20, 2021 05:47
Copy link
Contributor

@glassnotes glassnotes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josh146 good to go, just caught a couple copy-paste errors in the docstrings.

# corresponding element of the VJP will be zero,
# and we can avoid a quantum computation.
return [], lambda _: math.convert_like(np.zeros([num_params]), dy)
except AttributeError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what situations would allclose cause an attribute error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so this is really annoying. Newer versions of TF will attempt to vectorize Jacobian computations by default, and as part of this vectorization process, they trace the cost function. The issue is:

  • you can't vectorize a quantum execution, even though TF tries 😆 So it's a very pointless step that does nothing but adds overhead.

  • During tracing, it sends proxy variables that have no value; e.g., you can't call dy.numpy() (since the value doesn't exist yet).

The second bullet point is the cause of the attribute error --- math.allclose is calling dy.numpy(), which doesn't exist in vectorized mode.

I attempted to rewrite math.allclose() to directly implement tf.abs(a-b) <= atol + b * rtol as per the definition, but then ran into another error - TF was complaining that a proxy variable cannot be used in a Python conditional 🙁

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically: we need the vectorization to 'work' from TF's perspective, even though it has no effect.

tests/interfaces/test_batch_tensorflow.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_tensorflow.py Outdated Show resolved Hide resolved
tests/interfaces/test_batch_tensorflow.py Outdated Show resolved Hide resolved
josh146 and others added 3 commits August 20, 2021 23:58
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
pennylane/interfaces/batch/tensorflow.py Show resolved Hide resolved
tapes (Sequence[.QuantumTape]): batch of tapes to execute
device (.Device): Device to use to execute the batch of tapes.
If the device does not provide a ``batch_execute`` method,
by default the tapes will be executed in serial.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the user notified of somehow when a device executes in series?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on the device 🙂 Currently, no - serial is always a fallback to parallel.

pennylane/interfaces/batch/tensorflow.py Show resolved Hide resolved
pennylane/interfaces/batch/tensorflow.py Show resolved Hide resolved
pennylane/tape/unwrap.py Show resolved Hide resolved
@anthayes92
Copy link
Contributor

This looks really cool @josh146 . I have an objective to benchmark and provide user feedback on new features. This looks like a great candidate for that. Would be good to discuss some key test cases

@josh146 josh146 merged commit 97a4b97 into master Aug 25, 2021
@josh146 josh146 deleted the batch-tensorflow branch August 25, 2021 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review-ready 👌 PRs which are ready for review by someone from the core team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants