-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Differentiable batch execute using TensorFlow #1542
Conversation
Co-authored-by: Nathan Killoran <co9olguy@users.noreply.github.com>
Co-authored-by: Tom Bromley <49409390+trbromley@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a very quick review with a couple comments for now, but I tried executing one of the tests using the GPU and everything looks okay, the logger showed that all the operations were being run on it 🎉
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josh146 good to go, just caught a couple copy-paste errors in the docstrings.
# corresponding element of the VJP will be zero, | ||
# and we can avoid a quantum computation. | ||
return [], lambda _: math.convert_like(np.zeros([num_params]), dy) | ||
except AttributeError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In what situations would allclose
cause an attribute error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so this is really annoying. Newer versions of TF will attempt to vectorize Jacobian computations by default, and as part of this vectorization process, they trace the cost function. The issue is:
-
you can't vectorize a quantum execution, even though TF tries 😆 So it's a very pointless step that does nothing but adds overhead.
-
During tracing, it sends proxy variables that have no value; e.g., you can't call
dy.numpy()
(since the value doesn't exist yet).
The second bullet point is the cause of the attribute error --- math.allclose
is calling dy.numpy()
, which doesn't exist in vectorized mode.
I attempted to rewrite math.allclose()
to directly implement tf.abs(a-b) <= atol + b * rtol
as per the definition, but then ran into another error - TF was complaining that a proxy variable cannot be used in a Python conditional 🙁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically: we need the vectorization to 'work' from TF's perspective, even though it has no effect.
Co-authored-by: Olivia Di Matteo <2068515+glassnotes@users.noreply.github.com>
tapes (Sequence[.QuantumTape]): batch of tapes to execute | ||
device (.Device): Device to use to execute the batch of tapes. | ||
If the device does not provide a ``batch_execute`` method, | ||
by default the tapes will be executed in serial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the user notified of somehow when a device executes in series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on the device 🙂 Currently, no - serial is always a fallback to parallel.
This looks really cool @josh146 . I have an objective to benchmark and provide user feedback on new features. This looks like a great candidate for that. Would be good to discuss some key test cases |
Context: This PR adds support for differentiable batch execution of circuits using TensorFlow following #1501 and #1508.
Description of the change:
This PR adds the following:
TensorFlow dispatch to the top-level
qml.interfaces.batch.execute()
function.qml.interfaces.batch.tensorflow
- a module containing a TensorFlow custom gradient function fordev.batch_execute
.Benefits:
Execute tapes in a batch, with the output remaining differentiable.
Supports gradient transforms and device execution methods
Since gradient transforms are differentiable, nth order higher derivatives are supported. Compared to PL master, this allows nth order derivatives of everything, including expval, var, probs, tensor products, non-two-term shifts, etc.
Example:
gives
Potential drawbacks:
In Add a simple API for transforms that generate multiple tapes #1493, work is being done to create a standardized API for gradient transforms. Until then, we simply assume that any
gradient_fn
within thepennylane.gradients
module is a transform.All gradient transforms, and all device gradients (e.g., adjoint) are supported. The reversible method, however, is not currently supported, since it is not a transform nor a device method.
'Jacobians of Jacobians' (or Hessians or vector-valued cost functions) are not supported out-of-the box, because TensorFlow attempts to autograph/JIT the Jacobian computation in order to parallelize it. However, you can't JIT a function that converts Tensors -> NumPy! E.g.,
This is resolved by specifying
experimental_use_pfor=False
when computing the Jacobian.Issues: n/a