-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add adjoint differentiation method #1032
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @josh146! This device-based method is really nice! (sorry for this long train-of-thought comment)
What are the pros and cons of dev.rewind_jacobian()
vs. RewindTape
? Here it looks like we might be able to save a bit on QNode-level testing 🤔 Otherwise it seems to be a more philosophical question of when a Jacobian method should be associated to a device vs. a tape. For rewind, we require the device to be statevector-based and have some methods (e.g., _apply_unitary
) so it is somewhat device based, but general enough to be in QubitDevice
🤔 Maybe the only true device-independent methods are finite-diff and param-shift, which only require access to expectation values.
On the other hand, one downside of having it as a device-based method is that one cannot access a RewindTape
. Although, maybe this doesn't matter, since one could always do dev.rewind_jacobian(tape)
for any choice of tape. I find the border between tape and device can be a bit unclear at times 🤔
Overall, maybe we should just see which one is fastest! I don't see any reason for one to be faster than the other though. In which case let's just choose one and see if it causes any problems down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise it seems to be a more philosophical question of when a Jacobian method should be associated to a device vs. a tape.
I definitely see this as more of a 'philosophical' design decision.
For rewind, we require the device to be statevector-based and have some methods (e.g., _apply_unitary) so it is somewhat device based, but general enough to be in QubitDevice 🤔 Maybe the only true device-independent methods are finite-diff and param-shift, which only require access to expectation values.
I definitely agree here. The way I see it:
-
Parameter-shift and finite-diff inspect the provided circuit, and create new circuits for gradient evaluations (that can be batch evaluated).
- There is no dependence on device-level control, it is fully device independent.
- There is also value to end users for being able to access/draw/manipulate the generated tapes.
-
Rewind could be written in the above form as well, but there are subtleties.
- It accesses the low-level device API (including current private methods of
default.qubit
). - It will likely be easier to 'cache' the forward pass evaluation than if we did it at the tape level
- It accesses the low-level device API (including current private methods of
The big one for me though: since we are modifying the state of a device, rather than generating device independent tapes for later evaluation, this feels much more like it should live on the device. Is there a use case for being able to generate rewind tapes independent of devices?
Note: regardless which approach we go with, we will need to make device._apply_operation
and device._apply_unitary
proper abstract methods of our device API in order to support external devices and not just default.qubit
.
On the other hand, one downside of having it as a device-based method is that one cannot access a
RewindTape
. Although, maybe this doesn't matter, since one could always dodev.rewind_jacobian(tape)
for any choice of tape. I find the border between tape and device can be a bit unclear at times 🤔
Think of it like this:
- The tape is simply a data-structure for representing a quantum circuit.
- A device accepts a tape, and executes it, returning numeric results.
If we need a serializable representation of the circuit, than it should be a tape. Otherwise, it doesn't need to be.
Overall, maybe we should just see which one is fastest! I don't see any reason for one to be faster than the other though. In which case let's just choose one and see if it causes any problems down the line.
Let's run ASV on it!
Comparison between the last commit on this branch (a426cca - left column) and the last commit for
Comparison between the last commit on this branch (a426cca - left column) and the last commit on
|
Co-authored-by: antalszava <antalszava@gmail.com>
That's a good point! However, are we sure that these are the methods we want to "make official" going forward? Would it be worth waiting in case we decide to do a more major refactor of the inner workings of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really great work! 🚀 Only a small number of comments/thoughts, but it seems to be working very well. Haven't looked through the tests yet, but I'll do that right away.
QNode._get_parameter_shift_tape(device), | ||
interface, | ||
device, | ||
{"method": "analytic"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between method
and jacobian_method
and why would both be needed?
Co-authored-by: Theodor <theodor@xanadu.ai>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @thisac!
Didn't really have any comments on the tests. I think they look great (although I'm a bit tired right now, so I might have been a bit quick on checking the final tests). 😆 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @trbromley, @albi3ro, et. al! I left comments and suggestions, but nothing that would block merging.
One thought that crossed my mind - if this is very specific to default.qubit, should this just be a method on default.qubit instead?
|
||
phi = self._reshape(self.state, [2] * self.num_wires) | ||
|
||
lambdas = [self._apply_operation(phi, obs) for obs in tape.observables] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just double checking, but we can't do the following:
new_tape = some_func(old_tape)
self.execute(new_tape)
because here we are applying hermitian matrices to the statevector, which the tape/device won't understand how to do without going low level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new_tape
for a specific obs
would look like:
with qml.QuantumTape() as new_tape:
qml.QubitStateVector(phi, wires=range(wires))
obs # actually not sure how this line would look
qml.state()
I don't see a problem if it's PauliX
but for an arbitrary Hermitian
this wouldn't (?) work. Maybe we could hack it in to work though, but we'd lose the state being normalized.
I also feel like the "many-tapes" approach might be inefficient, e.g., to make lots of tapes and then do device execution (which includes lots of postprocessing), when we just need to evolve the state by one gate.
tests/test_qubit_device.py
Outdated
grad_F = tape.jacobian(dev, method="numeric") | ||
grad_D = dev.adjoint_jacobian(tape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the functional refactor, this will be a lot more consistent.
Co-authored-by: Josh Izaac <josh146@gmail.com>
… rewind-on-device
🚀 |
This method adds the
diff_method="adjoint"
method based on the paper here.