-
Notifications
You must be signed in to change notification settings - Fork 575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Hamiltonian observables on default.qubit #1551
Conversation
Hello. You may have forgotten to update the changelog!
|
@@ -468,8 +470,6 @@ def expval(self, observable, shot_range=None, bin_size=None): | |||
if self.shots is not None: | |||
raise DeviceError("SparseHamiltonian must be used with shots=None") | |||
|
|||
if observable.name == "SparseHamiltonian" and self.shots is None: | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetic change - this line was obsolete.
@@ -232,7 +232,7 @@ def circuit(x): | |||
""" | |||
if not isinstance(op, (Observable, qml.Hamiltonian)): | |||
raise qml.QuantumFunctionError( | |||
"{} is not an observable or Hamiltonian: cannot be used with expval".format(op.name) | |||
"{} is not an observable: cannot be used with expval".format(op.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetic: overlooked this when Hamiltonians became observables
pennylane/qnode.py
Outdated
hamiltonian_in_obs = "Hamiltonian" in [obs.name for obs in self.qtape.observables] | ||
if hamiltonian_in_obs and not supports_hamiltonian: | ||
if hamiltonian_in_obs and (does_not_support_hamiltonian or finite_shots): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until the device can preprocess tapes before entering the blackbox we always split if finite shots are asked for - else we have a situation that the device gets expval(H)
and shots=1000
, and it can only raise an error because splitting the tape would not be tracked by autodiff.
pennylane/qnode.py
Outdated
try: | ||
tapes, fn = qml.transforms.hamiltonian_expand(self.qtape, group=False) | ||
except ValueError as e: | ||
raise ValueError( | ||
"Only a single expectation of a Hamiltonian observable can be returned" | ||
"Only a (single) expectation of a Hamiltonian observable can be returned " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error message sounds strange if one has a single variance measurement.
Codecov Report
@@ Coverage Diff @@
## master #1551 +/- ##
=======================================
Coverage 99.17% 99.17%
=======================================
Files 192 193 +1
Lines 13844 13889 +45
=======================================
+ Hits 13730 13775 +45
Misses 114 114
Continue to review full report at Codecov.
|
…e into support_hamiltonian
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mariaschuld I am happy to approve this subject to minor changes in the tests, to avoid them being skipped!
I think it is a amazing improvement, and the code changes to allow it to work are all very clean 💯
@@ -163,7 +163,7 @@ def expand_tape(tape, depth=1, stop_at=None, expand_measurements=False): | |||
if tape._obs_sharing_wires: | |||
try: | |||
rotations, diag_obs = qml.grouping.diagonalize_qwc_pauli_words(tape._obs_sharing_wires) | |||
except ValueError as e: | |||
except (TypeError, ValueError) as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
however I imagine this is something we should be able to support on default.qubit? Simply reuse the same state for the different inner products?
I think this is doable once the device is in charge of decomps.
tests/ops/test_hamiltonian.py
Outdated
@@ -20,6 +20,13 @@ | |||
import pennylane as qml | |||
from pennylane import numpy as pnp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately I think the original design was intentional, to allow contributors to run the tests locally without requiring all of JAX/TF/Torch be installed!
@@ -1207,12 +1207,12 @@ def circuit(): | |||
assert pars == [0.1, 3.0] | |||
|
|||
|
|||
@pytest.mark.parametrize("simplify", [True, False]) | |||
@pytest.mark.parametrize("group", [None, "qwc"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved these fixtures to the individual tests, since now there are tests that don't rely on the fixtures in this class
@@ -1286,7 +1322,37 @@ def combine(coeffs, param): | |||
assert np.allclose(grad[0], grad_expected[0]) | |||
assert np.allclose(grad[1], grad_expected[1]) | |||
|
|||
def test_vqe_differentiation_jax(self, simplify, group): | |||
def test_nontrainable_coeffs_autograd(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we found an error for this test case, so I am adding them all here!
# note: it is important that we use the Hamiltonian's data and not the coeffs attribute | ||
for op, coeff in zip(observable.ops, observable.data): | ||
# extract a scipy.sparse.coo_matrix representation of this Pauli word | ||
coo = qml.operation.Tensor(op).sparse_matrix(wires=self.wires) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have a better name instead of coo
?
@@ -2357,3 +2353,96 @@ def _matrix(cls, *params): | |||
assert np.allclose(res_state, test_state) | |||
assert np.allclose(res_mat, op.matrix) | |||
assert np.allclose(res_wires, wires) | |||
|
|||
|
|||
class TestHamiltonianSupport: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any test to check if the expectation value of a sample Hamiltonian, such as the one for H2, is computed correctly? (probably covered by other tests but wanted to double check)
for idx_row, idx_col, entry in zip(coo.row, coo.col, coo.data): | ||
# while "entry" is not differentiable, it will be parsed during multiplication | ||
product = self._conj(self.state)[idx_row] * entry * self.state[idx_col] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this loop replaces an operation like coo_matrix.dot
to avoid breaking differentiability? Is there a way to make it more efficient, if it is not efficient already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mariaschuld here is a way of removing the loop that appears to pass all tests:
for op, coeff in zip(observable.ops, observable.data):
# extract a scipy.sparse.coo_matrix representation of this Pauli word
coo = qml.operation.Tensor(op).sparse_matrix(wires=self.wires)
# todo: remove this hack that avoids errors when attempting to multiply
# a nontrainable qml.tensor to a trainable Arraybox
if isinstance(coeff, qml.numpy.tensor) and not coeff.requires_grad:
coeff = qml.math.toarray(coeff)
product = (
qml.math.gather(self._conj(self.state), coo.row)
* coo.data
* qml.math.gather(self.state, coo.col)
)
c = qml.math.cast(qml.math.convert_like(coeff, product), "complex128")
res = res + qml.math.sum(c * product)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small benchmark:
Original VQE: 1.0342366695404053 (1x)
New VQE this PR: 0.28333497047424316 (3.65x)
New VQE, above comment: 0.09953856468200684 (10.39x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me Maria, just left few questions. Will be good to double check if the new addition works for getting the correct expectation value in a VQE qchem problem, if not covered by other tests.
Thanks for the approval! I wonder if someone could just run a few old qchem scripts once this is merged, so we can pressure test it before the next release? I hope that one should only have to change a few lines! |
I will do some tests for few small molecules. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a quick look, without a chance to go deep into the tests. The changes are great! 🤩
I won't request this as as a change, but I worry that the method to compute expectation values can be optimized further with just a bit of extra effort.
To the best of my understanding, this PR is implementing a method that computes expectation values by converting PauliWords into sparse representations and computing expvals for each term. My bet is that if we instead add these matrices together weighted by their coefficients, and compute a single expval for the entire matrix, the simulation will be faster.
But maybe this breaks differentiabilty with respect to the coefficients?
@@ -250,9 +250,18 @@ | |||
|
|||
<h3>Improvements</h3> | |||
|
|||
* Hamiltonians are now natively supported on the `default.qubit` device if `shots=None`. | |||
This makes VQE workflows a lot faster in some cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional:
A simple code snippet would be nice to illustrate what we mean by native support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is any user-facing change here though, apart from speedup! No UI changes to enable this, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The UI does not change, so I am struggling to find a good example!
if observable.name == "Hamiltonian": | ||
assert self.shots is None, "Hamiltonian must be used with shots=None" | ||
|
||
# compute <psi| H |psi> via sum_i coeff_i * <psi| PauliWord |psi> using a sparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we decompose the Hamiltonian into Pauli terms and compute sparse representations of them, instead of calculating a sparse representation of the entire matrix? This current implementation seems suboptimal in that regard
Note that this can be done using https://pennylane.readthedocs.io/en/stable/code/api/pennylane.utils.sparse_hamiltonian.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Differentiability is hard! We tired that before!
import pennylane as qml | ||
|
||
|
||
def hamiltonian_grad(tape, idx, params=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So cool! 🤩
for o in self.obs: | ||
if len(o.wires) > 1: | ||
# todo: deal with multi-qubit operations that do not act on consecutive qubits | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can decompose any observable into the Pauli basis, so restricting to PauliWords doesn't lose generality, it may just be inconvenient in niche cases.
Really, a list of coeffs and PauliWords is a sparse representation, where instead of using the canonical basis with of matrices with zeros everywhere except for a one in entry (i, j), we use the Pauli basis
Context:
Circuits with
expval(Hamiltonian)
currently get split up into other circuits unless the device supportsHamiltonian
observables and implements something smarter. This PR adds Hamiltonian support for default qubit. Note that nothing will change for the user, but circuits withexpval(H)
should run and differentiate a lot faster for default.qubit withdiff_method=best
(-> backpropagating through sparse simulation), and a little faster when usingdiff_method="parameter-shift"
(-> using the new gradient rule)!Note: Initially we added a new differentiable
SparseMatrix
class, but this turned out to be much harder than anticipated. So now we are using essentially the same recipe as theSparseHamiltonian
class, but with a bit more care to not break differentiability:scipy.sparse
is used for all sparse support, and the elements of the sparse matrix (which are floats) are multiplied to differentiable tensors, and hence do not disturb the differentiation pipeline.Description of the Change:
sparse_matrix
method to the Tensor class, which returns a sparse matrix representation of a Tensor.The VQE tests are quite distributed:
test_vqe
andtest_hamiltonian
already contain tests for differentiating Hamiltonian and other params,device/test_measurements
now has a sanity check that Hamiltonian observables are supported, andtest_qubit_device
checks whether the Hamiltonian is split or not.Benefits:
Avoiding tape splitting should be a lot faster on the forward pass. Previously with our own SparseMatrix implementation I got a 15x speedup, which now reduced to a 4x speedup for optimising VQE using a Hamiltonian with 15 terms. I believe that this slowdown stems from the fact that it is faster to do
\sum_ij psi[i] H_sparse[ij] psi[j]
than\sum_k c_k * \sum_ij psi[i] PauliWord_sparse[ij] psi[j]
, since we do not have to repeatedly index into the state. But we don't get around this without a differentiable representation of H_sparse!Possible Drawbacks:
There may be some places to still optimise the pipeline, and in future we could revert to the plan of having a differentiable sparse class!
The only feature I can think of that is not supported is when Tensors are made up of operations acting on non-consecutive wires like
Hermitian(..., wires=[0,2]) @ PauliX(1)
. This case is implicitly nowhere supported in PennyLane (i.e. the Tensor class is not foolproof against it, and theSparseHamiltonian
class fails such constructions). It is non-trivial to compute a matrix from tensor products of constituent matrices, and suspect the only way to do it is to shuffle wires around and afterwards reshuffle the matrix.