Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] interface and implementation for low-level expectation value #8532

Closed
wants to merge 14 commits into from

Conversation

jlapeyre
Copy link
Contributor

@jlapeyre jlapeyre commented Aug 13, 2022

This is a WIP, RFC PR.
This PR implements a low-level function expectation_value with 17 methods for various combinations of input using a multiple-dispatch library. The function is not a method of any class. It's (more or less) an ordinary function.

What's in the code

  • Implementations for Counts, QuasiDistribution, SamplerResult. Till now there are none for these classes. The need for these comes from specific use cases reported by @nonhermitian. I did not understand the details of the use cases, so these implementations may not yet fit the bill exactly.
  • Reimplementations of some class methods named expectation_value in quantum_info.
  • A few other methods including some in quantum_info, but that were not previously covered.

Interface

The methods have signatures similar to the following

expectation_value([operator], state, [qargs])  # First and last arguments may be absent.

# More explicitly
expectation_value(operator, state) # does what it says
expectation_value(operator, state, qargs) # restricted to a subset of qubits
expectation_value([operator], probs) # The operator is implied if omitted.

The last signature is not for states, but rather associated probabilities. If operator is omitted it defaults to the Pauli string that is diagonal in the measurement basis and that contains no one-qubit identity operators. Usually, this will be a string of Zs.

If a state is given by $(c_1, c_2,\ldots)$, then the probabilities are $(|c_1|^2, |c_2|^2, \ldots)$. Since the phase information is lost, the only possible operators for expectation are those corresponding to the measurement basis. There has been concern that we don't want to narrow the semantics of Counts, etc. by assuming the computational basis. But, I think this is misplaced. Whatever the the Pauli string specifying the measurement basis was, the answer is correct, and there is no room for ambiguity.

Why

Class methods are essentially a kind of single dispatch, if different classes have methods of the same name. Here, we dispatch on the number of arguments and the types of each.

  • Multiple dispatch (MD) allows better factoring (less boilerplate) and more flexibility. This goes together with better encapsulation.
    • For example here is a single method with the signature

      expectation_value(pauli: Pauli, state: Union[Statevector, DensityMatrix], qargs: QargsT)

      with code that leaks no details of the state. In contrast, the current implementation is two separate methods that depend on details of state.

    • To handle the case where qargs is absent, there is a single, one-line method for two operator types and all types of state. This is done via dispatch. In contrast, the current implementation uses a conditional checking for None and constructing the missing data. This boilerplate is repeated at the top of each method. Here, here, and here. quantum_info is not unique in this respect. Rather, this pattern is common in qiskit.

    • There is one very short method to handle SparsePauliOp for all subclasses of QuantumState. In contrast, currently this is handled with typechecking and conditionals in a method for each subclass. And one of them is not yet implemented. Using MD, we get it for free.

  • One has to be conservative with changes to methods in a widely used class. With MD, you can have methods that are both more powerful, and local (unexported) to a file. Like these. This means that even if your proposal to include the methods you need in a class are rejected, you can still define and use MD methods locally.
  • The type annotations used for dispatch also serve as 1) documentation and 2) run-time type checking. If you pass values of the wrong type a no-method-found error is thrown. Alternatively, you can also write a method to give an informative error message. If the annotations are out of sync with the implementation, that is incorrect, tests are likely to fail. You will no longer write occasional, ad hoc, type checks in conditionals.
  • To add a method to an existing function in another file, you must explicitly import it and think about the interface and dispatch. This provides social and technical pressure to follow a uniform interface for a function. With class methods, it's easier to order arguments differently than do semantically similar methods in other classes.

How is it organized ?

  • For simplicity, I included most of this in a single file qiskit/operations.py. But, it could be organized differently. For example methods for a function can be spread across the code base in order to include them near code involving similar types. There is one example of this in this PR.

Python projects using Multiple Dispatch

This is important. How is MD used with Python. Is it successful ? Not much here yet...

Disadvantages

  • Multiple dispatch is different from and less familiar than typical class based OO.
  • It would be best to introduce multiple dispatch in a way that is hidden from the user, that is, does not affect the API. I think this current PR does not fit the bill here. Qiskit is in part just a Python library, so the classes and objects are meant to be exposed. It may be hard to introduce MD in way that is not hard to revert.
  • We need to find a way to organize doc strings.
  • Performance. Because Python is interpreted, abstractions come at a cost. In this PR I already avoided some small MD functions with the result that I duplicated code. In contrast, with a compiled language, you could use these functions. I would not be surprised if some of the code duplication I observed in quantum_info and elsewhere is there precisely to avoid performance hits. In any case, here is an example benchmark of overhead. Overhead is rather high here because: 1) There are multiple levels of MD dispatch. 2) There are only two qubits, so in general function call overhead is more important.
       In [1]: diagop = Pauli('ZZ')
       In [2]: state = Statevector([1, 2, 2, 1])
       In [3]: state = state / np.linalg.norm(state.data)
    
       In [4]: %timeit state.expectation_value(diagop)
       34.2 µs ± 461 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
       In [5]: %timeit expectation_value(diagop, state)
       42.5 µs ± 324 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    Here is an example that uses the density matrix constructed from the same state above, rather than the state vector. It also avoids one level of MD by passing qargs directly.
     In [32]: %timeit expectation_value(diagop, rho_state, [0,1])
     41.3 µs ± 296 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    
     In [33]: %timeit rho_state.expectation_value(diagop, [0,1])
     38.3 µs ± 189 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
    Now the performance gap is smaller, about 8%, because 1) the calculation is a bit longer and 2) one level of MD is avoided.
    To improve performance one can
    • Try different MD libraries. The author of the one used in this PR shows a favorable comparison with the most popular library. However, a year ago, we found the opposite.
    • Lobby for cython in the library. The MD library used in this PR used cython until very recently. It was removed because a couple of users did not find the performance increase worth the opacity of the pure-python to compiled-code interface. We could benchmark the previous (cythonized) version.
    • Use MD only at higher levels. This would increase performance, but would decrease the utility of MD. The biggest reductions of code and complexity often come from two or more layers of MD calls.

@jlapeyre jlapeyre requested review from a team, ikkoham and t-imamichi as code owners August 13, 2022 05:47
@qiskit-bot

This comment was marked as duplicate.

@coveralls
Copy link

coveralls commented Aug 13, 2022

Pull Request Test Coverage Report for Build 2880017667

  • 0 of 178 (0.0%) changed or added relevant lines in 1 file are covered.
  • 215 unchanged lines in 6 files lost coverage.
  • Overall coverage decreased (-0.2%) to 83.832%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/operations.py 0 178 0.0%
Files with Coverage Reduction New Missed Lines %
qiskit/extensions/quantum_initializer/initializer.py 1 97.22%
qiskit/extensions/unitary.py 1 96.59%
qiskit/extensions/quantum_initializer/uc.py 10 88.19%
qiskit/algorithms/optimizers/optimizer.py 18 87.43%
qiskit/circuit/quantumcircuit.py 36 93.5%
qiskit/visualization/state_visualization.py 149 58.68%
Totals Coverage Status
Change from base Build 2845105967: -0.2%
Covered Lines: 56313
Relevant Lines: 67174

💛 - Coveralls

@nonhermitian
Copy link
Contributor

Why doesn't expectation_value on distributions take operators here? E.g. suppose I want to make my own VQE routine that need to compute operators like 'IIZZIZ , or rewrite Quantum Volume to be the expectation value over the projection operators that define the heavy set per circuit.

@jlapeyre
Copy link
Contributor Author

Why doesn't expectation_value on distributions take operators here?

That makes sense. I just added it, but the interface needs to be adjusted a bit before I update the PR.

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Aug 16, 2022

I pushed some changes to the code. I added a notebook , although I don't think it is allowed to live permanently where I put it. The notebook shows examples of the latest version of the API.

@nonhermitian I'd be interested to know if you think the implementation for Counts and QuasiDistribution is convenient and flexible enough (or too much so). It is demo'ed in the notebook.

@jlapeyre

This comment was marked as outdated.

@nonhermitian
Copy link
Contributor

Can't one just check diagonality and raise if not the case? I really think this whole multi-basis stuff is a solution looking for a problem.

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Aug 17, 2022

An out-of-band discussion, and more thought, brought some issues to light. (Many points below have come up elsewhere and people have already argued for some ideas below.) Something we already knew is that there are two concerns addressed in this PR, and the hope was that this is a neat way to do both at once:

  1. We want a straightforward function or method to compute some expectation values from things like Counts and QuasiDistribution. There seems to be no dissent here (but correct me if I missed it). What does this mean and how should we do it?
  2. Is multiple-dispatch useful in qiskit? If so, how should it be used?

I'll discuss 1 and 2 separately. My conclusion is that this PR does not adequately answer these questions, at least without significant revision. I discuss 1 here and leave 2 for later.

Expectation value of Counts

It would be useful to extend the idea in quantum_info of the method QuantumState.expectation_value(operator) to Counts.expecation_value(operator). Or to design an API with MD to do both of these and more. The interface should be as uniform as possible with respect to operators and states (or counts) of different mathematical types as well as representations in data. A requirement is that the solution should allow the most common and pressing situation of counts measured in the computational basis to be handled easily without being burdened by preserving generality or uniformity of interface for use cases that have not yet arisen.

A first attempt might be

  • expectation_value(operator: Union[Pauli, etc.], state: Counts). The implementation will validate operator: It must be constructed of Pauli strings containing only Z and I.

But Counts is meant to support more than just measurements of qubits in the computational basis1. Perhaps the next most common scenario is measurement in another Pauli basis (XYYZIZ, etc.)2. So our next attempt (the one in this PR) is

  • Allow any Pauli string (or weighted sum) as operator. In the documentation we write: "Any Pauli operator is supported with the restriction that it must be diagonal in the Pauli basis in which counts were collected." Now, Counts (and QuasiDistribution) do not carry information on the measurement basis. But the algorithm does not need this information. It only needs to know at which indices an I occurs. In sum, the implementation is exactly the same as the previous attempt, except that all Pauli operators are allowed. It is up to the user to only pass valid operators based on information not available to our implementation. However, it would make sense to check that each term in a Pauli sum is consistent with a single measurement basis.

This is not the end of the story. There are other qubit-measurement bases that cannot be handled by the scheme above. For example, the Bell-basis encoded in a single-qubit computational basis (see below for a footgun that the scheme allows). Furthermore Counts allows dits as keys. For example Bell measurements could be encoded as 4-dits. So we need something else. Some possible solutions are

  1. Implement expectation_value_pauli_basis(operator: Union[Pauli, SparsePauliOp, etc.], state: Union[Counts, etc,]) (except with a better name, and could be MD or class methods or both). Validate the input only to verify that each string in a sum is compatible with a single measurement basis. This would be fairly easy, not disruptive, would use exactly the same computation already in this PR, and should satisfy the need expressed many times by users. (via @nonhermitian). This solution does not preclude a more general solution when the need arises.

  2. Include metadata in Counts specifying the measurement basis. Counts is already carrying some meta data, although it could be cleaned up. Adding the measurement basis seems to fit with the spirit of Counts. If the consumer of Counts omits the basis (as they do now), then the computational basis is assumed. In this case, we have a more generic expectation_value that can raise an error if a particular basis is not supported.

  3. (EDIT. I forgot to add this option) Instead of including metadata in Counts, pass this information to expected_value. For the immediate case at hand, i.e. computational basis, or more generally Pauli basis, we don't need to pass anything, just (optionally) validate the input. (We often don't validate input, but probably should here because of potential confusion.) So, for the moment, this is the same as option 1 above. But, if we want to treat expectation_value more generally at a later time, we could change expectation_value_pauli_basis to use this general facility without changing the API. In general, the combination of the measurement basis and the operator passed to expected_value determine a weight associated with each measurement result. The expectation value is the (normalized) dot product of the counts and the weights. You might need to allow supplying a vector of weights. But, typically they are computed. For example, for a diagonal Pauli operator, the weight is determined by the parity of the number of qubit indices where the operator is not I.

Bell-basis measurement

Imagine this only slightly futuristic scenario. Suppose the user runs a simulator or machine that supports measuring in the Bell basis using the standard encoding3. The user receives an instance of Counts. The user is unfamiliar with Counts, but it looks useful; there is even an expectation_value method (or a function or whatever). They look at examples and see the operator passed to expectation_value is a Pauli string or weighted sum thereof. The user doesn't know how to specify an operator in the Bell basis, so they decide to proceed with Paulis. Maybe the documentation said only diagonal operators are supported. Or maybe the user manages not to forget that only expectations of diagonal operators are possible from counts. The user knows that ZZ is also diagonal in the Bell basis, so they try it. It works every time! The user then tries XX and YY, which are also diagonal in the Bell basis. But they get the wrong answer! (It is easy to compute that this is indeed what would happen if we only verify input consistent with one Pauli basis. But, I don't include the computation here.)

Footnotes

  1. The documentation, interface, and implementation of Counts could be improved. But the documentation seems to imply that it is meant to handle more general measurements than qubits in the computational basis. In fact, output of dits is explicitly supported. I have draft issue (not yet opened) on improving Counts.

  2. It may be that till now no one has used Counts to represent measurements in other Pauli bases.

  3. By, standard, I mean the one that I have seen most often. $|\Phi(a, b)\rangle = (1/\sqrt(2))[|0b\rangle + (-1)^a |1\bar{b}\rangle]$. The user could be fooled into thinking that Counts knows the basis (it does not) because it turns out that computing the expectation value of $ZZ$ is the same whether bits $(a, b)$ are interpreted as in the computational basis or the Bell basis.

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Aug 17, 2022

Can't one just check diagonality and raise if not the case? I really think this whole multi-basis stuff is a solution looking for a problem.

@nonhermitian, I'm inclined to agree. Counts was designed be be more general. But I can't find anywhere that it's used for anything other than qubits measured in the computational basis. Is there anything that constructs Counts with dits for keys? And several places, for example in the mitigation code in terra, qubits and Z basis are assumed when using Counts.

I think expectation_value(operator, data: Counts) or Counts.expectation_value(operator) with the appropriate doc string is ok. More general things could be added when needed in the future, probably without changing this interface. As I said above: via metadata in Counts or passed as a kw argument, etc.

We could enforce only Is and Zs, in which case the doc would have:

It is assumed here, as in almost all of qiskit, that counts were collected by measuring in the computational basis. It follows that operator must be diagonal. That is, in Pauli strings are required to contain only Z and I.

Or allow any operator in a Pauli basis, in which case the doc would have:

It is assumed here that the counts were collected by measuring in a Pauli basis. This means that operator must be diagonal in that basis. Since all backends and simulators in Qiskit measure in the computational basis, this means that Pauli strings in operator must contain only Z and I.

However you or a third party may write code that returns counts measured in a different Pauli basis. In this case operator should be diagonal in that basis. It is important to be aware that this method has no way of knowing which basis you measured in. In fact, computation of the expectation value depends only on how many Is are in each Pauli string, and not on whether the other single-qubit operators are X, Y, or Z. Regardless of the operator you pass, the expectation will be computed in the actual measurement basis. The only requirement is that all strings in operator are diagonal in the same basis.

A couple of comments:

a. The documentation of the second option is more complex than the first. And we don't have any evidence that anyone wants this yet.

b. I would bet that fairly frequently users will neglect to read the doc string and/or due to a thinko try to compute the expectation value for non-diagonal operators. Then they'll be frustrated and we'll see bug reports.

So maybe one of these options:

  1. Leave it as described in the second docstring above. Let people read the docstring to find out why their code doesn't work.

  2. Choose the first option above and enforce only Zs and Is. If someone wants to use counts collected in another Pauli basis, I suppose they might remain quietly sad that expectation_value does not support this. But, probably they would ask somewhere how it can be done, in which case we can broaden the allowed input to something like the second option.

  3. Choose number 2 above, but to prevent the footgun, the doc string would have:

    However you or a third party may write code that returns counts measured in a different Pauli basis. In this case, you must pass the keyword argument basis = "pauli". etc. etc.

    Then we require all Zs and Is unless the correct keyword argument is passed. One issue is that if the possibilities are expanded in the future, we may have to deprecate the keyword and use a different option. At least trying to avoid this would be really nice.

EDIT: One more thing. It should be pretty easy to get something like this merged as class methods on Counts, etc. If the multiple dispatch works out, that will take longer to merge. Since this is needed, I'd support using a method first, and perhaps change the interface later.

@nonhermitian
Copy link
Contributor

I would advocate for more than just Pauli's and include the zero and one state projectors. They are diagonal in the computational basis, and are useful to use for projections onto subsets. eg. one could write strings like IIZI0Z1I

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Aug 17, 2022

include the zero and one state projectors

Oh, that makes sense. ... I think that would mean throw out counts if any of the 0s and 1s don't match , right? (yeah thats what it means)

The more I think about it, I think supporting only what people need now is best. If something else is needed, we probably won't do it correctly now anyway and would have to deprecate/change it for some reason when it is needed.

@jlapeyre
Copy link
Contributor Author

jlapeyre commented Aug 18, 2022

Regarding item number 2 at the top if this comment #8532 (comment), that what do we think of multiple dispatch in this PR?

  1. Some people who are usually quite good at understanding complex software found reasoning through the dispatch confusing. This suggests that most people would find it confusing.
  2. Reducing the number of nested calls to methods of the same generic function would make the code more performant, and possible easier to understand.

To address the first point, it would be a good idea to review this with an eye toward reducing the complexity. Furthermore it could be organized better, commented better, perhaps explained better in a notebook.

Regarding the second point: I wanted to see how much I could get out of multiple dispatch here without regard to efficiency. If this is done well, it's instructive for understanding how it MD is used. Then, start removing some abstractions or clarity in order to improve performance. Still, with some effort, you might be able to reorganize the code such that you don't have to sacrifice too much to get more performant code.

@jakelishman and @garrison made a few suggestions. For example suppose you use MD inside a loop. You can instead ask the MD system outside the loop to find the correct method without actually calling it. There is a function, invoke for this. Then use this precomputed method inside the loop. That's somewhat clumsy, and slightly harder to read. But, I suppose not too bad.

It was also pointed out that allowing "implicit arguments" and so forth reduces the uniformity of the interface. Maybe to the point where you don't want to try to claim it's one function. In any case, I now no longer think omitting the first operator function and using a default is a good idea.

Before we considered that an operator is all Zs if it is omitted.
Now we require an explicit operator.

We disallow strings like '1100' to represent 'ZZII'.

For `Counts` and `QuasiDistribution`, we allow characters 0 and 1
in a string. These represent projectors onto 0 and 1 subspaces.
@jlapeyre
Copy link
Contributor Author

They are diagonal in the computational basis, and are useful to use for projections onto subsets. eg. one could write strings like IIZI0Z1I

@nonhermitian , I pushed an implementation of this to this PR. Seems to work ok. Needs a bit of refactoring. Also the overall interface still needs some work.

@ikkoham
Copy link
Contributor

ikkoham commented Aug 22, 2022

Great PoC. Thank you. I left some comments:

  1. Basis.

IIZI0Z1I is already implemented in https://github.com/Qiskit/qiskit-terra/blob/main/qiskit/result/mitigation/utils.py#L71.
I know this is not so efficient because it stores diagonal that grows exponentially.

  1. Error handling.
  • What happens in the case of positional arguments? Does MD check the types in order from the top and match them? I am wondering what the behavior will be since the argument names are not aligned.
  • If none of these matches are found, does it result in an error?
  • How do we verify that Counts (,QuasiDistribution or ProbDistribution) are measured on the appropriate basis? or only diagonal matrices should be allowed?
  1. Performance.

When QuantumCircuit is Clifford circuit, we can use StabilizerState to improve the performance.

  1. Packaging.

Where would be the appropriate place to put this? If it depends on result, quantum_info, primitives, then it is even higher level. Maybe algorithms? @jakelishman may have good suggestion. (We discussed the hierarchy of qiskit.)



@dispatch(precedence=1)
def expectation_value(oper: Pauli, state: StabilizerState, qargs: QargsT):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't these sort of specializations implemented in the StabilizerState/Statevector/DensityMatrix expectation_value methods? It seems like you should put the dispatching on the state class methods that compute these for various different operator types, and then if you want a separate function interface to them, its dispatch registration just needs to call the states method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. In particular in the case of StabilizerState (as far as this PR is developed). But Statevector/DensityMatrix share a single method here. Which class should this go in? Or maybe I'm misunderstanding something.

How to integrate MD with existing language features and styles is not clear to me. There is an argument for doing it the other way around. Have the class methods call the generic functions. Or maybe its better to avoid being able call class methods and generic functions to do the same thing.

@jlapeyre jlapeyre marked this pull request as draft August 25, 2022 02:49
@jlapeyre
Copy link
Contributor Author

@ikkoham thanks for the comments

  1. Error handling.

If I understand correctly, standard Python supports keyword-only, positional-only, and positional-or-keyword arguments. I'm fairly (not quite 100%) certain that plum-dispatch works like this: Only positional-only and keyword-only arguments make sense. Only positional arguments are used for dispatch. In fact, the docs say

Keyword-only arguments, separated by an asterisk from the other arguments, can also be used, but are not dispatched on.

When searching for a matching method, the number of positional arguments must match. The type of each positional argument must match. The order of arguments must match. If no matching method is found, an error is raised.

If more than one method matches, then the "most specific" method is chosen. For example calling with an int will match Number, but an explicit int annotation is more specific. Sometimes two different methods match, but it is not clear which is more specific. There are various ways to resolve this.

what the behavior will be since the argument names are not aligned

I don't understand.

When QuantumCircuit is Clifford circuit, we can use StabilizerState to improve the performance.

I think you are saying that a method could be added for this.

Where would be the appropriate place to put this?

That's a good question. I put as much in one file as possible for convenience. But, it's probably better to put the methods in different places. How fine grained to make that is then a question.

It would be nice to keep MD away from the API, so it can be changed easily. I didn't yet try to do that. It may be difficult.

@jlapeyre
Copy link
Contributor Author

Apparently the NetKet project uses plum-dispatch mainly for expectation values

We use plum mainly for the expect(state, operator) and expect_and_grad(state, operator, options) functions in order to dispatch to the best implementation possible. We do have generic ones that work for all types that implement our API, but this allows us to have specialised implementations.

@jlapeyre
Copy link
Contributor Author

This is out of date. I think most things here have been or will be moved to Rust.

@jlapeyre jlapeyre closed this Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants