Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring commutativity analysis and a new commutative inverse cancellation transpiler pass #8184

Merged
merged 30 commits into from
Aug 10, 2022

Conversation

alexanderivrii
Copy link
Contributor

@alexanderivrii alexanderivrii commented Jun 16, 2022

Summary of this PR:

  • Moving functionality to check whether two gates commute to a separate classCommutationChecker.
  • Simplifying code for CommutationAnalysis and DagDependency. There is no change to the actual functionality.
  • A new transpiler pass CommutativeInverseCancellation that cancels pairs of inverse gates exploiting commutation relations.
  • Adding tests for CommutativeInverseCancellation. These include tests adapted from CommutativeCancellation and InverseCancellation, and problematic tests from CommutationAnalysis needs to be refactored #8020 .

What's already in Qiskit

Let me try to summarize (based on my understanding) what we already have in Qiskit (related to this PR).

CommutativeAnalysis:
*) Efficiently checks commutativity between two DAG nodes, cleverly hashing both positive and negative results (i.e. whether various pairs of gates commute or not)
*) Creates a dictionary describing commutation relations on a given wire, however does not property handle transitivity (see #8020 for an in-depth discussion). As the result, transpiler passes based on the created dictionary may output incorrect results. This is definitely the case for CommutativeCancellation (see #8020) and probably for #8034. Personally I cannot think of an application where the created dictionary structure would be useful.

CommutativeCancellation:
*) Transpiler pass that cancels self-inverse gates and rotations using commutativity analysis (as per CommutativeAnalysis).
*) Additionally, combines multiple z-rotations and multiple x-rotations (but not y-rotations), also see #8178 for a related PR.
*) Additionally does something called "basis priority change" (I did not understand the purpose for that).
*) Only works with a specific predefined set of self-inverse and rotation gates.
*) Misses optimization opportunities and may lead to incorrect results (see #8020).

InverseCancellation
*) Transpiler pass that cancels consecutive pairs of inverse gates.
*) The pairs of inverse gates must be specified by the caller, e.g., InverseCancellation(gates_to_cancel = [CXGate(), HGate(), (RXGate(np.pi / 4), RXGate(-np.pi / 4))]).
*) Very efficient in terms of performance, based on collect_runs.
*) Does not use commutativity analysis.
*) Not used in preset pass managers.

DagDependency
*) A canonical representation of non-commutativity in a circuit. Currently only used for TemplateOptimization transpiler pass (which by itself is not used in any of the preset pass managers).
*) Uses similar (but not exactly the same) commutativity checking as CommutativeAnalysis, but does not hash results.
*) For each node (in DagDependency), computes and stores the list of its transitive successors = all nodes forward reachable from this node, and the list of its transitive predecessors = all nodes backward reachable from this node. This leads to large time required to construct DagDependency and to huge memory overhead (quadratic in the total number of nodes).
*) On the other hand, keeping the lists of successors and predecessors allows an efficient check whether a pair of nodes in DagDependency commutes.
*) Based on DagDepNode. Personally, I don't understand why we need separate classes DagDepNode instead of DagNode, and DagDependency instead of DagCircuit. After all, it's still a DAG.

TemplateOptimization
*) A transpiler pass that allows to optimize circuits based on templates.
*) Uses commutativity analysis (as per DagDependency)
*) Very slow to run (at least with default options).
*) Not used in preset pass managers.

Some experimental results

  • All the experimental results are for random Clifford circuits with the specified number of qubits and gates from ["x", "y", "z", "h", "s", "sdg", "cx", "cz", "swap"]. Clifford circuits are obviously special, so please treat these results with a grain of salt. The default seed is 0 (and the results are quite consistent with respect to seed).

Constructing DagDependency

#qubits = 5, #gates = 100, seed = 0, #edges = 151, depth = 30, time = 0.0439
#qubits = 5, #gates = 1000, seed = 0, #edges = 1618, depth = 336, time = 1.0921
#qubits = 5, #gates = 5000, seed = 0, #edges = 8007, depth = 1650, time = 19.8801
#qubits = 10, #gates = 100, seed = 0, #edges = 162, depth = 22, time = 0.0469
#qubits = 10, #gates = 1000, seed = 0, #edges = 1672, depth = 200, time = 1.2437
#qubits = 10, #gates = 5000, seed = 0, #edges = 8333, depth = 943, time = 22.9258

Memory consumption for DagDependency (using memory_profiler, memory reports the memory increment for circuit_to_dagdependency)

#qubits = 5, #gates = 100, memory = 0.3 MB
#qubits = 5, #gates = 1000, memory = 9.4 MB
#qubits = 5, #gates = 5000, memory = 209.3 MB
#qubits = 10, #gates = 100, memory = 0.1 MB
#qubits = 10, #gates = 1000, memory = 7.8 MB
#qubits = 10, #gates = 5000, memory = 199.1 MB

Time to run TemplateOptimization

Here #gates is the original number of gates, and #gates_opt is the optimized number.

#qubits = 5, #gates = 100, seed = 0, #gates_opt = 93, time = 5.5520
#qubits = 5, #gates = 1000, seed = 0, #gates_opt = 963, time = 626.0542
#qubits = 10, #gates = 100, seed = 0, #gates_opt = 93, time = 8.6801
#qubits = 10, #gates = 1000, seed = 0, #gates_opt = 929, time = 1003.9944

Other optimization techniques

We are running

  • transpile with optimization_level=3
  • InverseCancellation with gates_to_cancel = [CXGate(), HGate(), SwapGate(), CZGate(), ZGate(), XGate(), YGate()]
  • CommutativeCancellation
  • The new pass CommutativeInverseCancellation)

As before, #gates is the original number of gates, and #gates_opt is the optimized number.

#qubits = 10, #gates = 1000:
Transpile: #gates_opt = 942, time = 0.8979
InverseCancellation: #gates_opt = 952, time = 0.0284
CommutativeCancellation: #gates_opt = 960, time = 0.0738
CommutativeInverseCancellation: #gates_opt = 900, time = 0.1236

#qubits = 10, #gates = 5000:
Transpile: #gates_opt = 4747, time = 4.9246
InverseCancellation: #gates_opt = 4666, time = 0.1376
CommutativeCancellation: #gates_opt = 4760, time = 0.2224
CommutativeInverseCancellation: #gates_opt = 4368, time = 0.4219

#qubits = 50, #gates = 1000:
Transpile: #gates_opt = 961, time = 0.8138
InverseCancellation: #gates_opt = 948, time = 0.0299
CommutativeCancellation: #gates_opt = 962, time = 0.0678
CommutativeInverseCancellation: #gates_opt = 900, time = 0.1347

#qubits = 50, #gates = 5000:
Transpile: #gates_opt = 4756, time = 4.8730
InverseCancellation: #gates_opt = 4692, time = 0.1506
CommutativeCancellation: #gates_opt = 4762, time = 0.2533
CommutativeInverseCancellation: #gates_opt = 4390, time = 0.6961

Please note that the new pass CommutativeInverseCancellation results in most optimization opportunities, but is a bit slower than InverseCancellation and CommutativeCancellation (but still faster than the time required to construct DagDependency objects for these circuits).

In this PR:

CommutationChecker
*) Essentially copy-pasting checking whether two gates commute from CommutativeAnalysis in a separate class. This includes includes hashing of commutativity results.
*) Sufficient to implement simple transpiler passes, such as the new CommutativeInverseCancellation pass.
*) Note: CommutationChecker (with its hashing) can be reused even when the circuit changes!

Simplifying code for CommutationAnalysis and DagDependency
*) Replacing code to check whether a pair of gates commutes using CommutationChecker
*) Need to see if the old code for DagDependency had some more optimizations that could be useful here (and also need to see @Sebastian-Brandhofer's improvements in #8081).

A new CommutativeInverseCancellation transpiler pass
*) Uses CommutationChecker to check whether a pair of nodes commutes.
*) Does not construct a full DagDependency (which would be much slower), though is inspired by how DagDependency is constructed.
*) Does not do the additional things done by CommutativeCancellation, such as combining rotations exploiting commutativity relations, or doing basis priority change.
*) Inverse gates detected via nodes[idx2].op == nodes[idx1].op.inverse(). This is reasonably fast, but may miss some opportunities, for instance it does not see that the inverse of Rx(np.pi) is Rx(np.pi} (since it's technically Rx(-np.pi).

Further discussion

Rethink what kind of commutativity analysis is needed for specific applications

  • For some applications, only CommutationChecker is needed (as in the new CommutativeInverseCancellation transpiler pass). There is no need to construct the canonical DagDependency data structure. Furthermore, I don't know how DagDependency would even help here.
  • Possibly for some applications DagDependency is needed, but there is no need to compute and store the lists of transitive successors and predecessors for every node. In this case, the algorithm to construct DagDependency can be significantly improved, both in terms of performance and memory requirement.
  • Possibly for some applications (such as TemplateOptimization) we do need the full lists of transitive successors and predecessors for every node. Then we probably have to pay the price.

Further discussion

  • Do we need the CommutativeAnalysis transpiler pass at all? Definitely, the dictionary it now provides is not very useful as there is no transitivity. Should we fix these so that transitivity would hold?

  • We should do something about CommutativeCancellation transpiler pass (especially as it may produce wrong results). Should we try to fix it? Or maybe its functionality can be replaced by CommutateInverseCancellation + transpiler passes that combine single-qubit rotations?

  • Can we reimplement DagDependency to be DagCircuit? There is no need to use DagDepNode (which stores information required specifically for TemplateOptimization pass).

@alexanderivrii alexanderivrii requested a review from a team as a code owner June 16, 2022 00:30
@qiskit-bot
Copy link
Collaborator

Thank you for opening a new pull request.

Before your PR can be merged it will first need to pass continuous integration tests and be reviewed. Sometimes the review process can be slow, so please be patient.

While you're waiting, please feel free to review other open PRs. While only a subset of people are authorized to approve pull requests for merging, everyone is encouraged to review open pull requests. Doing reviews helps reduce the burden on the core team and helps make the project's code better for everyone.

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core

@coveralls
Copy link

coveralls commented Jun 16, 2022

Pull Request Test Coverage Report for Build 2832797885

  • 109 of 111 (98.2%) changed or added relevant lines in 7 files are covered.
  • 3 unchanged lines in 1 file lost coverage.
  • Overall coverage increased (+0.02%) to 84.041%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/circuit/commutation_checker.py 58 59 98.31%
qiskit/transpiler/passes/optimization/commutative_inverse_cancellation.py 38 39 97.44%
Files with Coverage Reduction New Missed Lines %
qiskit/pulse/library/waveform.py 3 89.36%
Totals Coverage Status
Change from base Build 2832742649: 0.02%
Covered Lines: 56240
Relevant Lines: 66920

💛 - Coveralls

@alexanderivrii alexanderivrii changed the title [WIP] commutativity analysis and reimplementing commutative cancellation transpiler pass Refactoring commutativity analysis and a new commutative inverse cancellation transpiler pass Jul 5, 2022
@alexanderivrii
Copy link
Contributor Author

Does is make sense to limit commutation checking to only 1-qubit and 2-qubit gates? @ShellyGarion suggested to replace the check nodes[idx2].op == nodes[idx1].op.inverse() by Operator(nodes[idx2].op) == Operator(nodes[idx1].op.inverse()), which still seems quite fast, and does cancel pairs of Rx(np.pi) gates?

@alexanderivrii alexanderivrii added this to the 0.22 milestone Jul 11, 2022
@alexanderivrii alexanderivrii self-assigned this Jul 11, 2022
@alexanderivrii
Copy link
Contributor Author

This is ready for review/feedback.

@1ucian0 1ucian0 self-assigned this Jul 19, 2022
@kdk kdk assigned 1ucian0 and mtreinish and unassigned 1ucian0 and alexanderivrii Jul 19, 2022
@Sebastian-Brandhofer
Copy link

I can take a look. :)

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this LGTM, I left a few inline comments. I think the big things are some expansion of the unittests to cover more edge cases and where to put the commutation checker class. The other thing which I didn't put inline is I think have some dedicate tests for the commutation checker class would be good. Other than that this is missing release notes too

)


class CommutationChecker:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering all the circular import issues around this I'm wondering if this makes more sense in something more central in the import tree like quantum_info or dagcircuit especially since the use of this isn't used exclusively by transpiler passes. Although if we make this part of quantum info it might be better to remove the use of the dag node classes and instead take in the args and op separately for node1 and node2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all good ideas: removing the use of the dag node classes (handled in a1284b9) and moving the class to a more central place (still note sure where exactly to move it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the previous suggestion, the function commute in commutation_checker now looks as

def commute(self, op1: Operation, qargs1: List, cargs1: List, op2: Operation, qargs2: List, cargs2: List)

In particular, it does not depend on DAGNode or DAGDepNode and can also be called on gates/operations in QuantumCircuit.

After some discussion with @ShellyGarion and @eliarbel, I have moved commutation_checker to qiskit/circuit. This refactoring also allowed to avoid runtime imports.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only question for me is why qiskit/circuit instead of qiskit/quantum_info? I could see a case for either location and don't feel to strongly one way or the other. I'm fine with this decision but having a comment in the PR review and/or as part of commit message for future reference might be useful in case someone attempts to revisit it's location again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly one way or another either, but it seems that quantum_info should be agnostic to the encompassing circuit (i.e. it can reason about how to represent a Clifford using stabilizer/destabilizer tables or even about how to create a circuit for a Clifford, but it should not reason about where Clifford sits in the larger circuit), while commutation checker does look where the pair of gates sit in the larger circuit (looking at qargs and cargs). So personally circuit seems a better fit.


from .commutation_checker import CommutationChecker

cc = CommutationChecker()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the cache warm between executions of the pass I wonder if we want to scope this to the instance, or even put this in the pass manager itself so it can be shared between passes that use the checker?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a very good point, the cache of which pairs of gates commute or do not commute can be reused between different passes pertaining to the same quantum circuit (for instance, during the optimization look in transpile with optimization_level=3), or even for different quantum circuits (provided they have similar gates, or else the cache will not be useful). Though let's handle this extension in a follow-up PR.

Comment on lines 32 to 34
# ToDo: Even less sure about the next line
# if node.op.params:
# return True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the is_parameterized() check should cover this too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see the difference now: gate.is_parameterized() returns True for a parameterized gate with an unbound parameter (such as RZGate(Parameter("Theta"))), but returns False for a parameterized gate with a bound parameter (such as RZGate(np.pi/2)). The current analysis takes the over-conservative approach in the unbound case (skipping such gates or declaring that such gates do not commute with other gates), and works well in the bound case (for instance, cancelling the inverse pair of RZGate(np.pi/2) and RZGate(-np.pi/2)).

return True
if node.op.condition:
return True
# ToDo: Not sure about the next line
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, unless you know for a parameterized gate if the commutative and inverse checking will eval true even with a Parameter or ParameterExpression object are used for any gate parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to have this extension in the future, but let's keep things simple for now. This comment is also relevant to many other transpiler passes, which usually also ignore conditional gates, parameterized gates, and directives (e.g. collecting 1q or 2q runs).

from qiskit.transpiler.passes import CommutativeInverseCancellation


class TestCommutativeInverseCancellation(QiskitTestCase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be good to have some tests for some edge cases like with parameterized gates in the circuit, some directives like barrier. Also some of the operations explicitly skipped like reset. Just to make sure all those paths are exercised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added these corner case tests, and have also added dedicated tests for CommutationChecker.

qiskit/dagcircuit/dagdepnode.py Outdated Show resolved Hide resolved
@mtreinish mtreinish added the Changelog: New Feature Include in the "Added" section of the changelog label Aug 4, 2022
@alexanderivrii
Copy link
Contributor Author

@mtreinish, thanks for the review! I have incorporated all of your comments, adding corner-case tests for CommutativeInverseCancellation pass, adding dedicated tests for CommutationChecker, adding release notes, and movingCommutationChecker to qiskit/circuit which seems to contain similar utility code.

The only thing that I haven't done is adding CommutativeInverseCancellation to preset pass managers (probably, with optimization_level=3) and the related comment of adding CommutationChecker to the pass manager itself, which I am hoping to treat in a small follow-up PR.

@mtreinish
Copy link
Member

Yeah, I'm fine with updating the preset pass managers in a follow up. It'll let us explore the performance in isolation and evaluate how it effects the runtime and weigh that against any improvements in the quality of output.

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, thanks for making all the updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants