Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add VF2PostLayout pass #7862

Merged
merged 31 commits into from
Apr 26, 2022
Merged

Conversation

mtreinish
Copy link
Member

@mtreinish mtreinish commented Apr 1, 2022

Summary

This commit adds a new transpiler VF2PostLayout and adds a new
phase/model to the preset transpiler pipeline post-layout/qubit
selection. The idea is based on the mapomatic project [1] which
took the code from the existing VF2Layout pass to find an isomorphic
subgraph in the coupling graph after transpilation which had better
noise characteristics than those initial selected as part of the initial
layout phase. Doing post transpile qubit selection provides the pass
more information because we can assume that the circuit's operations are
in the target basis and that there is at least 1 subgraph already in the
coupling graph because we've gone through routing. This enables us to
look at the specific error rates for each instruction and weigh the sum
of error rates for the mapped circuit on each potential qubit mapping to
find the best performing set of qubits for a given circuit. Initial
layout doesn't have access to this information because at the beginning
of the circuit we aren't necessarily going to find a perfect mapping and
we're not guaranteed to be in the target basis. So running post layout
may yield quality improvements even if we found an initial perfect
layout using VF2Layout.

While this new pass is very similar to VF2Layout pass as it builds an
interaction graph representing the 2q interactions in the circuit and
uses retworkx's vf2_mapping() function to find all isomorphic subgraphs
in the coupling graph it behaves differently. This is a separate pass
because it performs the search a bit differently. First the interaction
graphs are annotated with the gate counts on each qubit and edge which
is used to completely apply a heuristic score to the circuit and secondly
in the case of a target we verify the nodes and edges are feasible in the
subgraph isomorphism check since a target can have operations defined on
a subset of bits. Additionally the scoring heursitic checks the
sum of the error rates for each gate on the mapped qubits.

The preset pass managers are updated to use this new pass at the end of
the transpile and apply a layout if a solution is found.

Details and comments

[1] https://github.com/Qiskit-Partners/mapomatic

TODO:

  • Fix test failures
    • Target layout score checking key error (looks like mapping or layout generation is wrong, or the node match check is broken)
    • Layout not fully allocated error
  • Add tests for new pass
  • Add release note

This commit adds a new transpiler VF2PostLayout and adds a new
phase/model to the preset transpiler pipeline post-layout/qubit
selection. The idea is based on the mapomatic project [1] which
took the code from the existing VF2Layout pass to find an isomorphic
subgraph in the coupling graph after transpilation which had better
noise characteristics than those initial selected as part of the initial
layout phase. Doing post transpile qubit selection provides the pass
more information because we can assume that the circuit's operations are
in the target basis and that there is at least 1 subgraph already in the
coupling graph because we've gone through routing. This enables us to
look at the specific error rates for each instruction and weigh the sum
of error rates for the mapped circuit on each potential qubit mapping to
find the best performing set of qubits for a given circuit. Initial
layout doesn't have access to this information because at the beginning
of the circuit we aren't necessarily going to find a perfect mapping and
we're not guaranteed to be in the target basis. So running post layout
may yield quality improvements even if we found an initial perfect
layout using VF2Layout.

While this new pass is very similar to VF2Layout pass as it builds an
interaction graph representing the 2q interactions in the circuit and
uses retworkx's vf2_mapping() function to find all isomorphic subgraphs
in the coupling graph it behaves diffferently. This is a separate pass
because it performs the search a bit differently. First the interaction
graphs are annotated with the gate counts on each qubit and edge which
is used to completely apply a heuristic score to the circuit and secondly
in the case of a target we verify the nodes and edges are feasible in the
subgraph isomorphism check since a target can have operations defined on
a subset of bits. Additionally the scoring heursitic checks the
sum of the error rates for each gate on the mapped qubits.

The preset pass managers are updated to use this new pass at the end of
the transpile and apply a layout if a solution is found.

[1] https://github.com/Qiskit-Partners/mapomatic
@mtreinish mtreinish added the on hold Can not fix yet label Apr 1, 2022
@mtreinish mtreinish requested a review from a team as a code owner April 1, 2022 17:10
@mtreinish mtreinish added this to the 0.21 milestone Apr 1, 2022
The matching callback function had a typo so it was always returning
True on edge comparisons even if the target coupling graph edge was not
a superset of the local gates in the interaction graph. This commit
fixes the oversight so such cases are correctly rejected as a viable
subgraph isomorphic graph.
If we find a a better layout using post layout and there are ancillas in
the original layout those would previously be lost. This commit fixes
this by detecting when we're missing qubits in the new layout and adding
the ancillas on unused qubits in the coupling graph.
Applying a new layout after we schedule a circuit would invalidate
that scheduling. This commit moves the post layout pass to run prior
to scheduling in the preset passmanagers.
This commit modifies the ApplyLayout pass to enable slightly altered
behavior when applying a post layout ontop of a circuit that's already
had a layout applied. Previously we overwrote the layout and apply
layout just blindly applied the layout, this caused us to lose the
original bit and register context as that only exists as metadata
in the property set's layout field. To ensure we preserve the mapping
from the initial virtual bits through our second round of layout apply
layout is modified to handle doing this mapping for us by passing the
new layout separately as a new field in the property set.
If initial_layout or layout_method are set in transpile() do not run
post layout as this will produce unexpected results for users. If you're
manually specifying a layout method that should be what is used only and
we shouldn't do any other reordering to try and optimize beyond what the
user requested.
After introducing VF2PostLayout the output layout of the circuit is
potentially different depending on the noise characteristics of the
target backend. This was causing test failures on tests that were
explicitly checking for an exact layout output from transpile. This
commit updates the expected layouts in those tests to match the new
behavior of the transpiler. Most of these tests were actually already
using VF2Layout to find a perfect layout, but with vf2layout the
transpiler is finding an alternative layout post optimization which has
better noise characteristics for the circuit being run.
@coveralls
Copy link

coveralls commented Apr 2, 2022

Pull Request Test Coverage Report for Build 2229815468

  • 261 of 267 (97.75%) changed or added relevant lines in 7 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.05%) to 84.316%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/transpiler/passes/layout/vf2_post_layout.py 209 215 97.21%
Totals Coverage Status
Change from base Build 2228279020: 0.05%
Covered Lines: 54426
Relevant Lines: 64550

💛 - Coveralls

This commit fixes a typo in the node/edge match function used in
matching a subgraph to a target over valid operation names. Previously
the reverse condition check was incorrectly checking that the target
operations on a 2q edge were a subset of the circuit operations when it
should have been checking the reverse condition. This commit fixes this
oversight.
@mtreinish mtreinish added Changelog: New Feature Include in the "Added" section of the changelog and removed on hold Can not fix yet labels Apr 5, 2022
@mtreinish mtreinish changed the title [WIP] Add VF2PostLayout pass Add VF2PostLayout pass Apr 5, 2022
Copy link
Member

@ajavadia ajavadia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is nice.

I think we should actually remove the NoiseAwareLayout pass after this. That pass doesn't work as intended anyway because the swap minimization part from the associated paper is not implemented, so the results are often bad. I like this new approach since the swap minimization part is solved on a subgraph and then the subgraph is embedded. The NoiseAwareLayout might be faster (since finding graph isomorphism is hard), but we can optimize when we hit a problem in scaling.

releasenotes/notes/vf2-post-layout-f0213e2c7ebb645c.yaml Outdated Show resolved Hide resolved
releasenotes/notes/vf2-post-layout-f0213e2c7ebb645c.yaml Outdated Show resolved Hide resolved
releasenotes/notes/vf2-post-layout-f0213e2c7ebb645c.yaml Outdated Show resolved Hide resolved
Comment on lines +182 to +191
mappings = vf2_mapping(
cm_graph,
im_graph,
node_matcher=_target_match,
edge_matcher=_target_match,
subgraph=True,
id_order=False,
induced=False,
call_limit=self.call_limit,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this return ALL subgraphs and then we score them here? this should be pretty expensive right?

Also say we have a line of 5 qubits and we have chose subset 10 to 1., there are two possibilities based on the orientation of layout. It could be either 10-11-12-13-14 or 14-13-12-11-10. Does it return them both and score separately, or just one? For say a ring of 12 on heavy-hex, there would be 12 orientations.

Copy link
Member Author

@mtreinish mtreinish Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does mappings here is an iterator and each step will compute a new isomorphic mapping. It can be slow and that is why we set call_limit which is the number of internal state visits the function will try so we limit the amount of time we try waiting. In the preset passmanagers we set this at increasing large number so that we spend at most ~100ms for level1, ~10sec for level 2, and ~60 sec for level 3 (we also check on each iteration that we haven't gone over a timeout parameter and break if we have).

As for the orientation it will try both because it is a directed graph and we're using strict edges. This is actually necessary especially for the backendv2/target path because in that path we might not have all gates available in both directions (which is what the matcher functions here are checking). Also the scores can be different because we'd potentially end up with different gate counts on each qubit and 2q link which would change the score between each orientation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see, it's looking for directed subgraphs. I think this can be problematic. Suppose I have a circuit that has been laid out on qubits 0 ---> 1 ---> 2 in this direction. So the circuit would be:

----*------
    |
----X--*---
       |
-------X---

Now if these qubits are very noisy but qubits 3 --> 4 <--- 5 are very good, they will not be chosen because they don't have the right direction. But actually fixing direction is trivial. I suggest that choosing layouts should be on undirected graphs (only find good subsets). Then apply a post-post-layout direction fixing pass.

For the case I brought up which is when there are multiple orientations (2 for a line, 12 for a ring of 12), I think this can be an interesting follow-up of choosing the best among those. But already choosing the best subset among many will go a long way.

Copy link
Member Author

@mtreinish mtreinish Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I considered using undirected edges and using GateDirection when I wrote this, the problem with it is to do an undirected subgraph search we'd basically need to rerun the transpiler after the routing phase again. We can easily do GateDirection after applying the layout to fix this, but then we've potentially injected a bunch of hadamards into the circuit so we need the basis translator to convert that to the native basis set. This will then require the optimization passes to run because the basis translator output can likely be simplified. We basically end up rerunning most of the transpiler at that point. Typing all of this now though has made met think of a potentially interesting follow on we can play with where we could add a strict direction flag to this pass and then add VF2PostLayout and ApplyLayout to end of the optimization loop with that flag set False.

The way I was viewing this pass was given the hard constraints on the backend can we find a better qubit selection with lower noise and if not we don't do anything. So we do miss the opportunity for 3 -> 4 < - 5 if there are no compatible 2q gates on that direction but it is just a heuristic and that seemed ok . Especially with BackendV2 where the gates are defined per qubit (like in your example if 0 -> 1 -> 2 was all in cx but 3 -> 4 <- 5 was only ecr). This seemed the better path to start since in all my tests it was able to find better layouts. At least for all the current backends with connectivity constraints this won't come up since they all currently define bidirectional edges with the same error rates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically end up rerunning most of the transpiler at that point.

If we order the passes right I don't think there will be a duplication. In my mind the order should be something like this:

  1. high-level synthesis (toffolis, cliffords, unitaries, etc.) to reduce the circuit to 1 & 2 qubit gates
  2. layout + routing + post-layout
  3. other parts including 1-& 2-qubit synthesis, optimization, scheduling. Since these just make local changes on 1 or 2 qubits at a time, they don't alter the mapping.

So I was thinking that this PostLayout pass can be done in stage 2 (basically to improve the layout). But if the scoring mechanism relies on the gates exactly being in the Target, then it wouldn't work.
For it to work we would need a more relaxed scoring that can approximate. e.g. if there's a 2-qubit unitary it can assign a score to it based on looking at the 2-qubit error rates on that link. It wouldn't be exact, but I think it would be good enough. Since the whole scoring is approximate anyway. (I think soon the devices will report the native cx direction only btw)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a try, I'm curious to see the difference between running it at different spots in the pass manager. I'm also wondering if there value in doing post layout > 1 time in a pipeline, like if we did it with looser constraints at the end of 2 and after 3 with the stricter constraints.

FWIW, I did this after 3 because I thought it would be better because we have the complete circuit so we can see how many gates get run on each of the qubits and get the full error rates with each layout. Especially since DenseLayout is already noise aware so it should be picking similar qubits already. But it's definitely worth testing and checking to see what makes a bigger impact on result quality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been testing this locally running on hardware and most of the time the layouts are the same. But there has been 1 time so far where the layouts were significantly different and the undirected case doing it right after routing was significantly better. So I'm going to adjust the preset pass managers to do it this way in the PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the undirected mode in 3b3c0a0 and started using that in the preset pass managers after routing in 5e90148

Co-authored-by: Ali Javadi-Abhari <ajavadia@users.noreply.github.com>
@nonhermitian
Copy link
Contributor

It would be nice if we can mirror the design decisions here in Mapomatic. What I think would be nice is to use the ability to select the best sub-graph on multiple backends to target a system if an user does not specify the system name in the call to the primitive (which is possible on the cloud but not IQX at present)

This commit adds a new flag, strict_direction, which can be used to do
an isomorphic match on undirected graphs and use an avg 1q and 2q error
rate for each qubit for scoring.
This commit moves the VF2PostLayout run to right after the routing phase
in the preset passmanagers. This lets us set the strict_direction flag
to False which expands the search space to ignore 2q gate
directionality which will potentially find better mappings.
@mtreinish mtreinish requested a review from ajavadia April 25, 2022 19:40
Copy link
Member

@ajavadia ajavadia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@mergify mergify bot merged commit 7e83de6 into Qiskit:main Apr 26, 2022
@mtreinish mtreinish deleted the post-layout-ala-mapomatic branch April 26, 2022 23:59
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Apr 27, 2022
In Qiskit#7862 we recently added a new vf2 post layout pass which is designed
to run after routing to improve the layout once we know there is at
least one isomorphic subgraph in the coupling graph for the interactions
in the circuit. In that PR we still ran vf2 post layout even if vf2
layout pass found a match. That's because the heuristic scoring of
layouts used for the vf2 layout and vf2 post layout passes were
different. Originally this difference was due to the the vf2 post
layout pass being intedended to run after the optimization loop where we
could guarantee the gates were in the target and exactly score the error
for each potential layout. But since the vf2 post layout was updated to
score a layout based on the gate counts for each qubit and the average
1q and 2q instruction error rates we can leverage this better heuristic
scoring in the vf2 layout pass. This commit updates the vf2 layout pass
to use the same heuristic and deduplicates some of the code between the
passes at the same time. Additionally, since the scoring heuristics are
the same the preset pass managers are updated to only run vf2 post
layout if vf2 layout didn't find a match. If vf2 layout finds a match
it's going to be the same as what vf2 post layout finds so there is no
need to run the vf2 post layout pass anymore.
mergify bot added a commit that referenced this pull request May 4, 2022
* Deduplicate and unify VF2 layout passes

In #7862 we recently added a new vf2 post layout pass which is designed
to run after routing to improve the layout once we know there is at
least one isomorphic subgraph in the coupling graph for the interactions
in the circuit. In that PR we still ran vf2 post layout even if vf2
layout pass found a match. That's because the heuristic scoring of
layouts used for the vf2 layout and vf2 post layout passes were
different. Originally this difference was due to the the vf2 post
layout pass being intedended to run after the optimization loop where we
could guarantee the gates were in the target and exactly score the error
for each potential layout. But since the vf2 post layout was updated to
score a layout based on the gate counts for each qubit and the average
1q and 2q instruction error rates we can leverage this better heuristic
scoring in the vf2 layout pass. This commit updates the vf2 layout pass
to use the same heuristic and deduplicates some of the code between the
passes at the same time. Additionally, since the scoring heuristics are
the same the preset pass managers are updated to only run vf2 post
layout if vf2 layout didn't find a match. If vf2 layout finds a match
it's going to be the same as what vf2 post layout finds so there is no
need to run the vf2 post layout pass anymore.

* Update apply post layout condition comments

* Remove old layout score function

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: New Feature Include in the "Added" section of the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants