Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle custom operations with overlapping names in QPY #11646

Merged
merged 5 commits into from
Jan 31, 2024

Conversation

mtreinish
Copy link
Member

Summary

This commit fixes a longstanding bug and limitation in QPY around the use of custom operations. When QPY encounters a non-standard (meaning not in Qiskit itself, or a direct instance of Gate or Instruction) instruction it is not able to represent the exact instance perfectly in that payload. This is because it would require more code that represents those classes or data not contained inside Qiskit itself, which would be an opportunity for arbitrary code execution which is something QPY was designed to prevent (as it's a limitation with pickle serialization). So to serialize these objects in a circuit, QPY stores as much data as it can extract from the data model from the Instruction and Gate classes and builds a table of custom instructions that will be recreated as Instruction or Gate instances on deserialization. In all previous QPY format versions that table was keyed on the .name attribute of the custom instructions in the circuit, the thinking being the name is a unique op code during circuit compilation and if there are multiple circuit elements with the same name they should be the same object. With this assumption it enabled the QPY payloads generated with repeated custom instructions to be smaller and more efficient because we don't need to store potentially repeated data. While that assumption is true during most of compilation it was ignoring that for a bare quantum circuit (or during the initial stages of compilation) this isn't necessarily true and there are compiler passes that canonicalize custom operations prior to that being true. To make it worse this assumption was causing conflicts in the QPY payload and cause an inaccurate reproduction of the original circuit when deserializing a QPY payload in many cases when custom gates were used.

This commit fixes this limitation by introducing a new QPY format version 11, which changes the key used in the custom instruction table to instead of being just the name to {name}_{uuid} where each instance of a custom instruction has a different uuid value. This means that there are never any overlapping definitions in the table and we don't have to worry about the case where two custom instructions have the same name.

Details and comments

Fixes #8941

This commit fixes a longstanding bug and limitation in QPY around the
use of custom operations. When QPY encounters a non-standard (meaning not
in Qiskit itself, or a direct instance of Gate or Instruction)
instruction it is not able to represent the exact instance perfectly in
that payload. This is because it would require more code that represents
those classes or data not contained inside Qiskit itself, which would be
an opportunity for arbitrary code execution which is something QPY was
designed to prevent (as it's a limitation with pickle serialization). So
to serialize these objects in a circuit, QPY stores as much data as it
can extract from the data model from the Instruction and Gate classes
and builds a table of custom instructions that will be recreated as
Instruction or Gate instances on deserialization. In all previous QPY
format versions that table was keyed on the .name attribute of the
custom instructions in the circuit, the thinking being the name is a
unique op code during circuit compilation and if there are multiple
circuit elements with the same name they should be the same object. With
this assumption it enabled the QPY payloads generated with repeated
custom instructions to be smaller and more efficient because we don't
need to store potentially repeated data. While that assumption is true
during most of compilation it was ignoring that for a bare quantum
circuit (or during the initial stages of compilation) this isn't
necessarily true and there are compiler passes that canonicalize custom
operations prior to that being true. To make it worse this assumption
was causing conflicts in the QPY payload and cause an inaccurate
reproduction of the original circuit when deserializing a QPY payload in
many cases when custom gates were used.

This commit fixes this limitation by introducing a new QPY format
version 11, which changes the key used in the custom instruction table
to instead of being just the name to `{name}_{uuid}` where each instance
of a custom instruction has a different uuid value. This means that
there are never any overlapping definitions in the table and we don't
have to worry about the case where two custom instructions have the same
name.

Fixes Qiskit#8941
@mtreinish mtreinish added Changelog: API Change Include in the "Changed" section of the changelog Changelog: Bugfix Include in the "Fixed" section of the changelog mod: qpy Related to QPY serialization labels Jan 25, 2024
@mtreinish mtreinish added this to the 1.0.0 milestone Jan 25, 2024
@mtreinish mtreinish requested a review from a team as a code owner January 25, 2024 23:04
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

  • @Qiskit/terra-core
  • @mtreinish
  • @nkanazawa1989

@mtreinish
Copy link
Member Author

I wrote this with the intent to combine it with #11644 so there is a version kwarg added to some of the write functions which is not set anywhere externally. Depending on whether this or #11644 merges first we'll want to update the other to ensure that when qpy.dump(version=10) is called it connects to the version kwarg on qiskit.qpy.binary_io.circuits.write_circuit().

@coveralls
Copy link

coveralls commented Jan 25, 2024

Pull Request Test Coverage Report for Build 7726150026

Warning: This coverage report may be inaccurate.

We've detected an issue with your CI configuration that might affect the accuracy of this pull request's coverage report.
To ensure accuracy in future PRs, please see these guidelines.
A quick fix for this PR: rebase it; your next report should be accurate.

  • -1 of 13 (92.31%) changed or added relevant lines in 3 files are covered.
  • 38 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.06%) to 89.536%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/qpy/binary_io/circuits.py 10 11 90.91%
Files with Coverage Reduction New Missed Lines %
crates/qasm2/src/expr.rs 1 93.81%
qiskit/test/base.py 4 78.57%
crates/qasm2/src/lex.rs 9 90.68%
crates/qasm2/src/parse.rs 24 96.23%
Totals Coverage Status
Change from base Build 7721773451: -0.06%
Covered Lines: 59483
Relevant Lines: 66435

💛 - Coveralls

Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, it's a bit of a shame that I didn't think of generalizing the fix in #10809 or at least stripping the trailing _{uuid} (version 10 outputs beautiful gate names like ucrz_dg_0d501712-5682-4a9c-99ea-07453a0a3fe1(-π/2)), but it's a lot better now. I only have two super minor comments. I can approve once the changes from #11644 are merged.

@@ -138,6 +138,45 @@ def test_empty_layout(self):
qc._layout = TranspileLayout(None, None, None)
self.assert_roundtrip_equal(qc)

def test_overlapping_definitions(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to have this test be part of TestLoadFromQPY(in test/python/circuit/test_circuit_load_from_qpy.py) rather than TestLayout? Could this be an opportunity to merge both test files under test/python/qpy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but for this PR I just left it as is. I'll push a follow up PR after this merges that unifies the tests into a single module under tests/python/qpy

@mtreinish mtreinish requested a review from ElePT January 31, 2024 12:38
Copy link
Contributor

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ElePT ElePT enabled auto-merge January 31, 2024 13:36
@ElePT ElePT added this pull request to the merge queue Jan 31, 2024
@jakelishman jakelishman removed this pull request from the merge queue due to a manual request Jan 31, 2024
@jakelishman
Copy link
Member

I disembarked this so that #10998 could go first (because it'll have conflicts with other things), but that's also just joined the queue, so I can put this back in now.

@jakelishman jakelishman added this pull request to the merge queue Jan 31, 2024
Merged via the queue into Qiskit:main with commit 588f2df Jan 31, 2024
12 checks passed
@mtreinish mtreinish deleted the fix-qpy-custom-gates branch January 31, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: API Change Include in the "Changed" section of the changelog Changelog: Bugfix Include in the "Fixed" section of the changelog mod: qpy Related to QPY serialization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

qpy mishandles the .definition of custom gates with the same .name but different params
5 participants