Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use singletons for standard library unparameterized, non-controlled gates #10314

Merged
merged 37 commits into from
Sep 19, 2023

Conversation

mtreinish
Copy link
Member

@mtreinish mtreinish commented Jun 20, 2023

Summary

This commit adds a new class SingletonGate which is a Gate subclass that reuses a single instance by default for all instances of a particular class. This greatly reduces the memory overhead and significant improves the construction speed for making multiple instances of the same gate. The tradeoff is in the flexibility of use because it precludes having any potentially mutable state in the shared instance. This is a large change to the data model of qiskit because it previously could be assumed that any gate instance was unique and there weren't any unintended side effects from modifying it in isolation (for example doing XGate().label = 'foo' wouldn't potentially break other code). To limit the impact around this instances of SingletonGate do not allow mutation of an existing instance. This can (and likely will) cause unexpected issues as usage of the class is released. Specifically what used to be valid will now raise an exception because it is a shared instance. This is evident from the code modifications necessary to most of the Qiskit code base to enable working with instances of SingletonGates. The knock on effects of this downstream are likely significant and managing how we roll this feature out is going to be equally if not more important than the feature itself. This is why I'm not personally convinced we want to do all this commit includes in a single release. I've opened this as a pull request primarily to start the conversation on how we want to do the roll out to try and minimize and notify downstream users of the potential breakage to avoid issues. The primary issue I have is this doesn't really follow the Qiskit deprecation policy as there is no user facing notification or documentation of this pending change and code that worked in the previously release will not work in the release with this feature. For some aspects of this change (primarily the setters on gate attributes) this can easily be handled by deprecating it in planned singleton standard library gates and waiting the necessary amount of time. But the more fundamental data model changes are hard to announce ahead of time. We can have a release note about it coming in the future but it will end up being very abstract and users will not necessarily be able to act on it ahead of time without concrete examples to test with. This was an issue for me in developing this feature as I couldn't anticipate where API breakages would occur until I switched over all the standard library gates, and there still might be others.

Due to the large performance gains this offers and also in the interest of testing the API implications of using singleton gates the unparameterized and non-controlled gates available in qiskit.circuit.library.standard_gates are all updated to be subclasses of singleton gates. In aggregate this is causing construction and copying to be roughly 6x faster and building circuits comprised solely of these gates consume 1/4th the memory as before. But it also exposed a large number of internal changes we needed to make to the wider circuit, QPY, qasm2, dagcircuit, transpiler, converters, and test modules to support working with singleton gates.

Besides this there are a couple seemingly unrelated API changes in this PR and it is caused by inconsistencies in the Instruction/Gate API that were preventing this from working. The first which is the ECRGate class was missing a label kwarg in the parent. Similarly all Gate classes and subclasses were missing duration and unit kwargs on their constructors. These are necessary to be able to use these fields on singletons because we need an interface to construct an instance that has the state set so we avoid the use of the global shared instance. In the release notes I labeled these as bugfixes, because in all the cases the parent clases were exposing these interfaces and it primarily is an oversight that they were missing in these places. But personally this does seem more like something we'd normally document as a feature rather than a bugfix.

A follow up PR will add a SingletonControlledGate class which will be similar to SingletonGate but will offer two singleton instance based on the value of ctrl_state (and also handle nested labels and other nested mutable state in the base gate). We can then update the standard library gates like CXGate, and CHGate to also be singletons. The ctrl state attribute is primarily why these gates were not included in this commit.

Details and comments

Related to: #5895 and #6991
Fixes: #3800

…ates

This commit adds a new class SingletonGate which is a Gate subclass that
reuses a single instance by default for all instances of a particular
class. This greatly reduces the memory overhead and significant improves
the construction speed for making multiple instances of the same gate.
The tradeoff is in the flexibility of use because it precludes having
any potentially mutable state in the shared instance. This is a large
change to the data model of qiskit because it previously could be
assumed that any gate instance was unique and there weren't any
unintended side effects from modifying it in isolation (for example
doing XGate().label = 'foo' wouldn't potentially break other code).
To limit the impact around this instances of SingletonGate do not allow
mutation of an existing instance. This can (and likely will) cause
unexpected issues as usage of the class is released. Specifically what
used to be valid will now raise an exception because it is a shared
instance. This is evident from the code modifications necessary to
most of the Qiskit code base to enable working with instances of
SingletonGates. The knock on effects of this downstream are likely
significant and managing how we roll this feature out is going to be
equally if not more important than the feature itself. This is why
I'm not personally convinced we want to do all this commit includes
in a single release. I've opened this as a pull request primarily to
start the conversation on how we want to do the roll out to try and
minimize and notify downstream users of the potential breakage to
avoid issues. The primary issue I have is this doesn't really follow
the Qiskit deprecation policy as there is no user facing notification
or documentation of this pending change and code that worked in the
previously release will not work in the release with this feature.
For some aspects of this change (primarily the setters on gate
attributes) this can easily be handled by deprecating it in planned
singleton standard library gates and waiting the necessary amount of
time. But the more fundamental data model changes are hard to announce
ahead of time. We can have a release note about it coming in the future
but it will end up being very abstract and users will not necessarily
be able to act on it ahead of time without concrete examples to test
with. This was an issue for me in developing this feature as I couldn't
anticipate where API breakages would occur until I switched over all the
standard library gates, and there still might be others.

Due to the large performance gains this offers and also in the
interest of testing the API implications of using singleton gates the
unparameterized and non-controlled gates available in
qiskit.circuit.library.standard_gates are all updated to be subclasses
of singleton gates. In aggregate this is causing construction to be
roughly 6x faster and building circuits comprised solely of these gates
consume 1/4th the memory as before. But it also exposed a large number
of internal changes we needed to make to the wider circuit, QPY, qasm2,
dagcircuit, transpiler, converters, and test modules to support working
with singleton gates.

Besides this there are a couple seemingly unrelated API changes in
this PR and it is caused by inconsistencies in the Instruction/Gate
API that were preventing this from working. The first which is the
ECRGate class was missing a label kwarg in the parent. Similarly
all Gate classes and subclasses were missing duration and unit
kwargs on their constructors. These are necessary to be able to use
these fields on singletons because we need an interface to construct
an instance that has the state set so we avoid the use of the global
shared instance. In the release notes I labeled these as bugfixes,
because in all the cases the parent clases were exposing these
interfaces and it primarily is an oversight that they were missing
in these places. But personally this does seem more like something
we'd normally document as a feature rather than a bugfix.

A follow up PR will add a SingletonControlledGate class which will
be similar to SingletonGate but will offer two singleton instance
based on the value of ctrl_state (and also handle nested labels
and other nested mutable state in the base gate). We can then update
the standard library gates like CXGate, and CHGate to also be
singletons. The ctrl state attribute is primarily why these gates
were not included in this commit.
@mtreinish mtreinish added priority: high performance Changelog: New Feature Include in the "Added" section of the changelog Changelog: API Change Include in the "Changed" section of the changelog labels Jun 20, 2023
@mtreinish mtreinish added this to the 0.25.0 milestone Jun 20, 2023
@mtreinish mtreinish requested a review from a team as a code owner June 20, 2023 16:27
@qiskit-bot
Copy link
Collaborator

One or more of the the following people are requested to review this:

  • @Cryoris
  • @Qiskit/terra-core
  • @ajavadia
  • @mtreinish
  • @nkanazawa1989

@mtreinish
Copy link
Member Author

If the giant commit message isn't clear enough, I'm not entirely convinced we should do all of this in 0.25.0. I opened up this PR primarily to show what the end state looks like and also to start a discussion around exactly how we roll this out to users in a controlled manner. Specifically what of this we want to include in 0.25.0 and how we plan to roll the rest of it out in subsequent releases?

@mtreinish
Copy link
Member Author

mtreinish commented Jun 20, 2023

Sigh it looks like there are a bunch of failures when running with python 3.8. I was testing locally with 3.11 and everything passed. I'll dig into what's causing the differences on 3.8. Fixed in: 330beb8

@mtreinish
Copy link
Member Author

I ran a subset of the asv benchmarks quickly to get a feeling for the performance in some of our benchmarks. The benefits are more drastic for the more repeated instances of singleton gates we have to create, especially when there are copies. (the ~6x number I came up with in the commit message was for a 100 qubit circuit with a depth of 100 composed solely of singleton gates involving multiple copies).

Benchmarks that have improved:

       before           after         ratio
     [9ef34b7a]       [70874622]
     <main>       <singleton-gates-poc>
-         218±1ms          197±1ms     0.90  queko.QUEKOTranspilerBench.time_transpile_bss(2, None)
-        25.2±4ms      22.2±0.07ms     0.88  queko.QUEKOTranspilerBench.time_transpile_bigd(2, None)
-        135±20ms        115±0.6ms     0.85  queko.QUEKOTranspilerBench.time_transpile_bigd(3, None)
-        137±50ms        116±0.4ms     0.85  queko.QUEKOTranspilerBench.time_transpile_bntf(2, None)
-        15.1±3ms       12.4±0.2ms     0.82  queko.QUEKOTranspilerBench.time_transpile_bigd(0, 'sabre')
-       87.5±20ms       70.6±0.3ms     0.81  queko.QUEKOTranspilerBench.time_transpile_bntf(1, None)
-        171±60ms        136±0.3ms     0.79  queko.QUEKOTranspilerBench.time_transpile_bntf(1, 'sabre')
-       1.10±0.4s          848±4ms     0.77  queko.QUEKOTranspilerBench.time_transpile_bntf(3, None)
-        124±70ms       92.0±0.8ms     0.74  queko.QUEKOTranspilerBench.time_transpile_bntf(0, 'sabre')
-       506±300ms          359±2ms     0.71  queko.QUEKOTranspilerBench.time_transpile_bntf(2, 'sabre')
-      37.0±0.2μs       24.9±0.3μs     0.67  circuit_construction.CircuitConstructionBench.time_circuit_copy(5, 8)
-      35.3±0.1μs       22.9±0.2μs     0.65  circuit_construction.CircuitConstructionBench.time_circuit_copy(2, 8)
-      56.1±0.5μs       35.7±0.3μs     0.64  circuit_construction.CircuitConstructionBench.time_circuit_copy(8, 8)
-      92.4±0.4μs       57.6±0.3μs     0.62  circuit_construction.CircuitConstructionBench.time_circuit_copy(14, 8)
-         428±3μs        245±0.9μs     0.57  circuit_construction.CircuitConstructionBench.time_circuit_copy(8, 128)
-     6.35±0.03ms         3.61±0ms     0.57  circuit_construction.CircuitConstructionBench.time_circuit_copy(14, 2048)
-         422±1μs        239±0.5μs     0.57  circuit_construction.CircuitConstructionBench.time_circuit_copy(5, 128)
-         436±5μs        246±0.3μs     0.56  circuit_construction.CircuitConstructionBench.time_circuit_copy(14, 128)
-     6.40±0.04ms      3.58±0.02ms     0.56  circuit_construction.CircuitConstructionBench.time_circuit_copy(8, 2048)
-        143±40μs         80.1±1μs     0.56  circuit_construction.CircuitConstructionBench.time_circuit_copy(20, 8)
-      25.7±0.2ms      14.3±0.09ms     0.56  circuit_construction.CircuitConstructionBench.time_circuit_copy(8, 8192)
-      25.9±0.3ms      14.4±0.09ms     0.56  circuit_construction.CircuitConstructionBench.time_circuit_copy(14, 8192)
-     6.34±0.03ms      3.52±0.01ms     0.55  circuit_construction.CircuitConstructionBench.time_circuit_copy(5, 2048)
-      26.3±0.4μs       14.4±0.1μs     0.55  circuit_construction.CircuitConstructionBench.time_circuit_copy(1, 8)
-      25.8±0.2ms      14.1±0.03ms     0.55  circuit_construction.CircuitConstructionBench.time_circuit_copy(5, 8192)
-         378±1μs        204±0.5μs     0.54  circuit_construction.CircuitConstructionBench.time_circuit_copy(2, 128)
-       529±200ms          282±5ms     0.53  circuit_construction.CircuitConstructionBench.time_circuit_copy(14, 131072)
-      23.5±0.3ms      12.4±0.06ms     0.53  circuit_construction.CircuitConstructionBench.time_circuit_copy(2, 8192)
-     5.86±0.04ms      3.08±0.01ms     0.53  circuit_construction.CircuitConstructionBench.time_circuit_copy(2, 2048)
-        530±10ms          277±5ms     0.52  circuit_construction.CircuitConstructionBench.time_circuit_copy(8, 131072)
-         519±9ms          268±1ms     0.52  circuit_construction.CircuitConstructionBench.time_circuit_copy(5, 131072)
-       547±200ms          279±3ms     0.51  circuit_construction.CircuitConstructionBench.time_circuit_copy(20, 131072)
-       123±0.8ms       62.3±0.4ms     0.51  circuit_construction.CircuitConstructionBench.time_circuit_copy(14, 32768)
-         123±2ms       62.2±0.6ms     0.50  circuit_construction.CircuitConstructionBench.time_circuit_copy(8, 32768)
-       122±0.9ms       59.9±0.3ms     0.49  circuit_construction.CircuitConstructionBench.time_circuit_copy(5, 32768)
-         484±9ms          235±4ms     0.48  circuit_construction.CircuitConstructionBench.time_circuit_copy(2, 131072)
-       611±100μs        285±0.6μs     0.47  circuit_construction.CircuitConstructionBench.time_circuit_copy(20, 128)
-        139±20ms       63.2±0.5ms     0.46  circuit_construction.CircuitConstructionBench.time_circuit_copy(20, 32768)
-       114±0.8ms       51.5±0.2ms     0.45  circuit_construction.CircuitConstructionBench.time_circuit_copy(2, 32768)
-        9.20±3ms      3.65±0.02ms     0.40  circuit_construction.CircuitConstructionBench.time_circuit_copy(20, 2048)
-       38.2±10ms       14.7±0.4ms     0.38  circuit_construction.CircuitConstructionBench.time_circuit_copy(20, 8192)
-         286±2μs        109±0.5μs     0.38  circuit_construction.CircuitConstructionBench.time_circuit_copy(1, 128)
-     4.43±0.02ms         1.60±0ms     0.36  circuit_construction.CircuitConstructionBench.time_circuit_copy(1, 2048)
-      17.9±0.1ms      6.39±0.02ms     0.36  circuit_construction.CircuitConstructionBench.time_circuit_copy(1, 8192)
-        83.7±2ms      25.5±0.07ms     0.30  circuit_construction.CircuitConstructionBench.time_circuit_copy(1, 32768)
-         357±6ms        104±0.8ms     0.29  circuit_construction.CircuitConstructionBench.time_circuit_copy(1, 131072)

Benchmarks that have stayed the same:

       before           after         ratio
     [9ef34b7a]       [70874622]
     <main>       <singleton-gates-poc>
       13.1±0.1ms         13.6±2ms     1.04  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(2, 'translator')
        103±0.6μs          106±1μs     1.03  circuit_construction.CircuitConstructionBench.time_circuit_construction(2, 8)
      2.55±0.01ms      2.59±0.04ms     1.01  qft.QftTranspileBench.time_ibmq_backend_transpile(1)
       53.7±0.4ms         54.3±1ms     1.01  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(5, 'translator')
        107±0.4ms          108±2ms     1.01  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'noise_adaptive')
         225±10ms          226±9ms     1.00  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(8, 'synthesis')
       23.8±0.2ms       23.9±0.2ms     1.00  qft.QftTranspileBench.time_ibmq_backend_transpile(8)
      7.96±0.08ms      7.98±0.09ms     1.00  qft.QftTranspileBench.time_ibmq_backend_transpile(3)
          123±5ms        123±0.8ms     1.00  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(1)
       74.6±0.6ms       74.7±0.2ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'noise_adaptive')
          131±1ms          131±1ms     1.00  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(0)
      1.19±0.02ms      1.19±0.01ms     1.00  circuit_construction.CircuitConstructionBench.time_circuit_construction(2, 128)
       72.3±0.6ms       72.2±0.4ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'dense')
          241±2ms        240±0.8ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'sabre')
        133±0.2ms        132±0.3ms     1.00  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'noise_adaptive')
       70.4±0.5ms       70.0±0.4ms     0.99  qft.QftTranspileBench.time_ibmq_backend_transpile(14)
      4.09±0.03ms      4.07±0.04ms     0.99  qft.QftTranspileBench.time_ibmq_backend_transpile(2)
        159±0.3ms        158±0.3ms     0.99  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(8, 'translator')
        175±0.8ms          174±1ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'sabre')
       83.8±0.3ms       83.1±0.1ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'noise_adaptive')
      8.58±0.07ms       8.51±0.2ms     0.99  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(2, 'synthesis')
      61.2±0.08ms       60.6±0.2ms     0.99  qft.QftTranspileBench.time_ibmq_backend_transpile(13)
       18.3±0.2ms      18.2±0.06ms     0.99  circuit_construction.CircuitConstructionBench.time_circuit_construction(2, 2048)
       60.2±0.4ms       59.6±0.2ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'dense')
       64.2±0.3ms       63.6±0.3ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'dense')
       82.1±0.4ms       81.1±0.4ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'noise_adaptive')
         17.0±3ms         16.8±2ms     0.99  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(3, 'synthesis')
       62.0±0.3ms       61.2±0.6ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'noise_adaptive')
          181±1ms          178±1ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'noise_adaptive')
         84.9±1ms       83.8±0.2ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'sabre')
        132±0.6ms        130±0.4ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'dense')
       66.2±0.6ms       65.4±0.3ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'noise_adaptive')
        102±0.3ms        101±0.5ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'sabre')
          198±1ms          195±2ms     0.99  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(2)
          827±1ms          816±6ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'sabre')
          679±3ms          669±3ms     0.99  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'sabre')
       2.15±0.03s       2.11±0.05s     0.99  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(20, 'synthesis')
       4.48±0.04s       4.41±0.01s     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(3)
       1.01±0.01s          990±6ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'sabre')
       80.5±0.4ms       79.1±0.2ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'dense')
       60.2±0.3ms       59.1±0.2ms     0.98  transpiler_qualita The regressions on the 1 qubit case I think are an artifact of the first time gate creation being marginally slower for singletonstive.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'sabre', 'sabre')
         900±30ms         884±30ms     0.98  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(14, 'synthesis')
          127±2ms        125±0.6ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'sabre')
       1.69±0.01s       1.66±0.01s     0.98  queko.QUEKOTranspilerBench.time_transpile_bntf(3, 'sabre')
      2.48±0.02ms      2.44±0.02ms     0.98  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(1, 'translator')
          138±2ms        136±0.6ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'dense')
          394±4ms        387±0.8ms     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(2)
       78.7±0.3ms       77.3±0.3ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'noise_adaptive')
        232±0.9ms          228±1ms     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(1)
          390±4ms          382±2ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'dense')
         60.8±3ms         59.6±4ms     0.98  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(5, 'synthesis')
       89.5±0.9ms       87.7±0.3ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'sabre')
          649±4ms          636±5ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'dense')
          106±1ms        104±0.7ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'dense')
          878±9ms          860±4ms     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(1)
       1.09±0.01s          1.07±0s     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'noise_adaptive')
       1.74±0.01s       1.71±0.02s     0.98  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(27, 'translator')
          983±5ms          963±4ms     0.98  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(20, 'translator')
          519±5ms          508±4ms     0.98  transpiler_benchmarks.TranspilerBenchSuite.time_compile_from_large_qasm
         88.6±1ms       86.7±0.2ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'noise_adaptive')
       75.1±0.3ms       73.4±0.1ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'dense')
          298±3ms          292±2ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'dense')
          306±6ms          299±2ms     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_schedule_qv_14_x_14(0)
          236±2ms          230±2ms     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(1)
        262±0.9ms          256±1ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'dense')
       1.99±0.01s       1.94±0.01s     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(2)
       86.1±0.5ms       84.1±0.4ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'sabre', 'dense')
          674±4ms          658±5ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'noise_adaptive')
          125±1ms        122±0.5ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'dense')
       1.40±0.01s          1.37±0s     0.98  transpiler_levels.TranspilerLevelBenchmarks.time_quantum_volume_transpile_50_x_20(0)
       1.42±0.01s       1.39±0.01s     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'dense')
          390±2ms          381±3ms     0.98  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'sabre')
          483±2ms          471±2ms     0.97  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(14, 'translator')
       1.45±0.01s       1.41±0.01s     0.97  queko.QUEKOTranspilerBench.time_transpile_bss(0, None)
        201±0.5ms        196±0.8ms     0.97  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(2)
          300±3ms          293±3ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'sabre')
          250±7ms          243±1ms     0.97  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(3)
          147±1ms        143±0.3ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'sabre', 'sabre')
         73.7±3ms       71.8±0.3ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'sabre', 'sabre')
          260±3ms          253±2ms     0.97  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(0)
       61.2±0.4ms       59.6±0.3ms     0.97  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm(0)
          279±4ms          272±7ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(2, 'stochastic', 'noise_adaptive')
         82.6±2ms       80.4±0.3ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'sabre', 'dense')
          468±4ms          455±6ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'stochastic', 'noise_adaptive')
          228±5ms          221±2ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'dense')
          892±7ms          868±3ms     0.97  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_qv_14_x_14(3)
          671±7ms          652±8ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'sabre', 'noise_adaptive')
       1.23±0.01s       1.20±0.01s     0.97  circuit_construction.CircuitConstructionBench.time_circuit_construction(2, 131072)
        125±0.9ms        122±0.5ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'sabre')
          123±1ms        120±0.7ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'sabre', 'noise_adaptive')
          197±1ms          191±2ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'noise_adaptive')
         75.5±2ms       73.2±0.2ms     0.97  circuit_construction.CircuitConstructionBench.time_circuit_construction(5, 8192)
          127±1ms        123±0.3ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'stochastic', 'noise_adaptive')
          111±1ms        108±0.5ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'stochastic', 'sabre')
      1.30±0.01ms      1.25±0.01ms     0.97  circuit_construction.CircuitConstructionBench.time_circuit_construction(5, 128)
       82.0±0.6ms       79.3±0.3ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(2, 'sabre', 'sabre')
      3.58±0.08ms       3.46±0.1ms     0.97  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(1, 'synthesis')
       1.57±0.01s       1.52±0.01s     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'sabre')
          173±4ms          167±1ms     0.97  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(1, 'stochastic', 'dense')
       1.49±0.01s       1.43±0.01s     0.96  queko.QUEKOTranspilerBench.time_transpile_bss(3, None)
          228±4ms          220±3ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'noise_adaptive')
          172±1μs          166±2μs     0.96  circuit_construction.CircuitConstructionBench.time_circuit_construction(8, 8)
        169±0.9ms        163±0.3ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'sabre')
       66.8±0.2ms       64.3±0.5ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(1, 'sabre', 'sabre')
          233±4ms          224±1ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(1, 'stochastic', 'sabre')
       2.03±0.02s       1.95±0.01s     0.96  queko.QUEKOTranspilerBench.time_transpile_bss(3, 'sabre')
          258±5ms          248±2ms     0.96  transpiler_levels.TranspilerLevelBenchmarks.time_transpile_from_large_qasm_backend_with_prop(3)
       3.47±0.1ms      3.34±0.04ms     0.96  transpiler_benchmarks.TranspilerBenchSuite.time_single_gate_compile
          1.35±0s       1.29±0.01s     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'stochastic', 'dense')
          263±2ms          252±1ms     0.96  transpiler_levels.TranspilerLevelBenchmarks.time_schedule_qv_14_x_14(1)
          305±1ms          292±1ms     0.96  circuit_construction.CircuitConstructionBench.time_circuit_construction(2, 32768)
       1.34±0.01s       1.28±0.01s     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(3, 'stochastic', 'noise_adaptive')
          212±2ms          203±1ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'sabre')
       11.5±0.1ms       11.0±0.2ms     0.96  qft.QftTranspileBench.time_ibmq_backend_transpile(5)
         76.9±2ms       73.5±0.2ms     0.96  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'sabre', 'sabre')
       19.2±0.3ms       18.3±0.1ms     0.96  circuit_construction.CircuitConstructionBench.time_circuit_construction(5, 2048)
         75.1±1ms       71.8±0.5ms     0.96  circuit_construction.CircuitConstructionBench.time_circuit_construction(2, 8192)
       1.27±0.01s       1.21±0.02s     0.96  circuit_construction.CircuitConstructionBench.time_circuit_construction(5, 131072)
        111±0.3μs          106±2μs     0.96  circuit_construction.CircuitConstructionBench.time_circuit_construction(5, 8)
          217±2ms          207±2ms     0.96  queko.QUEKOTranspilerBench.time_transpile_bss(1, 'sabre')
          315±3ms          300±2ms     0.95  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'sabre')
          281±2ms          268±2ms     0.95  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'dense')
          289±3μs          276±4μs     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(14, 8)
       76.1±0.8ms         72.5±1ms     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(8, 8192)
          408±4μs          389±3μs     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(20, 8)
         856±20ms          815±6ms     0.95  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(3, 'sabre', 'dense')
          313±2ms          298±2ms     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(5, 32768)
       19.2±0.2ms      18.3±0.08ms     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(8, 2048)
       74.3±0.9ms       70.6±0.6ms     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(14, 8192)
          104±2ms       98.8±0.5ms     0.95  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'dense')
          219±3ms          208±2ms     0.95  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'dense')
      18.6±0.06ms       17.7±0.4ms     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(14, 2048)
       1.25±0.01s       1.18±0.01s     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(8, 131072)
      1.32±0.02ms      1.25±0.01ms     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(8, 128)
          312±3ms          295±3ms     0.95  circuit_construction.CircuitConstructionBench.time_circuit_construction(8, 32768)
          277±2ms          262±1ms     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(2, 'stochastic', 'noise_adaptive')
         416±10ms          393±1ms     0.94  queko.QUEKOTranspilerBench.time_transpile_bss(2, 'sabre')
          139±1ms          131±1ms     0.94  queko.QUEKOTranspilerBench.time_transpile_bss(0, 'sabre')
          305±7ms          288±4ms     0.94  circuit_construction.CircuitConstructionBench.time_circuit_construction(14, 32768)
       19.0±0.8ms       17.9±0.6ms     0.94  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(3, 'translator')
          320±6ms          302±1ms     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(3, 'sabre', 'noise_adaptive')
          164±4ms          154±1ms     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_qft_16(0, 'stochastic', 'dense')
       1.22±0.02s          1.15±0s     0.94  circuit_construction.CircuitConstructionBench.time_circuit_construction(20, 131072)
         640±10ms         602±20ms     0.94  randomized_benchmarking.RandomizedBenchmarkingBenchmark.time_ibmq_backend_transpile([0, 1])
       1.23±0.01s       1.16±0.01s     0.94  circuit_construction.CircuitConstructionBench.time_circuit_construction(14, 131072)
          106±1ms       99.9±0.6ms     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_179(0, 'stochastic', 'noise_adaptive')
          217±2ms          203±2ms     0.94  transpiler_qualitative.TranspilerQualitativeBench.time_transpile_time_cnt3_5_180(0, 'stochastic', 'noise_adaptive')
         75.9±1ms         71.2±1ms     0.94  circuit_construction.CircuitConstructionBench.time_circuit_construction(20, 8192)
       19.2±0.1ms       18.0±0.3ms     0.94  circuit_construction.CircuitConstructionBench.time_circuit_construction(20, 2048)
          307±5ms          287±5ms     0.94  circuit_construction.CircuitConstructionBench.time_circuit_construction(20, 32768)
       3.64±0.02s       3.40±0.01s     0.94  randomized_benchmarking.RandomizedBenchmarkingBenchmark.time_ibmq_backend_transpile_single_thread([0, 1])
      1.31±0.01ms      1.22±0.01ms     0.93  circuit_construction.CircuitConstructionBench.time_circuit_construction(14, 128)
          332±7ms          309±5ms     0.93  randomized_benchmarking.RandomizedBenchmarkingBenchmark.time_ibmq_backend_transpile([0])
          1.20±0s          1.11±0s     0.93  randomized_benchmarking.RandomizedBenchmarkingBenchmark.time_ibmq_backend_transpile_single_thread([0])
      1.51±0.02ms      1.41±0.01ms     0.93  circuit_construction.CircuitConstructionBench.time_circuit_construction(20, 128)
        4.64±0.3s       4.29±0.02s     0.92  quantum_volume.QuantumVolumeBenchmark.time_ibmq_backend_transpile(27, 'synthesis')
       3.92±0.4ms      3.61±0.03ms     0.92  transpiler_benchmarks.TranspilerBenchSuite.time_cx_compile
          127±3ms        116±0.5ms     0.91  queko.QUEKOTranspilerBench.time_transpile_bss(1, None)
         18.8±3ms       16.3±0.2ms    ~0.87  queko.QUEKOTranspilerBench.time_transpile_bigd(1, 'sabre')
         143±40ms          123±1ms    ~0.86  queko.QUEKOTranspilerBench.time_transpile_bigd(3, 'sabre')
         259±50ms          222±2ms    ~0.86  queko.QUEKOTranspilerBench.time_transpile_bntf(0, None)
         28.2±6ms       23.7±0.5ms    ~0.84  queko.QUEKOTranspilerBench.time_transpile_bigd(2, 'sabre')
         17.0±4ms       13.3±0.1ms    ~0.78  queko.QUEKOTranspilerBench.time_transpile_bigd(1, None)
        38.0±10ms       28.3±0.1ms    ~0.74  queko.QUEKOTranspilerBench.time_transpile_bigd(0, None)

Benchmarks that have got worse:

       before           after         ratio
     [9ef34b7a]       [70874622]
     <main>       <singleton-gates-poc>
+      41.8±0.2ms       47.7±0.6ms     1.14  circuit_construction.CircuitConstructionBench.time_circuit_construction(1, 8192)
+         697±5μs          791±6μs     1.13  circuit_construction.CircuitConstructionBench.time_circuit_construction(1, 128)
+         170±1ms          193±2ms     1.13  circuit_construction.CircuitConstructionBench.time_circuit_construction(1, 32768)
+      10.5±0.1ms       11.9±0.1ms     1.13  circuit_construction.CircuitConstructionBench.time_circuit_construction(1, 2048)
+         690±5ms          769±5ms     1.12  circuit_construction.CircuitConstructionBench.time_circuit_construction(1, 131072)
+        60.8±1μs       67.2±0.6μs     1.11  circuit_construction.CircuitConstructionBench.time_circuit_construction(1, 8)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE DECREASED.

There are some differences in how the inspect stdlib module behaves
between python 3.8 and newer versions of python. This was causing
divergence in the test and qpy behavior where inspect was used to
determine different aspects of a gate (either whether label was
supported as an arg or find the number of free parameters). This commit
fixes this by adjusting the code to handle both newer versions of
inspect as well as older ones.
@coveralls
Copy link

coveralls commented Jun 20, 2023

Pull Request Test Coverage Report for Build 5751485044

  • 269 of 281 (95.73%) changed or added relevant lines in 29 files are covered.
  • 20 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.006%) to 85.91%

Changes Missing Coverage Covered Lines Changed/Added Lines %
qiskit/qpy/binary_io/circuits.py 16 17 94.12%
qiskit/dagcircuit/dagcircuit.py 14 16 87.5%
qiskit/converters/ast_to_dag.py 4 8 50.0%
qiskit/circuit/singleton_gate.py 77 82 93.9%
Files with Coverage Reduction New Missed Lines %
qiskit/converters/ast_to_dag.py 1 86.26%
crates/qasm2/src/lex.rs 7 91.14%
crates/qasm2/src/parse.rs 12 97.13%
Totals Coverage Status
Change from base Build 5750977952: -0.006%
Covered Lines: 73212
Relevant Lines: 85219

💛 - Coveralls

This commit adds two methods to the SingletonGate class, mutable and
to_mutable. The mutable() method is a property method that returns
whether a particular instance is mutable or a shared singleton instance.
The second method to_mutable() returns a mutable copy of the gate.
@jlapeyre
Copy link
Contributor

If the singletons are meant to be immutable, doesn't it make sense to define __setattr__ to disallow setting attributes?
If not, you can have this:

In [1]: from qiskit.circuit.library import XGate

In [2]: x1 = XGate._instance

In [3]: x2 = XGate()

In [4]: id(x1) == id(x2)
Out[4]: True

In [5]: x1.a = 3

In [6]: x2.a
Out[6]: 3

In [7]: x3 = XGate()

In [8]: x3.a
Out[8]: 3

I did some quick performance tests. __setattr__ may be a bit slower, but it's not clear how important this is.

@mtreinish
Copy link
Member Author

If the singletons are meant to be immutable, doesn't it make sense to define setattr to disallow setting attributes?

Yeah we need to. We should add that to this PR before merging. The only open question is whether to do it via __setattr__ or via slots (which if set implicitly disallow the addition of extra attributes). I have a patch locally (which is independent of this PR) exploring slots which is part of why I hadn't added setattr yet

@jlapeyre
Copy link
Contributor

exploring slots which is part of why I hadn't added setattr yet

👍

I thought about trying to modify what you have here to use __setattr__. It shouldn't be too hard.

The _condition attribute has a particular behavior here. I didn't include it above just for brevity. You can set it by hand (you shouldn't but you can) in . Then if you call XGate() is silenty reset to None. But that will be fixed as part of using slots or __setattr__

This commit adds a __setattr__ method to the singleton gate class to
ensure that custom attributes are not settable for a shared singleton
instance. It prevents addign a custom attribute if the instance is in
singleton mode and will raise a NotImplementedError to avoid silently
sharing state in the single shared instance.
@mtreinish
Copy link
Member Author

I added __setattr__ in c29887a to disallow custom attributes for shared instances. I think longer term we'll want to enforce this via slots, but we can revisit that later as part of 0.45/0.26 as it's a larger change than what's needed here. If/when we slot the entire instruction hierarchy it will be easy to adapt SingletonGate at the same time.

@alexanderivrii
Copy link
Contributor

@mtreinish, to clarify: if I now write x = XGate(label="special"), then x will be an instance both of XGate and of SingletonGate classes. So SingletonGate is more like PotentiallySingletonGate, in the sense that it will attempt to reuse the same class _instance when possible, but will provide new mutable instances of the class when required. Do I understand this correctly?

@jakelishman jakelishman added this pull request to the merge queue Sep 19, 2023
Merged via the queue into Qiskit:main with commit e62c86b Sep 19, 2023
14 checks passed
@mtreinish mtreinish deleted the singleton-gates-poc branch September 19, 2023 20:57
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Sep 20, 2023
This commit fixes an oversight in Qiskit#10314 where the handling of pickle
wasn't done correctly. Because the SingletonGate class defines __new__
and based on the parameters to the gate.__class__() call determines
whether we get a new mutable copy or a shared singleton immutable
instance we need special handling in pickle. By default pickle will call
__new__() without any arguments and then rely on __setstate__ to update
the state in the new object. This works fine if the original instance
was a singleton but in the case of mutable copies this will create a
singleton object instead of a mutable copy. To fix this a __reduce__
method is added to ensure arguments get passed to __new__ forcing a
mutable object to be created in deserialization. Then a __setstate__
method is defined to correctly update the mutable object post creation.
kevinhartman added a commit to kevinhartman/qiskit that referenced this pull request Sep 25, 2023
github-merge-queue bot pushed a commit that referenced this pull request Sep 28, 2023
* Fix pickle handling for SingletonGate class

This commit fixes an oversight in #10314 where the handling of pickle
wasn't done correctly. Because the SingletonGate class defines __new__
and based on the parameters to the gate.__class__() call determines
whether we get a new mutable copy or a shared singleton immutable
instance we need special handling in pickle. By default pickle will call
__new__() without any arguments and then rely on __setstate__ to update
the state in the new object. This works fine if the original instance
was a singleton but in the case of mutable copies this will create a
singleton object instead of a mutable copy. To fix this a __reduce__
method is added to ensure arguments get passed to __new__ forcing a
mutable object to be created in deserialization. Then a __setstate__
method is defined to correctly update the mutable object post creation.

* Use __getnewargs_ex__ insetad of __reduce__ & __setstate__

This commit pivots the pickle interface methods used to implement
__getnewargs_ex__ instead of the combination of __reduce__ and
__setstate__. Realistically, all we need to do here is pass that
we have mutable arguments to new to trigger it to create a separate
object, the rest of pickle was working correctly. This makes the
interface being used for pickle a lot clearer.

* Improve assertion in immutable pickle test
@eendebakpt
Copy link
Contributor

@mtreinish Would it be possible to also make the Barrier instruction a singleton? It is not completely without parameters (the number of qubits), so perhaps we need multiple singletons. And similarly for the Clifford operations

@jakelishman
Copy link
Member

This PR is an early part of a complete effort to move most parameter-like state from the Instruction instances into the circuit context (like qubits already are, say), with the goal of making most instructions singleton opcode-like objects.

Barrier being variadic will be a bit weird, but likely we will be able to cache its different sizes, while the standard-library Clifford gates (but not the higher-level Clifford itself) are all big targets for getting to be singletons for sure.

@mtreinish
Copy link
Member Author

@eendebakpt That's a fair point I opened an issue to track that here: #10953

mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request Oct 6, 2023
This commit is a small optimization inside the
Optimize1qGatesDecomposition pass. The last stage of the pass is taking
a circuit sequence and using it to construct an equivalent 1q dag. To do
this the pass iterates over the returned circuit sequence from the 1q
synthesis routine and looks up the gate name in a mapping to gate
classes, and creates a new object of that class with any angles
provided. However, for XGate and SXGate there are no angles, and
since Qiskit#10314 merged there is extra overhead with the repeated
construction of these gate classes (see Qiskit#10867 for more details). Since
these gates are now singletons since Qiskit#10314 it is safe to just reuse the
same instance because calling XGate() will return that instance anyway.
This commit updates the DAGCircuit construction to just reuse the same
instance if the gate in circuit sequence is for x or sx.
rupeshknn pushed a commit to rupeshknn/qiskit that referenced this pull request Oct 9, 2023
* Fix pickle handling for SingletonGate class

This commit fixes an oversight in Qiskit#10314 where the handling of pickle
wasn't done correctly. Because the SingletonGate class defines __new__
and based on the parameters to the gate.__class__() call determines
whether we get a new mutable copy or a shared singleton immutable
instance we need special handling in pickle. By default pickle will call
__new__() without any arguments and then rely on __setstate__ to update
the state in the new object. This works fine if the original instance
was a singleton but in the case of mutable copies this will create a
singleton object instead of a mutable copy. To fix this a __reduce__
method is added to ensure arguments get passed to __new__ forcing a
mutable object to be created in deserialization. Then a __setstate__
method is defined to correctly update the mutable object post creation.

* Use __getnewargs_ex__ insetad of __reduce__ & __setstate__

This commit pivots the pickle interface methods used to implement
__getnewargs_ex__ instead of the combination of __reduce__ and
__setstate__. Realistically, all we need to do here is pass that
we have mutable arguments to new to trigger it to create a separate
object, the rest of pickle was working correctly. This makes the
interface being used for pickle a lot clearer.

* Improve assertion in immutable pickle test
mtreinish added a commit to mtreinish/qiskit-core that referenced this pull request May 1, 2024
The previous definition of a swap gate using ECR rz and sx was incorrect
and also not as efficient as possible. This was missed because the tests
were accidently broken since Qiskit#10314 which was fixed in the previous
commit. This commit updates the definition to use one that is actually
correct and also more efficient with fewer 1 qubit gates.

Co-authored-by: Alexander Ivrii <alexi@il.ibm.com>
github-merge-queue bot pushed a commit that referenced this pull request May 2, 2024
* Add equivalence library entry for swap to ECR or CZ

This commit adds two new equivalence library entries to cover the
conversion from a SWAP gate to either ecr or cz directly. These are
common 2q basis gates and without these entries in the equivalence
library the path found from a lookup ends up with a much less efficient
translation. This commit adds the two new entries so that the
BasisTranslator will use a more efficient decomposition from the start
when targeting these basis. This will hopefully result in less work for
the optimization stage as the output will already be optimal and not
require simplification.

Testing for this PR is handled automatically by the built-in testing
harness in test_gate_definitions.py that evaluates all the entries in
the standard equivalence library for unitary equivalence.

* Add name to annotated gate circuit in qpy backwards compat tests

* Fix equivalence library tests

As fallout from the addition of SingletonGate and
SingletonControlledGate we were accidentally not running large portions
of the unit tests which validate the default session equivalence
library. This test dynamically runs based on all members of the standard
gate library by looking at all defined subclasses of Gate and
ControlledGate. But with the introduction of SingletonGate and
SingletonControlledGate all the unparameterized gates in the library
were not being run through the tests. This commit fixes this to catch
that the swap definition added in the previous commit on this PR branch
used an incorrect definition of SwapGate using ECRGate. The definition
will be fixed in a follow up PR.

* Use a more efficient and actually correct circuit for ECR target

The previous definition of a swap gate using ECR rz and sx was incorrect
and also not as efficient as possible. This was missed because the tests
were accidently broken since #10314 which was fixed in the previous
commit. This commit updates the definition to use one that is actually
correct and also more efficient with fewer 1 qubit gates.

Co-authored-by: Alexander Ivrii <alexi@il.ibm.com>

* Update ECR circuit diagram in comment

* Simplify cz equivalent circuit

* Simplify cz circuit even more

Co-authored-by: Shelly Garion <46566946+ShellyGarion@users.noreply.github.com>
Co-authored-by: Alexander Ivrii <alexi@il.ibm.com>

---------

Co-authored-by: Alexander Ivrii <alexi@il.ibm.com>
Co-authored-by: Shelly Garion <46566946+ShellyGarion@users.noreply.github.com>
ElePT pushed a commit to ElePT/qiskit that referenced this pull request May 31, 2024
* Add equivalence library entry for swap to ECR or CZ

This commit adds two new equivalence library entries to cover the
conversion from a SWAP gate to either ecr or cz directly. These are
common 2q basis gates and without these entries in the equivalence
library the path found from a lookup ends up with a much less efficient
translation. This commit adds the two new entries so that the
BasisTranslator will use a more efficient decomposition from the start
when targeting these basis. This will hopefully result in less work for
the optimization stage as the output will already be optimal and not
require simplification.

Testing for this PR is handled automatically by the built-in testing
harness in test_gate_definitions.py that evaluates all the entries in
the standard equivalence library for unitary equivalence.

* Add name to annotated gate circuit in qpy backwards compat tests

* Fix equivalence library tests

As fallout from the addition of SingletonGate and
SingletonControlledGate we were accidentally not running large portions
of the unit tests which validate the default session equivalence
library. This test dynamically runs based on all members of the standard
gate library by looking at all defined subclasses of Gate and
ControlledGate. But with the introduction of SingletonGate and
SingletonControlledGate all the unparameterized gates in the library
were not being run through the tests. This commit fixes this to catch
that the swap definition added in the previous commit on this PR branch
used an incorrect definition of SwapGate using ECRGate. The definition
will be fixed in a follow up PR.

* Use a more efficient and actually correct circuit for ECR target

The previous definition of a swap gate using ECR rz and sx was incorrect
and also not as efficient as possible. This was missed because the tests
were accidently broken since Qiskit#10314 which was fixed in the previous
commit. This commit updates the definition to use one that is actually
correct and also more efficient with fewer 1 qubit gates.

Co-authored-by: Alexander Ivrii <alexi@il.ibm.com>

* Update ECR circuit diagram in comment

* Simplify cz equivalent circuit

* Simplify cz circuit even more

Co-authored-by: Shelly Garion <46566946+ShellyGarion@users.noreply.github.com>
Co-authored-by: Alexander Ivrii <alexi@il.ibm.com>

---------

Co-authored-by: Alexander Ivrii <alexi@il.ibm.com>
Co-authored-by: Shelly Garion <46566946+ShellyGarion@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: API Change Include in the "Changed" section of the changelog Changelog: New Feature Include in the "Added" section of the changelog performance priority: high
Projects
None yet
Development

Successfully merging this pull request may close these issues.

circuit deepcopy too slow: singleton gates
7 participants