Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization level 3 in transpile gives the wrong circuit when increasing the depth #7341

Closed
anedumla opened this issue Dec 2, 2021 · 13 comments
Labels
bug Something isn't working

Comments

@anedumla
Copy link
Contributor

anedumla commented Dec 2, 2021

Environment

  • Qiskit Terra version: 0.19.0
  • Python version: 3.9.4
  • Operating system: Windows 10 Enterprise 64-bit

What is happening?

Using optimization_level=3 within the transpile method yields a circuit with different outcomes that the one produced either from setting optimization_level=2 or evaluating the non transpiled circuit with Statevector.

The wrong behaviour was observed while increasing the number of trotter steps in a Trotterization time evolution, and the circuits obtained from optimization_level=3 perform as expected as long as the original circuit remains relatively shallow (see plots below).

How can we reproduce the issue?

import numpy as np
from qiskit import QuantumCircuit, transpile, IBMQ, execute, Aer
from qiskit.quantum_info import Pauli, Operator, Statevector
from qiskit.opflow import PauliOp
from qiskit.circuit.library import PauliEvolutionGate
from qiskit.synthesis import SuzukiTrotter
import matplotlib.pyplot as plt
from scipy.linalg import expm

# input parameters
provider = IBMQ.load_account()
provider = IBMQ.get_provider(hub='ibm-q-internal', group='deployed')
backend = provider.get_backend('ibmq_mumbai')
simulator = Aer.get_backend('aer_simulator')
shots = 100000
layout =  [0,1,4]
num_qubits = len(layout)
evo_time = 1

def heisenberg_EO(n_qubits):
    """Construct Hamiltonian of 1-D Heisenberg chain with n_qubits sites.
    Return a list of matrices corresponding to the even-odd-splitting (154) in
    Childs2021.
    """
    # Three qubits because for 2 we get H_O = 0
    assert n_qubits >= 3
    def OPauli(s):
        return PauliOp(Pauli(s))
    def two_qubit_interaction(i, coeff):
        assert i < n_qubits - 1
        return (OPauli("I" * i + "XX" + "I" * (n_qubits - 2 - i)) +
                OPauli("I" * i + "YY" + "I" * (n_qubits - 2 - i)) +
                OPauli("I" * i + "ZZ" + "I" * (n_qubits - 2 - i)) +
                OPauli("I" * i + "Z" + "I" * (n_qubits - 1 - i)) *
                coeff)

    H_E = sum((two_qubit_interaction(i, 1)
               for i in range(0, n_qubits - 1, 2)))
    H_O = sum((two_qubit_interaction(i, 1)
               for i in range(1, n_qubits - 1, 2)))
    return [H_E, H_O]

def get_circuit(qubits, evo_time, trotter_steps):
    num_qubits = qubits

    # Obtain the matrices for Trotter 
    paulis = heisenberg_EO(num_qubits)

    evo_op = PauliEvolutionGate(paulis, evo_time, synthesis=SuzukiTrotter(order=2, reps=trotter_steps))
    return evo_op.definition

# Calculate the local magnetization observable of a qubit
def local_magnetization(qubit, counts, shots):
    zeros, ones = 0, 0
    for c, v in counts.items():
        if c[qubit] == '0':
            zeros += v
        else:
            ones += v
    return (zeros - ones) / shots

# Calculate the local magnetization observable from a statevector
def local_magnetization_vec(qubit, statevector, num_qubits):
    'calculate the prob of seing 0 and 1 in the given qubit'
    zero, one = 0, 0
    for i,amplitude in enumerate(statevector):
        prob = np.real(amplitude * np.conj(amplitude))
        if bin(i)[2:].zfill(num_qubits)[qubit] == '0':
            zero += prob
        else:
            one += prob
    return zero - one

def get_plot(ax, local_qubit):
    # arrays for the magnetization value for different number of Trotter steps
    magn_opt2, magn_opt3, magn_sv = [], [], []

    for i in range(len(trotter_steps)):
        magn_opt2.append(local_magnetization(local_qubit, counts_opt2[i], shots))
        magn_opt3.append(local_magnetization(local_qubit, counts_opt3[i], shots)) 
        magn_sv.append(local_magnetization_vec(local_qubit, statevectors[i], num_qubits))
    magn_exact = local_magnetization_vec(local_qubit, sol_vec, num_qubits)
    ax.plot(trotter_steps, magn_opt2, label='opt_lvl=2')
    ax.plot(trotter_steps, magn_opt3, label='opt_lvl=3')
    ax.plot(trotter_steps, magn_sv, label='statevector')
    ax.plot(trotter_steps, [magn_exact]*len(trotter_steps),'--', label='exact')
    ax.set(xlabel='trotter steps', ylabel=r'<$\sigma_{%s}^z$>'%(local_qubit))
    ax.legend()

# Aer simulator circuits
trotter_steps = range(1, 31)
jobs_sim_opt2, jobs_sim_opt3 = [], []

for m in trotter_steps:
    qc = QuantumCircuit(num_qubits, num_qubits)
    qc.h(list(range(num_qubits))) # dummy initial state
    qc.append(get_circuit(num_qubits, evo_time, m), list(range(num_qubits)))
    qc.measure(range(num_qubits), range(num_qubits))
    # transpiled circuit with optimization level 2 and 3
    qct2 = transpile(qc, backend, initial_layout=layout, optimization_level=2)
    qct3 = transpile(qc, backend, initial_layout=layout, optimization_level=3)
    # simulated job
    jobs_sim_opt2.append(execute(qct2, simulator, shots=shots))
    jobs_sim_opt3.append(execute(qct3, simulator, shots=shots))
    
# Get Aer simulator results
counts_opt2, counts_opt3 = [], []

for i in range(len(trotter_steps)):
    counts_opt2.append(jobs_sim_opt2[i].result().get_counts())
    counts_opt3.append(jobs_sim_opt3[i].result().get_counts())

# Statevector
statevectors = []
for m in trotter_steps:
    qc = QuantumCircuit(num_qubits)
    qc.h(list(range(num_qubits))) # dummy initial state
    qc.append(get_circuit(num_qubits, evo_time, m), list(range(num_qubits)))
    statevectors.append(Statevector(qc).data)
    
# Exact vector
H1, H2 = heisenberg_EO(num_qubits)
exactH = H1.to_matrix(massive=True) + H2.to_matrix(massive=True)
ham_H = expm(1j * exactH * evo_time)
sol_vec = ham_H @ np.array([1]*(2**num_qubits))*(1/np.sqrt(2)**num_qubits)

# Plot local measurements
fig, axes = plt.subplots(1, num_qubits, figsize=(17,5))
fig.suptitle(r'$<\sigma^z>$ local measurement')
for i in range(num_qubits):
    get_plot(axes[i], i)

Output:
issue_qiskit

What should happen?

In the output plots the line for optimization_level=3 should coincide with the statevector and optimization_level=2 lines.

Any suggestions?

No response

@anedumla anedumla added the bug Something isn't working label Dec 2, 2021
@nonhermitian
Copy link
Contributor

Yes. Optimization level 3 gives (for 15 steps):

global phase: 2π
                          
        q_0 -> 0 ─────────
                 ┌───────┐
        q_1 -> 1 ┤ Rz(2) ├
                 ├───────┤
        q_2 -> 2 ┤ Rz(2) ├
                 └───────┘
  ancilla_0 -> 3 ─────────
                          
  ancilla_1 -> 4 ─────────
                          
  ancilla_2 -> 5 ─────────

where as 2 gives something much longer, a small snippet of which is:

global phase: π/2
                  ┌─────────┐  ┌────┐┌─────────┐                               »
        q_0 -> 0 ─┤ Rz(π/2) ├──┤ √X ├┤ Rz(π/2) ├──■────────────────────────────»
                  ├─────────┤  ├────┤├─────────┤┌─┴─┐┌────────────────────────┐»
        q_1 -> 1 ─┤ Rz(π/2) ├──┤ √X ├┤ Rz(π/2) ├┤ X ├┤ Rz(0.0666666666666667) ├»
                 ┌┴─────────┴─┐├────┤├─────────┤└───┘└────────────────────────┘»
        q_2 -> 2 ┤ Rz(1.6375) ├┤ √X ├┤ Rz(π/2) ├───────────────────────────────»
                 └────────────┘└────┘└─────────┘                               »

@nonhermitian
Copy link
Contributor

It is independent of the routing method, so something else is breaking.

@kdk
Copy link
Member

kdk commented Dec 2, 2021

Do we know yet if this is due to changes in optimization_level=3 after 0.18.3? I wasn't able to directly replicate the example on 0.18.3 because it use e.g. PauliEvolutionGate which is new with 0.19.

@kdk kdk added this to the 0.19 milestone Dec 2, 2021
@nonhermitian
Copy link
Contributor

nonhermitian commented Dec 2, 2021

Here is the raw circuit used above:

circ.qpy.zip

@levbishop
Copy link
Member

levbishop commented Dec 2, 2021

Collect2qBlocks/ConsolidateBlocks/UnitarySynthesis is greatly simplifying these circuits here.

def cb(pass_, dag, **kwargs):
    print(type(pass_).__name__, dag.count_ops())

qct3 = transpile(qc, backend, initial_layout=layout, optimization_level=3, callback=cb)

Going from 268 cx to 31 unitary to 0 entangling gates....

UnitarySynthesis {'h': 3, 'circuit-2402': 1, 'measure': 3}
Unroll3qOrMore {'h': 3, 'measure': 3, 'rzz': 45, 'rxx': 45, 'rz': 45, 'ryy': 45}
RemoveResetInZeroState {'h': 3, 'measure': 3, 'rzz': 45, 'rxx': 45, 'rz': 45, 'ryy': 45}
OptimizeSwapBeforeMeasure {'h': 3, 'measure': 3, 'rzz': 45, 'rxx': 45, 'rz': 45, 'ryy': 45}
RemoveDiagonalGatesBeforeMeasure {'h': 3, 'measure': 3, 'rzz': 44, 'rxx': 45, 'rz': 44, 'ryy': 45}
SetLayout {'h': 3, 'measure': 3, 'rzz': 44, 'rxx': 45, 'rz': 44, 'ryy': 45}
FullAncillaAllocation {'h': 3, 'measure': 3, 'rzz': 44, 'rxx': 45, 'rz': 44, 'ryy': 45}
EnlargeWithAncilla {'h': 3, 'measure': 3, 'rzz': 44, 'rxx': 45, 'rz': 44, 'ryy': 45}
ApplyLayout {'h': 3, 'rxx': 45, 'ryy': 45, 'rzz': 44, 'rz': 44, 'measure': 3}
CheckMap {'h': 3, 'rxx': 45, 'ryy': 45, 'rzz': 44, 'rz': 44, 'measure': 3}
UnitarySynthesis {'h': 3, 'rxx': 45, 'ryy': 45, 'rzz': 44, 'rz': 44, 'measure': 3}
UnrollCustomDefinitions {'h': 3, 'rxx': 45, 'ryy': 45, 'rzz': 44, 'rz': 44, 'measure': 3}
BasisTranslator {'rz': 1084, 'measure': 3, 'sx': 543, 'cx': 268}
RemoveResetInZeroState {'rz': 1084, 'measure': 3, 'sx': 543, 'cx': 268}
Depth {'rz': 1084, 'measure': 3, 'sx': 543, 'cx': 268}
FixedPoint {'rz': 1084, 'measure': 3, 'sx': 543, 'cx': 268}
vvvv
Collect2qBlocks {'rz': 1084, 'measure': 3, 'sx': 543, 'cx': 268}
ConsolidateBlocks {'measure': 3, 'unitary': 31}
UnitarySynthesis {'measure': 3, 'rz': 93, 'sx': 61}
^^^^
Optimize1qGatesDecomposition {'measure': 3, 'rz': 6, 'sx': 3}
CommutationAnalysis {'measure': 3, 'rz': 6, 'sx': 3}
CommutativeCancellation {'measure': 3, 'rz': 6, 'sx': 3}
UnitarySynthesis {'measure': 3, 'rz': 6, 'sx': 3}
UnrollCustomDefinitions {'measure': 3, 'rz': 6, 'sx': 3}
BasisTranslator {'measure': 3, 'rz': 6, 'sx': 3}
Depth {'measure': 3, 'rz': 6, 'sx': 3}
FixedPoint {'measure': 3, 'rz': 6, 'sx': 3}
Collect2qBlocks {'measure': 3, 'rz': 6, 'sx': 3}
ConsolidateBlocks {'measure': 3, 'rz': 6, 'sx': 3}
UnitarySynthesis {'measure': 3, 'rz': 6, 'sx': 3}
Optimize1qGatesDecomposition {'measure': 3, 'rz': 6, 'sx': 3}
CommutationAnalysis {'measure': 3, 'rz': 6, 'sx': 3}
CommutativeCancellation {'measure': 3, 'rz': 6, 'sx': 3}
UnitarySynthesis {'measure': 3, 'rz': 6, 'sx': 3}
UnrollCustomDefinitions {'measure': 3, 'rz': 6, 'sx': 3}
BasisTranslator {'measure': 3, 'rz': 6, 'sx': 3}
Depth {'measure': 3, 'rz': 6, 'sx': 3}
FixedPoint {'measure': 3, 'rz': 6, 'sx': 3}
Collect2qBlocks {'measure': 3, 'rz': 6, 'sx': 3}
ConsolidateBlocks {'measure': 3, 'rz': 6, 'sx': 3}
UnitarySynthesis {'measure': 3, 'rz': 6, 'sx': 3}
Optimize1qGatesDecomposition {'measure': 3, 'rz': 6, 'sx': 3}
CommutationAnalysis {'measure': 3, 'rz': 6, 'sx': 3}
CommutativeCancellation {'measure': 3, 'rz': 6, 'sx': 3}
UnitarySynthesis {'measure': 3, 'rz': 6, 'sx': 3}
UnrollCustomDefinitions {'measure': 3, 'rz': 6, 'sx': 3}
BasisTranslator {'measure': 3, 'rz': 6, 'sx': 3}
ContainsInstruction {'measure': 3, 'rz': 6, 'sx': 3}

@levbishop
Copy link
Member

I think the problem here is the basic interaction of small-step Trotterization (which makes the interaction-per-step become small) with approximate unitary synthesis (which avoids emitting 2q gates when the interaction becomes small enough). The solution is to use fewer Trotter steps or to turn off approximate synthesis.

@nonhermitian
Copy link
Contributor

It appears that this happens only with the default approximation_degree=None. Setting 1 (min approx) or 0 (max approx) both yield the same circuit.

kdk added a commit to kdk/qiskit-terra that referenced this issue Dec 2, 2021
QiskitGH-6581 introduced a behavior of setting the basis_fidelity used
by decomposer2q from the reported device gate_error of the
corresponding 2q gate. QiskitGH-7341 reported a case where this
approximation removed small but significant interactions
resulting in an non-equivalent output circuit. As a workaround,
revert to the prior behavior of using decomposer2q's default
basis_fidelity (1, no approximation).
@kdk
Copy link
Member

kdk commented Dec 2, 2021

#7348 takes the approach of disabling approximate synthesis entirely (since this is more or less a rollback to the behavior of 0.18.3) (I was incorrect, this behavior had already been released as part of 0.18.3).

image

Thinking about this a bit more though, could it be the case that this is a sound optimization to make (in the interest of best approximating the given unitary with the available device gate fidelities), even though when simulating (on an ideal simulator) you'll end up quite far from the intended operation?

@levbishop
Copy link
Member

Thinking about this a bit more though, could it be the case that this is a sound optimization to make (in the interest of best approximating the given unitary with the available device gate fidelities), even though when simulating (on an ideal simulator) you'll end up quite far from the intended operation?

This is the intention, that you do the best you can given available gate fidelities. It may end up far from the intended, but better than not approximating. Of course the decision is heuristic and local, so may not always succeed - something like trotterization where the repeated trotter steps are intended to add coherently would be an example where the heuristic based on assuming a depolarizing channel of equivalent infidelity might choose poorly.

@kdk kdk removed this from the 0.19 milestone Dec 2, 2021
@kdk
Copy link
Member

kdk commented Dec 2, 2021

In that case, I'd lean toward the transpiler behavior being not a bug. It may be a bug though that users can compile based on a set of backend_properties and simulate without them without seeing a warning. Does that make sense to you @anedumla @nonhermitian @levbishop ?

@nonhermitian
Copy link
Contributor

As an user I am actually surprised that this was enabled by default. To me at least, it seems to be implicitly breaking the idea that the transpiler faithfully performs the unitary in question (up to certain permutations, transformations etc.). To me this feels akin to the fast-math compiler option that breaks IEEE 754 floating-point rules to gain performance, but occasionally really messes things up, eg see JuliaLang/julia#30073 (comment)

I think approximations like this should be explicit, and turned on by the user with knowledge that bad things can some times happen.

@anedumla
Copy link
Contributor Author

anedumla commented Dec 3, 2021

I agree with @nonhermitian. Personally, I would not expect that the circuit returned is an approximation and would prefer to explicitly activate an option to approximate the unitary given the hardware characteristics.

@mtreinish
Copy link
Member

Since #8595 has merged I think this has been implemented so I'm going to close this. If I'm missing something here though or misinterpreting what was needed for this issue please feel free to reopen this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants