In [3]:
from qibo import Circuit, gates

In [4]:
def build_circuit(nqubits, nlayers):
    """Build qibo's aavqe example circuit."""

    circuit = Circuit(nqubits)
    for _ in range(nlayers):
        circuit.add(gates.RY(q, theta=0) for q in range(nqubits))
        circuit.add(gates.RZ(q, theta=0) for q in range(nqubits))
        circuit.add(gates.CZ(q, q + 1) for q in range(0, nqubits - 1, 2))
        circuit.add(gates.RY(q, theta=0) for q in range(nqubits))
        circuit.add(gates.RZ(q, theta=0) for q in range(nqubits))
        circuit.add(gates.CZ(q, q + 1) for q in range(1, nqubits - 2, 2))
        circuit.add(gates.CZ(0, nqubits - 1))
    circuit.add(gates.RY(q, theta=0) for q in range(nqubits))

    return circuit

In [7]:
nqubits = 10
nlayers = 15

c = build_circuit(nqubits, nlayers)

In [8]:
print(c.summary())

Circuit depth = 91
Total number of gates = 760
Number of qubits = 10
Most common gates:
ry: 310
rz: 300
cz: 150


In the following, we propose a method to compare pure VQE training with hybrid VQE-DBI training. Before delving into the details, let's express the number of two-qubit gates that should be performed during the entire training process in terms of the VQE hyperparameters.

#### Pure VQE Training

We denote $N_{2q}$ as the total number of two-qubit gates in a VQE. This quantity depends on the number of layers, and with our ansatz, a direct proportionality holds:

$$N_{2q} = N_{layers} \cdot d_{2q}$$

where $d_{2q}$ is the number of two-qubit gates per layer.

Let's consider the worst-case scenario in terms of two-qubit gate evaluations: gradient descent on hardware. We need two expectation values for each VQE parameter to compute the derivative. These values need to be computed for each epoch of training. The total number of circuit executions to evaluate a gradient is $K = 2p$, where $p$ is the number of circuit parameters.

Considering a training of $N_{ep}$ epochs, the total number of circuit evaluations is $N_{ep} \cdot K$.

$$ N_{tot} = K \cdot N_{2q} \cdot N_{ep} = N_{vqe} \cdot N_{ep}$$ 

In the best-case scenario, the loss function requires only one expectation value for each optimization iteration, hence $K = 1$.

#### Hybrid VQE-DBI Training

In this case, the total number of two-qubit gates is given by the sum of the gates used in the VQE ($N'_{VQE}$) and the DBI ($N_{DBI}$).

$$ N'_{tot} = N'_{VQE} + N_{DBI} = K' \cdot N'_{2q} \cdot N'_{ep} + N_{2q, dbi}(n_{step}) = N'_{vqe} \cdot N'_{ep} + N_{2q, dbi}(n_{step})$$

#### Metric

We define the metric 

$$ \chi = \frac{N'_{tot}}{N_{tot}} = \frac{N'_{vqe} \cdot N'_{ep} + N_{2q, dbi}(n_{step})}{N_{vqe} \cdot N_{ep}} = \frac{K' \cdot N'_{layers} \cdot d'_{2q} \cdot N'_{ep} + N_{2q, dbi}(n_{step})}{K \cdot N_{layers} \cdot d_{2q} \cdot N_{ep}}$$ 

Since the VQE layers are the same in both cases, $d_{2q} = d'_{2q}$. In the worst-case scenario, $K$ and $K'$ are proportional to the number of layers; otherwise, they can be neglected. From the above formula, we can infer that the ratio $\chi$ depends only on the number of epochs and layers.

#### Evaluation Procedure

Now we propose an algorithm to evaluate which configuration, given a fixed model accuracy, satisfies $\chi \leq 1$, i.e., the hybrid method is more efficient than the pure method.

1. Choose a number of qubits large enough to ensure the presence of barren plateaus.
2. Fix the number of DBI steps. DBI is effective even with a small number of iterations, and since the number of two-qubit gates grows exponentially with the number of iterations, it is better to choose it as small as possible.
3. Fix a target accuracy (difference between the true ground energy and the one found by the model) $\varepsilon$:
    1. **Pure Training:** In this step, we want to find the best VQE model given $\varepsilon$ and use it as our target. We select the models that can approximate $E_0$ with accuracy $\leq \varepsilon$, and pick the epoch-layers pair that minimizes $N_{tot}$.
    2. **Hybrid Training:** Run VQE hybrid training with different numbers of layers and epochs, select those with accuracy less than or equal to $\varepsilon$, and evaluate $N'_{tot}$ for each of them.
    3. Evaluate $\chi$ for each combination of layers and epochs of the hybrid models.

We can collect all the $\chi$ values in a matrix and study the regime where hybrid training performs similarly to or better than pure training.
