**Derivation of the training ansatz for the 4-call qubit-unitary inversion protocol**

In [None]:
import torch
from torch.nn import ModuleList

import quairkit
from quairkit.application import PQCombNet
from quairkit.circuit import Circuit
from quairkit.database import zero_state
from quairkit.qinfo import dagger

quairkit.set_dtype("complex128")

Set up the basic parameters of the training ansatz.

In [None]:
num_slots, ancilla_qubits = 4, 3
num_qubits = ancilla_qubits + 1
num_V = num_slots + 1
slot_dim = 2

Initial training ansatz that found 0.99 training fidelity, where each comb tooth is a 16-dimensional universal unitary tunable by 255 parameters.

In [None]:
net = PQCombNet(
    target_function=dagger,
    num_slots=num_slots,
    ancilla=ancilla_qubits,
    slot_dim=slot_dim,
    train_mode="process",
)

net.plot()


In [None]:
net.train()

The ansatz further improved by guessing all entangled gates between ancilla qubits and the main system to be controlled universal qubit-gates $\textrm{U}3$.

In [None]:
V_circuit_list_b = ModuleList()
for i in range(num_V):
    V_circuit = Circuit(num_qubits)
    if i > 0:
        V_circuit.cu([0, 3])
        V_circuit.cu([1, 3])
        V_circuit.cu([2, 3])
    V_circuit.universal_qudits(list(range(ancilla_qubits)))
    V_circuit.cu([2, 3])
    V_circuit.cu([1, 3])
    V_circuit.cu([0, 3])
    V_circuit_list_b.append(V_circuit)
net.V_circuit_list = V_circuit_list_b
net.plot()
net.train()

The ansatz further improved by guessing most of universal qubit-gates to be Pauli operators, and the first universal three-qubit gates to be Hadamard gates. The best training fidelity can approach 0.999 at this stage.

In [None]:
V_circuit_list_c = ModuleList()
for i in range(num_V):
    V_circuit = Circuit(num_qubits)
    if i == 0:
        V_circuit.h(list(range(ancilla_qubits)))
    else:
        V_circuit.cz([0, 3])
        V_circuit.cy([1, 3])
        V_circuit.cx([2, 3])
        V_circuit.universal_qudits(list(range(ancilla_qubits)))

    if i < num_slots:
        V_circuit.cx([2, 3])
        V_circuit.cy([1, 3])
        V_circuit.cz([0, 3])
    else:
        V_circuit.cu([2, 3])
        V_circuit.cu([1, 3])
        V_circuit.cu([0, 3])
    V_circuit_list_c.append(V_circuit)
net.V_circuit_list = V_circuit_list_c
net.plot()
net.train()

The ansatz further improved by guessing low coherence of the first ancilla qubit, which needs to be clean at the end: the amplitude on $\ket{0}$ is now proportional to the training fidelity.

In [None]:
V_circuit_list_d = ModuleList()
for i in range(num_V):
    V_circuit = Circuit(num_qubits)
    if i == 0:
        V_circuit.h(list(range(1, ancilla_qubits)))
    else:
        V_circuit.cy([1, 3])
        V_circuit.cx([2, 3])
        V_circuit.universal_qudits(list(range(ancilla_qubits)))

    if i < num_slots:
        V_circuit.cx([2, 3])
        V_circuit.cy([1, 3])
    else:
        V_circuit.cy([2, 3])
        V_circuit.cx([1, 3])
    V_circuit_list_d.append(V_circuit)
net.V_circuit_list = V_circuit_list_d
net.plot()
net.train(
    projector=zero_state(1).density_matrix.kron(torch.eye(2 ** (ancilla_qubits - 1)))
)