# Extend Janus-CT To Identify Bugs in Quantum Circuit

**Author:** Congliang Lang \& Siwei Tan  

**Date:** 15/4/2024

Based on "[QuCT: A Framework for Analyzing Quantum Circuit by Extracting Contextual and Topological Features (MICRO 2023][1]"

[1]: https://scholar.google.com/scholar_url?url=https://dl.acm.org/doi/abs/10.1145/3613424.3614274%3Fcasa_token%3DffjIB1hQ4ZwAAAAA:8MajDLrDOC74WoeMf7r7AoQ-koxCa4E1TNqQg3GSDz03xUX6XdE3toNTM-YdM_e4rKEusMceJ6BGJg&hl=zh-CN&sa=T&oi=gsb&ct=res&cd=0&d=11146218754516883150&ei=42YSZpPlFL6s6rQPtt6x6Ac&scisig=AFWwaeYaiu2hyx8HUJ_7Buf9Mwom

The vectorization of Janus-CT can be extended to more downstream tasks. For example, in this notebook, we use Janus-CT to identify the potential bugs in the quantum algorithm implementation. We apply a data driven method that traing a model to predict the error rate 

In [1]:
import sys
sys.path.append('..')
import os
os.chdir("..")
import logging
logging.basicConfig(level=logging.WARN)
import ray
ray.init(log_to_driver=False)

from janusq.data_objects.algorithms import get_algorithm_circuits
import random
import numpy as np
from collections import defaultdict
from janusq.analysis.vectorization import RandomwalkModel
from janusq.data_objects.backend import LinearBackend
from janusq.data_objects.circuit import Circuit

2024-04-20 16:42:03,287	ERROR services.py:1329 -- Failed to start the dashboard 
2024-04-20 16:42:03,349	ERROR services.py:1354 -- Error should be written to 'dashboard.log' or 'dashboard.err'. We are printing the last 20 lines for you. See 'https://docs.ray.io/en/master/ray-observability/ray-logging.html#logging-directory-structure' to find where the log file is.
2024-04-20 16:42:03,361	ERROR services.py:1398 -- 
The last 20 lines of /tmp/ray/session_2024-04-20_16-41-35_264172_672/logs/dashboard.log (it contains the error message from the dashboard): 
2024-04-20 16:41:55,283	INFO utils.py:123 -- Module ray.dashboard.modules.actor.actor_head cannot be loaded because we cannot import all dependencies. Install this module using `pip install 'ray[default]'` for the full dashboard functionality. Error: No module named 'aiohttp'
2024-04-20 16:41:57,825	INFO utils.py:123 -- Module ray.dashboard.modules.data.data_head cannot be loaded because we cannot import all dependencies. Install this mo

As shown in the following codes, the downstream model should be a Python class that take the upstream model as a attribute. This enables the model to extract topological and contextual information for further analysis. For example, in the bug identification, we use this information to guess the functionalities of each gate in the circuit and identify abnormal gates as bugs. 

In [2]:
from collections import Counter

class BugIdentificationModel:
    def __init__(self, up_model: RandomwalkModel) -> None:
        '''
        description: bug identification model based on upstream model
        param {RandomwalkModel} up_model: random walk and turn gates to vecs
        '''
        self.up_model = up_model
    
    def train(self, algorithm_to_circuirts: dict[str, list[Circuit]]):
        '''
        description: use algorithm as train dataset and mark circuit functionality
        param {dict[str, list[Circuit]]} algorithm_to_circuirts: algorithms 
        '''
        self.total_vecs = []
        self.functionalities = []
        
        algorithm_names = list(algorithm_to_circuirts.keys())
        for algorithm_name, circuits in algorithm_to_circuirts.items():
            for circuit in circuits:
                vecs = self.up_model.vectorize(circuit)
                self.total_vecs += list(vecs)
                self.functionalities += [algorithm_names.index(algorithm_name)] * len(vecs)

        self.total_vecs = np.array(self.total_vecs)
        self.functionalities = np.array(self.functionalities)

    
    def identify_bug(self, circuit: Circuit, top_k = 3, dist_threshold = .2):
        '''
        description: identify bug by computing distance between vecs and its functionalities
        param {Circuit} circuit: circuit to be identifed
        param {int} top_k: select k bug candidates
        param {float} dist_threshold: distance threshold
        '''
        gate_vecs = self.up_model.vectorize(circuit)
        
        functionalities_per_gate = []
        all_functionalities = []
        for analyzed_vec in gate_vecs:
            dists = np.sqrt(np.sum((self.total_vecs - analyzed_vec)**2, axis=1))
            
            nearest_dist_indices = np.argsort(dists)[:top_k]
            nearest_dists = dists[nearest_dist_indices]
            
            nearest_dist_indices = nearest_dist_indices[nearest_dists < dist_threshold]
            
            nearest_functionalities = self.functionalities[nearest_dist_indices]
            functionalities_per_gate.append(nearest_functionalities)
            all_functionalities += list(nearest_functionalities)
        
        top_functionalities = [
            functionality
            for functionality, count in Counter(all_functionalities).most_common(top_k)
            if count / circuit.n_gates > 0.2
        ]
        
        predicted_gate_indices = []
        for i, possible_functionalities in enumerate(functionalities_per_gate):
            if len([functionality for functionality in possible_functionalities if functionality in top_functionalities]) != 0:
                continue
            predicted_gate_indices.append(i)
        
        # print(circuit)
        return predicted_gate_indices

Unlike traditional debugging approaches that need to repeatedly process large samples, our bug identification method provides a one-shot solution by using the gate vectors from the upstream model. It takes advantage of many reusable modules that compose the algorithm. For example, the quantum Fourier transformation~(QFT) module, and the Grover module are often applied to find the best decision strategy or estimate the expectation of discrete random processes. This opens the door to inferring the possible functionality of a gate by comparing its vector to a standard vector list derived from quantum algorithms. This vector list refers to an algorithm dataset that is generated offline from widely-used algorithms, where each vector is labeled with a functionality. Note that in contrast to fidelity datasets, this dataset is hardware-independent.


Since a gate vector involves multiple paths, it can be viewed as an abstraction of a module in the circuit. To infer the functionality of a given gate, we first calculate the distance between the vector of this gate and the vector in the standard vector list. We label their functionalities to this gate if distances are below a threshold. 

<div style="text-align:center;">
    <img src="pictures/2-6.detect_bug.jpg"  width="60%" height="60%">
</div>

For example, the figure above illustrates the distances between the vector of the CZ gate in the third layer and four standard vectors from the dataset. By setting a threshold of 0.7, this CZ gate is labeled with QFT and Quantum support vector machine~(QSVM) functionality.


<div style="text-align:center;">
    <img src="pictures/2-6.algorithm.png"  width="60%" height="60%">
</div>

After labeling all gates in the circuit, we then apply Algorithm 1 to detect the gates that may exhibit bugs. Assuming that bugs only occur in a sub-circuit, we define a parameter $N_b$ as the maximum number of gates that can be involved in a bug. For each gate vector, we first regard it as a bug (line 1-2). We then collect the neighboring gates $G_{nb}$ that can be visited within ($N_b+1$) steps (line 3). The functionality that is shared by the most gates in $G_{nb}$ is denoted as $m_func$ (line 4).
For each functionality of the input vector, its gate is not identified as a bug in two cases: 

* If the functionality of this gate follows the most frequent functionality of neighboring gates that most gates have. (line 6-7).
* If $num$ exceeds $N_b$, the functionality of the sub-circuit, with more than $N_b$ gates, is identical. There is therefore a high probability that this gate is not a bug (line 8-10). 


In [3]:

def construct_negatives(circuit: Circuit, n_error_gates:int, basis_gates):
    '''
    description: construct bug circuit
    param {Circuit} circuit: construct bug circuit base on circuit 
    param {ing} n_error_gates: number of bug
    param {list} basis_gates: asic types of error gates
    '''
    
    n_qubits = circuit.n_qubits
    bug_circuit = circuit.copy()
    for gate in bug_circuit.gates:
        gate.vec = None
    import time
    random.seed(time.time())
    bug_start = random.randint(0, max(circuit.n_gates - 1 - n_error_gates, 1))
    bug_end = bug_start + n_error_gates
    bug_gate_ids = list(range(bug_start, min(bug_end, circuit.n_gates)))

    for bug_gate_id in bug_gate_ids:
        
        gate = bug_circuit.gates[bug_gate_id]

        name = random.choice(basis_gates) # ['rx', 'ry', 'rz', 'h', 'cz', 'cx']

        params = np.random.random((3,)) * 2 * np.pi
        params = params.tolist()
        
        qubit1 = random.randint(0, n_qubits - 1)
        qubit2 = random.choice([qubit for qubit in range(n_qubits) if qubit != qubit1])
        qubits = [qubit1, qubit2]
        
        gate['name'] = name
        if name in ('rx', 'ry', 'rz'):
            gate['qubits'] = qubits[:1]
            gate['params'] = params[:1]
            
        elif name in ('cz', 'cx'):
            gate['qubits'] = qubits
            gate['params'] = []
            
        elif name in ('h'):
            gate['qubits'] = qubits[:1]
            gate['params'] = []
            
        elif name in ('u'):
            gate['qubits'] = qubits[:1]
            gate['params'] = params
            
        else:
            logging.error("no such gate")
            return circuit

    bug_circuit.name = bug_circuit.name
    return bug_circuit, bug_gate_ids

In [4]:
algorithm_names = ['qft', 'hs', 'ising', 'qknn', 'qsvm', 'vqc', 'ghz', 'grover'] # 8 programs
algorithm_to_circuirts = defaultdict(list)
algorithm_circuits = []

backend = LinearBackend(8)
up_model = RandomwalkModel(n_steps = 2, n_walks = 30, backend = backend, decay=.5)

for n_qubits in range(5, backend.n_qubits + 1):
    for algorithm, circuit in zip(algorithm_names, get_algorithm_circuits(n_qubits, backend, algorithm_names)):
        algorithm_to_circuirts[algorithm].append(circuit)
        algorithm_circuits.append(circuit)

up_model.train(algorithm_circuits)

100%|███████████████████████████████████████████| 32/32 [01:04<00:00,  2.03s/it]


In [5]:
bug_indentify_model = BugIdentificationModel(up_model)
bug_indentify_model.train(algorithm_to_circuirts)

In [6]:
# evaluate the model
for circuit in algorithm_circuits:
    error_circuit, error_gate_indices = construct_negatives(circuit, n_error_gates=3, basis_gates= backend.basis_gates)
    predict_indices = bug_indentify_model.identify_bug(error_circuit)

    correct_rate = 0
    for predict_indice in predict_indices:
        if predict_indice in error_gate_indices:
            correct_rate+=1

    print(str.format("algrithm: {}, identify_rate: {}", error_circuit.name, correct_rate * 100 / len(error_gate_indices)))

algrithm: qft_5, identify_rate: 100.0
algrithm: hs_5, identify_rate: 33.333333333333336
algrithm: ising_5, identify_rate: 100.0
algrithm: qknn_5, identify_rate: 100.0
algrithm: qsvm_5, identify_rate: 66.66666666666667
algrithm: vqc_5, identify_rate: 100.0
algrithm: ghz_5, identify_rate: 33.333333333333336
algrithm: grover_5, identify_rate: 100.0
algrithm: qft_6, identify_rate: 100.0
algrithm: hs_6, identify_rate: 100.0
algrithm: ising_6, identify_rate: 100.0
algrithm: qknn_6, identify_rate: 100.0
algrithm: qsvm_6, identify_rate: 33.333333333333336
algrithm: vqc_6, identify_rate: 100.0
algrithm: ghz_6, identify_rate: 66.66666666666667
algrithm: grover_6, identify_rate: 100.0
algrithm: qft_7, identify_rate: 100.0
algrithm: hs_7, identify_rate: 100.0
algrithm: ising_7, identify_rate: 100.0
algrithm: qknn_7, identify_rate: 100.0
algrithm: qsvm_7, identify_rate: 33.333333333333336
algrithm: vqc_7, identify_rate: 100.0
algrithm: ghz_7, identify_rate: 0.0
algrithm: grover_7, identify_rate: 10