## Data Generation

This cell generates a dataset of 48 unoptimized quantum circuits using Qiskit’s `random_circuit()` function, with qubit counts ranging from 2 to 5 and depths from 1 to 4, repeated 3 times for variety. The circuits are transpiled into a TKET-compatible basis (`h`, `cx`, `rz`, `t`) with `optimization_level=0` to ensure they remain unoptimized, then converted to TKET’s `Circuit` format using `qiskit_to_tk`. This step is provides the raw circuits needed for our ML model to analyze and optimize allowing for feature extraction and optimization pass evaluation.

In [2]:
from qiskit.circuit.random import random_circuit
from qiskit import transpile
from pytket.extensions.qiskit import qiskit_to_tk

# initialise empty random circuit list
random_circuits = []

# ensure gates generatexd are compatible with TKET
basis_gates = ['h', 'cx', 'rz', 't']

for n_qubits in [2, 3, 4, 5]:  # vary qubit count
    for depth in [1, 2, 3, 4]:  # vary circuit depth
        for _ in range(3):  # 3 instances per combination
            circ = random_circuit(n_qubits, depth, measure=False)
            
            # converts Qiskit circuit to circuits composed of basis gates defined above
            transpiled_circ = transpile(circ, basis_gates=basis_gates, optimization_level=0)
            random_circuits.append(transpiled_circ)

# convert transpiled Qiskit circuits to TKET for optimisation
tket_circuits = [qiskit_to_tk(circ) for circ in random_circuits]

## Feature Extraction

We define an `extract_features` function to analyze each of the 48 TKET circuits from Cell 1, extracting seven features: number of qubits (`n_qubits`), circuit depth (`depth`), counts of specific gates (`n_cx`, `n_h`, `n_rz`, `n_t`), and a connectivity metric (average qubit interactions via CX gates). These features are stored in `circuit_features` as a list of dictionaries. This step quantifies circuit properties that influence optimization, providing the input data for our ML model to predict the best TKET pass.

In [4]:
from pytket import OpType

# define a feature extraction function
def extract_features(circuit):
    features = {
        "n_qubits": circuit.n_qubits,  # number of qubits
        "depth": circuit.depth(),      # circuit depth
        "n_cx": circuit.n_gates_of_type(OpType.CX),  # number of CNOT gates
        "n_h": circuit.n_gates_of_type(OpType.H),    # number of Hadamard gates
        "n_rz": circuit.n_gates_of_type(OpType.Rz),  # number of Rz gates
        "n_t": circuit.n_gates_of_type(OpType.T),    # number of T gates
    }
    
    # calculate connectivity
    cx_count = features["n_cx"]
    if cx_count > 0:
        # each CX involves 2 qubits; estimate interactions
        features["connectivity"] = (2 * cx_count) / circuit.n_qubits
    else:
        features["connectivity"] = 0.0
    return features

# extract features
circuit_features = [extract_features(circ) for circ in tket_circuits]

## TKET Optimisation

This cell evaluates three TKET optimization passes (`CliffordSimp`, `FullPeepholeOptimise`, `PauliSimp`) on each circuit from Cell 1 to determine which reduces gate count the most. The `evaluate_optimizations` function tests each pass on a copy of the circuit, calculates gate reductions, and selects the best pass as a label. We use these results and pair them with our features to create `training_data`—a list of 48 (features, label) tuples.

In [6]:
 #import TKET optimization passes
from pytket.passes import CliffordSimp, FullPeepholeOptimise, PauliSimp
from pytket import Circuit

# define optimization passes fro testing
optimization_passes = {
    "CliffordSimp": CliffordSimp(),
    "FullPeepholeOptimise": FullPeepholeOptimise(),
    "PauliSimp": PauliSimp()
}

# define evaluation function to evaluate passes
""" description:
    - takes a TKET Circuit as input
    - stores the original gate count (circuit.n_gates) as a baseline
    - initializes an empty dictionary to track each pass’s performance
    - iterates over the optimization_passes dictionary
    - applies the pass and computes the gate reduction
    - Calculates the best pass
    """
def evaluate_optimizations(circuit):
    original_gates = circuit.n_gates
    pass_results = {}
    
    for pass_name, pass_obj in optimization_passes.items():
        
        # create a fresh copy of the circuit
        temp_circ = Circuit(circuit.n_qubits)
        temp_circ.append(circuit)
        
        # apply the optimization pass
        pass_obj.apply(temp_circ)
        
        # calculate gate reduction
        reduction = original_gates - temp_circ.n_gates
        pass_results[pass_name] = reduction
    
    # find the pass with the largest gate reduction
    best_pass = max(pass_results, key=pass_results.get)
    return best_pass, pass_results

## Create Training Data

In [8]:
# create traininng data by evaluating optimisations and pairing this with each circuits features
training_data = []
for circ, feats in zip(tket_circuits, circuit_features):
    best_pass, results = evaluate_optimizations(circ)
    training_data.append((feats, best_pass))

## Model Training

We train a Random Forest Classifier using Scikit-Learn on the `training_data`, converting features into a NumPy array (`X`) and labels into `y`. The data is split (80% train, 20% test), and the model learns to predict the best TKET pass.

In [10]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np

# convert features from training_data to 2d np array (X) and extract labels into 1-D array
X = np.array([[f["n_qubits"], f["depth"], f["n_cx"], f["n_h"], f["n_rz"], f["n_t"], f["connectivity"]] 
              for f, _ in training_data])
y = np.array([label for _, label in training_data])

# split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# initialize and train the model
# we use a simple random forest classifier as we have a small dataset
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

## Model Testing and Evaluation

We evaluate the previous training step by feeding our model the testing branch of our data. The accuracy is calculated to be 90%

In [12]:
# input test set to the model and evaluate model accuracy
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model accuracy on test set: {accuracy:.2f}")

Model accuracy on test set: 0.40


## Feature Importance

It is important to quantify which circuit features contribute to the models decision making: we can see below that connectivity appears to be most important, accounting for about 25.9% of the models decision making, followed closely by the number of CNOT gates at 21.4%.

In [14]:
# Feature importance
feature_names = ["n_qubits", "depth", "n_cx", "n_h", "n_rz", "n_t", "connectivity"]
importances = model.feature_importances_
for name, importance in zip(feature_names, importances):
    print(f"Feature {name}: Importance = {importance:.3f}")

Feature n_qubits: Importance = 0.098
Feature depth: Importance = 0.125
Feature n_cx: Importance = 0.215
Feature n_h: Importance = 0.092
Feature n_rz: Importance = 0.106
Feature n_t: Importance = 0.162
Feature connectivity: Importance = 0.202


## Application Example on New, Unseen Single Circuit

Having trained and tested our models we can now use it on unseen circuits to evaluate which optimisation pass will be best to use. In this example, we can say with 90% accuracy that the `FullPeepholeOptimise` pass is best for the given circuit, and with verification see that the model is correct.

In [16]:
# generate a new random 3-qubit circuit
new_circ = random_circuit(3, 3, measure=False)
transpiled_circ = transpile(new_circ, basis_gates=basis_gates, optimization_level=0)
tket_new_circ = qiskit_to_tk(transpiled_circ)

# extract features
new_features_dict = extract_features(tket_new_circ)
new_features = np.array([new_features_dict["n_qubits"], new_features_dict["depth"], 
                         new_features_dict["n_cx"], new_features_dict["n_h"], 
                         new_features_dict["n_rz"], new_features_dict["n_t"], 
                         new_features_dict["connectivity"]])

# predict the best optimization pass
predicted_pass = model.predict([new_features])[0]

# verify with existing passes from Cell 3
original_gates = tket_new_circ.n_gates
pass_results = {}
for pass_name, pass_obj in optimization_passes.items():
    temp_circ = Circuit(tket_new_circ.n_qubits)
    temp_circ.append(tket_new_circ)
    pass_obj.apply(temp_circ)
    reduction = original_gates - temp_circ.n_gates
    pass_results[pass_name] = reduction
best_pass = max(pass_results, key=pass_results.get)

# results
print(f"New circuit features: {new_features}")
print(f"Predicted optimization pass: {predicted_pass}")
print(f"Verification - Original gates: {original_gates}, Reductions: {pass_results}")
print(f"Correct optimization pass: {best_pass}")

New circuit features: [ 3.         24.          8.          4.         15.          5.
  5.33333333]
Predicted optimization pass: FullPeepholeOptimise
Verification - Original gates: 32, Reductions: {'CliffordSimp': 15, 'FullPeepholeOptimise': 14, 'PauliSimp': -1}
Correct optimization pass: CliffordSimp
