# Step 5f — QSVM with **3 qubits** and **8 PCA features** (data re‑upload)

This notebook builds a **quantum kernel SVM (QSVM)** using **3 qubits** while ingesting **8 PCA features** via **data re‑uploading**.  
We compute a **fidelity kernel** \(k(x,z) = |\langle \phi(x) | \phi(z) \rangle|^2\) with a fixed feature map \(U(x)\) and train an SVM on the **precomputed** kernel.

**Why re‑uploading?** It lets us encode more features than qubits by feeding features across multiple single‑qubit rotation blocks (RY → RZ → RX) on the same 3 wires.

**Inputs required** (from your PCA step):  
- `../data/processed/pca8_train.csv`  
- `../data/processed/pca8_test.csv`


## Config

In [None]:

from pathlib import Path
import math

BASE_DATA = Path("../data")
PROCESSED_DIR = BASE_DATA / "processed"

TRAIN_CSV = PROCESSED_DIR / "pca8_train.csv"
TEST_CSV  = PROCESSED_DIR / "pca8_test.csv"

PC_COLS = ["PC1","PC2","PC3","PC4","PC5","PC6","PC7","PC8"]
N_QUBITS = 3
ALL_CLASSES = ["angry", "fearful", "happy", "neutral", "sad"]

# SVM / kernel settings
SVM_C = 1.0          # regularization strength
ANGLE_CLIP = math.pi # map PCs to [-pi, pi] using TRAIN max-abs
K_SAVE = True        # save kernel matrices to disk

OUT_DIR = PROCESSED_DIR / "qsvm3_8pca_kernel"
OUT_DIR.mkdir(parents=True, exist_ok=True)

print("TRAIN_CSV:", TRAIN_CSV.resolve())
print("TEST_CSV :", TEST_CSV.resolve())


## Verify inputs exist

In [None]:

missing = []
if not TRAIN_CSV.exists(): missing.append(str(TRAIN_CSV))
if not TEST_CSV.exists():  missing.append(str(TEST_CSV))
if missing:
    raise SystemExit("Missing required file(s):\n  - " + "\n  - ".join(missing))
else:
    print("PCA(8) files found ✓")


## Imports & utilities

In [None]:

import json, random, time
import numpy as onp
import numpy as np
import pandas as pd
import pennylane as qml
from sklearn.svm import SVC
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import matplotlib.pyplot as plt

SEED = 7
random.seed(SEED); onp.random.seed(SEED); np.random.seed(SEED)

def load_data(train_csv, test_csv, pc_cols, valid_classes):
    tr = pd.read_csv(train_csv); te = pd.read_csv(test_csv)
    tr = tr[tr["emotion"].isin(valid_classes)].reset_index(drop=True)
    te = te[te["emotion"].isin(valid_classes)].reset_index(drop=True)
    for col in pc_cols + ["emotion"]:
        assert col in tr.columns and col in te.columns, f"Missing column: {col}"
    return tr, te

def build_angle_scaler(train_df, pc_cols, clip=np.pi):
    scales = {}
    for k in pc_cols:
        m = float(train_df[k].abs().max())
        scales[k] = {"denom": max(1e-8, 1.1*m), "clip": clip}
    return scales

def apply_angle_scaler(df, pc_cols, scales):
    X = df[pc_cols].to_numpy(onp.float32)
    for j, key in enumerate(pc_cols):
        d = scales[key]["denom"]; c = scales[key]["clip"]
        X[:, j] = np.clip((X[:, j] / d) * c, -c, c)
    return X


## Quantum device and **feature map** U(x) (re‑uploading)

In [None]:

# Robust device: try lightning, else default.qubit
try:
    dev = qml.device("lightning.qubit", wires=N_QUBITS, shots=None)
    print("Using lightning.qubit ✅")
except Exception as e:
    print("Falling back to default.qubit due to:", repr(e))
    dev = qml.device("default.qubit", wires=N_QUBITS, shots=None)
    print("Using default.qubit ✅")

def feature_map(x):
    """Fixed data-encoding circuit U(x) on N_QUBITS=3 with 8 features via re-uploading.
    Blocks:
      1) RY on q0..q2 using x1..x3
      2) RZ on q0..q2 using x4..x6
      3) RX on q0..q1 using x7..x8 (q2 gets 0)
    Interleave simple entangling rings (CZ + IsingZZ with data-coupled angles).
    """
    # Block 1: RY
    qml.RY(x[0], wires=0); qml.RY(x[1], wires=1); qml.RY(x[2], wires=2)
    # Entangle
    qml.CZ(wires=[0,1]); qml.CZ(wires=[1,2]); qml.CZ(wires=[2,0])
    # Block 2: RZ
    qml.RZ(x[3], wires=0); qml.RZ(x[4], wires=1); qml.RZ(x[5], wires=2)
    # ZZ ring with data-coupled angles (product terms encourage nonlinearity)
    qml.IsingZZ(0.25*(x[0]*x[3]), wires=[0,1])
    qml.IsingZZ(0.25*(x[1]*x[4]), wires=[1,2])
    qml.IsingZZ(0.25*(x[2]*x[5]), wires=[2,0])
    # Block 3: RX (x7,x8; q2 left at 0)
    qml.RX(x[6], wires=0); qml.RX(x[7], wires=1)
    # Final CZ ring
    qml.CZ(wires=[0,1]); qml.CZ(wires=[1,2]); qml.CZ(wires=[2,0])


## Fidelity kernel k(x,z) = |⟨φ(x)|φ(z)⟩|²

In [None]:

@qml.qnode(dev)
def kernel_circuit(x, z):
    # Prepare |phi(z)> = U(z)|0>
    feature_map(z)
    # Apply U(x)†
    qml.adjoint(feature_map)(x)
    # Probability of |0..0> equals |⟨phi(x)|phi(z)⟩|^2
    return qml.probs(wires=range(N_QUBITS))

def fidelity_kernel(x, z):
    probs = kernel_circuit(x, z)
    return float(probs[0])  # prob of all-zeros

def compute_kernel_matrix(A, B, desc="K"):
    """Compute K[i,j] = k(A[i], B[j]) with progress."""
    n, m = len(A), len(B)
    K = np.empty((n, m), dtype=np.float64)
    t0 = time.time()
    for i in range(n):
        for j in range(m):
            K[i, j] = fidelity_kernel(A[i], B[j])
        if (i+1) % max(1, n//10) == 0 or i == n-1:
            elapsed = time.time()-t0
            print(f"{desc}: row {i+1}/{n}  (elapsed {elapsed:.1f}s)", end="\r")
    print(f"\n{desc} done in {time.time()-t0:.1f}s; shape={K.shape}")
    return K


## Load data, scale angles, and encode labels

In [None]:

train_df, test_df = load_data(TRAIN_CSV, TEST_CSV, PC_COLS, ALL_CLASSES)

scales = build_angle_scaler(train_df, PC_COLS, clip=ANGLE_CLIP)
X_train = apply_angle_scaler(train_df, PC_COLS, scales).astype(np.float64)
X_test  = apply_angle_scaler(test_df,  PC_COLS, scales).astype(np.float64)

le = LabelEncoder()
y_train = le.fit_transform(train_df["emotion"].values)
y_test  = le.transform(test_df["emotion"].values)

print("Classes:", list(le.classes_))
print("X_train:", X_train.shape, "X_test:", X_test.shape)


## Compute quantum kernels (train & test)

In [None]:

K_train = compute_kernel_matrix(X_train, X_train, desc="K_train")
K_test  = compute_kernel_matrix(X_test,  X_train, desc="K_test ")

if K_SAVE:
    np.save(OUT_DIR / "K_train.npy", K_train)
    np.save(OUT_DIR / "K_test.npy",  K_test)
    print("Saved kernels to:", (OUT_DIR / "K_train.npy").resolve())


## Train SVM (precomputed kernel)

In [None]:

svm = SVC(kernel="precomputed", C=SVM_C, decision_function_shape="ovr")
svm.fit(K_train, y_train)

yhat = svm.predict(K_test)
acc = accuracy_score(y_test, yhat)
print("Test accuracy:", round(acc, 3))

print("\nClassification report:\n", classification_report(y_test, yhat, target_names=le.classes_))
print("Confusion matrix (rows=true, cols=pred):\n", confusion_matrix(y_test, yhat))


## (Optional) Visualize K_train

In [None]:

plt.figure(figsize=(5,4))
plt.imshow(K_train, aspect='auto')
plt.title("K_train (quantum fidelity kernel)")
plt.colorbar(); plt.tight_layout()
plt.show()


## References (encoding more features with fewer qubits)


- **Data re‑uploading** (encode features across repeated blocks on the same qubits): Pérez‑Salinas et al., *Data re‑uploading for a universal quantum classifier* (2019/2020).  
- **Quantum kernels / QSVM**: Havlíček et al., *Supervised learning with quantum‑enhanced feature spaces* (Nature, 2019); Schuld & Killoran, *Quantum machine learning in feature Hilbert spaces* (PRL, 2019).
