
# 🧬 Quantum Machine Learning Demo on Tox21 (Binary Task)

This notebook demonstrates a minimal QML pipeline using a subset of the Tox21 dataset.
We use a binary classification setup (toxic vs non-toxic for one task) and apply a quantum kernel method using `PennyLane`.

---

## What it does:
- Loads Tox21 and selects a single binary task (e.g., NR-AR)
- Reduces data size for quick training and simulation
- Uses a simple quantum feature map and kernel-based classifier

**Note:** Requires `pennylane`, `scikit-learn`, `matplotlib`, and `deepchem`.
    

In [None]:

# Install requirements (uncomment if needed)
# !pip install pennylane scikit-learn matplotlib deepchem
    

In [None]:

import deepchem as dc
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Select one binary classification task
task_index = 0  # e.g., NR-AR
tox21_tasks, datasets, transformers = dc.molnet.load_tox21(featurizer='ECFP')
train_dataset, valid_dataset, test_dataset = datasets

X = train_dataset.X
y = train_dataset.y[:, task_index]  # Single task

# Filter out NaN labels
mask = ~np.isnan(y)
X = X[mask]
y = y[mask]

# Downsample for quick QML training
X_small, _, y_small, _ = train_test_split(X, y, train_size=100, stratify=y, random_state=42)
scaler = StandardScaler()
X_small = scaler.fit_transform(X_small)
    

In [None]:

import pennylane as qml
from pennylane.kernels import kernel_matrix
from sklearn.svm import SVC
from sklearn.metrics import classification_report

n_qubits = 6
X_small = X_small[:, :n_qubits]  # Reduce to match qubit count

dev = qml.device("default.qubit", wires=n_qubits)

def feature_map(x):
    for i in range(n_qubits):
        qml.Hadamard(wires=i)
        qml.RZ(x[i], wires=i)
    for i in range(n_qubits - 1):
        qml.CNOT(wires=[i, i+1])

@qml.qnode(dev)
def kernel_circuit(x1, x2):
    qml.templates.AngleEmbedding(x1, wires=range(n_qubits))
    qml.adjoint(feature_map)(x2)
    return qml.probs(wires=range(n_qubits))
    

In [None]:

print("Computing quantum kernel...")
q_kernel = kernel_matrix(X_small, X_small, kernel=kernel_circuit)

clf = SVC(kernel='precomputed')
clf.fit(q_kernel, y_small)

# Predict using same kernel (self-similarity)
y_pred = clf.predict(q_kernel)
print(classification_report(y_small, y_pred))
    