# Notebook 2: Advanced Quantum Tomography
## Deep Learning (DNN) & Variational Quantum Circuits (VQC) vs. Classical SVR

## 1. Scientific Context & Exploration Goals
In the previous notebook, we established a robust baseline using **Support Vector Regression (SVR)**, a classical kernel-based method. While effective, SVR relies on fixed kernels (RBF, Polynomial) which may not perfectly capture the complex geometry of quantum states under decoherence.

In this second exploratory phase, we move towards **Advanced Architectures**. Our goal is to determine if models with higher representational capacity (Deep Learning) or native quantum priors (VQC) can surpass the classical baseline, particularly in the regime of **impure states (decoherence)** and **limited data**.

We investigate two challengers:
1.  **Deep Neural Networks (DNN):** Using the universal approximation theorem to model the mapping from measurements to density matrices with high non-linearity.
2.  **Variational Quantum Circuits (VQC):** A "Quantum Machine Learning" approach. We hypothesize that a quantum circuit possesses a natural *inductive bias* for quantum data, potentially requiring fewer parameters to represent the Hilbert space than a classical network.

## 2. A Paradigm Shift: "Physics-Informed" Training
Unlike standard regression which minimizes the Euclidean distance (MSE), we introduce a **Custom Loss Function** grounded in Quantum Information Theory.

### The "Fidelity-Based" Backpropagation
Standard ML optimizes geometry. We want to optimize **physics**.
Instead of blindingly minimizing $MSE = ||\vec{r}_{pred} - \vec{r}_{real}||^2$, we configure our Neural Networks (both Classical and Quantum) to directly maximize the **Quantum Fidelity** ($F$).

During the **Backpropagation** pass, the gradient of the Fidelity is computed with respect to the model weights. This forces the optimizer to prioritize directions that increase the physical overlap between the predicted and true states.

**The Mathematical Loss Function:**
For a single qubit state defined by a Bloch vector $\vec{r}$, the loss $\mathcal{L}$ to minimize is:

$$\mathcal{L} = 1 - F(\rho_{pred}, \rho_{real})$$

Where the Fidelity $F$ for single-qubit Bloch vectors is given analytically by:
$$F(\vec{r}_{p}, \vec{r}_{t}) = \frac{1}{2} \left( 1 + \vec{r}_{p} \cdot \vec{r}_{t} + \sqrt{(1 - ||\vec{r}_{p}||^2)(1 - ||\vec{r}_{t}||^2)} \right)$$

* **Interpretation:** The term $\vec{r}_{p} \cdot \vec{r}_{t}$ aligns the vectors directionally. The term under the square root penalizes errors in **purity** (vector length). This allows the DNN to specifically "learn" decoherence.

## 3. Architecture Overview

### A. Deep Neural Network (DNN - PyTorch)
* **Structure:** A Multi-Layer Perceptron (MLP) with fully connected layers and non-linear activation functions (ReLU).
* **Why:** To test if a "Universal Approximator" can learn the noise models better than fixed kernels.

### B. Variational Quantum Circuit (VQC - PennyLane)
* **Concept:** We use a parameterized quantum circuit as the model.
* **Mechanism:**
    1.  **Encoding:** Classical inputs ($X, Y, Z$) are embedded into a quantum state via rotation gates.
    2.  **Processing:** A sequence of trainable gates (Ansatz) manipulates the state.
    3.  **Measurement:** We measure the expectation values of Pauli operators to obtain the output vector.
* **Hypothesis:** "Quantum for Quantum". A quantum circuit naturally evolves on the Bloch sphere (or inside it for mixed states via subsystems), which might offer better generalization with fewer parameters.

## 4. Implementation: High-Performance Computing (GPU)
With SVC, we had quite some long training time. So in order to handle the computational load of training deep networks and simulating quantum circuits, we leverage **GPU Acceleration**:
* **PyTorch (CUDA/MPS):** For tensor operations and automatic differentiation of the DNN.
* **PennyLane Lightning GPU:** Using high-performance state-vector simulators (like `lightning.gpu` or `lightning.qubit`) to accelerate the VQC simulation and gradient calculation (adjoint differentiation).

We also do this as a way to learn modern high-performance ML pipelines.

## 5. Input/Output Interfaces
To ensure a rigorous comparison with the SVR baseline from Notebook 1, the I/O structure remains identical:

* **Input $\mathbf{X}$:** Noisy measurement expectations $[ \langle X \rangle_{noise}, \langle Y \rangle_{noise}, \langle Z \rangle_{noise} ]$.
* **Output $\mathbf{y}$:** Predicted Bloch vector components $[\hat{x}, \hat{y}, \hat{z}]$.

*Note: The predicted vector is implicitly constrained to valid physical states (norm $\le$ 1) either via activation functions (Tanh) or penalty terms in the loss.*

In [None]:
import time
import numpy as np
import pandas as pd
import torch
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

# Import du générateur de dataset
try:
    from saint_dtSet import generate_qubit_tomography_dataset_base
except ImportError:
    from dataset_build.saint_dtSet import generate_qubit_tomography_dataset_base

DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
torch.manual_seed(0)
np.random.seed(0)
print(f"Device: {DEVICE}")


In [None]:
def fidelity_from_bloch(pred: torch.Tensor, target: torch.Tensor) -> torch.Tensor:
    """
    Calcul analytique de la fidélité entre deux vecteurs de Bloch (qubit).
    pred et target: tensors (..., 3).
    """
    dot = (pred * target).sum(dim=-1)
    norm_pred = (pred ** 2).sum(dim=-1)
    norm_target = (target ** 2).sum(dim=-1)
    under_sqrt = torch.clamp(1.0 - norm_pred, min=0.0) * torch.clamp(1.0 - norm_target, min=0.0)
    fidelity = 0.5 * (1.0 + dot + torch.sqrt(under_sqrt))
    return torch.clamp(fidelity, 0.0, 1.0)


def build_dataloaders(df: pd.DataFrame, batch_size: int = 32):
    features = df[['X_mean', 'Y_mean', 'Z_mean']].to_numpy(dtype=np.float32)
    targets = df[['X_real', 'Y_real', 'Z_real']].to_numpy(dtype=np.float32)
    X_train, X_test, y_train, y_test = train_test_split(features, targets, test_size=0.2, random_state=42, shuffle=True)

    train_ds = TensorDataset(torch.from_numpy(X_train), torch.from_numpy(y_train))
    test_ds = TensorDataset(torch.from_numpy(X_test), torch.from_numpy(y_test))

    train_loader = DataLoader(train_ds, batch_size=batch_size, shuffle=True)
    test_loader = DataLoader(test_ds, batch_size=batch_size, shuffle=False)
    return train_loader, test_loader, (X_test, y_test)


def evaluate_model(model, data_loader, device=DEVICE):
    model.eval()
    preds, targets = [], []
    with torch.no_grad():
        for xb, yb in data_loader:
            xb = xb.to(device)
            yb = yb.to(device)
            pred = model(xb)
            preds.append(pred.detach().cpu())
            targets.append(yb.detach().cpu())
    preds_t = torch.cat(preds, dim=0)
    targets_t = torch.cat(targets, dim=0)
    fidelities = fidelity_from_bloch(preds_t, targets_t)
    return fidelities.mean().item()


def compute_baseline_fidelity(x_test: np.ndarray, y_test: np.ndarray) -> float:
    # Baseline rapide ~SVR/MLE : prédicteur trivial X_mean->X_real (sans contrainte de sphère).
    preds = torch.from_numpy(x_test)
    targets = torch.from_numpy(y_test)
    return fidelity_from_bloch(preds, targets).mean().item()


In [None]:
def train_model(model, train_loader, optimizer, loss_fn, epochs: int, device=DEVICE):
    """
    Boucle générique d'entraînement.
    Déplace inputs/targets sur DEVICE avant forward comme demandé.
    """
    model.to(device)
    history = []
    for epoch in range(1, epochs + 1):
        model.train()
        running_loss = 0.0
        for xb, yb in train_loader:
            xb = xb.to(device)
            yb = yb.to(device)
            optimizer.zero_grad()
            preds = model(xb)
            loss = loss_fn(preds, yb)
            loss.backward()
            optimizer.step()
            running_loss += loss.item() * xb.size(0)

        epoch_loss = running_loss / len(train_loader.dataset)
        history.append(epoch_loss)
        if epoch == 1 or epoch % max(1, epochs // 5) == 0:
            print(f"Epoch {epoch:>3}/{epochs} - loss: {epoch_loss:.4f}")
    return history


In [None]:
# ---------------------------------------------------------------------------
# Boucle d'expérimentation principale
# ---------------------------------------------------------------------------
N_STATES = 2000
N_SHOTS = 500
DECOHERENCE_LEVELS = [0.0, 0.2, 0.5, 0.8]
BATCH_SIZE = 32
EPOCHS_DNN = 25
EPOCHS_VQC = 12  # VQC plus lent -> moins d'epochs
LR_DNN = 0.01
LR_VQC = 0.05

results = {"dnn": [], "vqc": [], "baseline": []}

for level in DECOHERENCE_LEVELS:
    print(f"
=== Décohérence {level} ===")
    df = generate_qubit_tomography_dataset_base(
        n_states=N_STATES,
        n_shots=N_SHOTS,
        include_decoherence=True,
        decoherence_level=level,
        mode="finite_shots",
        random_state=1234
    )

    train_loader, test_loader, (x_test, y_test) = build_dataloaders(df, batch_size=BATCH_SIZE)

    # ---------------------- DNN ----------------------
    dnn = TomographyDNN().to(DEVICE)
    dnn_optimizer = torch.optim.Adam(dnn.parameters(), lr=LR_DNN)
    dnn_loss = QuantumFidelityLoss()
    _ = train_model(dnn, train_loader, dnn_optimizer, dnn_loss, epochs=EPOCHS_DNN, device=DEVICE)
    dnn_fid = evaluate_model(dnn, test_loader, device=DEVICE)

    # ---------------------- VQC ----------------------
    vqc = TomographyVQC().to(DEVICE)
    vqc_optimizer = torch.optim.Adam(vqc.parameters(), lr=LR_VQC)
    vqc_loss = QuantumFidelityLoss()
    _ = train_model(vqc, train_loader, vqc_optimizer, vqc_loss, epochs=EPOCHS_VQC, device=DEVICE)
    vqc_fid = evaluate_model(vqc, test_loader, device=DEVICE)

    # -------------------- Baseline --------------------
    baseline_fid = compute_baseline_fidelity(x_test, y_test)

    results["dnn"].append(dnn_fid)
    results["vqc"].append(vqc_fid)
    results["baseline"].append(baseline_fid)

    print(f"DNN fidelity (test): {dnn_fid:.4f}")
    print(f"VQC fidelity (test): {vqc_fid:.4f}")
    print(f"Baseline fidelity    : {baseline_fid:.4f}")

print('Boucle terminée.')
results


In [None]:
# Visualisation des fidélités moyennes
plt.figure(figsize=(8, 5))
plt.plot(DECOHERENCE_LEVELS, results['dnn'], '-o', label='DNN')
plt.plot(DECOHERENCE_LEVELS, results['vqc'], '-o', label='VQC')
plt.plot(DECOHERENCE_LEVELS, results['baseline'], '-o', label='Baseline (SVR/MLE rapide)')
plt.xlabel('Niveau de décohérence')
plt.ylabel('Fidélité moyenne (test)')
plt.ylim(0.0, 1.05)
plt.grid(True, alpha=0.3)
plt.legend()
plt.show()
