# Module 3: Large Scale Circuits and Topology

Scaling from textbook 2-qubit circuits to the 100+ qubit regime required for the competition introduces specific constraints related to topology and mapping.

## 3.1 The Simulation Bottleneck

To train an AI model for QEM, we typically need pairs of $(x_{noisy}, x_{ideal})$.
* **Small Circuits (< 30 qubits):** We can compute $x_{ideal}$ using a statevector simulator.
* **Large Circuits (50+ qubits):** Classical simulation is exponentially hard ($2^{50}$ states). We cannot calculate the ground truth $x_{ideal}$ for arbitrary circuits.

**The Solution:** **Clifford Circuits** and the **Gottesman-Knill Theorem**.
Circuits consisting only of Clifford gates (H, S, CNOT, Pauli) can be simulated in polynomial time $O(N^2)$ on a classical computer, regardless of the number of qubits.

**Clifford Data Regression (CDR):** We train our AI on Clifford circuits (where we know the answer) and assume the learned noise model generalizes to non-Clifford (universal) circuits.

## 3.2 Connectivity and Mapping

Real quantum processors like IBM's Eagle/Osprey follow a **Heavy-Hex lattice** topology. They are not fully connected.

**Constraint:** A CNOT can only be applied between physically connected qubits. If $q_A$ and $q_B$ are not neighbors, the compiler must insert **SWAP** gates.
$$ \text{SWAP}(A, B) = 3 \times \text{CNOT} $$
Routing signals across the chip massively increases circuit depth and error count.

**AI Feature:** Your model should account for the **Topological Distance**. A gate between distant qubits incurs a penalty proportional to the path length.

## 3.3 Implementation: Generating Large-Scale Training Data

We will create a dataset generator that produces random **Clifford circuits**. These serve as the "training ground" for your AI, allowing it to learn the noise characteristics of a 100-qubit device without needing a supercomputer.

In [1]:
import numpy as np
from qiskit import QuantumCircuit, transpile
from qiskit.quantum_info import Clifford
from qiskit_aer import AerSimulator
import utils  # Import shared utilities

def generate_clifford_data(n_qubits=5, depth=10, n_samples=1000):
    """
    Generates dataset using the safe random Clifford generator from utils.
    """
    dataset = []

    for i in range(n_samples):
        # 1. Use our SAFE generator from utils
        # It returns (qc, instructions), we only need qc here
        qc, _ = utils.create_random_clifford_circuit(n_qubits, depth)
        
        # 2. Calculate "Ideal" value efficiently using Clifford simulator
        qc_sim = qc.copy()
        qc_sim.measure_all()
        sim_ideal = AerSimulator(method='stabilizer')
        result_ideal = sim_ideal.run(qc_sim, shots=1000).result()
        counts_ideal = result_ideal.get_counts()

        # Calc <Z>
        shots = sum(counts_ideal.values())
        p0 = sum(v for k, v in counts_ideal.items() if k.endswith('0')) / shots
        p1 = sum(v for k, v in counts_ideal.items() if k.endswith('1')) / shots
        ideal_z = p0 - p1

        dataset.append({
            'circuit_index': i,
            'n_qubits': n_qubits,
            'depth': depth,
            'ideal_expectation': ideal_z,
            'circuit_obj': qc
        })

    return dataset

# --- Generate Sample Data ---
data = generate_clifford_data(n_qubits=5, depth=10, n_samples=1000)
print(f"Generated {len(data)} Clifford training samples using Shared Utils.")
print(f"Sample 0 Ideal <Z>: {data[0]['ideal_expectation']}")

Generated 1000 Clifford training samples using Shared Utils.
Sample 0 Ideal <Z>: -0.028000000000000025
