
# S‑T01 GHZ Classical Shadows (Aer Simulator)

This notebook reproduces the Phase 1 S‑T01 benchmark entirely on the local Qiskit Aer simulator. It highlights the core QuartumSE workflow: generating classical shadows, persisting shot data, computing SSR/CI metrics, and replaying results from a provenance manifest.



## 0. Environment setup

The notebook writes manifests and shot data to a `notebook_artifacts/` folder so runs stay isolated from the main `data/` tree.


In [None]:

from pathlib import Path

import numpy as np

from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator

from quartumse import ShadowEstimator
from quartumse.shadows import ShadowConfig, ShadowVersion
from quartumse.shadows.core import Observable
from quartumse.reporting.manifest import ProvenanceManifest
from quartumse.reporting.shot_data import ShotDataWriter
from quartumse.utils.metrics import compute_ssr


In [None]:

ARTIFACT_DIR = Path("notebook_artifacts")
ARTIFACT_DIR.mkdir(parents=True, exist_ok=True)

backend = AerSimulator(seed_simulator=2025)
print(f"Artifacts will be stored in {ARTIFACT_DIR.resolve()}")



## 1. Helper functions

We reuse the GHZ circuit generator and a simple direct measurement baseline to compare against QuartumSE's classical shadows estimator.


In [None]:

def create_ghz_circuit(num_qubits: int) -> QuantumCircuit:
    qc = QuantumCircuit(num_qubits)
    qc.h(0)
    for idx in range(1, num_qubits):
        qc.cx(0, idx)
    return qc

def direct_measurement_baseline(circuit: QuantumCircuit, observable: Observable, shots: int) -> dict:
    baseline_circuit = circuit.copy()
    baseline_circuit.measure_all()
    job = backend.run(baseline_circuit, shots=shots)
    counts = job.result().get_counts()
    total = sum(counts.values())
    expectation = 0.0
    for bitstring, count in counts.items():
        parity = 1
        for qubit, pauli in enumerate(observable.pauli_string):
            if pauli != "Z":
                continue
            bit = int(bitstring[::-1][qubit])
            parity *= 1 - 2 * bit
        expectation += parity * count / total
    variance = (1 - expectation**2) / shots
    return {"expectation": expectation * observable.coefficient, "variance": variance, "shots": shots}



## 2. Configure the experiment

We focus on a 3-qubit GHZ state with three Pauli observables. Feel free to change `shadow_size` or the observable list to explore other metrics.


In [None]:

num_qubits = 3
ghz_circuit = create_ghz_circuit(num_qubits)
observables = [
    Observable("ZII"),
    Observable("ZZI"),
    Observable("ZZZ"),
]

shadow_config = ShadowConfig(
    version=ShadowVersion.V0_BASELINE,
    shadow_size=256,
    random_seed=17,
    confidence_level=0.95,
)

estimator = ShadowEstimator(
    backend=backend,
    shadow_config=shadow_config,
    data_dir=ARTIFACT_DIR,
)



## 3. Run QuartumSE and the baseline

QuartumSE executes one shot per randomized Clifford circuit. We compare the results to a direct Z-basis measurement baseline using 1024 shots.


In [None]:

baseline_shots = 1024
shadow_result = estimator.estimate(ghz_circuit, observables, save_manifest=True)
baseline_results = {
    str(obs): direct_measurement_baseline(ghz_circuit, obs, baseline_shots)
    for obs in observables
}

print(f"Shadow manifest: {shadow_result.manifest_path}")
print(f"Shots used by QuartumSE: {shadow_result.shots_used}")



## 4. Compute SSR and CI coverage

The helper below prints observable-by-observable comparisons and aggregates the shot-savings ratio (SSR).


In [None]:

def ghz_expectation(pauli: str) -> float:
    non_identity = [p for p in pauli if p != "I"]
    if not non_identity:
        return 1.0
    if all(p == "Z" for p in non_identity) and len(non_identity) % 2 == 0:
        return 1.0
    return 0.0

rows = []
for obs in observables:
    obs_key = str(obs)
    shadow_data = shadow_result.observables[obs_key]
    baseline_data = baseline_results[obs_key]
    target = ghz_expectation(obs.pauli_string)
    ci = shadow_data["ci_95"]
    in_ci = ci[0] <= target <= ci[1]
    ssr = compute_ssr(
        baseline_shots,
        shadow_result.shots_used,
        baseline_precision=max(abs(baseline_data["expectation"] - target), 1e-9),
        quartumse_precision=max(abs(shadow_data["expectation_value"] - target), 1e-9),
    )
    rows.append((obs_key, shadow_data, baseline_data, target, ci, ssr, in_ci))

for obs_key, shadow_data, baseline_data, target, ci, ssr, in_ci in rows:
    print(
        f"{obs_key:>6}: shadow={shadow_data['expectation_value']:+.3f} "
        f"baseline={baseline_data['expectation']:+.3f} target={target:+.1f} "
        f"CI=({ci[0]:+.3f}, {ci[1]:+.3f}) SSR={ssr:.2f} in_CI={'yes' if in_ci else 'no'}"
    )

avg_ssr = float(np.mean([row[5] for row in rows]))
ci_coverage = sum(1 for row in rows if row[6]) / len(rows)
print(f"
Average SSR: {avg_ssr:.2f}x")
print(f"CI coverage: {ci_coverage:.0%}")



## 5. Inspect shot data diagnostics

Manifest + Parquet artefacts live under `notebook_artifacts/`. We can summarise the measurement basis frequencies and top bitstrings.


In [None]:

manifest = ProvenanceManifest.from_json(shadow_result.manifest_path)
writer = ShotDataWriter(ARTIFACT_DIR)
diagnostics = writer.summarize_shadow_measurements(manifest.schema.experiment_id)
diagnostics.to_dict()



## 6. Replay from manifest

Baseline (v0) experiments can be replayed offline. Here we add an `XXX` observable without re-running the circuit.


In [None]:

new_observable = Observable("XXX")
replay_result = estimator.replay_from_manifest(shadow_result.manifest_path, [new_observable])
replay_result.observables



## Next steps

- Swap in `ShadowVersion.V1_NOISE_AWARE` with a `MitigationConfig` to explore MEM-corrected runs (see the dedicated `noise_aware_shadows_demo.ipynb`).
- Adapt the observable list and baseline helper for chemistry (H₂) or optimisation (MAX-CUT) targets.
- Publish SSR/CI trend plots directly from notebook outputs to track roadmap metrics.
