# QuartumSE Comprehensive Test Suite

**Complete end-to-end testing:** Qiskit simulator → IBM Quantum hardware → replay → diagnostics

This notebook demonstrates all QuartumSE features:
- Classical Shadows v0 (baseline) and v1 (noise-aware)
- Measurement Error Mitigation (MEM)
- Shot data persistence and replay
- IBM Quantum connector
- Provenance tracking and diagnostics

**Runtime:** ~5 minutes on simulator, ~15 minutes with IBM hardware

**Prerequisites:**
- QuartumSE installed: `pip install -e ".[dev]"`
- (Optional) IBM Quantum token for hardware testing

---

## Section 1: Quick Start - Bell State with Shadows v0

Start with the simplest workflow: estimate observables on a Bell state using baseline classical shadows.

In [None]:
# Load environment variables from .env file (if it exists)
try:
    from dotenv import load_dotenv
    load_dotenv()
    print('✓ Environment variables loaded from .env file')
except ImportError:
    print('⚠️ python-dotenv not installed. Install with: pip install python-dotenv')
except Exception as e:
    print(f'⚠️ Could not load .env file: {e}')

import os
import numpy as np
from qiskit import QuantumCircuit
from qiskit_aer import AerSimulator

from quartumse import ShadowEstimator
from quartumse.shadows import ShadowConfig
from quartumse.shadows.config import ShadowVersion
from quartumse.shadows.core import Observable

print("✓ QuartumSE imports successful")

In [None]:
# Create a Bell state (|00⟩ + |11⟩) / √2
bell_circuit = QuantumCircuit(2)
bell_circuit.h(0)
bell_circuit.cx(0, 1)

print("Bell state circuit:")
print(bell_circuit)

# Analytical expectations for Bell state
# ⟨ZI⟩ = 0, ⟨IZ⟩ = 0, ⟨ZZ⟩ = +1
# Keys must match Observable string representation: "coefficient*pauli"
analytical_expectations = {
    "1.0*ZI": 0.0,
    "1.0*IZ": 0.0,
    "1.0*ZZ": 1.0
}

In [None]:
# Configure baseline shadows (v0)
shadow_config_v0 = ShadowConfig(
    version=ShadowVersion.V0_BASELINE,
    shadow_size=100,
    random_seed=42,
)

# Create estimator with Aer simulator
estimator_v0 = ShadowEstimator(
    backend=AerSimulator(seed_simulator=123),
    shadow_config=shadow_config_v0,
    data_dir="./demo_data"
)

print(f"✓ Estimator created (v0 baseline)")
print(f"  Backend: {estimator_v0.backend.name}")
print(f"  Shadow size: {shadow_config_v0.shadow_size}")

In [None]:
# Define observables to estimate
observables = [
    Observable("ZI", coefficient=1.0),  # Expect 0
    Observable("IZ", coefficient=1.0),  # Expect 0
    Observable("ZZ", coefficient=1.0),  # Expect +1
]

# Run estimation
print("Running classical shadows estimation (v0)...")
result_v0 = estimator_v0.estimate(
    circuit=bell_circuit,
    observables=observables,
    save_manifest=True
)

print("\n✓ Estimation complete!")
print(f"  Manifest: {result_v0.manifest_path}")
print(f"  Shot data: {result_v0.shot_data_path}")

In [None]:
# Display results
print("\nResults (v0 baseline):")
print(f"{'Observable':<15} {'Estimated':<12} {'Expected':<12} {'Error':<10} {'95% CI':<20}")
print("-" * 75)

errors_v0 = []
for obs_str, data in result_v0.observables.items():
    exp_val = data['expectation_value']
    ci = data['ci_95']  # Fixed: was 'confidence_interval'
    ci_width = data['ci_width']
    expected = analytical_expectations[obs_str]
    error = abs(exp_val - expected)
    errors_v0.append(error)
    
    print(f"{obs_str:<15} {exp_val:>11.4f} {expected:>11.4f} {error:>9.4f} [{ci[0]:>6.3f}, {ci[1]:>6.3f}]")

print(f"\nMean Absolute Error: {np.mean(errors_v0):.4f}")

## Section 2: Shot Persistence and Replay

Demonstrate "measure once, ask later" - compute new observables from saved measurement data.

In [None]:
# Inspect saved shot data
import pandas as pd

shot_data = pd.read_parquet(result_v0.shot_data_path)
print("Shot data schema:")
print(shot_data.head(10))
print(f"\nTotal shots: {len(shot_data)}")
print(f"Columns: {list(shot_data.columns)}")

In [None]:
# Replay from manifest with NEW observables (no hardware re-execution)
print("Replaying from saved manifest with new observables...\n")

new_observables = [
    Observable("XX", coefficient=1.0),  # Expect +1 for Bell state
    Observable("YY", coefficient=1.0),  # Expect +1 for Bell state
]

replayed_result = estimator_v0.replay_from_manifest(
    manifest_path=str(result_v0.manifest_path),
    observables=new_observables
)

print("Replay results (new observables):")
for obs_str, data in replayed_result.observables.items():
    exp_val = data['expectation_value']
    ci = data['ci_95']  # Fixed: was 'confidence_interval'
    print(f"{obs_str}: {exp_val:.4f} [95% CI: {ci[0]:.3f}, {ci[1]:.3f}]")

print("\n✓ Replay successful - no hardware re-execution needed!")

## Section 3: Noise-Aware Shadows (v1) with MEM

Compare baseline (v0) vs noise-aware (v1) on a 3-qubit GHZ state.

In [None]:
from quartumse.reporting.manifest import MitigationConfig

# Create GHZ state (|000⟩ + |111⟩) / √2
ghz_circuit = QuantumCircuit(3)
ghz_circuit.h(0)
ghz_circuit.cx(0, 1)
ghz_circuit.cx(1, 2)

print("GHZ state circuit:")
print(ghz_circuit)

# Analytical expectations for GHZ
# Z-type observables with even number of Z's: +1
# Z-type observables with odd number of Z's: 0
ghz_observables = [
    Observable("ZII", coefficient=1.0),  # Expect 0 (single Z)
    Observable("ZZI", coefficient=1.0),  # Expect +1 (even Z's)
    Observable("ZZZ", coefficient=1.0),  # Expect 0 (odd Z's)
]

# Keys must match Observable string representation: "coefficient*pauli"
ghz_expectations = {
    "1.0*ZII": 0.0,
    "1.0*ZZI": 1.0,
    "1.0*ZZZ": 0.0
}

In [None]:
# Configure noise-aware shadows (v1) with MEM
shadow_config_v1 = ShadowConfig(
    version=ShadowVersion.V1_NOISE_AWARE,
    shadow_size=100,
    random_seed=42,
    apply_inverse_channel=True,
)

mitigation_config = MitigationConfig(
    techniques=[],  # Will be populated by estimator
    parameters={"mem_shots": 512}
)

estimator_v1 = ShadowEstimator(
    backend=AerSimulator(seed_simulator=321),
    shadow_config=shadow_config_v1,
    mitigation_config=mitigation_config,
    data_dir="./demo_data"
)

print("✓ Estimator created (v1 noise-aware)")
print(f"  Shadow impl: {type(estimator_v1.shadow_impl).__name__}")
print(f"  MEM enabled: {estimator_v1.measurement_error_mitigation is not None}")

In [None]:
# Run estimation with automatic MEM calibration
print("Running classical shadows estimation (v1 + MEM)...")
print("  Step 1: MEM calibration (8 basis states × 512 shots)")
print("  Step 2: Shadow measurements (100 shots)")
print("  Step 3: Noise correction via confusion matrix\n")

result_v1 = estimator_v1.estimate(
    circuit=ghz_circuit,
    observables=ghz_observables,
    save_manifest=True
)

print("\n✓ Estimation complete!")
print(f"  MEM in techniques: {'MEM' in estimator_v1.mitigation_config.techniques}")
print(f"  Confusion matrix shape: {estimator_v1.measurement_error_mitigation.confusion_matrix.shape}")

In [None]:
# Verify MEM calibration (should be near-identity for ideal simulator)
confusion_matrix = estimator_v1.measurement_error_mitigation.confusion_matrix

print("Confusion matrix (rows=measured, cols=prepared):")
print(confusion_matrix)

# Check if close to identity
identity_check = np.allclose(confusion_matrix, np.eye(8), atol=0.05)
print(f"\nClose to identity matrix: {identity_check} (expected for noiseless simulator)")
if identity_check:
    print("✓ MEM calibration successful - simulator is noiseless")

In [None]:
# Compare v0 vs v1 results
print("\nv0 (baseline) vs v1 (noise-aware) comparison:")
print(f"{'Observable':<15} {'v1 Result':<12} {'Expected':<12} {'Error':<10}")
print("-" * 55)

errors_v1 = []
for obs_str, data in result_v1.observables.items():
    exp_val = data['expectation_value']
    expected = ghz_expectations[obs_str]
    error = abs(exp_val - expected)
    errors_v1.append(error)
    
    print(f"{obs_str:<15} {exp_val:>11.4f} {expected:>11.4f} {error:>9.4f}")

print(f"\nMean Absolute Error (v1): {np.mean(errors_v1):.4f}")
print("\nNote: On ideal simulator, v0 and v1 should be similar.")
print("On real hardware, v1 should show lower error due to MEM correction.")

## Section 4: IBM Quantum Connector

Connect to IBM Quantum backends using the vendor-neutral descriptor syntax.

In [None]:
# Check for IBM credentials
ibm_token = os.environ.get("QISKIT_IBM_TOKEN") or os.environ.get("QISKIT_RUNTIME_API_TOKEN")

if ibm_token:
    print("✓ IBM Quantum credentials detected")
    print("  Token found in environment variables")
    has_credentials = True
else:
    print("⚠️ No IBM Quantum credentials found")
    print("  Will use local Aer simulator fallback")
    print("  To test with IBM hardware:")
    print("    1. Get token from https://quantum.ibm.com")
    print("    2. Set: export QISKIT_IBM_TOKEN=<YOUR_TOKEN>")
    print("       (or load it from .env with: source .env)")
    has_credentials = False

In [None]:
# Test IBM connector with backend descriptor
from quartumse.connectors import resolve_backend

# Try to resolve IBM backend (will fallback to Aer if no credentials)
backend_descriptor = "ibm:ibmq_qasm_simulator" if has_credentials else "ibm:aer_simulator"

print(f"Resolving backend: {backend_descriptor}")
backend, snapshot = resolve_backend(backend_descriptor)

print(f"\n✓ Backend resolved: {backend.name}")
print(f"  Backend snapshot:")
print(f"    - Backend name: {snapshot.backend_name}")
print(f"    - Num qubits: {snapshot.num_qubits}")
print(f"    - Basis gates: {snapshot.basis_gates[:5] if snapshot.basis_gates else 'N/A'}...")
print(f"    - Properties hash: {snapshot.properties_hash[:16] if snapshot.properties_hash else 'N/A'}...")

if snapshot.t1_times:
    print(f"    - T1 times (qubit 0): {snapshot.t1_times.get(0, 'N/A')} μs")
if snapshot.readout_errors:
    print(f"    - Readout error (qubit 0): {snapshot.readout_errors.get(0, 'N/A')}")

In [None]:
# Create estimator with IBM backend descriptor
estimator_ibm = ShadowEstimator(
    backend=backend_descriptor,  # Use descriptor instead of backend object
    shadow_config=ShadowConfig(shadow_size=50, random_seed=42),
    data_dir="./demo_data"
)

print(f"✓ Estimator created with IBM connector")
print(f"  Actual backend: {estimator_ibm.backend.name}")
print(f"  Snapshot captured: {estimator_ibm._backend_snapshot is not None}")

if not has_credentials:
    print("\n  (Using fallback Aer simulator - no credentials provided)")

## Section 5: End-to-End Workflow (Optional - Real Hardware)

**Skip this section if you don't have IBM credentials.**

This demonstrates the complete workflow:
1. Run on IBM quantum hardware
2. Save results with calibration snapshot
3. Replay later with new observables

In [None]:
if has_credentials:
    try:
        print("Running on IBM hardware (this may take several minutes)...\n")
        
        # Use smallest available backend
        ibm_backend = "ibm:ibm_kyoto"  # Or ibm:ibmq_qasm_simulator for faster testing
        
        # Configure for hardware (smaller shot count to reduce wait time)
        hw_config = ShadowConfig(
            version=ShadowVersion.V1_NOISE_AWARE,
            shadow_size=30,  # Smaller for faster execution
            random_seed=42,
            apply_inverse_channel=True,
        )
        
        hw_mitigation = MitigationConfig(
            parameters={"mem_shots": 256}  # Smaller for faster calibration
        )
        
        estimator_hw = ShadowEstimator(
            backend=ibm_backend,
            shadow_config=hw_config,
            mitigation_config=hw_mitigation,
            data_dir="./demo_data"
        )
        
        # Run on hardware
        hw_result = estimator_hw.estimate(
            circuit=bell_circuit,
            observables=[Observable("ZZ", coefficient=1.0)],
            save_manifest=True
        )
        
        print("\n✓ Hardware execution complete!")
        print(f"  Backend: {estimator_hw.backend.name}")
        print(f"  Calibration timestamp: {estimator_hw._backend_snapshot.calibration_timestamp}")
        print(f"  Manifest: {hw_result.manifest_path}")
        
        # Display hardware confusion matrix (should show off-diagonal noise)
        hw_confusion = estimator_hw.measurement_error_mitigation.confusion_matrix
        print(f"\n  Hardware confusion matrix (first 4×4):")
        print(hw_confusion[:4, :4])
        print("\n  (Off-diagonal elements indicate readout noise)")
    except Exception as e:
        print(f"⚠️ Hardware execution failed: {e}")
        print("   This is expected in automated testing without valid credentials")
        print("   To test with hardware: ensure valid IBM Quantum credentials")
else:
    print("⏩ Skipping hardware execution (no IBM credentials)")
    print("   To enable: set QISKIT_IBM_TOKEN environment variable")

## Section 6: Diagnostics and Provenance Inspection

Inspect saved artifacts for reproducibility and analysis.

In [None]:
# Load and inspect manifest
import json
from pathlib import Path

manifest_path = result_v1.manifest_path
with open(manifest_path, 'r') as f:
    manifest = json.load(f)

print("Provenance Manifest Contents:")
print(f"  Experiment ID: {manifest['experiment_id']}")
print(f"  Created at: {manifest['created_at']}")
print(f"  QuartumSE version: {manifest['quartumse_version']}")
print(f"  Python version: {manifest['python_version']}")
if 'random_seed' in manifest and manifest['random_seed'] is not None:
    print(f"  Random seed: {manifest['random_seed']}")
print(f"\n  Circuit:")
print(f"    - Num qubits: {manifest['circuit']['num_qubits']}")
print(f"    - Gate counts: {manifest['circuit']['gate_counts']}")
print(f"    - Circuit hash: {manifest['circuit']['circuit_hash'][:16]}...")
print(f"\n  Backend:")
print(f"    - Name: {manifest['backend']['backend_name']}")
print(f"    - Num qubits: {manifest['backend']['num_qubits']}")
print(f"\n  Shadows Config:")
print(f"    - Version: {manifest['shadows']['version']}")
print(f"    - Shadow size: {manifest['shadows']['shadow_size']}")
print(f"\n  Mitigation:")
print(f"    - Techniques: {manifest['mitigation']['techniques']}")
print(f"    - Parameters: {manifest['mitigation']['parameters']}")

In [None]:
# Compute shot data diagnostics
from quartumse.reporting.shot_data import ShotDataWriter

shot_writer = ShotDataWriter(data_dir=Path("./demo_data"))
experiment_id = result_v1.experiment_id

diagnostics = shot_writer.summarize_shadow_measurements(experiment_id)

print("\nShot Data Diagnostics:")
print(f"  Total shots: {diagnostics.total_shots}")
print(f"\n  Measurement basis distribution:")
for basis, count in sorted(diagnostics.measurement_basis_distribution.items(), key=lambda x: -x[1])[:5]:
    print(f"    {basis}: {count} shots ({100*count/diagnostics.total_shots:.1f}%)")

print(f"\n  Top bitstring outcomes:")
for bitstring, count in list(diagnostics.bitstring_histogram.items())[:5]:
    print(f"    {bitstring}: {count} occurrences")

print(f"\n  Qubit marginals (P(|0⟩) for each qubit):")
for qubit, marginal in diagnostics.qubit_marginals.items():
    p0 = marginal['0']
    print(f"    Qubit {qubit}: {p0:.3f}")

In [None]:
# Generate HTML report (skip if function not available)
try:
    from quartumse.reporting.report import generate_html_report
    
    report_path = Path("./demo_data/reports") / f"{experiment_id}_report.html"
    report_path.parent.mkdir(parents=True, exist_ok=True)
    
    generate_html_report(
        manifest_path=manifest_path,
        output_path=str(report_path)
    )
    
    print(f"✓ HTML report generated: {report_path}")
    print(f"  Open in browser to view full experiment details")
except ImportError:
    print("⚠️ HTML report generation not available in this version")
    print("   Use 'quartumse report <manifest>' CLI command instead")

## Summary

**What we tested:**
- ✅ Classical Shadows v0 (baseline) on Bell state
- ✅ Shot data persistence and replay with new observables
- ✅ Classical Shadows v1 (noise-aware) with automatic MEM
- ✅ IBM Quantum connector with backend descriptors
- ✅ Provenance tracking (manifests, calibration snapshots)
- ✅ Diagnostics (basis distribution, bitstring histograms, marginals)
- ✅ HTML report generation

**Key capabilities demonstrated:**
1. **"Measure once, ask later"** - Replay from saved data without re-execution
2. **Noise mitigation** - Automatic MEM calibration and correction
3. **Vendor neutrality** - Backend descriptor syntax works across providers
4. **Full provenance** - Every experiment has complete audit trail
5. **Reproducibility** - Seeds + manifests enable exact reproduction

**Next steps:**
- Run S-T01/S-T02 experiments to collect SSR validation data
- Test on real IBM hardware with noisy qubits
- Implement domain experiments (chemistry, optimization, benchmarking)

---

**QuartumSE v0.1.0** - Production-grade quantum measurement optimization framework
