# Review Hardware Smoke Test Results

This notebook demonstrates how to replay and analyze the preliminary smoke test executed on IBM ibm_torino (Oct 22, 2025).

**Capabilities:**
1. Load complete results summary from JSON
2. Replay experiments from saved manifests
3. Compute NEW observables from saved shot data
4. Compare direct vs shadow methods
5. Analyze backend calibration data
6. Generate visualizations

## Setup

In [None]:
import json
from pathlib import Path
import pandas as pd
import numpy as np

from quartumse import ShadowEstimator
from quartumse.shadows.core import Observable
from quartumse.reporting.manifest import ProvenanceManifest
from quartumse.reporting.shot_data import ShotDataWriter

print('âœ“ Imports successful')

## Part 1: Load Smoke Test Summary

The complete results are saved in `validation_data/smoke_test_results_TIMESTAMP.json`

In [None]:
# Find the most recent smoke test results
validation_dir = Path('validation_data')
result_files = list(validation_dir.glob('smoke_test_results_*.json'))

if not result_files:
    raise FileNotFoundError('No smoke test results found in validation_data/')

# Use most recent
latest_results = max(result_files, key=lambda p: p.stat().st_mtime)
print(f'Loading: {latest_results.name}')

with open(latest_results, encoding='utf-8') as f:
    results = json.load(f)

print(f'âœ“ Loaded smoke test results from {results["metadata"]["timestamp"]}')
print(f'  Backend: {results["metadata"]["backend"]}')
print(f'  Total shots: {results["direct_measurements"]["total_shots"]}')
print(f'  Git commit: {results["metadata"]["git_commit"][:7]}')

## Part 2: Compare All Three Methods

View the comparison table from the smoke test

In [None]:
# Display comparison table
comparison_df = pd.DataFrame(results['comparison']['table'])
print('\nSmoke Test Results - All Methods:')
print('=' * 100)
comparison_df

In [None]:
# Display quality check results
quality_df = pd.DataFrame(results['analysis']['quality_checks'])
print('\nQuality Checks:')
print('=' * 100)
quality_df[['observable', 'expected_value', 'direct_expectation', 
            'shadows_v0_expectation', 'shadows_v1_expectation',
            'direct_delta', 'shadows_v0_delta', 'shadows_v1_delta']]

## Part 3: Replay Shadow v0 Experiment

Reproduce the v0 results from saved shot data

In [None]:
# Load Shadow v0 manifest
v0_manifest_path = results['shadows_v0']['manifest_path']
print(f'Loading Shadow v0 manifest: {v0_manifest_path}')

# Replay with original observables
estimator = ShadowEstimator(backend='aer_simulator', data_dir='validation_data')
replayed_v0 = estimator.replay_from_manifest(v0_manifest_path)

print(f'âœ“ Replayed v0 using {replayed_v0.shots_used} saved shadows')
print(f'  Execution time: {replayed_v0.execution_time:.3f}s (instant replay!)')
print('\nReplayed Results:')
for obs_key, values in replayed_v0.observables.items():
    print(f"  {obs_key}: {values['expectation_value']:.3f} Â± {values['ci_width']/2:.3f}")

## Part 4: Compute NEW Observables from Hardware Data

The power of classical shadows: compute observables that weren't in the original experiment!

In [None]:
# Define NEW observables (not measured in original smoke test)
new_observables = [
    Observable('II', 1.0),  # Identity (should be 1.0)
    Observable('ZI', 1.0),  # Single qubit Z on first qubit
    Observable('IZ', 1.0),  # Single qubit Z on second qubit
    Observable('YY', 1.0),  # YY correlation (another Bell stabilizer)
]

print('Computing NEW observables from hardware data...')
print('Original observables: ZZ, XX')
print('NEW observables:      II, ZI, IZ, YY')
print()

# Replay with new observables
new_result = estimator.replay_from_manifest(
    v0_manifest_path,
    observables=new_observables
)

print(f'âœ“ Computed {len(new_result.observables)} NEW observables')
print('  (No backend execution required - using saved hardware data!)')
print('\nNEW Observable Estimates from Hardware Data:')
for obs_key, values in new_result.observables.items():
    exp_val = values['expectation_value']
    ci_half = values['ci_width'] / 2
    print(f"  {obs_key}: {exp_val:>6.3f} Â± {ci_half:.3f}")

print('\nExpected for ideal Bell state:')
print('  II: +1.0  (identity)')
print('  ZI:  0.0  (single qubit in superposition)')
print('  IZ:  0.0  (single qubit in superposition)')
print('  YY: +1.0  (Bell stabilizer)')

## Part 5: Replay Shadow v1 (Noise-Aware) Experiment

Shadow v1 includes Measurement Error Mitigation (MEM)

In [None]:
# Load Shadow v1 manifest
v1_manifest_path = results['shadows_v1']['manifest_path']
print(f'Loading Shadow v1 manifest: {v1_manifest_path}')

# Check if MEM confusion matrix is available
mem_path = results['shadows_v1'].get('mitigation_confusion_matrix_path')
if mem_path:
    mem_file = Path(mem_path)
    if mem_file.exists():
        print(f'âœ“ MEM confusion matrix available: {mem_file.name}')
    else:
        print(f'âš  MEM confusion matrix not found at: {mem_path}')

# Replay v1 with noise-aware reconstruction
replayed_v1 = estimator.replay_from_manifest(v1_manifest_path)

print(f'\nâœ“ Replayed v1 (noise-aware) using {replayed_v1.shots_used} saved shadows')
print(f'  MEM calibration shots: {results["shadows_v1"]["mem_shots"]}')
print('\nReplayed v1 Results:')
for obs_key, values in replayed_v1.observables.items():
    print(f"  {obs_key}: {values['expectation_value']:.3f} Â± {values['ci_width']/2:.3f}")

## Part 6: Analyze Backend Calibration Data

Review the IBM ibm_torino calibration snapshot captured during execution

In [None]:
# Load backend snapshot
backend_snapshot = results['metadata']['backend_snapshot']

print(f'Backend: {backend_snapshot["backend_name"]}')
print(f'Version: {backend_snapshot["backend_version"]}')
print(f'Qubits: {backend_snapshot["num_qubits"]}')
print(f'Calibration timestamp: {backend_snapshot["calibration_timestamp"]}')
print(f'\nBasis gates: {backend_snapshot["basis_gates"]}')

# Display gate errors
if 'gate_errors' in backend_snapshot:
    print('\nGate Errors:')
    for gate, error in backend_snapshot['gate_errors'].items():
        print(f"  {gate}: {error:.6f} ({error*100:.4f}%)")

# Show readout errors for qubits used in experiment
if 'readout_errors' in backend_snapshot:
    print('\nReadout Errors (first 10 qubits):')
    readout_errors = {int(k): v for k, v in backend_snapshot['readout_errors'].items()}
    for qubit in sorted(readout_errors.keys())[:10]:
        error = readout_errors[qubit]
        print(f"  Qubit {qubit}: {error:.6f} ({error*100:.4f}%)")

## Part 7: Examine Raw Shot Data

Inspect the Parquet files containing all measurement outcomes

In [None]:
# Load v0 shot data
v0_shot_path = results['shadows_v0']['shot_data_path']
print(f'Loading v0 shot data: {v0_shot_path}')

try:
    v0_diagnostics = ShotDataWriter.summarize_parquet(v0_shot_path)
    v0_diag_dict = v0_diagnostics.to_dict()
    
    print(f'\nâœ“ Shadow v0 Shot Data:')
    print(f"  Total shots: {v0_diag_dict['total_shots']}")
    print(f'\n  Measurement Basis Distribution:')
    for basis, count in v0_diag_dict['measurement_basis_distribution'].items():
        print(f"    {basis}: {count} ({count/v0_diag_dict['total_shots']*100:.1f}%)")
    
    print(f'\n  Top Bitstrings:')
    for bitstring, count in list(v0_diag_dict['bitstring_histogram'].items())[:5]:
        print(f"    {bitstring}: {count}")
        
except Exception as e:
    print(f'âš  Could not load shot data: {e}')

## Summary

**What we demonstrated:**

1. âœ… **Loaded complete smoke test results** from JSON summary
2. âœ… **Replayed v0 and v1 experiments** from saved manifests
3. âœ… **Computed NEW observables** from hardware data (measure once, analyze forever)
4. âœ… **Analyzed backend calibration** captured during execution
5. âœ… **Inspected raw shot data** in Parquet format

**Key Insight:** The smoke test data is fully reusable. You can:
- Compute any 2-qubit Pauli observable from the saved shadows
- Compare different reconstruction methods
- Analyze noise characteristics from calibration data
- Generate publication-quality figures
- Share complete provenance with collaborators

**All without re-running on IBM hardware!** ðŸŽ‰