# IBM Smoke Test (Phase A) Notebook

This notebook reproduces the two-qubit Bell-state smoke test described in the experimental plan.
It gives you an interactive way to validate the IBM hardware path before running larger studies.

## Prerequisites

1. Install the Quartumse package in the same environment as this notebook (this repository already provides it).
2. Make sure Qiskit is installed and up to date.
3. Set the `QISKIT_IBM_TOKEN` environment variable to a valid IBM Quantum Platform API token **before** you run any hardware cells.
4. Ensure you have runtime minutes available on the `ibm_torino` backend.

> ⚠️ The notebook will raise an error if the IBM token is missing.

### Option: Set the IBM token in-notebook

If you did not export `QISKIT_IBM_TOKEN` when launching the notebook server you can do it here.
Remove the comment marker and paste your token string between quotes.

```
import os
os.environ["QISKIT_IBM_TOKEN"] = "paste-your-token"
```

In [None]:
# Load environment variables from .env file (if it exists)
try:
    from dotenv import load_dotenv
    load_dotenv()
    print('✓ Environment variables loaded from .env file')
except ImportError:
    print('⚠️ python-dotenv not installed. Install with: pip install python-dotenv')
    print('   Alternatively, set environment variables manually before starting Jupyter.')
except Exception as e:
    print(f'⚠️ Could not load .env file: {e}')

# Verify IBM token is available
import os
if not os.environ.get('QISKIT_IBM_TOKEN') and not os.environ.get('QISKIT_RUNTIME_API_TOKEN'):
    raise EnvironmentError(
        'QISKIT_IBM_TOKEN or QISKIT_RUNTIME_API_TOKEN is not set.\n'
        'Options:\n'
        '  1. Create a .env file with: QISKIT_RUNTIME_API_TOKEN=your-token-here\n'
        '  2. Set environment variable before starting Jupyter\n'
        '  3. Uncomment and use the manual token cell below\n'
        'Get your token from: https://quantum.ibm.com'
    )

# Show which token variable was found
token_var = 'QISKIT_IBM_TOKEN' if os.environ.get('QISKIT_IBM_TOKEN') else 'QISKIT_RUNTIME_API_TOKEN'
print(f'✓ IBM token detected from {token_var} (value not shown for security)')

## Imports and helpers

The helpers mirror the standalone smoke-test script so results line up with the plan.

In [None]:
from pathlib import Path
import numpy as np
from qiskit import QuantumCircuit, transpile

from quartumse.connectors import resolve_backend, is_ibm_runtime_backend, create_runtime_sampler
from quartumse.shadows.core import Observable
from quartumse import ShadowEstimator
from quartumse.shadows import ShadowConfig
from quartumse.shadows.config import ShadowVersion
from quartumse.reporting.manifest import MitigationConfig

In [None]:
def bell_circuit():
    qc = QuantumCircuit(2)
    qc.h(0)
    qc.cx(0, 1)
    return qc


def measure_in_basis(qc, pauli_str):
    circuit = qc.copy()
    for i, axis in enumerate(pauli_str):
        if axis == 'X':
            circuit.h(i)
        elif axis == 'Y':
            circuit.sdg(i)
            circuit.h(i)
    circuit.measure_all()
    return circuit


def parity_from_counts(counts, pauli_str, shots):
    def bit_value(bitstring, index):
        return int(bitstring[-(index + 1)])

    expectation = 0.0
    for bitstring, ct in counts.items():
        weight = ct / shots
        parity = 1.0
        for idx, axis in enumerate(pauli_str):
            if axis != 'I':
                parity *= 1 - 2 * bit_value(bitstring, idx)
        expectation += weight * parity
    return expectation

## Connect to IBM hardware

This resolves the IBM backend descriptor and prepares the output directory used by the smoke test.

In [None]:
backend, snapshot = resolve_backend('ibm:ibm_torino')
print('Connected to backend:', backend.name)
print('Backend version snapshot:', snapshot)
Path('validation_data').mkdir(exist_ok=True)

## Run direct measurement baselines

These replicate the 250-shot ZZ/XX parity checks. The results feed into the comparison table at the end.

In [None]:
observables = [Observable('ZZ', 1.0), Observable('XX', 1.0)]
direct_shots = {
    'ZZ': 250,
    'XX': 250,
}

direct_results = {}
qc = bell_circuit()

# Check if we need to use IBM Runtime Sampler primitives
sampler = None
if is_ibm_runtime_backend(backend):
    sampler = create_runtime_sampler(backend)

for obs in observables:
    pauli = obs.pauli_string
    shots = direct_shots[pauli]
    circuit = measure_in_basis(qc, pauli)
    compiled = transpile(circuit, backend)
    
    if sampler is not None:
        job = sampler.run([compiled], shots=shots)
        result = job.result()
        counts = result[0].data.meas.get_counts()
    else:
        job = backend.run(compiled, shots=shots)
        counts = job.result().get_counts()
    
    expectation = parity_from_counts(counts, pauli, shots)
    direct_results[pauli] = {
        'expectation': float(expectation),
        'shots': shots,
        'counts': counts,
    }
    print(f"[Direct] {pauli}: expectation={expectation:.3f}, shots={shots}")

## Classical shadows v0 baseline

Uses the default (noise-agnostic) shadow workflow with 500 random measurements.

In [None]:
shadow_v0 = ShadowEstimator(
    backend='ibm:ibm_torino',
    shadow_config=ShadowConfig(
        version=ShadowVersion.V0_BASELINE,
        shadow_size=500,
        random_seed=42,
    ),
    data_dir='validation_data',
)

result_v0 = shadow_v0.estimate(qc, observables, save_manifest=True)
print('Shadow v0 manifest saved to:', result_v0.manifest_path)

## Classical shadows v1 with measurement error mitigation

This run matches the plan's MEM-assisted configuration using 4×128 calibration shots.

In [None]:
mem_shots = 128
shadow_v1 = ShadowEstimator(
    backend='ibm:ibm_torino',
    shadow_config=ShadowConfig(
        version=ShadowVersion.V1_NOISE_AWARE,
        shadow_size=200,
        random_seed=43,
        apply_inverse_channel=True,
    ),
    mitigation_config=MitigationConfig(techniques=[], parameters={'mem_shots': mem_shots}),
    data_dir='validation_data',
)

result_v1 = shadow_v1.estimate(qc, observables, save_manifest=True)
print('Shadow v1 manifest saved to:', result_v1.manifest_path)

## Compare the results

The cell below prints expectation values and (when available) the 95% confidence intervals pulled from the estimator outputs.

In [None]:
def format_ci(ci):
    if ci is None:
        return 'N/A'
    return f"[{ci[0]:.3f}, {ci[1]:.3f}]"

summary_rows = []
for obs in observables:
    pauli = obs.pauli_string
    direct = direct_results[pauli]['expectation']
    v0_stats = result_v0.observables[pauli]
    v1_stats = result_v1.observables[pauli]

    summary_rows.append({
        'Observable': pauli,
        'Direct expectation': f"{direct:.3f}",
        'Shadows v0 expectation': f"{v0_stats['expectation_value']:.3f}",
        'Shadows v0 CI95': format_ci(v0_stats.get('ci_95')),
        'Shadows v1 expectation': f"{v1_stats['expectation_value']:.3f}",
        'Shadows v1 CI95': format_ci(v1_stats.get('ci_95')),
    })

import pandas as pd
summary_df = pd.DataFrame(summary_rows)
summary_df

## Validation artifacts

The smoke test writes manifest files and raw shot data under `validation_data/`. Use the cell below to confirm that captures occurred.

In [None]:
validation_root = Path('validation_data')
for path in sorted(validation_root.rglob('*')):
    if path.is_file():
        print(path)

In [None]:
import json
from datetime import datetime

# Compile all results into a single dictionary
smoke_test_results = {
    'metadata': {
        'test_name': 'preliminary_smoke_test',
        'timestamp': datetime.utcnow().isoformat() + 'Z',
        'backend': backend.name,
        'backend_snapshot': {
            'backend_name': snapshot.backend_name,
            'backend_version': snapshot.backend_version,
            'num_qubits': snapshot.num_qubits,
            'calibration_timestamp': snapshot.calibration_timestamp.isoformat() if snapshot.calibration_timestamp else None,
        },
        'circuit': {
            'type': 'Bell state',
            'num_qubits': 2,
        },
        'observables': [str(obs) for obs in observables],
    },
    'direct_measurements': {
        'method': 'Direct Pauli measurement with basis rotations',
        'total_shots': sum(direct_results[k]['shots'] for k in direct_results),
        'results': {
            pauli: {
                'expectation': direct_results[pauli]['expectation'],
                'shots': direct_results[pauli]['shots'],
                'counts': dict(direct_results[pauli]['counts'])  # Convert to regular dict
            }
            for pauli in direct_results
        }
    },
    'shadows_v0': {
        'method': 'Classical Shadows v0 (baseline)',
        'experiment_id': result_v0.experiment_id,
        'manifest_path': result_v0.manifest_path,
        'shot_data_path': result_v0.shot_data_path,
        'shadow_size': 500,
        'execution_time': result_v0.execution_time,
        'results': {
            pauli: {
                'expectation_value': result_v0.observables[pauli]['expectation_value'],
                'variance': result_v0.observables[pauli]['variance'],
                'ci_95': result_v0.observables[pauli].get('ci_95'),
                'ci_width': result_v0.observables[pauli]['ci_width'],
            }
            for pauli in [str(obs) for obs in observables]
        }
    },
    'shadows_v1': {
        'method': 'Classical Shadows v1 (noise-aware with MEM)',
        'experiment_id': result_v1.experiment_id,
        'manifest_path': result_v1.manifest_path,
        'shot_data_path': result_v1.shot_data_path,
        'shadow_size': 200,
        'mem_shots': mem_shots,
        'execution_time': result_v1.execution_time,
        'results': {
            pauli: {
                'expectation_value': result_v1.observables[pauli]['expectation_value'],
                'variance': result_v1.observables[pauli]['variance'],
                'ci_95': result_v1.observables[pauli].get('ci_95'),
                'ci_width': result_v1.observables[pauli]['ci_width'],
            }
            for pauli in [str(obs) for obs in observables]
        }
    },
    'comparison': {
        'description': 'Side-by-side comparison of all three methods',
        'expected_values': {
            'ZZ': 1.0,  # Perfect correlation for Bell state
            'XX': 1.0,  # Perfect correlation for Bell state
        },
        'table': summary_rows,
    },
    'analysis': {
        'notes': [
            'Bell state should show ZZ = XX = +1 (perfect correlation)',
            'Noise and finite sampling cause deviations from ideal',
            'v1 (noise-aware) should show reduced error on noisy hardware',
            'All methods should give consistent results within confidence intervals',
        ]
    }
}

# Save to JSON file with timestamp
results_filename = f"smoke_test_results_{datetime.utcnow().strftime('%Y%m%d_%H%M%S')}.json"
results_path = Path('validation_data') / results_filename

with open(results_path, 'w') as f:
    json.dump(smoke_test_results, f, indent=2)

print(f"✓ Complete smoke test results saved to: {results_path}")
print(f"  File size: {results_path.stat().st_size / 1024:.1f} KB")
print(f"\nResults include:")
print(f"  - Metadata (backend, timestamp, circuit)")
print(f"  - Direct measurement results and counts")
print(f"  - Shadow v0 results (with manifest/shot data paths)")
print(f"  - Shadow v1 results (with MEM)")
print(f"  - Comparison table")
print(f"  - Analysis notes")
print(f"\nTo load later:")
print(f"  import json")
print(f"  with open('{results_path}', 'r') as f:")
print(f"      results = json.load(f)")

## Summary: What Gets Saved

**When you run this notebook, the following data is automatically saved to `validation_data/`:**

### 1. Shadow Estimator Manifests (JSON)
- **Location**: `validation_data/manifests/{experiment_id}.json`
- **Contains**: Complete provenance including circuit, backend calibration, mitigation config, results
- **Files**: 2 manifests (one for v0, one for v1)
- **Size**: ~10 KB each

### 2. Shot Data (Parquet)
- **Location**: `validation_data/shots/{experiment_id}.parquet`
- **Contains**: All raw measurement outcomes (basis, bitstring) for replay
- **Files**: 2 parquet files (one for v0, one for v1)
- **Size**: ~50-100 KB each

### 3. Smoke Test Summary (JSON)
- **Location**: `validation_data/smoke_test_results_{timestamp}.json`
- **Contains**: All results from all three methods (direct, v0, v1) plus comparison
- **Files**: 1 summary file per notebook run
- **Size**: ~5-10 KB

### 4. Direct Measurement Raw Counts
- **Location**: Included in smoke test summary JSON
- **Contains**: Full count dictionaries for ZZ and XX measurements

**Total storage per smoke test run**: ~200-300 KB

**To review later**:
```python
# Load smoke test summary
import json
with open('validation_data/smoke_test_results_20251022_143000.json', 'r') as f:
    results = json.load(f)

# Access any result
print(results['shadows_v1']['results']['ZZ']['expectation_value'])

# Load full provenance manifests
from quartumse.reporting.manifest import ProvenanceManifest
manifest = ProvenanceManifest.from_json(results['shadows_v1']['manifest_path'])

# Replay with new observables
from quartumse import ShadowEstimator
estimator = ShadowEstimator(backend='...', data_dir='validation_data')
new_result = estimator.replay_from_manifest(results['shadows_v1']['manifest_path'])
```

## Save Complete Results for Later Review

Export all results (direct, v0, v1, comparison) to JSON for later analysis.