# Classical Shadow Tomography

This notebook aims to introduce theory of Classical Shadow Tomography (CST) methods, following [Aaronson's](https://arxiv.org/abs/1711.01053) (2018) original (shadow tomography) approach with [Huang et al.'s](https://arxiv.org/abs/2002.08953) (2020) improvements (CST).

The overarching theory for Shadow Tomogrpahy is stated as follows from Aaronson:

> **Theorem 1 (Shadow Tomography; Aaronson, 2018)**
>
> Let $\rho$ be an unknown $D$-dimensional mixed state, and let $E_1,\dots,E_M$ be known two-outcome measurements.
>
> Then, for any $\varepsilon,\delta>0$, there exists a procedure that estimates
> $\Pr[E_i \text{ accepts } \rho]$ to within $\pm\varepsilon$ for all $i\in[M]$, with success probability at least $1-\delta$, using
>
> $$\widetilde{O}\!\left(\frac{\log^4 M \cdot \log D}{\varepsilon^5}\right)$$
>
> copies of $\rho$.

The general gist of this theorem implies that, we can measure $k = \mathcal{O}(...)$ copies of state $U \rho U^{\textdagger}$, to reconstruct the state with the 2 outcome measurements in the computational basis. These unitary bases are drawn from a random ensemble; covered in [Huang et al.](https://arxiv.org/abs/2002.08953) (2020) and this literature review from [Tufts](https://www.cs.tufts.edu/comp/150QC/Report1Mingqian.pdf). The unitary and measurement operations are applied to $\rho$ a number of times which determines the size of the classical shadow. From this we obtain observables, which are representations of expected outputs from $\rho$. These are formed form the classical shadow and are what we want to predict, as they will provide us with useful information of quantum properties of our state. Observables can be any operator project, combination of paulis, or full hamiltonians.

The theorem for CST is as follows:
> **Theorem 2 (Classical Shadows; Huang–Kueng–Preskill, 2020)**
>
> Let $\rho$ be an unknown $n$-qubit state. Fix a randomized measurement scheme that admits a classical-shadow estimator (e.g., independently sample a single-qubit Pauli/Clifford on each qubit, apply it to $\rho$, and classically invert the channel).  
> Given $N$ independent snapshots, there exists an estimator that outputs
> $\{\widehat{\mu}_i\}_{i=1}^M$ for $\mu_i=\operatorname{Tr}(O_i\rho)$ such that, with probability at least $1-\delta$,
> all $M$ estimates are $\varepsilon$-accurate simultaneously:
> $|\widehat{\mu}_i-\mu_i|\le \varepsilon$ for every $i\in[M]$, provided
>
> $$N \;\ge\; C\,\frac{\max_{i\in[M]}\,\|O_i\|_{\mathrm{sh}}^{2}\;\log(M/\delta)}{\varepsilon^{2}},$$
>
> where $\|O\|_{\mathrm{sh}}$ is the shadow norm (variance proxy) determined by the chosen measurement ensemble and $C$ is a universal constant.
>
> **In particular:**
> - For independent random single-qubit Pauli measurements and any $k$-local observable $O$ with $\|O\|\le 1$, one has $\|O\|_{\mathrm{sh}}^{2}\!\le 3^{k}$, so
>   $$N = O\!\left(\frac{3^{k}\,\log(M/\delta)}{\varepsilon^{2}}\right).$$
> - For fidelity (rank-1 projector) observables $O=\lvert\psi\rangle\!\langle\psi\rvert$, one has $\|O\|_{\mathrm{sh}}^{2}\!\le 2^{n}$, so
>   $$N = O\!\left(\frac{2^{n}\,\log(M/\delta)}{\varepsilon^{2}}\right).$$

## Recreating Classical Shadow Tomography
In order to recreate CST, the first thing we need to do is set up our Clifford unitary basis operation. This can be done easily with Qiskit's built-in function. You can read the documentation for the Clifford gate from Qiskit [here](https://quantum.cloud.ibm.com/docs/en/api/qiskit/qiskit.quantum_info.Clifford). These can be created on the fly when running the full quantum circuit, so we will implement everything as one.

In [1]:
from pathlib import Path
import sys

sys.path.insert(0, str(Path.cwd().parent / "src"))

from CShadTomo import CTomo, ShadowTomoPlotter

In [1]:
# Set HuggingFace Token
import os
os.environ["HF_TOKEN"] = ""

!huggingface-cli login --token "$HF_TOKEN"

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `hf`CLI if you want to set the git credential as well.
Token is valid (permission: fineGrained).
The token `ToSProject` has been saved to /Users/zacsmms/.cache/huggingface/stored_tokens
Traceback (most recent call last):
  File "/opt/anaconda3/bin/huggingface-cli", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/huggingface_hub/commands/huggingface_cli.py", line 61, in main
    service.run()
  File "/opt/anaconda3/lib/python3.11/site-packages/huggingface_hub/commands/user.py", line 113, in run
    login(
  File "/opt/anaconda3/lib/python3.11/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
    return f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/huggingface_hub/utils/_deprecation.py"

In [None]:
# ---- driver.py (example usage) ----
import numpy as np
from qiskit_aer import AerSimulator
from qiskit import QuantumCircuit

# For set_llm() to work:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

# 1) Define your state-prep routine
def prep_bell(qc: QuantumCircuit):
    qc.h(0)
    qc.cx(0, 1)
    return qc  # optional, class doesn't use the return value

# 2) Instantiate tomography (choose a scheme)
tomo = CTomo(
    n_qubits=2,
    scheme="local-clifford",          # or "local-pauli" / "global-clifford"
    num_snapshots=5000,
    default_predictor="classical",     # or "llm"
)

# 3) Backend + state-prep + observables
tomo.set_backend(AerSimulator())       # or an IBM backend name if using Runtime
tomo.set_state_prep(prep_bell)
tomo.set_observables(["ZI", "IZ", "XX", "ZZ"])

# 4) Run shots (pick one)
tomo.run_snapshots()                   # runs exactly num_snapshots
# or, e.g., cap by time/shots and auto-save checkpoints:
# tomo.run_snapshots_capped(N=10000, max_seconds_total=60, save_every=1000, save_path="data/snapshots.json")

# (Optional) Save the snapshot log to file for resuming later
tomo.save_now("data/snapshots.json")

# 5) Classical estimates (mean ± SE) for each observable
for O in tomo.observables:
    mean, se = tomo.estimate_observable_classical(O)
    print(f"{O:>2}: mean={mean:.4f}, SE={se:.4f}")

# 6) Save per-snapshot contributions to .npz (for plotting later)
#    Do this for whichever observable(s) you want hist/running-mean plots for.
tomo.save_results_classical("ZZ", "out/classical_ZZ.npz")

# 7) (Optional) LLM setup + values
#    Make sure your GPU/CPU can handle the model you pick.
#    Comment these lines out if you’re not using the LLM path.
tomo.set_llm("Qwen/Qwen2.5-1.5B-Instruct")  # example HF model
mean_llm, conf_llm = tomo.estimate_observable_llm("ZZ")
print(f"LLM(ZZ): mean={mean_llm:.4f}, avg_conf={conf_llm:.3f}")
tomo.save_results_llm("ZZ", "out/llm_ZZ.npz")

# 8) Plotting (vector outputs). Create a plotter attached to your tomo.
plotter = ShadowTomoPlotter(tomo)

# Individual vector plots (SVGs)
plotter.plot_running_mean_from_file("out/classical_ZZ.npz", save_dir="plots")
plotter.plot_hist_from_file("out/classical_ZZ.npz", save_dir="plots")
plotter.plot_bit_marginals(save_dir="plots")
plotter.plot_basis_usage(save_dir="plots")
plotter.plot_stats_dashboard(["ZI", "IZ", "XX", "ZZ"], save_dir="plots")
# If you created LLM results:
plotter.plot_llm_confidence_from_file("out/llm_ZZ.npz", save_dir="plots")
plotter.plot_llm_vs_classical(["ZI", "IZ", "XX", "ZZ"], save_dir="plots")

# One combined multi-panel vector figure (also SVG)
plotter.plot_all_vector(
    save_dir="plots",
    classical_path="out/classical_ZZ.npz",
    llm_path="out/llm_ZZ.npz",             # omit if you didn’t run LLM
    compare_observables=["ZI", "IZ", "XX", "ZZ"],
    stats_observables=["ZI", "IZ", "XX", "ZZ"],
    filename="all_plots.svg"
)

# 9) Resuming later (optional):
# tomo.resume_snapshots_capped(N=5000, load_path="data/snapshots.json",
#                              save_path="data/snapshots.json",
#                              max_seconds_total=60, save_every=1000)

ZI: mean=0.0294, SE=0.0246
IZ: mean=0.0114, SE=0.0247
XX: mean=0.9558, SE=0.0392
ZZ: mean=1.0422, SE=0.0407


tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/660 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.09G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/242 [00:00<?, ?B/s]

Device set to use mps:0
The following generation flags are not valid and may be ignored: ['temperature', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
The following generation flags are not valid and may be ignored: ['temperature', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
The following generation flags are not valid and may be ignored: ['temperature', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.
The following generation flags are not valid and may be ignored: ['temperature', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
A decoder-only architecture is being used, but right-paddi