# Post-Processing Experimental Two-Probe Data

This notebook post-processes raw experimental measurement data from two-probe quantum experiments into a format suitable for machine learning training. It performs post-selection based on ancilla-physical qubit pairs for error mitigation, extracts probe qubit measurements, constructs quantum shadow states from different measurement bases, and aggregates data across multiple experimental runs into consolidated training datasets.

### Qubit Layout: 6x6 Physical Lattice with Ancilla Perimeter

```
            ⊕0   ⊕1    ⊕2   ⊕3   ⊕4           ← ANCILLA (top)
             │    │    │    │    │ 
       ●5── ●6 ──●7 ──●8──● 9── ●10──⊕11      
        │    │    │    │    │    │    
  ⊕12──●13──●14──●15──●16──●17──●18──⊕19
        │    │    │    │    │    │    
  ⊕20──●21──●22──●23──●24──●25──●26──⊕27      ← 6×6 PHYSICAL
        │    │    │    │    │    │            
  ⊕28──●29──●30──●31──●32──●33──●34──⊕35       
        │    │    │    │    │    │    
  ⊕36──●37──●38──●39──●40──●41──●42──⊕43
        │    │    │    │    │    │    
       ●44──●45──●46──●47──●48──●49──⊕50
                  │    │    │    │
                 ⊕51  ⊕52  ⊕53  ⊕54           ← ANCILLA (bottom)
```

- **Physical qubits** (prep_idx): Interior 6×6 grid used for computation
- **Ancilla qubits**: Perimeter qubits used for error correction post-selection
- **Post-selection**: Requires ancilla-physical matching (e.g., ancilla 0 ↔ physical 7)

In [1]:
import torch # type: ignore
import os

dtype = torch.complex128
device = torch.device("cpu")# torch.device("cuda" if torch.cuda.is_available() else "cpu")

pauli = torch.tensor([[[1,0],[0,1]],[[0,1],[1,0]],[[0,-1j],[1j,0]],[[1,0],[0,-1]]], device=device, dtype=dtype)
basis = torch.linalg.eig(pauli)[1][1:].mT # (3, 2, 2)


def shuffle(prepseq, shadow_state, rhoS):
    indices = torch.randperm(prepseq.shape[0])
    prepseq = prepseq[indices]
    shadow_state = shadow_state[indices]
    rhoS = rhoS[indices]
    return prepseq, shadow_state, rhoS

In [2]:
# process two probe qubits data
def torch_data(filename):
    data = {}
    for i in range(3):
        for j in range(3):
            m = torch.load(filename+f'_({i},{j}).pt')
            # msk starts as all True, then gets progressively filtered by requiring ancilla-physical qubit pairs to be the same.
            # Finally used to keep only measurements that pass all post-selection criteria: m = m[msk]
            msk = torch.ones(m.shape[0], device=device, dtype=torch.bool)
            # post select on 2-qubit mitigation
            for anc, phy in [(0,6),
                             (1,7),
                             (2,8),
                             (3,9),
                             (12,13),
                             (20,21),
                             (28,29),
                             (36,37),
                             (19,18),
                             (27,26),
                             (35,34),
                             (43,42),
                             (51,46),
                             (52,47),
                             (53,48)
                             ]:
                msk = msk & (m[:,anc]==m[:,phy])
                #print((m[:,anc]==m[:,phy]).float().mean())
            # post select on 3-qubit mitigation
            for anc, phy in [(4,10),
                             (11,10),
                             (50,49),
                             (54,49)
                             ]:
                msk = msk & (m[:,anc]==m[:,phy])
                #print((m[:,anc]==m[:,phy]).float().mean())
            m = m[msk] # (batch, num_qubits)

            probe = torch.cat([m[:,10].view(-1,1), m[:,49].view(-1,1)], 1)
            prep = m[:,[5,6,7,8,9,
                        13,14,15,16,17,18,
                        21,22,23,24,25,26,
                        29,30,31,32,33,34,
                        37,38,39,40,41,42,
                        44,45,46,47,48]]
            data[(i,j)] = (prep, probe)
    prepseq, shadow_state, rhoS = [], [], []
    for k in data.keys():
        # construct post-measure state
        probseq = data[k][1].to(dtype=torch.int64).to(device=device) # (repetition, 2) last 2 outcomes
        obs_basis0 = basis[k[0]].unsqueeze(0).expand(probseq.shape[0], -1, -1) # (repetition, 2, 2)
        shadow_state0 = obs_basis0.gather(1, probseq[:,0].view(-1, 1, 1).expand(-1, -1, 2)).squeeze(1) # (repetition, 2)
        obs_basis1 = basis[k[1]].unsqueeze(0).expand(probseq.shape[0], -1, -1) # (repetition, 2, 2)
        shadow_state1 = obs_basis1.gather(1, probseq[:,1].view(-1, 1, 1).expand(-1, -1, 2)).squeeze(1) # (repetition, 2)
        shadow_state01 = torch.vmap(torch.kron)(shadow_state0, shadow_state1) # (batch, 4)
        # construct rhoS
        I = torch.eye(2, 2, device=device)[None,...].expand(shadow_state01.shape[0], -1, -1)
        rhoS0 = 3*torch.vmap(torch.outer)(shadow_state0, shadow_state0.conj()) - I
        rhoS1 = 3*torch.vmap(torch.outer)(shadow_state1, shadow_state1.conj()) - I
        rhoS01 = torch.vmap(torch.kron)(rhoS0, rhoS1)
        # collect result
        prepseq.append(data[k][0].to(dtype=torch.int64).to(device=device))
        shadow_state.append(shadow_state01)
        rhoS.append(rhoS01)
    prepseq = torch.cat(prepseq, 0).to(torch.int64)
    shadow_state = torch.cat(shadow_state, 0)
    rhoS = torch.cat(rhoS, 0)
    return prepseq, shadow_state, rhoS

### Data Object Descriptions

**Key saved objects and their meanings:**

- `all_prepseq` - Shape: (N, 34)  
  Ancilla measurement outcomes from physical qubits after post-selection filtering. Used as input features for ML models to predict probe states.

- `all_shadow_state` - Shape: (N, 4)  
  Shadow states of the two probe qubits: |ψ₁⟩⊗|ψ₂⟩ in the 4D Hilbert space. Constructed from probe measurements in X/Y/Z bases.

- `all_rhoS` - Shape: (N, 4, 4)  
  Tensor product of single-qubit shadow density matrices: ρS = ρS₁⊗ρS₂ where ρSᵢ = 3|ψᵢ⟩⟨ψᵢ| - I. These are the target density matrices for ML training.

---

### Data Storage Summary

**File Organization:**
```
data/theta{θ_idx}/
├── all_prepseq_theta={θ_idx}.pt     # Input: ancilla measurement patterns  
├── all_shadow_state_theta={θ_idx}.pt # Target: probe shadow states
└── all_rhoS_theta={θ_idx}.pt        # Target: probe shadow density matrices
```

**Usage:** ML models learn to predict probe quantum states (shadow_state/rhoS) from ancilla measurement outcomes (prepseq) for quantum error correction decoding.


In [None]:
for theta_idx in [4]:
    for d in [6]:
        all_prepseq = []
        all_shadow_state = []
        all_rhoS = []
        for loop in [0]:
            filename = f'data/theta{theta_idx}/loop{loop}/theta={theta_idx}'
            torch.manual_seed(loop)
            prepseq, shadow_state, rhoS = torch_data(filename, d)
            prepseq, shadow_state, rhoS = shuffle(prepseq, shadow_state, rhoS)
            all_prepseq.append(prepseq)
            all_shadow_state.append(shadow_state)
            all_rhoS.append(rhoS)
            print(f'distance={d}, loop={loop}, theta_idx={theta_idx}, portion to keep={((prepseq.shape[0]/9000000)):.4f}')
        all_prepseq = torch.cat(all_prepseq, 0)
        all_shadow_state = torch.cat(all_shadow_state, 0)
        all_rhoS = torch.cat(all_rhoS, 0)
        torch.save(all_prepseq, f'data/theta{theta_idx}/all_prepseq_theta={theta_idx}.pt')
        torch.save(all_shadow_state, f'data/theta{theta_idx}/all_shadow_state_theta={theta_idx}.pt')
        torch.save(all_rhoS, f'data/theta{theta_idx}/all_rhoS_theta={theta_idx}.pt')
        print(all_prepseq.shape, theta_idx)
        print(all_shadow_state.shape, theta_idx)
        print(all_rhoS.shape, theta_idx)