# LFT: N=6 Scaling & $\mathbb{R}^4$ Embedding Stress

This notebook quantifies how well the **A$_5$** geometry (permutohedron $\Pi_5$ of **N=6**) embeds from the natural **5D** sum-zero space into **4D** via **PCA (optimal linear projection)**. We report:

1. **Edge distortion**: relative errors on all **adjacent-generator edges** (should be 1,800 edges for 720 vertices).
2. **Global stress**: fraction of variance lost by the best rank-4 linear map (Eckart–Young–Mirsky optimality), plus optional pairwise-distance RMS error on a random sample.

**Captions for manuscript** are included near each figure/output.

## 1. Build $\Pi_5$ in the sum-zero space $V\subset \mathbb{R}^6$

**Definition.** Take a centered, strictly increasing template $a=(a_0,\dots,a_5)$ with $\sum a_i=0$, then the vertex set is $\{\sigma\cdot a\mid \sigma\in S_6\}$ projected to an orthonormal basis of $V=\{x\in\mathbb{R}^6: \sum x_i=0\}\cong\mathbb{R}^5$.

We also build the **adjacent-generator** Cayley graph to enumerate the **1,800 edges**.

In [None]:
import numpy as np, itertools, networkx as nx

def sum_zero_basis(N):
    # Orthonormal basis for V = {x: sum x_i = 0}, via SVD of difference matrix
    diffs = np.zeros((N, N-1))
    for i in range(N-1):
        diffs[i, i] = 1.0
        diffs[i+1, i] = -1.0
    U, S, Vt = np.linalg.svd(diffs, full_matrices=False)
    return U  # N x (N-1)

def permutohedron_coords(N):
    B = sum_zero_basis(N)
    a = np.arange(N, dtype=float) - (N-1)/2.0
    perms = list(itertools.permutations(range(N)))
    Vcoords = np.zeros((len(perms), N-1))
    for k, p in enumerate(perms):
        v = a[list(p)]
        Vcoords[k] = B.T @ v
    return Vcoords, perms

def cayley_adjacent_graph(N, perms):
    idx = {p:i for i,p in enumerate(perms)}
    G = nx.Graph()
    G.add_nodes_from(range(len(perms)))
    gens = [(i, i+1) for i in range(N-1)]
    for p in perms:
        u = idx[p]
        for (i,j) in gens:
            q = list(p)
            q[i], q[j] = q[j], q[i]
            v = idx[tuple(q)]
            if u < v:
                G.add_edge(u, v)
    return G

V6, perms6 = permutohedron_coords(6)
G6 = cayley_adjacent_graph(6, perms6)
nodes6, edges6 = V6.shape[0], G6.number_of_edges()
print({'N':6, 'nodes':nodes6, 'edges_adjacent':edges6, 'avg_degree':2*edges6/nodes6})
assert nodes6 == 720 and edges6 == 1800
print('A5 vertex/edge counts verified.')

## 2. PCA to $\mathbb{R}^4$ and variance-retention (global stress)

**Theorem (Eckart–Young–Mirsky).** The best rank-4 linear projection (minimizing squared reconstruction error) is PCA onto the top-4 principal axes. If $S$ are singular values of centered data, the **variance retained** is $\sum_{i=1}^4 S_i^2/\sum_{i} S_i^2$, so we report **global stress** $=1-\text{retained}$.


In [None]:
def pca_project(X, k):
    Xc = X - X.mean(axis=0, keepdims=True)
    U,S,Vt = np.linalg.svd(Xc, full_matrices=False)
    Xk = Xc @ Vt[:k].T
    retained = (S[:k]**2).sum()/ (S**2).sum()
    return Xk, retained

X4, retained = pca_project(V6, 4)
global_stress = 1.0 - retained
print({'retained_variance': float(retained), 'global_stress': float(global_stress)})

## 3. Edge-length distortion (adjacent edges only)

For each Cayley **adjacent edge** \((u,v)\), compute original edge length
$$\ell_5 = \lVert V6[u]-V6[v]\rVert_2$$
and projected length
$$\ell_4 = \lVert X4[u]-X4[v]\rVert_2.$$
Report the **relative error** $|\ell_4-\ell_5|/\ell_5$ over all edges. Save CSV and a histogram.

**Figure caption (manuscript):** *Histogram of relative edge-length errors under PCA(5→4) for A$_5$ permutohedron edges (N=6). The distribution is tight with low mean and IQR, indicating coherent 4D embedding.*

In [None]:
import pandas as pd, matplotlib.pyplot as plt, os
os.makedirs('./outputs', exist_ok=True)

edge_rows=[]
for u,v in G6.edges():
    l5 = np.linalg.norm(V6[u]-V6[v])
    l4 = np.linalg.norm(X4[u]-X4[v])
    rel = abs(l4-l5)/l5 if l5>0 else 0.0
    edge_rows.append({'u':u,'v':v,'L5':l5,'L4':l4,'rel_err':rel})

df_edges = pd.DataFrame(edge_rows)
df_edges.to_csv('./outputs/N6_edge_distortions.csv', index=False)
print('Saved ./outputs/N6_edge_distortions.csv with', len(df_edges), 'edges')

plt.figure()
plt.hist(df_edges['rel_err'].values, bins=40)
plt.xlabel('Relative edge-length error')
plt.ylabel('Count')
plt.title('N=6 A5 edge distortions under PCA(5→4)')
plt.tight_layout()
plt.savefig('./outputs/N6_edge_relerr_hist.png', dpi=150)
plt.close()
print('Saved ./outputs/N6_edge_relerr_hist.png')

## 4. Optional: Pairwise-distance RMS error on a random sample
We sample up to 50,000 unordered pairs to estimate RMS relative pairwise-distance error globally (complements variance loss).

In [None]:
import random, math
pairs = []
max_samples = 50000
n = V6.shape[0]
for _ in range(max_samples):
    i = random.randrange(n)
    j = random.randrange(n)
    if i==j:
        continue
    if i>j:
        i,j = j,i
    pairs.append((i,j))
pairs = list(set(pairs))  # dedupe

errs=[]
for i,j in pairs:
    d5 = np.linalg.norm(V6[i]-V6[j])
    d4 = np.linalg.norm(X4[i]-X4[j])
    if d5>0:
        errs.append(abs(d4-d5)/d5)
rms_pair_err = float(np.sqrt(np.mean(np.square(errs)))) if errs else 0.0
print({'sampled_pairs': len(errs), 'rms_pair_rel_err': rms_pair_err})

## 5. Summary JSON (drop-in for manuscript)
We save a concise JSON with counts, global stress (variance loss), edge stats, and RMS pairwise error.

In [None]:
summary = {
    'N': 6,
    'nodes': int(nodes6),
    'edges_adjacent': int(edges6),
    'avg_degree': float(2*edges6/nodes6),
    'retained_variance_PCA4': float(retained),
    'global_stress_variance_loss': float(global_stress),
    'edge_rel_err_mean': float(df_edges['rel_err'].mean()),
    'edge_rel_err_median': float(df_edges['rel_err'].median()),
    'edge_rel_err_q25': float(df_edges['rel_err'].quantile(0.25)),
    'edge_rel_err_q75': float(df_edges['rel_err'].quantile(0.75)),
    'edge_rel_err_max': float(df_edges['rel_err'].max()),
    'rms_pair_rel_err_sampled': float(rms_pair_err),
    'sampled_pairs': len(errs)
}
with open('./outputs/N6_summary.json', 'w') as f:
    json.dump(summary, f, indent=2)
print('Saved ./outputs/N6_summary.json')
summary

### Manuscript Captions
- **Fig. N6-1.** *Histogram of relative edge-length error for all 1,800 adjacent edges of $\Pi_5$ under PCA(5→4). Low mean and narrow IQR indicate a near-isometric 4D embedding.*
- **Table N6-1.** *Summary metrics for N=6: nodes, edges, PCA retained variance, global stress (variance loss), edge error (mean/median/IQR/max), and sampled pairwise RMS error.*