# 1.8d: Full-Dimensional Black Hole Separation

We know the 4 black holes are separated by 1-2 ULP in the (x, y, z) basis dimensions (322, 1564, 163).

**Question:** Are they **only** separated in those 3 dimensions, or are they separated across all 2560 dimensions?

**Test:** Take BH1 (largest, 866 tokens) as reference and compute component-wise differences to BH2, BH3, BH4 across **all 2560 dimensions**.

If they're only separated in 3 dimensions → lattice is 3D
If they're separated in many dimensions → lattice is high-dimensional

## Parameters

In [1]:
# Model to analyze
MODEL_NAME = "Qwen3-4B-Instruct-2507"

## Imports

In [2]:
import torch
import ml_dtypes
import numpy as np
import matplotlib.pyplot as plt
from safetensors.torch import load_file
from pathlib import Path
from collections import defaultdict, Counter

# Helper function
def torch_bf16_to_numpy_bf16(tensor):
    """Convert PyTorch bfloat16 tensor to numpy array with ml_dtypes.bfloat16 dtype."""
    return tensor.cpu().view(torch.uint16).numpy().view(ml_dtypes.bfloat16)

## Load Data

In [3]:
# Load W in bfloat16
W_path = Path(f"../tensors/{MODEL_NAME}/W.safetensors")
W_bf16 = load_file(W_path)["W"]

# Load core data
core_path = Path(f"../tensors/{MODEL_NAME}/1.8a_core.safetensors")
core_data = load_file(core_path)

core_mask = core_data["core_mask"].to(torch.bool)
core_token_ids = core_data["core_token_ids"].to(torch.int64)
n_core = core_data["n_core"].item()

# Basis indices
north_idx = core_data["north_idx"].item()
meridian_idx = core_data["meridian_idx"].item()
equinox_idx = core_data["equinox_idx"].item()

print(f"Loaded core: {n_core:,} tokens")
print(f"Basis dimensions: x={meridian_idx}, y={equinox_idx}, z={north_idx}")

Loaded core: 2,179 tokens
Basis dimensions: x=322, y=1564, z=163


## Compute Centered Coordinates (bfloat16)

In [4]:
print("\nComputing centered coordinates...\n")

# Extract core and center
W_core_bf16 = W_bf16[core_mask]
core_centroid_bf16 = W_core_bf16.mean(dim=0)
W_core_centered_bf16 = W_core_bf16 - core_centroid_bf16

# Extract coordinates for basis dimensions
x_bf16 = W_core_centered_bf16[:, meridian_idx]
y_bf16 = W_core_centered_bf16[:, equinox_idx]
z_bf16 = W_core_centered_bf16[:, north_idx]
r_squared_bf16 = x_bf16**2 + y_bf16**2 + z_bf16**2

print("✓ Computed centered coordinates in bfloat16")


Computing centered coordinates...

✓ Computed centered coordinates in bfloat16


## Group by Coordinates to Find Black Holes

In [5]:
print("\nGrouping tokens by coordinates...\n")

# Create coordinate tuples
coords_bf16 = torch.stack([r_squared_bf16, x_bf16, y_bf16, z_bf16], dim=1)
coords_np_bf16 = torch_bf16_to_numpy_bf16(coords_bf16)

# Group tokens
coord_groups = defaultdict(list)
for i in range(n_core):
    coord_tuple = tuple(coords_np_bf16[i])
    coord_groups[coord_tuple].append(i)

# Get black holes (groups with >1 token)
black_holes = [(coord, indices) for coord, indices in coord_groups.items() if len(indices) > 1]
black_holes.sort(key=lambda x: len(x[1]), reverse=True)

print(f"Found {len(black_holes)} black holes")
for i, (coord, indices) in enumerate(black_holes, 1):
    r2, x, y, z = coord
    print(f"  BH{i}: {len(indices):4,} tokens at (x={float(x):.6e}, y={float(y):.6e}, z={float(z):.6e})")


Grouping tokens by coordinates...

Found 4 black holes
  BH1:  866 tokens at (x=-1.525879e-05, y=0.000000e+00, z=0.000000e+00)
  BH2:  734 tokens at (x=1.525879e-05, y=1.525879e-05, z=0.000000e+00)
  BH3:  329 tokens at (x=1.525879e-05, y=0.000000e+00, z=0.000000e+00)
  BH4:  249 tokens at (x=-1.525879e-05, y=1.525879e-05, z=0.000000e+00)


## Get Representative Vectors for Each Black Hole

In [6]:
print("\nGetting representative vectors for each black hole...\n")

# For each black hole, get the first token's full centered embedding
bh_vectors_bf16 = []
bh_labels = []

for i, (coord, indices) in enumerate(black_holes, 1):
    # Get first token in this black hole
    first_idx = indices[0]
    
    # Get its full centered embedding (all 2560 dimensions)
    vector_bf16 = W_core_centered_bf16[first_idx]
    
    bh_vectors_bf16.append(vector_bf16)
    bh_labels.append(f"BH{i}")
    
    print(f"BH{i}: Token {core_token_ids[first_idx].item()} (representative)")

print(f"\n✓ Extracted {len(bh_vectors_bf16)} representative vectors")


Getting representative vectors for each black hole...

BH1: Token 80091 (representative)
BH2: Token 125 (representative)
BH3: Token 124 (representative)
BH4: Token 123939 (representative)

✓ Extracted 4 representative vectors


## Compute Full-Dimensional Differences (BH1 vs Others)

In [7]:
print("\nComputing component-wise differences (BH1 vs others)...\n")

# BH1 is the reference (largest black hole)
bh1_vector = bh_vectors_bf16[0]

# Compute differences to BH2, BH3, BH4
diffs = {}
for i in range(1, len(bh_vectors_bf16)):
    diff = bh_vectors_bf16[i] - bh1_vector  # Component-wise difference in bfloat16
    diffs[f"BH{i+1}"] = diff

print(f"✓ Computed differences from BH1 to BH2, BH3, BH4")


Computing component-wise differences (BH1 vs others)...

✓ Computed differences from BH1 to BH2, BH3, BH4


## Analyze Non-Zero Dimensions

In [8]:
print("\nAnalyzing which dimensions are non-zero...\n")

for label, diff in diffs.items():
    # Convert to numpy bfloat16
    diff_np_bf16 = torch_bf16_to_numpy_bf16(diff)
    
    # Find non-zero dimensions
    nonzero_mask = diff_np_bf16 != ml_dtypes.bfloat16(0.0)
    nonzero_dims = np.where(nonzero_mask)[0]
    n_nonzero = len(nonzero_dims)
    
    print(f"BH1 → {label}:")
    print(f"  Non-zero dimensions: {n_nonzero} / 2560 ({n_nonzero/2560*100:.2f}%)")
    
    if n_nonzero <= 20:
        print(f"  Dimensions: {list(nonzero_dims)}")
        print(f"  Values:")
        for dim in nonzero_dims:
            val = float(diff_np_bf16[dim])
            marker = " ← basis" if dim in [meridian_idx, equinox_idx, north_idx] else ""
            print(f"    dim {dim:4d}: {val:+.6e}{marker}")
    else:
        print(f"  First 10 non-zero dimensions: {list(nonzero_dims[:10])}...")
    
    print()


Analyzing which dimensions are non-zero...

BH1 → BH2:
  Non-zero dimensions: 6 / 2560 (0.23%)
  Dimensions: [np.int64(322), np.int64(1272), np.int64(1382), np.int64(1564), np.int64(2040), np.int64(2079)]
  Values:
    dim  322: +3.051758e-05 ← basis
    dim 1272: -9.536743e-07
    dim 1382: -1.490116e-08
    dim 1564: +1.525879e-05 ← basis
    dim 2040: -4.768372e-07
    dim 2079: -2.384186e-07

BH1 → BH3:
  Non-zero dimensions: 6 / 2560 (0.23%)
  Dimensions: [np.int64(282), np.int64(322), np.int64(1008), np.int64(1272), np.int64(1382), np.int64(1487)]
  Values:
    dim  282: -1.525879e-05
    dim  322: +3.051758e-05 ← basis
    dim 1008: -2.980232e-08
    dim 1272: -9.536743e-07
    dim 1382: -1.490116e-08
    dim 1487: -7.629395e-06

BH1 → BH4:
  Non-zero dimensions: 2 / 2560 (0.08%)
  Dimensions: [np.int64(216), np.int64(1564)]
  Values:
    dim  216: +7.629395e-06
    dim 1564: +1.525879e-05 ← basis



## Summary: Are They Only Separated in 3D?

In [9]:
print("\n" + "="*60)
print("SUMMARY: DIMENSIONAL SEPARATION")
print("="*60)
print()

print(f"Basis dimensions for (x, y, z):")
print(f"  x (meridian): dimension {meridian_idx}")
print(f"  y (equinox):  dimension {equinox_idx}")
print(f"  z (north):    dimension {north_idx}")
print()

all_nonzero_counts = []
for label, diff in diffs.items():
    diff_np_bf16 = torch_bf16_to_numpy_bf16(diff)
    nonzero_mask = diff_np_bf16 != ml_dtypes.bfloat16(0.0)
    n_nonzero = np.sum(nonzero_mask)
    all_nonzero_counts.append(n_nonzero)

max_nonzero = max(all_nonzero_counts)
min_nonzero = min(all_nonzero_counts)

print(f"Non-zero dimensions across all BH pairs:")
print(f"  Min: {min_nonzero}")
print(f"  Max: {max_nonzero}")
print()

if max_nonzero <= 5:
    print("RESULT: Black holes are separated in ≤5 dimensions")
    print("  → The lattice is LOW-DIMENSIONAL (3D or similar)")
    print("  → Most of 2560D space is identical across black holes")
elif max_nonzero <= 50:
    print("RESULT: Black holes are separated in ≤50 dimensions")
    print("  → The lattice is MODERATE-DIMENSIONAL")
else:
    print("RESULT: Black holes are separated in many dimensions")
    print("  → The lattice is HIGH-DIMENSIONAL")

print()
print("="*60)


SUMMARY: DIMENSIONAL SEPARATION

Basis dimensions for (x, y, z):
  x (meridian): dimension 322
  y (equinox):  dimension 1564
  z (north):    dimension 163

Non-zero dimensions across all BH pairs:
  Min: 2
  Max: 6

RESULT: Black holes are separated in ≤50 dimensions
  → The lattice is MODERATE-DIMENSIONAL

