# 1.8i: Intrinsic Dimensionality

We have 4 black holes in 2560D space. From 1.8h we know they vary in only 10 dimensions, with at most 2 distinct positions per dimension.

**Question:** What is the **intrinsic dimensionality** of these 4 points?

- **Rank 1:** All 4 points lie on a line (collinear)
- **Rank 2:** All 4 points lie in a plane (coplanar, flat)
- **Rank 3:** Points span a 3D subspace (form a tetrahedron with volume)

**Method:** 
1. Center the 4 points (translate so BH1 is at origin)
2. Stack the 3 displacement vectors (BH2-BH1, BH3-BH1, BH4-BH1) into a matrix
3. Compute the rank via SVD
4. The rank tells us the intrinsic dimensionality

## Parameters

In [1]:
# Model to analyze
MODEL_NAME = "Qwen3-4B-Instruct-2507"

# Tolerance for considering singular values as zero
RANK_TOLERANCE = 1e-10

## Imports

In [2]:
import torch
import numpy as np
from safetensors.torch import load_file
from pathlib import Path

## Device Detection

In [3]:
# Detect available device
if torch.cuda.is_available():
    device = 'cuda'
elif torch.backends.mps.is_available():
    device = 'mps'
else:
    device = 'cpu'

print(f"Using device: {device}")

Using device: mps


## Load Data

In [4]:
# Load W in bfloat16
W_path = Path(f"../tensors/{MODEL_NAME}/W.safetensors")
W_bf16 = load_file(W_path)["W"]

print(f"Loaded W: {W_bf16.shape}")

Loaded W: torch.Size([151936, 2560])


In [5]:
# Load black hole data
bh_path = Path(f"../tensors/{MODEL_NAME}/1.8e_black_hole_masks.safetensors")
bh_data = load_file(bh_path)

bh1_token_ids = bh_data["bh1_token_ids"].to(torch.int64)
bh2_token_ids = bh_data["bh2_token_ids"].to(torch.int64)
bh3_token_ids = bh_data["bh3_token_ids"].to(torch.int64)
bh4_token_ids = bh_data["bh4_token_ids"].to(torch.int64)

print(f"\nLoaded black holes:")
print(f"  BH1: {len(bh1_token_ids):,} tokens")
print(f"  BH2: {len(bh2_token_ids):,} tokens")
print(f"  BH3: {len(bh3_token_ids):,} tokens")
print(f"  BH4: {len(bh4_token_ids):,} tokens")


Loaded black holes:
  BH1: 866 tokens
  BH2: 734 tokens
  BH3: 329 tokens
  BH4: 249 tokens


## Extract Black Hole Representative Vectors

In [6]:
print("\nExtracting black hole representative vectors...\n")

# Get first token from each black hole as representative
bh_token_ids = [
    bh1_token_ids[0].item(),
    bh2_token_ids[0].item(),
    bh3_token_ids[0].item(),
    bh4_token_ids[0].item()
]

# Get vectors from W
# Work in float32 for numerical stability in SVD
bh_vectors = []
for i, token_id in enumerate(bh_token_ids, 1):
    vector = W_bf16[token_id].to(torch.float32)
    bh_vectors.append(vector)
    print(f"BH{i}: Token {token_id}")

print(f"\n✓ Extracted {len(bh_vectors)} representative vectors (float32 for numerical stability)")


Extracting black hole representative vectors...

BH1: Token 80091
BH2: Token 125
BH3: Token 124
BH4: Token 123939

✓ Extracted 4 representative vectors (float32 for numerical stability)


## Compute Displacement Vectors

In [7]:
print("\nComputing displacement vectors...\n")

# Use BH1 as the origin
origin = bh_vectors[0]

# Compute displacements: BH2-BH1, BH3-BH1, BH4-BH1
displacements = []
for i in range(1, 4):
    disp = bh_vectors[i] - origin
    displacements.append(disp)
    
    # Compute L2 norm
    norm = torch.norm(disp).item()
    print(f"BH{i+1} - BH1: ||v|| = {norm:.6e}")

print(f"\n✓ Computed {len(displacements)} displacement vectors")


Computing displacement vectors...

BH2 - BH1: ||v|| = 3.413718e-05
BH3 - BH1: ||v|| = 3.497530e-05
BH4 - BH1: ||v|| = 1.705985e-05

✓ Computed 3 displacement vectors


## Stack into Matrix and Compute SVD

In [8]:
print("\nComputing SVD...\n")

# Stack displacement vectors into a 3 × 2560 matrix
# Each row is one displacement vector
displacement_matrix = torch.stack(displacements, dim=0)  # Shape: (3, 2560)

print(f"Displacement matrix shape: {displacement_matrix.shape}")

# Compute SVD: X = U @ diag(S) @ V^T
U, S, Vt = torch.linalg.svd(displacement_matrix, full_matrices=False)

print(f"\nSingular values:")
for i, s in enumerate(S, 1):
    print(f"  σ{i} = {s.item():.6e}")

print(f"\n✓ SVD complete")


Computing SVD...

Displacement matrix shape: torch.Size([3, 2560])

Singular values:
  σ1 = 4.627419e-05
  σ2 = 2.087784e-05
  σ3 = 1.012283e-05

✓ SVD complete


## Determine Rank (Intrinsic Dimensionality)

In [9]:
print("\nDetermining rank...\n")

# Count singular values above tolerance
rank = torch.sum(S > RANK_TOLERANCE).item()

print(f"Rank (tolerance = {RANK_TOLERANCE:.0e}): {rank}")
print()

# Interpret rank
if rank == 1:
    interpretation = "COLLINEAR - All 4 points lie on a line"
    geometric_object = "1D line segment"
elif rank == 2:
    interpretation = "COPLANAR - All 4 points lie in a plane (flat)"
    geometric_object = "2D planar polygon"
elif rank == 3:
    interpretation = "VOLUMETRIC - Points span a 3D subspace"
    geometric_object = "3D tetrahedron"
else:
    interpretation = f"UNEXPECTED - Rank is {rank}"
    geometric_object = "Unknown"

print(f"Interpretation: {interpretation}")
print(f"Geometric object: {geometric_object}")
print()

print(f"✓ Intrinsic dimensionality determined")


Determining rank...

Rank (tolerance = 1e-10): 3

Interpretation: VOLUMETRIC - Points span a 3D subspace
Geometric object: 3D tetrahedron

✓ Intrinsic dimensionality determined


## Analyze Singular Value Ratios

In [10]:
print("\nAnalyzing singular value ratios...\n")

# Compute ratios relative to largest singular value
if S[0] > 0:
    ratios = S / S[0]
    
    print(f"Singular values (normalized by σ1):")
    for i, (s, r) in enumerate(zip(S, ratios), 1):
        print(f"  σ{i} = {s.item():.6e}  (ratio: {r.item():.6e})")
    
    print()
    
    # Check if any dimensions are much smaller (near-degenerate)
    if rank > 1:
        ratio_2_to_1 = ratios[1].item()
        print(f"σ2/σ1 = {ratio_2_to_1:.6f}")
        
        if ratio_2_to_1 < 0.1:
            print("  → Second dimension is much smaller (nearly collinear)")
        elif ratio_2_to_1 > 0.9:
            print("  → Dimensions are roughly equal (isotropic)")
        else:
            print(f"  → Dimensions differ by factor of ~{1/ratio_2_to_1:.1f}x")
    
    if rank > 2:
        ratio_3_to_1 = ratios[2].item()
        print(f"\nσ3/σ1 = {ratio_3_to_1:.6f}")
        
        if ratio_3_to_1 < 0.1:
            print("  → Third dimension is much smaller (nearly planar)")
        else:
            print(f"  → Third dimension is significant (~{ratio_3_to_1*100:.1f}% of first)")
else:
    print("WARNING: Largest singular value is zero (all points identical?)")

print()
print("✓ Ratio analysis complete")


Analyzing singular value ratios...

Singular values (normalized by σ1):
  σ1 = 4.627419e-05  (ratio: 1.000000e+00)
  σ2 = 2.087784e-05  (ratio: 4.511768e-01)
  σ3 = 1.012283e-05  (ratio: 2.187576e-01)

σ2/σ1 = 0.451177
  → Dimensions differ by factor of ~2.2x

σ3/σ1 = 0.218758
  → Third dimension is significant (~21.9% of first)

✓ Ratio analysis complete


## Summary

In [11]:
print("\n" + "=" * 80)
print("SUMMARY: INTRINSIC DIMENSIONALITY OF 4 BLACK HOLES")
print("=" * 80)
print()

print(f"Points analyzed: 4 black holes")
print(f"Ambient space: 2560 dimensions")
print(f"Active dimensions (from 1.8h): 10")
print()

print(f"Intrinsic dimensionality (rank): {rank}")
print(f"Geometric interpretation: {interpretation}")
print(f"Geometric object: {geometric_object}")
print()

print("Singular values:")
for i, s in enumerate(S, 1):
    status = "✓ non-zero" if s > RANK_TOLERANCE else "✗ zero (within tolerance)"
    print(f"  σ{i} = {s.item():.6e}  {status}")

print()

if rank == 1:
    print("The 4 black holes are COLLINEAR:")
    print("  → They lie on a single line in 2560D space")
    print("  → Maximum degeneracy: 1D structure")
elif rank == 2:
    print("The 4 black holes are COPLANAR but not collinear:")
    print("  → They lie in a 2D plane embedded in 2560D space")
    print("  → The 4th point is NOT out-of-plane")
    print("  → The structure is FLAT (no volume)")
    print("  → They form a planar quadrilateral")
elif rank == 3:
    print("The 4 black holes are NOT coplanar:")
    print("  → They span a 3D subspace in 2560D space")
    print("  → The 4th point is out-of-plane")
    print("  → The structure has VOLUME")
    print("  → They form a tetrahedron")

print()
print("This rank is the TRUE intrinsic dimensionality, independent of")
print("coordinate system or which dimensions happen to be active.")
print()
print("=" * 80)


SUMMARY: INTRINSIC DIMENSIONALITY OF 4 BLACK HOLES

Points analyzed: 4 black holes
Ambient space: 2560 dimensions
Active dimensions (from 1.8h): 10

Intrinsic dimensionality (rank): 3
Geometric interpretation: VOLUMETRIC - Points span a 3D subspace
Geometric object: 3D tetrahedron

Singular values:
  σ1 = 4.627419e-05  ✓ non-zero
  σ2 = 2.087784e-05  ✓ non-zero
  σ3 = 1.012283e-05  ✓ non-zero

The 4 black holes are NOT coplanar:
  → They span a 3D subspace in 2560D space
  → The 4th point is out-of-plane
  → The structure has VOLUME
  → They form a tetrahedron

This rank is the TRUE intrinsic dimensionality, independent of
coordinate system or which dimensions happen to be active.

