# 06.1d: Compute Black Hole Distances

**Goal:** Compute and save distances from the degenerate singularity to all tokens.

This is a **generator** notebook. Run once, save results, use in downstream analysis.

We compute:
- Euclidean distance from black hole reference token to all 151,936 tokens
- Save to safetensors for fast loading in 06.1e

This is Volume 6: Pathologies and Singularities

## Parameters

In [1]:
TENSOR_DIR = "../data/tensors"
OUTPUT_FILE = "black_hole_distances_qwen3_4b_instruct_2507.safetensors"

# Reference token (center of black hole)
BLACK_HOLE_TOKEN_ID = 149415  # Hebrew 'נּ'

## Imports

In [2]:
import torch
from safetensors.torch import load_file, save_file
from pathlib import Path

print("Imports loaded successfully.")

Imports loaded successfully.


## Step 1: Load Centered Embeddings

In [3]:
gamma_prime_path = Path(TENSOR_DIR) / "gamma_centered_qwen3_4b_instruct_2507.safetensors"
gamma_prime = load_file(gamma_prime_path)['gamma_centered']

N, d = gamma_prime.shape

print(f"Loaded γ' (centered):")
print(f"  Tokens: {N:,}")
print(f"  Dimensions: {d:,}")

Loaded γ' (centered):
  Tokens: 151,936
  Dimensions: 2,560


## Step 2: Compute Distances from Black Hole

In [4]:
print(f"\nComputing distances from black hole token {BLACK_HOLE_TOKEN_ID}...")

# Get reference embedding
reference = gamma_prime[BLACK_HOLE_TOKEN_ID]

# Compute Euclidean distances to all tokens
distances = torch.norm(gamma_prime - reference, dim=1)

print(f"\nDistance statistics:")
print(f"  Min: {distances.min().item():.12f}")
print(f"  Max: {distances.max().item():.12f}")
print(f"  Mean: {distances.mean().item():.12f}")
print(f"  Median: {distances.median().item():.12f}")
print(f"\nComputation complete!")


Computing distances from black hole token 149415...

Distance statistics:
  Min: 0.000000000000
  Max: 1.571872115135
  Mean: 1.050221085548
  Median: 1.088246703148

Computation complete!


## Step 3: Save to Disk

In [5]:
output_path = Path(TENSOR_DIR) / OUTPUT_FILE

print(f"\nSaving distances to: {output_path}")

save_file(
    {
        'distances': distances,
        'reference_token_id': torch.tensor(BLACK_HOLE_TOKEN_ID),
    },
    output_path
)

file_size_mb = output_path.stat().st_size / (1024 * 1024)

print(f"\nSaved successfully!")
print(f"  File size: {file_size_mb:.2f} MB")
print(f"  Distances shape: {distances.shape}")
print(f"  Reference token: {BLACK_HOLE_TOKEN_ID}")


Saving distances to: ../data/tensors/black_hole_distances_qwen3_4b_instruct_2507.safetensors

Saved successfully!
  File size: 0.58 MB
  Distances shape: torch.Size([151936])
  Reference token: 149415
