# 04.1b: Compute Spherical Coordinates

**Goal:** Convert all tokens to spherical coordinates for a chosen coordinate system.

Given three orthogonal basis vectors (e.g., PC1, PC2, PC3), compute:
- **r**: radial distance from origin (gamma units)
- **phi**: longitude (-180° to +180°)
- **theta**: latitude (-90° to +90°, original)
- **theta_flat**: CDF-flattened latitude (-90° to +90°, visualization)

Save to `spherical_coords_pc{N}_pc{M}_pc{E}.safetensors` for reuse across notebooks.

This is a **generator** notebook - run once per coordinate system, then use the saved data.

## Parameters

In [1]:
TENSOR_DIR = "../data/tensors"

# Coordinate system: choose three orthogonal PCs
NORTH_PC = 1      # North pole direction
MERIDIAN_PC = 2   # Prime meridian direction
EQUINOX_PC = 3    # Equinox direction (completes right-handed system)

## Imports

In [2]:
import torch
import numpy as np
from safetensors.torch import save_file, load_file
from pathlib import Path

print("Imports loaded successfully.")

Imports loaded successfully.


## Step 1: Load γ' (Centered Embeddings)

In [3]:
gamma_prime_path = Path(TENSOR_DIR) / "gamma_centered_qwen3_4b_instruct_2507.safetensors"
gamma_prime = load_file(gamma_prime_path)['gamma_centered']

N, d = gamma_prime.shape

print(f"Loaded γ' (centered):")
print(f"  Tokens: {N:,}")
print(f"  Dimensions: {d:,}")

Loaded γ' (centered):
  Tokens: 151,936
  Dimensions: 2,560


## Step 2: Compute PCA

In [4]:
print("\nComputing covariance matrix...")
Cov = (gamma_prime.T @ gamma_prime) / (N - 1)

print("Computing eigendecomposition...")
eigenvalues, eigenvectors = torch.linalg.eigh(Cov)

# Sort descending
eigenvalues = eigenvalues.flip(0)
eigenvectors = eigenvectors.flip(1)

# Extract basis vectors
north_vector = eigenvectors[:, NORTH_PC - 1]
meridian_vector = eigenvectors[:, MERIDIAN_PC - 1]
equinox_vector = eigenvectors[:, EQUINOX_PC - 1]

print(f"\nPCA computed. Using coordinate system:")
print(f"  North: PC{NORTH_PC}")
print(f"  Meridian: PC{MERIDIAN_PC}")
print(f"  Equinox: PC{EQUINOX_PC}")


Computing covariance matrix...


Computing eigendecomposition...



PCA computed. Using coordinate system:
  North: PC1
  Meridian: PC2
  Equinox: PC3


## Step 3: Compute Cartesian Coordinates in Chosen Basis

In [5]:
print("\nProjecting tokens onto basis vectors...")

x = gamma_prime @ meridian_vector
y = gamma_prime @ equinox_vector
z = gamma_prime @ north_vector

print(f"  x (meridian): [{x.min().item():.6f}, {x.max().item():.6f}]")
print(f"  y (equinox):  [{y.min().item():.6f}, {y.max().item():.6f}]")
print(f"  z (north):    [{z.min().item():.6f}, {z.max().item():.6f}]")


Projecting tokens onto basis vectors...
  x (meridian): [-0.285209, 0.206568]
  y (equinox):  [-0.218484, 0.166067]
  z (north):    [-1.064268, 0.466445]


## Step 4: Convert to Spherical Coordinates

In [6]:
print("\nConverting to spherical coordinates...\n")

# Radial distance
r = torch.sqrt(x**2 + y**2 + z**2)

# Longitude: atan2(y, x)
phi_rad = torch.atan2(y, x)
phi_deg = torch.rad2deg(phi_rad)

# Latitude: asin(z/r)
theta_rad = torch.asin(torch.clamp(z / r, -1, 1))
theta_deg = torch.rad2deg(theta_rad)

print(f"Spherical coordinates computed:")
print(f"  r (radius):      [{r.min().item():.6f}, {r.max().item():.6f}] gamma units")
print(f"  phi (longitude): [{phi_deg.min().item():.2f}°, {phi_deg.max().item():.2f}°]")
print(f"  theta (latitude): [{theta_deg.min().item():.2f}°, {theta_deg.max().item():.2f}°]")


Converting to spherical coordinates...

Spherical coordinates computed:
  r (radius):      [0.002472, 1.095257] gamma units
  phi (longitude): [-179.99°, 180.00°]
  theta (latitude): [-89.92°, 89.68°]


## Step 5: Compute CDF-Flattened Latitude

In [7]:
print("\nComputing CDF-flattened latitude...\n")

# Sort latitudes
theta_sorted, sort_indices = torch.sort(theta_deg)

# Empirical CDF: linearly spaced from 0 to 1
cdf_values = torch.linspace(0, 1, N)

# Map to uniform [-90°, +90°]
theta_flat_sorted = 90.0 * (2.0 * cdf_values - 1.0)

# Unsort to get back to original token order
theta_flat = torch.zeros_like(theta_deg)
theta_flat[sort_indices] = theta_flat_sorted

print(f"CDF-flattened latitude:")
print(f"  theta_flat: [{theta_flat.min().item():.2f}°, {theta_flat.max().item():.2f}°]")
print(f"  (Preserves rank order, spreads to full range for visualization)")


Computing CDF-flattened latitude...

CDF-flattened latitude:
  theta_flat: [-90.00°, 90.00°]
  (Preserves rank order, spreads to full range for visualization)


## Step 6: Save to Disk

In [8]:
output_filename = f"spherical_coords_pc{NORTH_PC}_pc{MERIDIAN_PC}_pc{EQUINOX_PC}.safetensors"
output_path = Path(TENSOR_DIR) / output_filename

print(f"\nSaving to: {output_path}\n")

save_file(
    {
        'r': r,
        'phi_deg': phi_deg,
        'theta_deg': theta_deg,
        'theta_flat': theta_flat,
    },
    str(output_path)
)

print(f"Saved spherical coordinates for {N:,} tokens.")
print(f"\nCoordinate system: North=PC{NORTH_PC}, Meridian=PC{MERIDIAN_PC}, Equinox=PC{EQUINOX_PC}")
print(f"\nTensors saved:")
print(f"  r:          radial distance (gamma units)")
print(f"  phi_deg:    longitude (-180° to +180°)")
print(f"  theta_deg:  latitude (-90° to +90°, original)")
print(f"  theta_flat: latitude (-90° to +90°, CDF-flattened)")


Saving to: ../data/tensors/spherical_coords_pc1_pc2_pc3.safetensors

Saved spherical coordinates for 151,936 tokens.

Coordinate system: North=PC1, Meridian=PC2, Equinox=PC3

Tensors saved:
  r:          radial distance (gamma units)
  phi_deg:    longitude (-180° to +180°)
  theta_deg:  latitude (-90° to +90°, original)
  theta_flat: latitude (-90° to +90°, CDF-flattened)


## Summary

Converted all tokens to spherical coordinates in the chosen basis.

**To use different coordinate systems:**
1. Change `NORTH_PC`, `MERIDIAN_PC`, `EQUINOX_PC` parameters
2. Rerun this notebook
3. New tensor saved with appropriate filename

**Next steps:**
- Use 04.1c to visualize this coordinate system interactively
- Use 05.2a to inspect individual tokens by ID