# Exponent Closeness — Sanity Check

The previous notebook said all 13 black holes share the same exponent on all 2,560 dimensions. That's surprisingly clean. Let's verify this isn't a bug.

---

In [1]:
import torch
from safetensors.torch import load_file

In [2]:
tensors = load_file("../tensors/qwen3_4b_instruct_2507.safetensors")
centroids = tensors["black_hole_centroids"]  # [13, 2560] bf16

print(f"Shape: {centroids.shape}")
print(f"Dtype: {centroids.dtype}")

Shape: torch.Size([13, 2560])
Dtype: torch.bfloat16


## Check 1: Are the centroids actually different vectors?

Maybe we accidentally saved the same vector 13 times?

In [3]:
# Check pairwise equality
print("Are the 13 centroids distinct vectors?\n")

for i in range(13):
    for j in range(i+1, 13):
        same = torch.equal(centroids[i], centroids[j])
        if same:
            print(f"  BH{i+1} == BH{j+1}  ← DUPLICATE!")

# Count unique
unique_centroids = torch.unique(centroids, dim=0)
print(f"Unique centroid vectors: {unique_centroids.shape[0]} (should be 13)")

Are the 13 centroids distinct vectors?

Unique centroid vectors: 13 (should be 13)


## Check 2: What do the actual values look like?

Pick a few dimensions and show the raw values across all 13 black holes.

In [4]:
# Look at first 5 dimensions
print("First 5 dimensions, all 13 black holes:\n")
print("       dim0       dim1       dim2       dim3       dim4")
for i in range(13):
    vals = centroids[i, :5].float()
    print(f"BH{i+1:02d}  {vals[0]:+.6f}  {vals[1]:+.6f}  {vals[2]:+.6f}  {vals[3]:+.6f}  {vals[4]:+.6f}")

First 5 dimensions, all 13 black holes:

       dim0       dim1       dim2       dim3       dim4
BH01  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH02  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH03  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH04  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH05  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH06  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH07  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH08  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH09  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH10  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH11  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH12  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869
BH13  +0.006073  +0.013245  +0.011780  +0.036621  +0.015869


## Check 3: Exponent extraction — is our bit manipulation correct?

In [5]:
def extract_bf16_exponent(tensor_bf16):
    """Extract the 8-bit exponent from each bfloat16 value."""
    bits = tensor_bf16.view(torch.uint16).to(torch.int32)
    exponent = (bits >> 7) & 0xFF
    return exponent

# Test on known values
test_vals = torch.tensor([1.0, 2.0, 0.5, 0.25, 0.125, 4.0, 8.0, 0.0625], dtype=torch.bfloat16)
test_exps = extract_bf16_exponent(test_vals)

print("Exponent extraction test:")
print("  Value    | Exponent (raw) | Expected (bias=127)")
print("  ---------|----------------|--------------------")
for v, e in zip(test_vals.tolist(), test_exps.tolist()):
    # For value 2^n, exponent should be 127 + n
    print(f"  {v:8.4f} | {e:14d} |")

Exponent extraction test:
  Value    | Exponent (raw) | Expected (bias=127)
  ---------|----------------|--------------------
    1.0000 |            127 |
    2.0000 |            128 |
    0.5000 |            126 |
    0.2500 |            125 |
    0.1250 |            124 |
    4.0000 |            129 |
    8.0000 |            130 |
    0.0625 |            123 |


In [6]:
# 1.0 = 2^0 → exponent = 127
# 2.0 = 2^1 → exponent = 128
# 0.5 = 2^-1 → exponent = 126
# etc.

expected = [127, 128, 126, 125, 124, 129, 130, 123]
actual = test_exps.tolist()

print("\nVerification:")
for v, e, exp in zip(test_vals.tolist(), actual, expected):
    match = "✓" if e == exp else "✗"
    print(f"  {v}: got {e}, expected {exp} {match}")


Verification:
  1.0: got 127, expected 127 ✓
  2.0: got 128, expected 128 ✓
  0.5: got 126, expected 126 ✓
  0.25: got 125, expected 125 ✓
  0.125: got 124, expected 124 ✓
  4.0: got 129, expected 129 ✓
  8.0: got 130, expected 130 ✓
  0.0625: got 123, expected 123 ✓


## Check 4: What's the exponent distribution across centroids?

In [7]:
exponents = extract_bf16_exponent(centroids)  # [13, 2560]

print(f"Exponent stats across all 13×2560 = {13*2560:,} values:")
print(f"  Min: {exponents.min().item()}")
print(f"  Max: {exponents.max().item()}")
print(f"  Unique exponents: {torch.unique(exponents).tolist()}")

Exponent stats across all 13×2560 = 33,280 values:
  Min: 107
  Max: 123
  Unique exponents: [107, 108, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123]


In [8]:
# For each dimension, what's the exponent? (should be same across all 13 BHs)
# Let's verify by checking variance per dimension
exp_per_dim = exponents.float()  # [13, 2560]
variance_per_dim = exp_per_dim.var(dim=0)  # [2560]

print(f"Variance of exponent across 13 BHs, per dimension:")
print(f"  Min variance: {variance_per_dim.min().item()}")
print(f"  Max variance: {variance_per_dim.max().item()}")
print(f"  Dimensions with zero variance: {(variance_per_dim == 0).sum().item()} / 2560")

Variance of exponent across 13 BHs, per dimension:
  Min variance: 0.0
  Max variance: 0.0
  Dimensions with zero variance: 2560 / 2560


## Check 5: Maybe they're just VERY close in magnitude?

Let's look at the actual magnitude range of the centroid values.

In [9]:
# Get absolute values
abs_centroids = centroids.float().abs()

print("Magnitude statistics:")
print(f"  Overall min |value|: {abs_centroids.min().item():.6f}")
print(f"  Overall max |value|: {abs_centroids.max().item():.6f}")
print(f"  Overall mean |value|: {abs_centroids.mean().item():.6f}")

# Per black hole
print("\nPer black hole:")
for i in range(13):
    row = abs_centroids[i]
    print(f"  BH{i+1:02d}: min={row.min().item():.6f}, max={row.max().item():.6f}, mean={row.mean().item():.6f}")

Magnitude statistics:
  Overall min |value|: 0.000001
  Overall max |value|: 0.080078
  Overall mean |value|: 0.004804

Per black hole:
  BH01: min=0.000001, max=0.080078, mean=0.004804
  BH02: min=0.000001, max=0.080078, mean=0.004804
  BH03: min=0.000001, max=0.080078, mean=0.004804
  BH04: min=0.000001, max=0.080078, mean=0.004804
  BH05: min=0.000001, max=0.080078, mean=0.004804
  BH06: min=0.000001, max=0.080078, mean=0.004804
  BH07: min=0.000001, max=0.080078, mean=0.004804
  BH08: min=0.000001, max=0.080078, mean=0.004804
  BH09: min=0.000001, max=0.080078, mean=0.004804
  BH10: min=0.000001, max=0.080078, mean=0.004804
  BH11: min=0.000001, max=0.080078, mean=0.004804
  BH12: min=0.000001, max=0.080078, mean=0.004804
  BH13: min=0.000001, max=0.080078, mean=0.004804


In [10]:
# What's the range of magnitudes that would fit in a single exponent?
# For exponent E, the range is [2^(E-127), 2^(E-126))
# So a single exponent spans a factor of 2.

# If all values are between, say, 0.001 and 0.01, that's only one order of magnitude
# which could fit in ~3-4 exponent values.

# Let's see the actual exponent histogram
from collections import Counter

all_exps = exponents.flatten().tolist()
exp_counts = Counter(all_exps)

print("Exponent histogram (across all 13×2560 values):")
for exp in sorted(exp_counts.keys()):
    count = exp_counts[exp]
    # Convert to actual magnitude range
    low = 2**(exp - 127)
    high = 2**(exp - 126)
    print(f"  E={exp}: {count:6d} values  (magnitude range [{low:.2e}, {high:.2e}))")

Exponent histogram (across all 13×2560 values):
  E=107:     26 values  (magnitude range [9.54e-07, 1.91e-06))
  E=108:     13 values  (magnitude range [1.91e-06, 3.81e-06))
  E=110:     26 values  (magnitude range [7.63e-06, 1.53e-05))
  E=111:     26 values  (magnitude range [1.53e-05, 3.05e-05))
  E=112:    156 values  (magnitude range [3.05e-05, 6.10e-05))
  E=113:    234 values  (magnitude range [6.10e-05, 1.22e-04))
  E=114:    754 values  (magnitude range [1.22e-04, 2.44e-04))
  E=115:   1209 values  (magnitude range [2.44e-04, 4.88e-04))
  E=116:   2756 values  (magnitude range [4.88e-04, 9.77e-04))
  E=117:   4433 values  (magnitude range [9.77e-04, 1.95e-03))
  E=118:   8216 values  (magnitude range [1.95e-03, 3.91e-03))
  E=119:  10465 values  (magnitude range [3.91e-03, 7.81e-03))
  E=120:   3900 values  (magnitude range [7.81e-03, 1.56e-02))
  E=121:    793 values  (magnitude range [1.56e-02, 3.12e-02))
  E=122:    247 values  (magnitude range [3.12e-02, 6.25e-02))
  E=123

## Verdict

In [11]:
# Final check: for each dimension, do all 13 BHs have the same exponent?
exp_min_per_dim = exponents.min(dim=0).values
exp_max_per_dim = exponents.max(dim=0).values
all_same = (exp_min_per_dim == exp_max_per_dim).all().item()

print("="*60)
if all_same:
    print("CONFIRMED: All 13 black holes share the same exponent")
    print("on all 2,560 dimensions.")
    print("")
    print("This is NOT a bug. They're just that close together.")
else:
    num_diff = (exp_min_per_dim != exp_max_per_dim).sum().item()
    print(f"MISMATCH: {num_diff} dimensions have varying exponents.")
print("="*60)

CONFIRMED: All 13 black holes share the same exponent
on all 2,560 dimensions.

This is NOT a bug. They're just that close together.
