# üî¨ Delta Observer: Quick-Start Demo

**Fast overview of the Delta Observer project with pre-computed results**

This notebook provides a quick demonstration of the key findings from our paper:
> *"Delta Observer: Learning Continuous Semantic Manifolds Between Neural Network Representations"*

üìÑ **Paper:** [OSF MetaArXiv](https://doi.org/10.17605/OSF.IO/CNJTP)  
üîó **Code:** [github.com/EntroMorphic/delta-observer](https://github.com/EntroMorphic/delta-observer)  
üìß **Contact:** tripp@entromorphic.com

---

## üéØ Key Findings

| Finding | Evidence |
|---------|----------|
| **Linear accessibility without clustering** | R¬≤=0.9879, Silhouette=-0.02 |
| **Transient clustering** | Silhouette peaks at 0.33, then dissolves |
| **Online beats post-hoc** | +4% improvement over PCA baseline |

**Key Insight:** Clustering is *scaffolding*, not *structure*. Networks build geometric organization to learn semantic concepts, then discard it.

---

## üì¶ Setup

In [None]:
# Install dependencies if needed (Colab)
import subprocess
import sys

def install_if_needed(package):
    try:
        __import__(package.replace('-', '_'))
    except ImportError:
        subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-q', package])

install_if_needed('umap-learn')
install_if_needed('scikit-learn')
install_if_needed('matplotlib')

print('‚úÖ Dependencies ready!')

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.metrics import r2_score, silhouette_score
import os

# Plotting style
plt.style.use('default')
plt.rcParams['figure.facecolor'] = 'white'
plt.rcParams['axes.facecolor'] = 'white'
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 11

# Colors for consistency
COLORS = {
    'r2': '#2ecc71',        # Green
    'silhouette': '#e74c3c', # Red
    'online': '#3498db',     # Blue
    'posthoc': '#9b59b6',    # Purple
    'pca': '#95a5a6'         # Gray
}

print('‚úÖ Imports complete!')

## üìÇ Load Pre-computed Data

We load the **Online Delta Observer's** latent space representations, computed by training the observer concurrently with source models on 4-bit binary addition.

In [None]:
# Clone repository if running in Colab
repo_dir = 'delta-observer'
if not os.path.exists(repo_dir) and not os.path.exists('../data'):
    print('üì• Cloning delta-observer repository...')
    !git clone https://github.com/EntroMorphic/delta-observer.git
    print('‚úÖ Repository cloned!')

# Smart path detection for different environments
latent_paths = [
    '../data/online_observer_latents.npz',
    'data/online_observer_latents.npz',
    'delta-observer/data/online_observer_latents.npz'
]

trajectory_paths = [
    '../data/online_observer_trajectory.npz',
    'data/online_observer_trajectory.npz',
    'delta-observer/data/online_observer_trajectory.npz'
]

latent_path = next((p for p in latent_paths if os.path.exists(p)), None)
trajectory_path = next((p for p in trajectory_paths if os.path.exists(p)), None)

if not latent_path or not trajectory_path:
    raise FileNotFoundError('Data files not found. Please ensure the repository is cloned correctly.')

print(f'üìÅ Loading latents from: {latent_path}')
print(f'üìÅ Loading trajectory from: {trajectory_path}')

# Load data
latents_data = np.load(latent_path)
trajectory_data = np.load(trajectory_path)

# Extract arrays
latent_space = latents_data['latents']         # Final latent representations
carry_counts = latents_data['carry_counts']    # 0-4 carries
mono_act = latents_data['mono_activations']    # Monolithic activations
comp_act = latents_data['comp_activations']    # Compositional activations

snapshots = trajectory_data['snapshots']       # Latent evolution during training
epochs = trajectory_data['epochs']             # Epoch numbers

print(f'\nüìä Data Summary:')
print(f'   Latent space: {latent_space.shape}')
print(f'   Carry counts: {np.bincount(carry_counts)}')
print(f'   Trajectory snapshots: {snapshots.shape}')
print(f'   Epochs recorded: {epochs[0]} to {epochs[-1]}')
print('\n‚úÖ Data loaded successfully!')

---

## üé® Visualization 1: The Latent Space

Project the 16D latent space to 2D using UMAP/PCA for visualization. Points are colored by **carry count** (the semantic variable we're studying).

In [None]:
# Try UMAP, fall back to PCA
try:
    from umap import UMAP
    reducer = UMAP(n_components=2, random_state=42, n_neighbors=15, min_dist=0.1)
    method_name = 'UMAP'
except ImportError:
    reducer = PCA(n_components=2, random_state=42)
    method_name = 'PCA'

latent_2d = reducer.fit_transform(latent_space)

# Create figure
fig, ax = plt.subplots(figsize=(10, 8))

scatter = ax.scatter(latent_2d[:, 0], latent_2d[:, 1],
                     c=carry_counts, cmap='viridis',
                     s=60, alpha=0.7, edgecolors='white', linewidth=0.5)

cbar = plt.colorbar(scatter, ax=ax)
cbar.set_label('Carry Count', fontsize=12)
cbar.set_ticks([0, 1, 2, 3, 4])

# Compute metrics for annotation
reg = LinearRegression().fit(latent_space, carry_counts)
r2 = r2_score(carry_counts, reg.predict(latent_space))
sil = silhouette_score(latent_space, carry_counts)

# Add metrics box
textstr = f'R¬≤ = {r2:.4f}\nSilhouette = {sil:.4f}'
props = dict(boxstyle='round', facecolor='white', alpha=0.9, edgecolor='gray')
ax.text(0.02, 0.98, textstr, transform=ax.transAxes, fontsize=12,
        verticalalignment='top', bbox=props, fontweight='bold')

ax.set_xlabel(f'{method_name} Dimension 1', fontsize=12)
ax.set_ylabel(f'{method_name} Dimension 2', fontsize=12)
ax.set_title('üî¨ Online Delta Observer Latent Space\n(Colored by Carry Count)', 
             fontsize=14, fontweight='bold')

plt.tight_layout()
plt.show()

print(f'\nüìå Notice: Points are NOT clustered by color, yet carry count is highly predictable!')

---

## üìà Key Finding 1: Linear Accessibility

**Question:** Can we predict semantic properties (carry count) from the latent space using a simple linear model?

**Answer:** Yes! R¬≤ = 0.9879 ‚Äî nearly perfect linear prediction.

In [None]:
# Train linear probe
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    latent_space, carry_counts, test_size=0.2, random_state=42
)

probe = Ridge(alpha=1.0)
probe.fit(X_train, y_train)
y_pred = probe.predict(X_test)
r2_test = r2_score(y_test, y_pred)

# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Left: Prediction scatter
ax1 = axes[0]
ax1.scatter(y_test, y_pred, alpha=0.6, c=COLORS['r2'], s=50)
ax1.plot([0, 4], [0, 4], 'k--', linewidth=2, label='Perfect prediction')
ax1.set_xlabel('True Carry Count', fontsize=12)
ax1.set_ylabel('Predicted Carry Count', fontsize=12)
ax1.set_title(f'üìä Linear Probe Performance\nR¬≤ = {r2_test:.4f}', fontsize=13, fontweight='bold')
ax1.legend(fontsize=10)
ax1.grid(True, alpha=0.3)
ax1.set_xlim(-0.5, 4.5)
ax1.set_ylim(-0.5, 4.5)

# Right: Metric interpretation
ax2 = axes[1]
ax2.barh(['Linear\nAccessibility\n(R¬≤)'], [r2_test], color=COLORS['r2'], alpha=0.8, height=0.5)
ax2.axvline(0.95, color='green', linestyle='--', alpha=0.7, linewidth=2)
ax2.text(0.96, 0, 'Excellent', fontsize=10, color='green', va='center')
ax2.set_xlim(0, 1.05)
ax2.set_xlabel('Score', fontsize=12)
ax2.set_title('üìà Linear Accessibility Score', fontsize=13, fontweight='bold')
ax2.text(r2_test + 0.02, 0, f'{r2_test:.4f}', va='center', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

print('\n' + '='*60)
print('‚úÖ LINEAR ACCESSIBILITY: R¬≤ = {:.4f}'.format(r2_test))
print('   ‚Üí {:.1%} of carry count variance explained by a LINEAR model'.format(r2_test))
print('='*60)

---

## üìâ Key Finding 2: No Geometric Clustering

**Question:** Are points with similar carry counts clustered together in space?

**Answer:** No! Silhouette ‚âà -0.02 ‚Äî no meaningful geometric clustering.

In [None]:
# Compute silhouette score
silhouette = silhouette_score(latent_space, carry_counts)

# Visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Left: Silhouette interpretation scale
ax1 = axes[0]
scale_values = np.linspace(-0.2, 1.0, 100)
scale_colors = plt.cm.RdYlGn((scale_values + 0.2) / 1.2)
for i, (v, c) in enumerate(zip(scale_values[:-1], scale_colors[:-1])):
    ax1.barh(0, scale_values[i+1] - v, left=v, color=c, height=0.3)

ax1.axvline(silhouette, color='black', linewidth=3, label=f'Our result: {silhouette:.3f}')
ax1.axvline(0, color='gray', linestyle='--', linewidth=1.5, alpha=0.7)
ax1.scatter([silhouette], [0], color='black', s=200, zorder=5, marker='v')

ax1.set_xlim(-0.25, 1.05)
ax1.set_ylim(-0.3, 0.3)
ax1.set_xlabel('Silhouette Score', fontsize=12)
ax1.set_title('üìâ Silhouette Score Scale', fontsize=13, fontweight='bold')
ax1.set_yticks([])

# Labels
ax1.text(-0.15, -0.2, 'No\nClustering', ha='center', fontsize=9, color='#c0392b')
ax1.text(0.5, -0.2, 'Weak\nClustering', ha='center', fontsize=9, color='#f39c12')
ax1.text(0.85, -0.2, 'Strong\nClustering', ha='center', fontsize=9, color='#27ae60')
ax1.legend(loc='upper right', fontsize=10)

# Right: Comparison bar
ax2 = axes[1]
ax2.barh(['Geometric\nClustering\n(Silhouette)'], [max(0, silhouette)], 
         color=COLORS['silhouette'], alpha=0.8, height=0.5)
ax2.axvline(0.3, color='red', linestyle='--', alpha=0.7, linewidth=2)
ax2.text(0.31, 0, 'Weak threshold', fontsize=10, color='red', va='center')
ax2.set_xlim(-0.1, 0.5)
ax2.set_xlabel('Score', fontsize=12)
ax2.set_title('üìä Clustering Score', fontsize=13, fontweight='bold')
ax2.text(0.02, 0, f'{silhouette:.4f}', va='center', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

print('\n' + '='*60)
print('‚ùå GEOMETRIC CLUSTERING: Silhouette = {:.4f}'.format(silhouette))
print('   ‚Üí Points are NOT organized into discrete clusters')
print('='*60)

---

## üåü The Paradox: Accessibility Without Clustering

**High R¬≤ + Low Silhouette = Continuous Semantic Gradients**

This challenges the assumption that interpretability requires discrete, spatially separated feature clusters.

In [None]:
# Summary visualization
fig, ax = plt.subplots(figsize=(10, 6))

metrics = ['Linear Accessibility\n(R¬≤)', 'Geometric Clustering\n(Silhouette)']
values = [r2_test, max(0, silhouette)]  # Clip silhouette for visualization
colors = [COLORS['r2'], COLORS['silhouette']]

bars = ax.barh(metrics, values, color=colors, alpha=0.8, height=0.6, edgecolor='black')

ax.set_xlim(0, 1.1)
ax.set_xlabel('Score', fontsize=12)
ax.set_title('üéØ The Accessibility-Clustering Paradox', fontsize=14, fontweight='bold')

# Add value labels
ax.text(r2_test + 0.02, 0, f'{r2_test:.4f}', va='center', fontsize=12, fontweight='bold', color=COLORS['r2'])
ax.text(max(0.02, silhouette) + 0.02, 1, f'{silhouette:.4f}', va='center', fontsize=12, fontweight='bold', color=COLORS['silhouette'])

# Add interpretation
ax.axvline(0.5, color='gray', linestyle='--', alpha=0.5)

# Text box with key insight
insight_text = ('KEY INSIGHT:\n'
                '‚Ä¢ R¬≤ ‚âà 0.99: Semantics are linearly accessible\n'
                '‚Ä¢ Silhouette ‚âà 0: No geometric clustering\n\n'
                '‚Üí Information exists as CONTINUOUS GRADIENTS,\n'
                '   not discrete clusters!')
props = dict(boxstyle='round', facecolor='lightyellow', alpha=0.9, edgecolor='orange')
ax.text(0.55, 0.5, insight_text, transform=ax.transAxes, fontsize=10,
        verticalalignment='center', bbox=props)

plt.tight_layout()
plt.show()

---

## ‚è≥ Key Finding 3: Transient Clustering

**The most surprising discovery:** Clustering isn't absent‚Äîit's *transient*.

During training, geometric clustering emerges (Silhouette peaks at ~0.33) then dissolves to near-zero in the final representation.

**Insight:** Clustering is **scaffolding**, not **structure**.

In [None]:
# Compute metrics at each epoch
r2_trajectory = []
sil_trajectory = []

for i, epoch in enumerate(epochs):
    z = snapshots[i]
    
    # R¬≤
    reg = LinearRegression().fit(z, carry_counts)
    r2_trajectory.append(r2_score(carry_counts, reg.predict(z)))
    
    # Silhouette
    try:
        sil_trajectory.append(silhouette_score(z, carry_counts))
    except:
        sil_trajectory.append(0)

r2_trajectory = np.array(r2_trajectory)
sil_trajectory = np.array(sil_trajectory)

# Find peak
peak_idx = np.argmax(sil_trajectory)
peak_epoch = epochs[peak_idx]
peak_sil = sil_trajectory[peak_idx]

# Visualization
fig, ax1 = plt.subplots(figsize=(12, 6))

# R¬≤ on left axis
ax1.set_xlabel('Training Epoch', fontsize=12)
ax1.set_ylabel('R¬≤ (Linear Accessibility)', color=COLORS['r2'], fontsize=12)
line1, = ax1.plot(epochs, r2_trajectory, color=COLORS['r2'], linewidth=2.5,
                  marker='o', markersize=4, label='R¬≤ (Accessibility)')
ax1.tick_params(axis='y', labelcolor=COLORS['r2'])
ax1.set_ylim(0, 1.05)

# Silhouette on right axis
ax2 = ax1.twinx()
ax2.set_ylabel('Silhouette (Clustering)', color=COLORS['silhouette'], fontsize=12)
line2, = ax2.plot(epochs, sil_trajectory, color=COLORS['silhouette'], linewidth=2.5,
                  marker='s', markersize=4, label='Silhouette (Clustering)')
ax2.tick_params(axis='y', labelcolor=COLORS['silhouette'])
ax2.set_ylim(-0.1, 0.5)

# Annotate peak
ax2.annotate(f'Peak: {peak_sil:.2f}\n(epoch {peak_epoch})',
             xy=(peak_epoch, peak_sil),
             xytext=(peak_epoch + 30, peak_sil + 0.08),
             fontsize=11, fontweight='bold',
             arrowprops=dict(arrowstyle='->', color=COLORS['silhouette'], lw=2),
             color=COLORS['silhouette'])

# Title and legend
ax1.set_title('‚è≥ Transient Clustering: Scaffolding Emerges Then Dissolves',
              fontsize=14, fontweight='bold')

lines = [line1, line2]
labels = [l.get_label() for l in lines]
ax1.legend(lines, labels, loc='center right', fontsize=10)

plt.tight_layout()
plt.show()

print('\n' + '='*60)
print('‚è≥ TRANSIENT CLUSTERING DISCOVERY')
print(f'   Peak clustering: Silhouette = {peak_sil:.4f} at epoch {peak_epoch}')
print(f'   Final clustering: Silhouette = {sil_trajectory[-1]:.4f}')
print('\n   ‚Üí Clustering is SCAFFOLDING, not STRUCTURE!')
print('='*60)

---

## üèÜ Key Finding 4: Online Beats Post-hoc

The Online Delta Observer outperforms post-hoc analysis by **4%** because it captures temporal information unavailable to static analysis.

In [None]:
# Method comparison (using pre-computed values from paper)
methods = ['Online\nObserver', 'Post-hoc\nObserver', 'PCA\nBaseline']
r2_values_compare = [0.9879, 0.9505, 0.9482]
colors_compare = [COLORS['online'], COLORS['posthoc'], COLORS['pca']]

fig, ax = plt.subplots(figsize=(10, 6))

bars = ax.bar(methods, r2_values_compare, color=colors_compare, 
              edgecolor='black', linewidth=2, alpha=0.8)

# Value labels
for bar, val in zip(bars, r2_values_compare):
    ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.003,
            f'{val:.4f}', ha='center', va='bottom', fontsize=12, fontweight='bold')

# Delta annotation
ax.annotate('', xy=(0, 0.9879), xytext=(2, 0.9482),
            arrowprops=dict(arrowstyle='<->', color='black', lw=2))
ax.text(1, 0.963, '+4.0%', ha='center', fontsize=14, fontweight='bold', color='green')

ax.set_ylabel('R¬≤ (Linear Accessibility)', fontsize=12)
ax.set_title('üèÜ Method Comparison: Online Observation Wins', fontsize=14, fontweight='bold')
ax.set_ylim(0.93, 1.01)
ax.axhline(y=0.9482, color='gray', linestyle='--', alpha=0.5)

plt.tight_layout()
plt.show()

print('\n' + '='*60)
print('üèÜ ONLINE OBSERVATION ADVANTAGE')
print(f'   Online Observer: R¬≤ = 0.9879')
print(f'   PCA Baseline:    R¬≤ = 0.9482')
print(f'   Improvement:     +4.0%')
print('\n   ‚Üí Watching training captures temporal structure!')
print('='*60)

---

## üìù Summary

| Finding | Evidence | Implication |
|---------|----------|-------------|
| **Linear Accessibility** | R¬≤ = 0.9879 | Semantics extractable with simple probes |
| **No Final Clustering** | Silhouette ‚âà -0.02 | Information as gradients, not clusters |
| **Transient Clustering** | Peak Silhouette ‚âà 0.33 | Clustering is scaffolding for learning |
| **Online Wins** | +4% over PCA | Temporal info matters for interpretability |

**Key Insight:** The semantic primitive isn't in the final representation‚Äîit's in the *learning trajectory*.

In [None]:
# Final summary figure
fig = plt.figure(figsize=(14, 8))

# Create grid
gs = fig.add_gridspec(2, 3, hspace=0.3, wspace=0.3)

# 1. Latent space
ax1 = fig.add_subplot(gs[0, 0])
ax1.scatter(latent_2d[:, 0], latent_2d[:, 1], c=carry_counts, cmap='viridis', s=20, alpha=0.6)
ax1.set_title('Latent Space', fontsize=11, fontweight='bold')
ax1.set_xlabel(f'{method_name} 1', fontsize=9)
ax1.set_ylabel(f'{method_name} 2', fontsize=9)

# 2. Linear probe
ax2 = fig.add_subplot(gs[0, 1])
ax2.scatter(y_test, y_pred, alpha=0.5, c=COLORS['r2'], s=20)
ax2.plot([0, 4], [0, 4], 'k--', linewidth=1.5)
ax2.set_title(f'Linear Probe (R¬≤={r2_test:.3f})', fontsize=11, fontweight='bold')
ax2.set_xlabel('True', fontsize=9)
ax2.set_ylabel('Predicted', fontsize=9)

# 3. Transient clustering
ax3 = fig.add_subplot(gs[0, 2])
ax3.plot(epochs, sil_trajectory, color=COLORS['silhouette'], linewidth=2)
ax3.axhline(0, color='gray', linestyle='--', alpha=0.5)
ax3.scatter([peak_epoch], [peak_sil], color=COLORS['silhouette'], s=100, zorder=5)
ax3.set_title('Transient Clustering', fontsize=11, fontweight='bold')
ax3.set_xlabel('Epoch', fontsize=9)
ax3.set_ylabel('Silhouette', fontsize=9)

# 4. Method comparison (spanning bottom)
ax4 = fig.add_subplot(gs[1, :])
x_pos = np.arange(3)
bars = ax4.bar(x_pos, r2_values_compare, color=colors_compare, alpha=0.8, edgecolor='black')
ax4.set_xticks(x_pos)
ax4.set_xticklabels(methods)
ax4.set_ylabel('R¬≤', fontsize=11)
ax4.set_title('Method Comparison', fontsize=11, fontweight='bold')
ax4.set_ylim(0.93, 1.01)
for bar, val in zip(bars, r2_values_compare):
    ax4.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.002,
            f'{val:.4f}', ha='center', fontsize=10, fontweight='bold')

fig.suptitle('üî¨ Delta Observer: Summary of Key Findings', fontsize=16, fontweight='bold', y=1.02)

plt.tight_layout()
plt.show()

print('\n' + '='*70)
print('üî¨ DELTA OBSERVER - KEY TAKEAWAYS')
print('='*70)
print('\n1. Semantic information is LINEARLY ACCESSIBLE without clustering')
print('2. Clustering is TRANSIENT - scaffolding that dissolves')
print('3. ONLINE observation captures what post-hoc analysis misses')
print('\n‚Üí The semantic primitive is in the TRAJECTORY, not the final state!')
print('='*70)

---

## üöÄ Next Steps

Explore the detailed notebooks:

| Notebook | Description | Colab |
|----------|-------------|-------|
| **01_training_models** | Train monolithic and compositional models | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/EntroMorphic/delta-observer/blob/main/notebooks/01_training_models.ipynb) |
| **02_delta_observer_training** | Train the Delta Observer | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/EntroMorphic/delta-observer/blob/main/notebooks/02_delta_observer_training.ipynb) |
| **03_analysis_visualization** | Deep dive into geometric analysis | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/EntroMorphic/delta-observer/blob/main/notebooks/03_analysis_visualization.ipynb) |
| **99_full_reproduction** | Complete end-to-end reproduction | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/EntroMorphic/delta-observer/blob/main/notebooks/99_full_reproduction.ipynb) |

Or read the paper: [OSF MetaArXiv](https://doi.org/10.17605/OSF.IO/CNJTP)

---

**For Science!** üî¨üåä