# Algorithm 2: Gaussian Smearing (Boltz-2)

Smooth distance encoding using Gaussian basis functions.

## Source Code Location
- **File**: `Boltz-Ref-src/boltz-official/src/boltz/model/modules/affinity.py`
- **Class**: `GaussianSmearing`

In [None]:
import numpy as np
np.random.seed(42)

In [None]:
def gaussian_smearing(distances, start=0.0, stop=5.0, num_gaussians=50):
    """
    Gaussian Smearing for distance encoding.
    
    Converts scalar distances into smooth, differentiable embeddings
    using a series of Gaussian basis functions.
    
    Args:
        distances: Input distances [...]
        start: Minimum distance
        stop: Maximum distance
        num_gaussians: Number of Gaussian centers
    
    Returns:
        Gaussian features [..., num_gaussians]
    """
    # Evenly spaced Gaussian centers
    offset = np.linspace(start, stop, num_gaussians)
    
    # Width coefficient (shared for all Gaussians)
    coeff = -0.5 / (offset[1] - offset[0]) ** 2
    
    # Compute Gaussian features
    dist_expanded = distances[..., np.newaxis] - offset  # [..., num_gaussians]
    features = np.exp(coeff * dist_expanded ** 2)
    
    return features

In [None]:
# Test
print("Test: Gaussian Smearing")
print("="*60)

# Single distance
d = 2.5
features = gaussian_smearing(np.array([d]), start=0, stop=5, num_gaussians=10)
print(f"Distance: {d}")
print(f"Features: {features.flatten()}")

# Multiple distances
distances = np.array([0.5, 1.5, 2.5, 3.5, 4.5])
features = gaussian_smearing(distances, start=0, stop=5, num_gaussians=10)
print(f"\nDistances: {distances}")
print(f"Features shape: {features.shape}")
print(f"Peak indices: {np.argmax(features, axis=1)}")

In [None]:
# Visualization
print("\nVisualization: Gaussian basis functions")
print("="*60)

x = np.linspace(0, 5, 100)
features = gaussian_smearing(x, start=0, stop=5, num_gaussians=10)

print("Distance -> Gaussian activations (first 5 Gaussians):")
for i in range(0, 100, 20):
    print(f"  d={x[i]:.2f}: {features[i, :5].round(2)}")

## Key Insights

1. **Smooth Encoding**: Continuous, differentiable distance features
2. **RBF Basis**: Similar to radial basis function networks
3. **Overlap**: Adjacent Gaussians overlap for smooth transitions
4. **Affinity Use**: Encodes ligand-protein distances for binding prediction