# Leaky Tensors as a Model of Neuromodulation in Deep Networks

This notebook demonstrates a novel approach to neural network training where we inject random noise into network weights at each training step. This simulates neuromodulation in biological neural networks and forces the model to learn robust representations.

## Key Concepts

- Leaky Tensors: Network weights that have additive noise injected during training
- Neuromodulation: A learned noise model with finite variance added to network units
- Covariance Shift Robustness: The model must learn to be robust to weight perturbations
- Noise Injection: Independent random noise added at every training step


In [None]:
# Import required libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
import numpy as np
import matplotlib.pyplot as plt
from typing import Dict, List

# Import from neural_model module
from neural_model import (LeakyLinear, LeakyConv2d, LeakyMLP, LeakyCNN, create_model)

# Set random seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)

# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')


## Noise Model for Neuromodulation

We create a learnable noise model that generates noise with finite variance. This noise model is trained alongside the main network to find optimal noise patterns that improve robustness.


In [None]:
class NoiseModel(nn.Module):
    """Learnable noise model that generates noise with finite variance."""
    def __init__(self, layer_shapes: Dict[str, tuple]):
        super(NoiseModel, self).__init__()
        self.layer_shapes = layer_shapes
        
        # Learnable variance parameters for each layer (log scale for stability)
        self.log_variances = nn.ParameterDict({
            name: nn.Parameter(torch.zeros(1))
            for name in layer_shapes.keys()
        })
    
    def generate_noise(self) -> Dict[str, torch.Tensor]:
        """Generate noise tensors for each layer."""
        noise_dict = {}
        for name, shape in self.layer_shapes.items():
            # Convert log variance to standard deviation
            std = torch.exp(0.5 * self.log_variances[name])
            # Clamp to prevent explosion
            std = torch.clamp(std, min=1e-6, max=0.5)
            # Generate Gaussian noise
            noise = torch.randn(shape, device=std.device) * std
            noise_dict[name] = noise
        return noise_dict
    
    def get_variances(self) -> Dict[str, float]:
        """Get current variance values for monitoring."""
        return {name: torch.exp(log_var).item() for name, log_var in self.log_variances.items()}


## Data Preparation

We'll use MNIST dataset for demonstration. The dataset is simple enough to train quickly while being complex enough to demonstrate the neuromodulation effects.


In [None]:
# Data preparation
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create data loaders
batch_size = 128
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

print(f'Training samples: {len(train_dataset)}')
print(f'Test samples: {len(test_dataset)}')


## Model Creation and Initialization

We create a LeakyMLP model that supports noise injection. We also initialize the noise model with the appropriate shapes for each layer.


In [None]:
# Create the main model using the factory function
model = create_model(
    model_type='mlp',
    input_dim=28*28,
    hidden_dims=[512, 256],
    output_dim=10,
    dropout=0.2
).to(device)

# Get layer shapes for noise model
layer_shapes = {}
for i, layer in enumerate(model.get_leaky_layers()):
    layer_shapes[f'layer_{i}'] = layer.weight.shape

# Create noise model
noise_model = NoiseModel(layer_shapes).to(device)

print(f'Model architecture:\n{model}')
print(f'\nNoise model layer shapes: {layer_shapes}')


## Training with Neuromodulation

The training loop injects noise at every step, forcing the model to learn robust representations. We train both the main model and the noise model simultaneously.


In [None]:
def train_with_neuromodulation(model, noise_model, train_loader, epochs=5, lr=0.001):
    """Train model with learnable neuromodulation."""
    criterion = nn.CrossEntropyLoss()
    
    # Separate optimizers for model and noise model
    model_optimizer = optim.Adam(model.parameters(), lr=lr)
    noise_optimizer = optim.Adam(noise_model.parameters(), lr=lr * 0.1)
    
    history = {'train_loss': [], 'train_acc': [], 'noise_vars': []}
    
    for epoch in range(epochs):
        model.train()
        running_loss = 0.0
        correct = 0
        total = 0
        
        for batch_idx, (data, target) in enumerate(train_loader):
            data, target = data.to(device), target.to(device)
            data = data.view(data.size(0), -1)  # Flatten for MLP
            
            # Generate and inject noise
            noise_dict = noise_model.generate_noise()
            model.inject_noise(noise_dict)
            
            # Forward pass
            model_optimizer.zero_grad()
            noise_optimizer.zero_grad()
            
            output = model(data)
            loss = criterion(output, target)
            
            # Backward pass
            loss.backward()
            model_optimizer.step()
            noise_optimizer.step()
            
            # Clear noise after update
            model.clear_noise()
            
            # Statistics
            running_loss += loss.item()
            _, predicted = output.max(1)
            total += target.size(0)
            correct += predicted.eq(target).sum().item()
            
            if batch_idx % 100 == 0:
                print(f'Epoch {epoch+1}/{epochs}, Batch {batch_idx}/{len(train_loader)}, '
                      f'Loss: {loss.item():.4f}, Acc: {100.*correct/total:.2f}%')
        
        # Epoch statistics
        epoch_loss = running_loss / len(train_loader)
        epoch_acc = 100. * correct / total
        noise_vars = noise_model.get_variances()
        
        history['train_loss'].append(epoch_loss)
        history['train_acc'].append(epoch_acc)
        history['noise_vars'].append(noise_vars)
        
        print(f'\nEpoch {epoch+1} Summary: Loss={epoch_loss:.4f}, Acc={epoch_acc:.2f}%')
        print(f'Noise variances: {noise_vars}\n')
    
    return history


## Run Training

Execute the training loop with neuromodulation enabled.


In [None]:
# Train the model
history = train_with_neuromodulation(model, noise_model, train_loader, epochs=3, lr=0.001)


## Evaluation

Evaluate the model on the test set both with and without noise injection to understand the robustness gained through neuromodulation training.


In [None]:
def evaluate(model, test_loader, with_noise=False, noise_model=None):
    """Evaluate model on test set."""
    model.eval()
    correct = 0
    total = 0
    
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            data = data.view(data.size(0), -1)
            
            if with_noise and noise_model is not None:
                noise_dict = noise_model.generate_noise()
                model.inject_noise(noise_dict)
            
            output = model(data)
            _, predicted = output.max(1)
            total += target.size(0)
            correct += predicted.eq(target).sum().item()
            
            if with_noise:
                model.clear_noise()
    
    accuracy = 100. * correct / total
    return accuracy

# Evaluate without noise
test_acc_clean = evaluate(model, test_loader, with_noise=False)
print(f'Test Accuracy (without noise): {test_acc_clean:.2f}%')

# Evaluate with noise
test_acc_noisy = evaluate(model, test_loader, with_noise=True, noise_model=noise_model)
print(f'Test Accuracy (with noise): {test_acc_noisy:.2f}%')

# Robustness metric
robustness = test_acc_noisy / test_acc_clean
print(f'\nRobustness ratio: {robustness:.4f}')


## Visualization

Visualize the training progress and learned noise variances to understand how the neuromodulation evolved during training.


In [None]:
# Plot training history
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

# Plot loss
axes[0].plot(history['train_loss'], marker='o')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].set_title('Training Loss')
axes[0].grid(True)

# Plot accuracy
axes[1].plot(history['train_acc'], marker='o', color='green')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy (%)')
axes[1].set_title('Training Accuracy')
axes[1].grid(True)

# Plot noise variances
for layer_name in history['noise_vars'][0].keys():
    variances = [epoch_vars[layer_name] for epoch_vars in history['noise_vars']]
    axes[2].plot(variances, marker='o', label=layer_name)
axes[2].set_xlabel('Epoch')
axes[2].set_ylabel('Variance')
axes[2].set_title('Learned Noise Variances')
axes[2].legend()
axes[2].grid(True)

plt.tight_layout()
plt.show()


## Robustness Analysis

Test the model's robustness to different levels of noise to demonstrate the benefits of neuromodulation training.


In [None]:
# Test robustness to varying noise levels
noise_scales = [0.0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3]
accuracies = []

for scale in noise_scales:
    model.eval()
    correct = 0
    total = 0
    
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            data = data.view(data.size(0), -1)
            
            # Inject fixed-scale noise
            if scale > 0:
                noise_dict = {}
                for i, layer in enumerate(model.get_leaky_layers()):
                    noise = torch.randn_like(layer.weight) * scale
                    noise_dict[f'layer_{i}'] = noise
                model.inject_noise(noise_dict)
            
            output = model(data)
            _, predicted = output.max(1)
            total += target.size(0)
            correct += predicted.eq(target).sum().item()
            
            if scale > 0:
                model.clear_noise()
    
    acc = 100. * correct / total
    accuracies.append(acc)
    print(f'Noise scale {scale:.2f}: Accuracy = {acc:.2f}%')

# Plot robustness curve
plt.figure(figsize=(8, 5))
plt.plot(noise_scales, accuracies, marker='o', linewidth=2, markersize=8)
plt.xlabel('Noise Scale (Standard Deviation)', fontsize=12)
plt.ylabel('Test Accuracy (%)', fontsize=12)
plt.title('Model Robustness to Weight Noise', fontsize=14)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()


## Conclusion

This notebook demonstrated leaky tensors as a model of neuromodulation in deep networks. Key findings:

1. The model successfully learns to be robust to weight noise injected at every training step
2. The learnable noise model adapts its variance parameters to find optimal noise levels for each layer
3. Training with neuromodulation creates networks that maintain performance even when weights are perturbed
4. This approach simulates biological neuromodulation where neural systems must operate reliably despite ongoing perturbations

The leaky tensor framework provides a principled way to study robustness and neuromodulation in artificial neural networks, with potential applications in creating more robust and adaptive AI systems.
