# Classification with SingleDendrite Readout Layer

In this notebook, we use **SingleDendrite neurons for the readout layer** instead of a passive linear readout.

## The Problem with Single-Neuron Readout

SingleDendrite output `s` is **always non-negative** (it's a current in a superconducting loop).
This means:
- `sigmoid(s) >= 0.5` always
- The network **cannot predict class 0**!

## The Solution: Two Competing Neurons with CrossEntropyLoss

We use **two SingleDendrite readout neurons** as direct class scores:

```
           ┌─► SingleDendrite₀ ─► s₀ (class 0 score)
Hidden ────┤
           └─► SingleDendrite₁ ─► s₁ (class 1 score)

           P(class=1) = softmax([s₀, s₁])[1] = exp(s₁) / (exp(s₀) + exp(s₁))
```

**Training**: CrossEntropyLoss on `[s₀, s₁]` - each neuron gets its own gradient signal.

**Hardware inference**: Simple comparison `s₁ > s₀ ?` - no softmax needed!

## Why CrossEntropyLoss?

| Approach | Loss | Gradient to s₀ | Gradient to s₁ |
|----------|------|----------------|----------------|
| BCE on (s₁-s₀) | BCEWithLogitsLoss | -∂L/∂logit | +∂L/∂logit |
| CrossEntropy | CrossEntropyLoss | Direct from CE | Direct from CE |

CrossEntropyLoss gives each neuron **independent gradient signals**, potentially more stable training.

In [None]:
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

from soen_toolkit.core import (
    ConnectionConfig,
    LayerConfig,
    SimulationConfig,
    SOENModelCore,
)

torch.manual_seed(42)
np.random.seed(42)

print(f"PyTorch version: {torch.__version__}")

## 1. Generate Circle-in-Ring Dataset

Same nonlinear binary classification task as before.

In [None]:
def generate_circle_ring_data(n_samples=500, inner_radius=0.3, outer_radius_min=0.5, 
                               outer_radius_max=0.8, noise=0.05):
    """
    Generate 2D classification data: circle inside a ring.
    
    Class 0: Points inside inner circle (r < inner_radius)
    Class 1: Points in outer ring (outer_radius_min < r < outer_radius_max)
    """
    n_each = n_samples // 2
    
    # Class 0: Inner circle
    theta_inner = np.random.uniform(0, 2*np.pi, n_each)
    r_inner = np.random.uniform(0, inner_radius, n_each)
    x_inner = r_inner * np.cos(theta_inner) + np.random.normal(0, noise, n_each)
    y_inner = r_inner * np.sin(theta_inner) + np.random.normal(0, noise, n_each)
    
    # Class 1: Outer ring
    theta_outer = np.random.uniform(0, 2*np.pi, n_each)
    r_outer = np.random.uniform(outer_radius_min, outer_radius_max, n_each)
    x_outer = r_outer * np.cos(theta_outer) + np.random.normal(0, noise, n_each)
    y_outer = r_outer * np.sin(theta_outer) + np.random.normal(0, noise, n_each)
    
    # Combine
    X = np.vstack([
        np.column_stack([x_inner, y_inner]),
        np.column_stack([x_outer, y_outer])
    ])
    y = np.array([0] * n_each + [1] * n_each)
    
    # Shuffle
    idx = np.random.permutation(len(y))
    X, y = X[idx], y[idx]
    
    # Scale to SOEN operating range [0, 0.3]
    X = (X + 1) / 2 * 0.25 + 0.025  # Map [-1, 1] to [0.025, 0.275]
    
    return torch.FloatTensor(X), torch.FloatTensor(y)


# Generate data
N_SAMPLES = 500
X_data, y_data = generate_circle_ring_data(N_SAMPLES)

print(f"Dataset shape: X={X_data.shape}, y={y_data.shape}")
print(f"Class distribution: {(y_data == 0).sum().item()} inner, {(y_data == 1).sum().item()} outer")
print(f"X range: [{X_data.min():.3f}, {X_data.max():.3f}]")

# Visualize
plt.figure(figsize=(8, 8))
colors = ['blue', 'red']
for c in [0, 1]:
    mask = y_data == c
    label = 'Inner circle (class 0)' if c == 0 else 'Outer ring (class 1)'
    plt.scatter(X_data[mask, 0], X_data[mask, 1], c=colors[c], 
                alpha=0.6, s=30, label=label)
plt.xlabel('x')
plt.ylabel('y')
plt.title('Binary Classification: Circle vs Ring')
plt.legend()
plt.axis('equal')
plt.grid(True, alpha=0.3)
plt.show()

## 2. Prepare Data for SOEN

In [None]:
SEQ_LEN = 50  # Time steps for SOEN dynamics to settle

# Expand to sequence: [N, T, 2]
X_seq = X_data.unsqueeze(1).expand(-1, SEQ_LEN, -1).clone()
y_labels = y_data.unsqueeze(1)  # [N, 1]

print(f"SOEN input shape: {X_seq.shape}")
print(f"Labels shape: {y_labels.shape}")

## 3. Model Builders: Linear vs SingleDendrite Readout

We create two builders:
1. **Linear readout**: Output layer is `Input` type (linear projection) with dim=1
2. **SingleDendrite readout**: Output layer is `SingleDendrite` with **dim=2** (two competing neurons)

In [None]:
def build_classifier_linear_readout(hidden_dims, input_dim=2, dt=50.0):
    """
    Build a SOEN classifier with LINEAR readout (Input layer).
    
    Architecture: 2 (input) → [hidden SingleDendrites] → 1 (linear readout)
    """
    sim_cfg = SimulationConfig(
        dt=dt,
        input_type="state",
        track_phi=False,
        track_power=False,
    )
    
    layers = []
    connections = []
    
    # Input layer (dim=2 for x, y)
    layers.append(LayerConfig(
        layer_id=0,
        layer_type="Input",
        params={"dim": input_dim},
    ))
    
    # Hidden layers (SingleDendrite)
    prev_dim = input_dim
    for i, hidden_dim in enumerate(hidden_dims):
        layer_id = i + 1
        
        layers.append(LayerConfig(
            layer_id=layer_id,
            layer_type="SingleDendrite",
            params={
                "dim": hidden_dim,
                "solver": "FE",
                "source_func": "Heaviside_fit_state_dep",
                "phi_offset": 0.02,
                "bias_current": 1.98,
                "gamma_plus": 0.0005,
                "gamma_minus": 1e-6,
                "learnable_params": {
                    "phi_offset": False,
                    "bias_current": False,
                    "gamma_plus": False,
                    "gamma_minus": False,
                },
            },
        ))
        
        connections.append(ConnectionConfig(
            from_layer=layer_id - 1,
            to_layer=layer_id,
            connection_type="all_to_all",
            learnable=True,
            params={"init": "xavier_uniform"},
        ))
        
        prev_dim = hidden_dim
    
    # Output layer - LINEAR (Input type)
    output_layer_id = len(hidden_dims) + 1
    layers.append(LayerConfig(
        layer_id=output_layer_id,
        layer_type="Input",  # Linear readout
        params={"dim": 1},
    ))
    
    connections.append(ConnectionConfig(
        from_layer=output_layer_id - 1,
        to_layer=output_layer_id,
        connection_type="all_to_all",
        learnable=True,
        params={"init": "xavier_uniform"},
    ))
    
    model = SOENModelCore(
        sim_config=sim_cfg,
        layers_config=layers,
        connections_config=connections,
    )
    
    return model


def build_classifier_singledendrite_readout(hidden_dims, input_dim=2, dt=50.0, 
                                             readout_phi_offset=0.23):
    """
    Build a SOEN classifier with TWO-NEURON SINGLEDENDRITE readout.
    
    Architecture: 2 (input) → [hidden SingleDendrites] → 2 (SingleDendrite readout)
    
    The two readout neurons compete: logit = s₁ - s₀
    This allows the output to be positive or negative, enabling both class predictions.
    """
    sim_cfg = SimulationConfig(
        dt=dt,
        input_type="state",
        track_phi=False,
        track_power=False,
    )
    
    layers = []
    connections = []
    
    # Input layer (dim=2 for x, y)
    layers.append(LayerConfig(
        layer_id=0,
        layer_type="Input",
        params={"dim": input_dim},
    ))
    
    # Hidden layers (SingleDendrite)
    prev_dim = input_dim
    for i, hidden_dim in enumerate(hidden_dims):
        layer_id = i + 1
        
        layers.append(LayerConfig(
            layer_id=layer_id,
            layer_type="SingleDendrite",
            params={
                "dim": hidden_dim,
                "solver": "FE",
                "source_func": "Heaviside_fit_state_dep",
                "phi_offset": 0.02,
                "bias_current": 1.98,
                "gamma_plus": 0.0005,
                "gamma_minus": 1e-6,
                "learnable_params": {
                    "phi_offset": False,
                    "bias_current": False,
                    "gamma_plus": False,
                    "gamma_minus": False,
                },
            },
        ))
        
        connections.append(ConnectionConfig(
            from_layer=layer_id - 1,
            to_layer=layer_id,
            connection_type="all_to_all",
            learnable=True,
            params={"init": "xavier_uniform"},
        ))
        
        prev_dim = hidden_dim
    
    # Output layer - TWO SingleDendrite neurons (competing)
    output_layer_id = len(hidden_dims) + 1
    layers.append(LayerConfig(
        layer_id=output_layer_id,
        layer_type="SingleDendrite",
        params={
            "dim": 2,  # TWO neurons: one for each class
            "solver": "FE",
            "source_func": "Heaviside_fit_state_dep",
            "phi_offset": readout_phi_offset,  # At threshold!
            "bias_current": 1.98,
            "gamma_plus": 0.0005,
            "gamma_minus": 1e-6,
            "learnable_params": {
                "phi_offset": False,
                "bias_current": False,
                "gamma_plus": False,
                "gamma_minus": False,
            },
        },
    ))
    
    connections.append(ConnectionConfig(
        from_layer=output_layer_id - 1,
        to_layer=output_layer_id,
        connection_type="all_to_all",
        learnable=True,
        params={"init": "xavier_uniform"},
    ))
    
    model = SOENModelCore(
        sim_config=sim_cfg,
        layers_config=layers,
        connections_config=connections,
    )
    
    return model


def count_params(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)


# Test builders
print("Testing model builders...")
print("\nLinear readout models:")
for hidden_dims, name in [([4], "2→4→1"), ([8], "2→8→1")]:
    model = build_classifier_linear_readout(hidden_dims)
    n_params = count_params(model)
    layer_dims = [l.dim for l in model.layers]
    print(f"  {name}: dims={layer_dims}, params={n_params}")

print("\nSingleDendrite readout models (2 competing neurons):")
for hidden_dims, name in [([4], "2→4→2"), ([8], "2→8→2")]:
    model = build_classifier_singledendrite_readout(hidden_dims)
    n_params = count_params(model)
    layer_dims = [l.dim for l in model.layers]
    print(f"  {name}: dims={layer_dims}, params={n_params}")

## 4. Training Function

In [None]:
def train_classifier(model, X_train, y_train, n_epochs=300, lr=0.02, verbose=False, 
                     two_neuron_readout=False):
    """
    Train SOEN classifier.
    
    Args:
        two_neuron_readout: If True, use CrossEntropyLoss on [s₀, s₁] class scores.
                           If False, use BCEWithLogitsLoss on single output.
    """
    model.train()
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
    
    # Choose loss based on readout type
    if two_neuron_readout:
        criterion = nn.CrossEntropyLoss()
        # Convert labels to class indices for CrossEntropyLoss
        y_target = y_train.squeeze().long()  # [N] with values 0 or 1
    else:
        criterion = nn.BCEWithLogitsLoss()
        y_target = y_train  # [N, 1] float
    
    losses = []
    accuracies = []
    
    for epoch in range(n_epochs):
        optimizer.zero_grad()
        
        # Forward
        final_hist, _ = model(X_train)
        output = final_hist[:, -1, :]  # [N, output_dim]
        
        # Compute loss
        if two_neuron_readout:
            # Two neurons as class scores: [s₀, s₁]
            # CrossEntropyLoss applies softmax internally
            loss = criterion(output, y_target)  # output: [N, 2], y_target: [N]
        else:
            # Single output with BCE
            loss = criterion(output, y_target)
        
        # Backward
        loss.backward()
        torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
        optimizer.step()
        
        # Metrics
        with torch.no_grad():
            if two_neuron_readout:
                # Prediction: argmax of [s₀, s₁]
                preds = output.argmax(dim=1)  # [N]
                acc = (preds == y_target).float().mean().item()
            else:
                preds = (torch.sigmoid(output) > 0.5).float()
                acc = (preds == y_target).float().mean().item()
        
        losses.append(loss.item())
        accuracies.append(acc)
        
        if verbose and (epoch + 1) % 50 == 0:
            print(f"  Epoch {epoch+1}: Loss={loss.item():.4f}, Acc={acc:.4f}")
    
    return losses, accuracies


def evaluate_classifier(model, X_test, y_test, two_neuron_readout=False):
    """
    Evaluate classifier and return predictions.
    """
    model.eval()
    with torch.no_grad():
        final_hist, _ = model(X_test)
        output = final_hist[:, -1, :]  # [N, output_dim]
        
        if two_neuron_readout:
            # Softmax probabilities
            probs_all = torch.softmax(output, dim=1)  # [N, 2]
            probs = probs_all[:, 1].numpy()  # P(class=1)
            preds = output.argmax(dim=1).numpy()  # [N]
        else:
            probs = torch.sigmoid(output).squeeze().numpy()
            preds = (probs > 0.5).astype(float)
    
    y_true = y_test.squeeze().numpy()
    accuracy = (preds == y_true).mean()
    
    return preds, probs, accuracy

## 5. Comparison: Linear vs SingleDendrite Readout

We compare the same hidden architectures with two different readout types.

In [None]:
# Architectures to compare
HIDDEN_CONFIGS = {
    "4 hidden": [4],
    "8 hidden": [8],
    "16 hidden": [16],
    "4→4 deep": [4, 4],
    "8→8 deep": [8, 8],
}

N_EPOCHS = 400
LR = 0.02

results_linear = {}
results_sd = {}  # SingleDendrite readout

print("Training all architectures...")
print("=" * 70)

for name, hidden_dims in HIDDEN_CONFIGS.items():
    print(f"\n{name}:")
    
    # Linear readout (single neuron, no two_neuron_readout)
    model_linear = build_classifier_linear_readout(hidden_dims)
    n_params_linear = count_params(model_linear)
    losses_l, accs_l = train_classifier(model_linear, X_seq, y_labels, n_epochs=N_EPOCHS, lr=LR,
                                        two_neuron_readout=False)
    _, _, final_acc_l = evaluate_classifier(model_linear, X_seq, y_labels, two_neuron_readout=False)
    results_linear[name] = {
        'hidden_dims': hidden_dims,
        'n_params': n_params_linear,
        'losses': losses_l,
        'accuracies': accs_l,
        'final_acc': final_acc_l,
        'model': model_linear,
    }
    print(f"  Linear readout:       Acc={final_acc_l:.4f}, Params={n_params_linear}")
    
    # SingleDendrite readout (TWO competing neurons, use two_neuron_readout=True)
    model_sd = build_classifier_singledendrite_readout(hidden_dims, readout_phi_offset=0.23)
    n_params_sd = count_params(model_sd)
    losses_sd_vals, accs_sd_vals = train_classifier(model_sd, X_seq, y_labels, n_epochs=N_EPOCHS, lr=LR,
                                                     two_neuron_readout=True)  # KEY FIX!
    _, _, final_acc_sd = evaluate_classifier(model_sd, X_seq, y_labels, two_neuron_readout=True)
    results_sd[name] = {
        'hidden_dims': hidden_dims,
        'n_params': n_params_sd,
        'losses': losses_sd_vals,
        'accuracies': accs_sd_vals,
        'final_acc': final_acc_sd,
        'model': model_sd,
    }
    print(f"  SD readout (φ=0.23): Acc={final_acc_sd:.4f}, Params={n_params_sd}")

print("\n" + "=" * 70)
print("Training complete!")

## 6. Training Curves Comparison

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Top row: Loss curves
ax1, ax2 = axes[0]

ax1.set_title('Linear Readout - Loss')
for name, res in results_linear.items():
    ax1.plot(res['losses'], label=name, lw=1.5)
ax1.set_xlabel('Epoch')
ax1.set_ylabel('BCE Loss')
ax1.legend(fontsize=8)
ax1.grid(True, alpha=0.3)

ax2.set_title('SingleDendrite Readout (φ=0.23) - Loss')
for name, res in results_sd.items():
    ax2.plot(res['losses'], label=name, lw=1.5)
ax2.set_xlabel('Epoch')
ax2.set_ylabel('BCE Loss')
ax2.legend(fontsize=8)
ax2.grid(True, alpha=0.3)

# Bottom row: Accuracy curves
ax3, ax4 = axes[1]

ax3.set_title('Linear Readout - Accuracy')
for name, res in results_linear.items():
    ax3.plot(res['accuracies'], label=name, lw=1.5)
ax3.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5)
ax3.axhline(y=1.0, color='red', linestyle='--', alpha=0.3)
ax3.set_xlabel('Epoch')
ax3.set_ylabel('Accuracy')
ax3.legend(fontsize=8)
ax3.grid(True, alpha=0.3)
ax3.set_ylim(0.4, 1.05)

ax4.set_title('SingleDendrite Readout (φ=0.23) - Accuracy')
for name, res in results_sd.items():
    ax4.plot(res['accuracies'], label=name, lw=1.5)
ax4.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5)
ax4.axhline(y=1.0, color='red', linestyle='--', alpha=0.3)
ax4.set_xlabel('Epoch')
ax4.set_ylabel('Accuracy')
ax4.legend(fontsize=8)
ax4.grid(True, alpha=0.3)
ax4.set_ylim(0.4, 1.05)

plt.tight_layout()
plt.show()

## 7. Direct Comparison: Linear vs SingleDendrite Readout

In [None]:
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

names = list(HIDDEN_CONFIGS.keys())
x = np.arange(len(names))
width = 0.35

linear_accs = [results_linear[n]['final_acc'] for n in names]
sd_accs = [results_sd[n]['final_acc'] for n in names]

# Bar chart comparison
ax1 = axes[0]
bars1 = ax1.bar(x - width/2, linear_accs, width, label='Linear Readout', color='steelblue')
bars2 = ax1.bar(x + width/2, sd_accs, width, label='SingleDendrite Readout (φ=0.23)', color='coral')

ax1.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5, label='Random')
ax1.set_xticks(x)
ax1.set_xticklabels(names, rotation=30, ha='right')
ax1.set_ylabel('Accuracy')
ax1.set_title('Final Accuracy: Linear vs SingleDendrite Readout')
ax1.legend()
ax1.set_ylim(0.4, 1.05)
ax1.grid(True, alpha=0.3, axis='y')

# Add value labels
for bar, acc in zip(bars1, linear_accs):
    ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01, 
             f'{acc:.3f}', ha='center', va='bottom', fontsize=8)
for bar, acc in zip(bars2, sd_accs):
    ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01, 
             f'{acc:.3f}', ha='center', va='bottom', fontsize=8)

# Difference plot
ax2 = axes[1]
diffs = [sd - lin for sd, lin in zip(sd_accs, linear_accs)]
colors = ['green' if d > 0 else 'red' for d in diffs]
bars = ax2.bar(x, diffs, color=colors, alpha=0.7)
ax2.axhline(y=0, color='black', linewidth=1)
ax2.set_xticks(x)
ax2.set_xticklabels(names, rotation=30, ha='right')
ax2.set_ylabel('Accuracy Difference (SD - Linear)')
ax2.set_title('SingleDendrite Readout Advantage')
ax2.grid(True, alpha=0.3, axis='y')

for bar, diff in zip(bars, diffs):
    sign = '+' if diff > 0 else ''
    y_pos = diff + 0.005 if diff > 0 else diff - 0.015
    ax2.text(bar.get_x() + bar.get_width()/2, y_pos, 
             f'{sign}{diff:.3f}', ha='center', va='bottom' if diff > 0 else 'top', fontsize=9)

plt.tight_layout()
plt.show()

## 8. Decision Boundary Visualization

In [None]:
def plot_decision_boundary(model, X_data, y_data, ax, title, resolution=100, 
                           two_neuron_readout=False):
    """Plot decision boundary for a 2D classifier."""
    x_min, x_max = X_data[:, 0].min() - 0.02, X_data[:, 0].max() + 0.02
    y_min, y_max = X_data[:, 1].min() - 0.02, X_data[:, 1].max() + 0.02
    
    xx, yy = np.meshgrid(
        np.linspace(x_min, x_max, resolution),
        np.linspace(y_min, y_max, resolution)
    )
    
    grid_points = np.c_[xx.ravel(), yy.ravel()]
    grid_tensor = torch.FloatTensor(grid_points)
    grid_seq = grid_tensor.unsqueeze(1).expand(-1, SEQ_LEN, -1).clone()
    
    model.eval()
    with torch.no_grad():
        final_hist, _ = model(grid_seq)
        output = final_hist[:, -1, :]  # [N, output_dim]
        
        if two_neuron_readout:
            # Softmax probabilities: P(class=1)
            probs = torch.softmax(output, dim=1)[:, 1].numpy()
        else:
            # Sigmoid probability
            probs = torch.sigmoid(output).squeeze().numpy()
    
    Z = probs.reshape(xx.shape)
    
    ax.contourf(xx, yy, Z, levels=50, cmap='RdBu', alpha=0.7)
    ax.contour(xx, yy, Z, levels=[0.5], colors='black', linewidths=2)
    
    for c, color in enumerate(['blue', 'red']):
        mask = y_data.squeeze() == c
        ax.scatter(X_data[mask, 0], X_data[mask, 1], c=color, 
                   s=15, alpha=0.5, edgecolors='white', linewidths=0.3)
    
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_title(title, fontsize=10)
    ax.set_aspect('equal')


X_np = X_data.numpy()
y_np = y_data.numpy()

# Compare decision boundaries for each architecture
n_configs = len(HIDDEN_CONFIGS)
fig, axes = plt.subplots(n_configs, 2, figsize=(12, 4*n_configs))

for idx, name in enumerate(HIDDEN_CONFIGS.keys()):
    ax_lin = axes[idx, 0]
    ax_sd = axes[idx, 1]
    
    res_lin = results_linear[name]
    res_sd = results_sd[name]
    
    plot_decision_boundary(
        res_lin['model'], X_np, y_np, ax_lin,
        f"{name} - Linear Readout\nAcc={res_lin['final_acc']:.3f}",
        two_neuron_readout=False
    )
    
    plot_decision_boundary(
        res_sd['model'], X_np, y_np, ax_sd,
        f"{name} - SD Readout (φ=0.23)\nAcc={res_sd['final_acc']:.3f}",
        two_neuron_readout=True
    )

plt.tight_layout()
plt.show()

## 9. Effect of Readout phi_offset

Let's test different `phi_offset` values for the readout layer to see how threshold positioning affects training.

In [None]:
# Test different phi_offset values for readout
PHI_OFFSETS = [0.02, 0.10, 0.15, 0.20, 0.23, 0.25, 0.30]
HIDDEN_DIM = [8]  # Use 8 hidden neurons

phi_results = {}

print("Testing different phi_offset values for readout layer...")
print("Hidden architecture: 2 → 8 → 2 (two competing SingleDendrite readout neurons)")
print("=" * 60)

for phi in PHI_OFFSETS:
    model = build_classifier_singledendrite_readout(HIDDEN_DIM, readout_phi_offset=phi)
    losses, accs = train_classifier(model, X_seq, y_labels, n_epochs=N_EPOCHS, lr=LR,
                                    two_neuron_readout=True)  # KEY: use two-neuron readout
    _, _, final_acc = evaluate_classifier(model, X_seq, y_labels, two_neuron_readout=True)
    
    phi_results[phi] = {
        'losses': losses,
        'accuracies': accs,
        'final_acc': final_acc,
    }
    print(f"  phi_offset={phi:.2f}: Final Accuracy = {final_acc:.4f}")

print("=" * 60)

In [None]:
fig, axes = plt.subplots(1, 3, figsize=(16, 4))

colors = plt.cm.viridis(np.linspace(0, 1, len(PHI_OFFSETS)))

# Loss curves
ax1 = axes[0]
for (phi, res), color in zip(phi_results.items(), colors):
    ax1.plot(res['losses'], label=f'φ={phi:.2f}', color=color, lw=1.5)
ax1.set_xlabel('Epoch')
ax1.set_ylabel('BCE Loss')
ax1.set_title('Training Loss by Readout phi_offset')
ax1.legend(fontsize=8)
ax1.grid(True, alpha=0.3)

# Accuracy curves
ax2 = axes[1]
for (phi, res), color in zip(phi_results.items(), colors):
    ax2.plot(res['accuracies'], label=f'φ={phi:.2f}', color=color, lw=1.5)
ax2.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5)
ax2.set_xlabel('Epoch')
ax2.set_ylabel('Accuracy')
ax2.set_title('Training Accuracy by Readout phi_offset')
ax2.legend(fontsize=8)
ax2.grid(True, alpha=0.3)
ax2.set_ylim(0.4, 1.05)

# Final accuracy vs phi_offset
ax3 = axes[2]
final_accs = [phi_results[phi]['final_acc'] for phi in PHI_OFFSETS]
ax3.plot(PHI_OFFSETS, final_accs, 'o-', markersize=10, lw=2, color='steelblue')
ax3.axvline(x=0.23, color='red', linestyle='--', alpha=0.7, label='Threshold (0.23)')
ax3.axhline(y=0.5, color='gray', linestyle='--', alpha=0.5)
ax3.set_xlabel('Readout phi_offset')
ax3.set_ylabel('Final Accuracy')
ax3.set_title('Final Accuracy vs Readout phi_offset')
ax3.legend()
ax3.grid(True, alpha=0.3)
ax3.set_ylim(0.4, 1.05)

plt.tight_layout()
plt.show()

## 10. Summary Table

In [None]:
import pandas as pd

# Summary comparison
summary_data = []
for name in HIDDEN_CONFIGS.keys():
    res_lin = results_linear[name]
    res_sd = results_sd[name]
    
    summary_data.append({
        'Architecture': name,
        'Hidden Neurons': sum(res_lin['hidden_dims']),
        'Params (Linear)': res_lin['n_params'],
        'Params (SD)': res_sd['n_params'],
        'Acc (Linear)': f"{res_lin['final_acc']:.4f}",
        'Acc (SD φ=0.23)': f"{res_sd['final_acc']:.4f}",
        'Difference': f"{res_sd['final_acc'] - res_lin['final_acc']:+.4f}",
    })

df = pd.DataFrame(summary_data)

print("=" * 100)
print("COMPARISON: LINEAR vs SINGLEDENDRITE READOUT")
print("=" * 100)
print(f"\nTask: Binary classification (Circle vs Ring)")
print(f"SingleDendrite readout: phi_offset = 0.23 (at threshold)")
print(f"Training epochs: {N_EPOCHS}")
print()
print(df.to_string(index=False))
print("=" * 100)

## 11. Conclusions

In [None]:
print("=" * 70)
print("CONCLUSIONS")
print("=" * 70)

# Analyze results
linear_best = max(results_linear.values(), key=lambda x: x['final_acc'])
sd_best = max(results_sd.values(), key=lambda x: x['final_acc'])

print("\n1. READOUT COMPARISON:")
print(f"   Best Linear Readout:       {max(r['final_acc'] for r in results_linear.values()):.4f}")
print(f"   Best SD Readout (φ=0.23):  {max(r['final_acc'] for r in results_sd.values()):.4f}")

# Count wins
sd_wins = sum(1 for n in HIDDEN_CONFIGS.keys() 
              if results_sd[n]['final_acc'] > results_linear[n]['final_acc'])
linear_wins = len(HIDDEN_CONFIGS) - sd_wins

print(f"\n2. WIN COUNT:")
print(f"   SingleDendrite readout wins: {sd_wins}/{len(HIDDEN_CONFIGS)}")
print(f"   Linear readout wins: {linear_wins}/{len(HIDDEN_CONFIGS)}")

print("\n3. EFFECT OF READOUT phi_offset:")
best_phi = max(phi_results.keys(), key=lambda x: phi_results[x]['final_acc'])
worst_phi = min(phi_results.keys(), key=lambda x: phi_results[x]['final_acc'])
print(f"   Best phi_offset:  {best_phi:.2f} (Acc={phi_results[best_phi]['final_acc']:.4f})")
print(f"   Worst phi_offset: {worst_phi:.2f} (Acc={phi_results[worst_phi]['final_acc']:.4f})")
print(f"   phi=0.23 (threshold): Acc={phi_results[0.23]['final_acc']:.4f}")

print("\n4. HARDWARE IMPLICATIONS:")
print("   • SingleDendrite readout is more hardware-faithful")
print("   • phi_offset=0.23 places neuron at threshold for easier training")
print("   • Linear readout requires idealized hardware (perfect linear response)")

print("\n5. KEY INSIGHT:")
if sd_wins >= linear_wins:
    print("   ✓ SingleDendrite readout performs comparably or better than linear")
    print("   ✓ Hardware-faithful architectures are viable!")
else:
    print("   Linear readout has an advantage for this task")
    print("   But SingleDendrite is still viable with proper phi_offset tuning")

print("\n" + "=" * 70)

## 12. Visualize Output Dynamics

Let's visualize how the SingleDendrite readout neuron's state evolves over time.

In [None]:
# Select best SingleDendrite readout model
best_sd_name = max(results_sd.keys(), key=lambda x: results_sd[x]['final_acc'])
best_model = results_sd[best_sd_name]['model']

# Get a few samples from each class
class0_idx = torch.where(y_data == 0)[0][:3]
class1_idx = torch.where(y_data == 1)[0][:3]

sample_idx = torch.cat([class0_idx, class1_idx])
X_samples = X_seq[sample_idx]
y_samples = y_data[sample_idx]

# Forward pass to get full history
best_model.eval()
with torch.no_grad():
    output_hist, layer_states = best_model(X_samples)

# Plot output dynamics - now we have TWO neurons
fig, axes = plt.subplots(1, 3, figsize=(15, 4))

time_steps = np.arange(SEQ_LEN)

# Left: Neuron 0 (s₀) dynamics
ax1 = axes[0]
for i, (idx, y_val) in enumerate(zip(sample_idx, y_samples)):
    s0 = output_hist[i, :, 0].numpy()  # First neuron
    color = 'blue' if y_val == 0 else 'red'
    label = f'Class {int(y_val)}' if i < 2 else None
    ax1.plot(time_steps, s0, color=color, alpha=0.7, lw=1.5, label=label)
ax1.set_xlabel('Time Step')
ax1.set_ylabel('State s₀ (class 0 score)')
ax1.set_title('Neuron 0 (class 0 detector)')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Middle: Neuron 1 (s₁) dynamics
ax2 = axes[1]
for i, (idx, y_val) in enumerate(zip(sample_idx, y_samples)):
    s1 = output_hist[i, :, 1].numpy()  # Second neuron
    color = 'blue' if y_val == 0 else 'red'
    ax2.plot(time_steps, s1, color=color, alpha=0.7, lw=1.5)
ax2.set_xlabel('Time Step')
ax2.set_ylabel('State s₁ (class 1 score)')
ax2.set_title('Neuron 1 (class 1 detector)')
ax2.grid(True, alpha=0.3)

# Right: Softmax probability P(class=1) over time
ax3 = axes[2]
for i, (idx, y_val) in enumerate(zip(sample_idx, y_samples)):
    s0 = output_hist[i, :, 0]
    s1 = output_hist[i, :, 1]
    # Compute softmax P(class=1) = exp(s1) / (exp(s0) + exp(s1))
    probs = torch.softmax(torch.stack([s0, s1], dim=1), dim=1)[:, 1].numpy()
    color = 'blue' if y_val == 0 else 'red'
    ax3.plot(time_steps, probs, color=color, alpha=0.7, lw=1.5)
ax3.axhline(y=0.5, color='black', linestyle='--', alpha=0.5, label='Decision boundary')
ax3.set_xlabel('Time Step')
ax3.set_ylabel('P(class=1)')
ax3.set_title('Softmax Probability Over Time')
ax3.legend()
ax3.grid(True, alpha=0.3)
ax3.set_ylim(-0.05, 1.05)

plt.suptitle(f'Two-Neuron Readout Dynamics ({best_sd_name}) - CrossEntropyLoss\nBlue=Class 0 (inner), Red=Class 1 (outer)', 
             fontsize=12, y=1.02)
plt.tight_layout()
plt.show()

print(f"\nFinal output states (two competing neurons with softmax):")
print(f"{'Sample':<8} {'True':<6} {'s₀':<10} {'s₁':<10} {'P(cls0)':<10} {'P(cls1)':<10} {'Pred':<6} {'Status'}")
print("-" * 76)
for i, (idx, y_val) in enumerate(zip(sample_idx, y_samples)):
    s0 = output_hist[i, -1, 0].item()
    s1 = output_hist[i, -1, 1].item()
    probs = torch.softmax(torch.tensor([s0, s1]), dim=0)
    p0, p1 = probs[0].item(), probs[1].item()
    pred = 1 if p1 > p0 else 0
    status = '✓' if pred == y_val else '✗'
    print(f"{i:<8} {int(y_val):<6} {s0:<10.4f} {s1:<10.4f} {p0:<10.4f} {p1:<10.4f} {pred:<6} {status}")