# Creating Custom Metrics in NeuroBench: A Deep Dive

This workshop guide explains how to create custom metrics in the NeuroBench framework, using the AverageActivity metric as a practical example. We'll explore how to implement metrics that can analyze neural network behavior, particularly focusing on spiking neural networks.

## Understanding Metric Types in NeuroBench

NeuroBench supports three types of metrics:

1. **Static Metrics**: Evaluate fixed properties of the model (e.g., parameter count)
2. **Workload Metrics**: Evaluate performance during inference
3. **Accumulated Metrics**: A special type of workload metric that accumulates statistics across batches

In this workshop, we'll focus on creating an AccumulatedMetric, which is particularly useful for analyzing spike related informationas it allows us to gather statistics across multiple batches of data and compute desired metrics at the end.

## The AverageActivity Metric: A Case Study

Let's examine how to create a metric that analyzes the activity patterns of neurons across different layers in a spiking neural network.

We are interested in the average activity of neurons across different layers. We want to compute the min, max, mean, median, and q1 and q3 of the activity of neurons across different layers.

### 1. Setting Up the Metric Class

First, import all our desired libraries, and the metric base class. We are creating an AccumulatedMetric, so we need to import the base class from neurobench.metrics.abstract.workload_metric.

In [1]:
import torch
import numpy as np
import matplotlib.pyplot as plt
from collections import defaultdict
from typing import Dict, List, Tuple

from neurobench.metrics.abstract.workload_metric import AccumulatedMetric


Now define the class for our metric, inheriting from the AccumulatedMetric base class.

Key points:
- Inherit from `AccumulatedMetric` for batch-wise accumulation
- Set `requires_hooks=True` to access activation information
- Initialize dictionaries to store statistics across batches

#### Understanding Hooks

Hooks are essential for accessing spike information after running a model. They allow us to intercept and record activation values during the forward pass. Both layer and neuron hooks are supported. We can access the data being passed in each layer or activation and the output values of each layer or activation.

When we set `requires_hooks=True`, hooks are automatically added to the model. We can access them through the `activation_hooks` (as we are interested in activations) attribute of the model. Optionally, we can also manually add hooks.

In [None]:
class AverageActivityMetric(AccumulatedMetric):
    """
    A workload metric that computes activity statistics for each layer in a neural network.
    For each layer, it tracks the min, max, mean, median, and q1 and q3 of the activity of its neurons.
    
    This metric is particularly useful for analyzing:
    1. Layer-wise activation patterns
    2. Neuron activity distribution
    3. Potential dead neurons or saturation
    4. Layer-specific sparsity patterns
    """
    
    def __init__(self):
        super().__init__(requires_hooks=True)
        self.layer_activities = defaultdict(list)  # Store activities for each layer
        self.num_batches = 0
        self.all_spikes = {}
        self.min_activities = {}
        self.max_activities = {}
        self.mean_activities = {}
        self.q1_activities = {}
        self.q3_activities = {}
  
    def _get_layer_activities(self, model) -> Dict[str, torch.Tensor]:
        """
        Extract activities from all layers in the model.
        This method should be customized based on the model architecture.
        
        Args:
            model: The neural network model
            
        Returns:
            activity of each neuron in each layer as a dictionary
        """
        spike_lsts = {}
        for i, hook in enumerate(model.activation_hooks):
            if not hook.activation_outputs:
                continue

            spike_lsts[f'activation_layer_{i+1}'] = hook.activation_outputs

        activities = {}
        for layer_name, spikes in spike_lsts.items():
            if layer_name not in self.all_spikes:
                self.all_spikes[layer_name] = [torch.stack(spikes, dim=0)]
            else:
                self.all_spikes[layer_name].append(torch.stack(spikes, dim=0))
            # activities[layer_name] = torch.stack(spikes, dim=0).sum(dim=0)/len(spikes) # sum over entire sequence and divide by sequence length
        return activities, len(spikes)
    
    def __call__(self, model, preds, data):
        """
        Accumulate activity statistics for each layer.
        We return the result
        
        Args:
            model: The neural network model
            preds: Model predictions (not used in this metric)
            data: Input data (not used in this metric)

        """
        # Get activities for all layers
        self._get_layer_activities(model)
        
        self.num_batches += 1

        return self.compute()
    
    def compute(self) -> Dict[str, Dict[str, np.ndarray]]:
        """
        Compute the final activity statistics for each layer.
        
        Returns:
            Dictionary containing statistics for each layer:
            {
                'layer_name': {
                    'mean': mean activity per neuron,
                    'std': standard deviation of activities,
                    'histogram': histogram of activities
                }
            }
        """
        results = {}
        
        for layer_name, spikes in self.all_spikes.items():
            # cat all batches
            spikes = torch.cat(spikes, dim=1)
            spike_percentages = ((spikes.sum(dim=0)/spikes.shape[0]).sum(dim=0)/spikes.shape[1]).cpu().detach().numpy() # first sum over batch, then over sequence length
            results[layer_name] = {
                'min': spike_percentages.min(),
                'max': spike_percentages.max(),
                'mean': spike_percentages.mean(),
                'std': spike_percentages.std(),
                'median': np.median(spike_percentages),
                'q1': np.percentile(spike_percentages, 25),
                'q3': np.percentile(spike_percentages, 75),
                'histogram': spike_percentages
            }
        return results
    
    def reset(self):
        """Reset the accumulated statistics."""
        self.layer_activities.clear()
        self.num_batches = 0
        # self.all_spikes = {}
    
    @classmethod
    def plot_activity_distributions(cls, results: Dict[str, Dict[str, np.ndarray]]):
        """
        Plot boxplots of activity distributions for each layer side by side on one figure.
        
        Args:
            results: Results from compute() method
        """
        # Create a single figure
        plt.figure(figsize=(12, 6))
        
        results = results['AverageActivityMetric']
        # Prepare data for plotting
        layer_names = list(results.keys())
        activity_data = [results[layer]['histogram'] for layer in layer_names]
        
        # Create boxplot
        bp = plt.boxplot(activity_data, 
                        labels=layer_names,
                        vert=True,
                        widths=0.5,
                        showmeans=True,
                        meanline=True,
                        patch_artist=True)
        
        # Customize the plot
        plt.title('Average Activity of Neurons per Layer')
        plt.xlabel('Layer')
        plt.ylabel('Activity')
        plt.ylim(0, 0.5)
        
        # Rotate x-axis labels for better readability
        plt.xticks(rotation=45, ha='right')
        
        # Add grid for better readability
        plt.grid(True, axis='y', linestyle='--', alpha=0.7)
        
        # Adjust layout to prevent label cutoff
        plt.tight_layout()
        
        plt.show()

### Visualization Support

We have included a class method which creates boxplots of the activity distributions for each layer side by side on one figure. We demonstrate that all data required after running the benchmark, should be returned in the results dictionary of the __call__ method.

### 2. Using the Metric in a Benchmark

We build on the GSC example to show how to use the metric in a benchmark.


In [2]:
import os
import torch

from torch.utils.data import DataLoader

from neurobench.datasets import SpeechCommands
from neurobench.processors.preprocessors import S2SPreProcessor
from neurobench.processors.postprocessors import ChooseMaxCount

from neurobench.models import SNNTorchModel
from neurobench.benchmarks import Benchmark

from neurobench.metrics.workload import (
    ActivationSparsity,
    SynapticOperations,
    ClassificationAccuracy
)
from neurobench.metrics.static import (
    Footprint,
    ConnectionSparsity,
)
from average_activity_metric import AverageActivityMetric

from examples.gsc.SNN import net
from pprint import pprint
import pathlib
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

file_path = pathlib.Path(__file__).parent.parent
model_path = os.path.join(file_path, "gsc/model_data/s2s_gsc_snntorch")
data_dir = os.path.join(file_path, "../../data/speech_commands") # data in repo root dir

test_set = SpeechCommands(path=data_dir, subset="testing")

test_set_loader = DataLoader(test_set, batch_size=500, shuffle=True)

net.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))

## Define model ##
model = SNNTorchModel(net)

preprocessors = [S2SPreProcessor(device=device)]
postprocessors = [ChooseMaxCount()]

static_metrics = [Footprint, ConnectionSparsity]
workload_metrics = [AverageActivityMetric]

benchmark = Benchmark(model, test_set_loader, preprocessors, postprocessors, [static_metrics, workload_metrics])
results = benchmark.run(device=device)
pprint(results)

# plot activity distributions
AverageActivityMetric.plot_activity_distributions(results)

# Results:
# {'Footprint': 583900, 'ConnectionSparsity': 0.0, 
# 'ClassificationAccuracy': 0.85633802969095, 'ActivationSparsity': 0.9668664144456199, 
# 'SynapticOperations': {'Effective_MACs': 0.0, 'Effective_ACs': 3289834.3206724217, 'Dense': 29030400.0}}

NameError: name '__file__' is not defined

## Key Takeaways

1. **Accumulated Metrics**:
   - Perfect for analyzing spiking neural networks
   - Allow gathering statistics across multiple batches
   - Require hooks for accessing activation information

2. **Hook System**:
   - Provides access to spike information
   - Enables analysis of temporal dynamics
   - Works with various spiking neural network architectures

3. **Metric Structure**:
   - `__init__`: Set up hooks and storage
   - `__call__`: Process each batch
   - `compute`: Calculate final statistics
   - `reset`: Clear accumulated data
