# Mackey Glass Benchmark Tutorial

This tutorial aims to provide an insight on how the NeuroBench framework is organized and how you can use it to benchmark your own models!

## About Mackey Glass:
The Mackey Glass task is a chaotic function prediction task. Contrary to the other tasks in NeuroBench, the Mackey Glass dataset is synthetic. Real-world data can be high-dimensional and require large networks to achieve high accuracy, presenting challenges for solution types with limited I/O support and network capacity, such as mixed-signal prototype solutions. 
### Dataset:
The Mackey Glass dataset is a one-dimensional non-linear time delay differential equation, where the evolution of the signal can be altered by a number of different parameters. 
<!-- $$ dx \over dt = \beta {x(t-\tau)} \over {1 + x(t-\tau)^n} - \gamma x(t)
$$ -->
$$ \frac{dx}{dt} = \frac{\beta x(t-\tau)}{1 + x(t-\tau)^n} - \gamma x(t) $$

For the NeuroBench task, fourteen time series are generated from the Mackey Glass function, available at this link: https://huggingface.co/datasets/NeuroBench/mackey_glass. The time series vary by the $\tau$ parameter from 17 to 30.

### Benchmark Task:
The task is a sequence-to-sequence prediction problem, similar to the non-human primate motor prediction task, which is also included in NeuroBench. The task involves predicting the next timestep value $f(t+\Delta t)$, from the current value $f(t)$. Models are trained on the first half of the time series, during which the ground truth state $f(t)$ is provided to the model to make its prediction $f'(t+\Delta t)$. During evaluation on the second half of the sequence, the model uses its prior prediction $f'(t)$ in order to generate each next value $f'(t+\Delta t)$, autoregressively generating the second half of the time series.

Symmetric mean absolute percentage error, sMAPE, is used to evaluate the correctness of the model's prediction. The length of the time series is dependent on Lyapunov time, which is listed with the dataset. 

First we will import the relevant libraries. These include the dataset, model wrapper, and benchmark object.

In [None]:
import torch

from torch.utils.data import Subset, DataLoader

import pandas as pd

from neurobench.datasets import MackeyGlass
from neurobench.models import TorchModel
from neurobench.benchmarks import Benchmark

For this tutorial, we will make use an Echo State Network (ESN) model architecture.

In [None]:
# this is the network we will be using in this tutorial
from examples.mackey_glass.echo_state_network import EchoStateNetwork

Next, we load the hyperparameters of the ESN that were found using a random grid search.

In [None]:
esn_parameters = pd.read_csv("./model_data/echo_state_network_hyperparameters.csv")

The Mackey Glass task contains 14 series with varying complexity. For simplicity, in this tutorial we only present with the first series, `tau=17`. The dataset is downloaded if it is not already available.

In [None]:
tau = 17
# data in repo root dir
file_path = "../../../data/mackey_glass/mg_17.npy"

# Load data using the parameters loaded from the csv file
mg = MackeyGlass(file_path = file_path)

# Split test and train set
train_set = Subset(mg, mg.ind_train)
test_set = Subset(mg, mg.ind_test)

The data is treated such that each time series element is considered a separate sample. The sample is passed to the model, which predicts the next sample. The data is shaped as `[len_time_series, 1, 1]` in order to fit with the three dimensional data format standard of NeuroBench.

In [None]:
train_data, train_labels = train_set[:]
train_data.shape

Instantiate the model with the searched hyperparameters and fit the model.

In [None]:
# Index of the hyperparamters for the current time-series
ind_tau = esn_parameters.index[esn_parameters['tau'] == tau].tolist()[0]

## Fitting Model ##
# Load the model with the parameters loaded from esn_parameters
esn = EchoStateNetwork(in_channels=1, 
    reservoir_size = esn_parameters['reservoir_size'][ind_tau], 
    input_scale = torch.tensor([esn_parameters['scale_bias'][ind_tau], esn_parameters['scale_input'][ind_tau],],dtype = torch.float64), 
    connect_prob = esn_parameters['connect_prob'][ind_tau], 
    spectral_radius = esn_parameters['spectral_radius'][ind_tau],
    leakage = esn_parameters['leakage'][ind_tau], 
    ridge_param = esn_parameters['ridge_param'][ind_tau])

esn.train()
train_data, train_labels = train_set[:] # outputs (batch, 1, 1)
warmup = 0.6 # in Lyapunov times
warmup_pts = round(warmup*mg.pts_per_lyaptime)
train_labels = train_labels[warmup_pts:]
esn.fit(train_data, train_labels, warmup_pts)

No pre- or post-processors are used in this task.

In [None]:
preprocessors = []
postprocessors = []

Next specify the metrics to calculate. For this task, sMAPE is used to evaluate correctness.

In [None]:
static_metrics = ["footprint", "connection_sparsity"]
workload_metrics = ["sMAPE", "activation_sparsity", "synaptic_operations"]

The test set is wrapped in a DataLoader. Importantly, shuffle should be False, since the samples of the time series should be passed in order.

In [None]:
test_set_loader = DataLoader(test_set, batch_size=mg.testtime_pts, shuffle=False)

# Wrap the model
model = TorchModel(esn)

benchmark = Benchmark(model, test_set_loader, [], [], [static_metrics, workload_metrics]) 
results = benchmark.run()
print(results)

Expected results:
{'footprint': 7488448, 'connection_sparsity': 0.0297, 'sMAPE': 10.893483007394062, 'activation_sparsity': 0.0, 'synaptic_operations': {'Effective_MACs': 908294.0, 'Effective_ACs': 0.0, 'Dense': 936056.0}}

Note that due to the ESN fit functionality being dependent on lower-level arithmetic libraries, your results may be different on a different machine.