# XOR: A minimalistic regression tutorial for LAVA-DL
by [Alexander Henkes](https://orcid.org/0000-0003-4615-9271)
---

In this tutorial we want to solve a simple regression task using [spiking neural
networks](https://en.wikipedia.org/wiki/Spiking_neural_network) (SNNs) and the [**LAVA-DL** library](https://github.com/lava-nc/lava-dl). The presented approach tries to stay as
minimalistic as possible, though it is easy to expand it to much more complex
problems. A basic understanding of spiking neural networks is asumed. If you start from zero, take a look at [this tutorials](https://snntorch.readthedocs.io/en/latest/tutorials/index.html) first.


# Installation
> If you are using Google Colab, uncomment the following cell to install all the
> necessary packages! You can safely ignore all GPU related erorrs and warnings, we only need a CPU!

In [1]:
# First, install the 'lava-nc' base package:
!pip install --upgrade pip
!pip install lava-nc

# Second, install the 'lava-dl' deep learning addition, available from github:
!rm ./lava*
!curl -s https://api.github.com/repos/lava-nc/lava-dl/releases/latest \
| grep browser_download_url \
| cut -d '"' -f 4 \
| grep tar.gz \
| wget -i -
!pip install ./lava_dl*


Collecting pip
  Downloading pip-23.2.1-py3-none-any.whl (2.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m13.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.1.2
    Uninstalling pip-23.1.2:
      Successfully uninstalled pip-23.1.2
Successfully installed pip-23.2.1
Collecting lava-nc
  Obtaining dependency information for lava-nc from https://files.pythonhosted.org/packages/29/39/607eafb3e98935ce1f195404081f2a11f55f72f6f08717b2899c742025e6/lava_nc-0.8.0-py3-none-any.whl.metadata
  Downloading lava_nc-0.8.0-py3-none-any.whl.metadata (10 kB)
Collecting asteval<0.10.0,>=0.9.31 (from lava-nc)
  Obtaining dependency information for asteval<0.10.0,>=0.9.31 from https://files.pythonhosted.org/packages/05/34/bdb51767967cb29302ee7dfe95662b057af7f23c62dd1967fc4b373656aa/asteval-0.9.31-py3-none-any.whl.metadata
  Downloading asteval-0.9.31-py3-none-any.whl.metad

rm: cannot remove './lava*': No such file or directory
--2023-08-21 11:39:12--  https://github.com/lava-nc/lava-dl/releases/download/v0.4.0/lava_dl-0.4.0.tar.gz
Resolving github.com (github.com)... 140.82.121.4
Connecting to github.com (github.com)|140.82.121.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://objects.githubusercontent.com/github-production-release-asset-2e65be/411730917/1a095795-5494-4a6a-80a2-85b8ac1b5ee7?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20230821%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20230821T113912Z&X-Amz-Expires=300&X-Amz-Signature=4de76a0b596e27cb969337a7c03cde510e8ce1eb26a2e015b2dc0f8d2436d52e&X-Amz-SignedHeaders=host&actor_id=0&key_id=0&repo_id=411730917&response-content-disposition=attachment%3B%20filename%3Dlava_dl-0.4.0.tar.gz&response-content-type=application%2Foctet-stream [following]
--2023-08-21 11:39:12--  https://objects.githubusercontent.com/github-production-release-asse

# Import and random seeds

In [2]:
"""Solve XOR with LIFs using LAVA/SLAYER."""
import lava.lib.dl.slayer as slayer
import numpy as np
import random
import torch

SEED = 666
np.random.seed(SEED)
random.seed(SEED)
torch.manual_seed(SEED)


No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'


<torch._C.Generator at 0x7fb3a864e6d0>

# Regression
[Regression](https://en.wikipedia.org/wiki/Regression_analysis) it the task of relating some ($n_x$-dimensional, real-valued) input to some ($n_y$-dimensional, real-valued) output, such that

\begin{align}
    f: \mathbb{R}^{n_x} &\to \mathbb{R}^{n_y} \\
    \mathbf{x} &\mapsto \mathbf{y}.
\end{align}

This simple statement can describe all kinds of nonlinear functions whith possibly very complicated behavior. If we do not know an analytical expression for our complicated function $f$ but have access to some input data $\mathbf{x}$ and output data $\mathbf{y}$, we can approximate it using a neural network $\mathcal{N}$, such that

\begin{equation}
    f \approx \mathcal{N}.
\end{equation}


# XOR
Here, we take a look at the [XOR problem](https://en.wikipedia.org/wiki/Exclusive_or), which relates a two-dimensional input pair to a one-dimensional output

\begin{align}
    \operatorname{XOR}: \mathbb{R} \times \mathbb{R} &\to \mathbb{R} \\
    (A, B) &\mapsto A \oplus B
\end{align}

It can be described by the following table:

\begin{array}{c}
A & B & A \oplus B \\ \hline
0 & 0 & 0 \\
0 & 1 & 1 \\
1 & 0 & 1 \\
1 & 1 & 0 \\ \hline
\end{array}

We see that our dataset is indeed quite minimalistic, it merely consists of four samples. We can define the dataset using the [standard PyTorch approach](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html#creating-a-custom-dataset-for-your-files). SNNs are inherently time-dependent, but the XOR dataset is static. We simply generate a pseudo-time axis by repeating every sample for as many time steps as we like.

In [3]:
class XOR(torch.utils.data.Dataset):
    """XOR dataset.

    Produce a torch.dataset for the XOR problem. It consists of two inputs
    and one output correspoinding to the following logic table:

    Input   |   Output
    ==================
    (0, 0)  |   (0)
    (0, 1)  |   (1)
    (1, 0)  |   (1)
    (1, 1)  |   (0)
    ==================

    """

    def __init__(self, time):
        """Initialize dataset.

        The dataset consists of two-dimensional input features and
        one-dimensional output labels. The axis convention

            (BATCH, TIME, FEATURE)

        is used. The parameter 'time' controlls the number of
        discrete pseudo-time steps.

        Parameters
        ----------
        time : int
            Number of discrete time steps needed for LIF-type neurons.

        """
        self.feature = torch.Tensor(
            [
                [0, 0],
                [0, 1],
                [1, 0],
                [1, 1],
            ]
        )

        self.label = torch.Tensor(
            [
                [0],
                [1],
                [1],
                [0],
            ]
        )

        self.feature = torch.unsqueeze(self.feature, -1)
        self.label = torch.unsqueeze(self.label, -1)

        self.feature = torch.repeat_interleave(
            input=self.feature, repeats=time, dim=-1
        )
        self.label = torch.repeat_interleave(
            input=self.label, repeats=time, dim=-1
        )

    def __len__(self):
        """Return length of dataset.

        The length of the dataset is defined as the length of the
        first axis, the batch axis.

        Returns
        -------
        int
            Number of unique samples in the dataset.

        """
        return len(self.feature[:, 0, 0])

    def __getitem__(self, idx):
        """Return a single sample from the dataset.

        Return a single sample from the dataset using the index variable 'idx'.

        Parameters
        ----------
        idx : int
            Index of the sample.

        Returns
        -------
        torch.Tensor
            Sample 'idx' from the dataset.

        """
        return self.feature[idx, :, :], self.label[idx, :, :]


# Neural network

In this tutorial we want to use SNNs to fit real-valued XOR problem. You can solve this as a classification task, probably using spikes directly, but we chose a different approach which translates well to all kinds of regression scenarios.

The problem in regression with SNNs lies in the binary (or unary, if you like) nature of information travel between neurons. On the one hand, this leads to temporal- and inter-spike sparsity and therefore, to massive energy savings on neuromorphic hardware, on the other hand representing real-valued functions with spikes is not straightforward.

First, we need to convert our real-valued input to spikes using some sort of encoder. In this tutorial, we will use a simple CUBA, a second-order variant of the classical [LIF](https://neuronaldynamics.epfl.ch/online/Ch1.S3.html) with richer neural dynamics. The encoder simply adds the input values constantly over time to a CUBA neuron, which generates spikes

\begin{align}
    \operatorname{encoder}: \mathbb{R} \times \mathbb{R} &\to \{0, 1\}^t.
\end{align}

In **LAVA-DL**, this is realized by the `slayer.block.cuba.Input` layer, which organises several neurons in a stack. In between we can use as many spiking layers as we want

\begin{align}
    \operatorname{spiking neuron}: \{0, 1\}^t &\to \{0, 1\}^t.
\end{align}

Here, we chose the `slayer.block.cuba.Dense` layer which combines a dense feed-forward neural network with CUBA dynamics. For the output we take the membrane potential of the neuron in the last layers. For more details for this approach you can take a look at [https://arxiv.org/abs/2210.03515](https://arxiv.org/abs/2210.03515). The potential is a real-valued number and can be extracted via the `slayer.block.cuba.Affine` layer. It acts as some sort of decoder, from spikes to real-valued numbers

\begin{align}
    \operatorname{decoder}: \{0, 1\}^t &\to \mathbb{R}.
\end{align}

Again, we can define the network using [PyTorch](https://pytorch.org/tutorials/recipes/recipes/defining_a_neural_network.html). For details on the nasty details, see the [**LAVA-DL** documentation](https://lava-nc.org/lava-lib-dl/index.html).

In [4]:
class Network(torch.nn.Module):
    """LIF network.

    A network consisting of the following topology:

    Layer
    ===============
    - BlockCubaInput
    - BlockCubaDense
    - BlockCubaDense
    - BlockCubaAffine

    """

    def __init__(self):
        """Initialize network."""
        super(Network, self).__init__()

        cuba_params = {
            "threshold": 0.1,
            "current_decay": 0.9,
            "voltage_decay": 0.9,
            "tau_grad": 1,
            "scale_grad": 1,
            "scale": 1 << 6,
            "norm": None,
            "dropout": None,
            "shared_param": True,
            "persistent_state": False,
            "requires_grad": False,
            "graded_spike": False,
        }

        width = 32

        self.blocks = torch.nn.ModuleList(
            [
                slayer.block.cuba.Input(
                    neuron_params=cuba_params, count_log=False
                ),
                slayer.block.cuba.Dense(
                    neuron_params=cuba_params,
                    in_neurons=2,
                    out_neurons=width,
                    count_log=False,
                ),
                slayer.block.cuba.Dense(
                    neuron_params=cuba_params,
                    in_neurons=width,
                    out_neurons=width,
                    count_log=False,
                ),
                slayer.block.cuba.Affine(
                    neuron_params=cuba_params,
                    in_neurons=width,
                    out_neurons=1,
                    dynamics=False,
                    count_log=False,
                ),
            ]
        )

    def forward(self, x):
        """Forward pass."""
        count = []
        for block in self.blocks:
            x = block(x)
            count.append(torch.mean(x).item())

        return x, torch.as_tensor(count)


# Training loop

The training using [**LAVA-DL**](https://lava-nc.org/dl.html#lava-dl-workflow) is carried out like a PyTorch training loop, but some details are handled via the library directly. First, we chose an [optimizer](https://pytorch.org/docs/stable/optim.html) and define a [dataloader](https://pytorch.org/docs/stable/optim.html). The training itself including logging is carried out by `slayer.utils.Assistant()` in conjunction with `slayer.utils.LearningStats()`. With the help of a `lambda` function we define a simple mean-squared error loss on the last time step of our pseudo-time (remember: SNNs are inherently time-dependent, therefore we introduced a pseudo-time in our static data to be able to make use of the neuron dynamics). Finally, we loop over our training set. Additionally, we track the number of spikes produced by every layer. This gives us information about the level of sparsity of our network.

In [5]:
def train(net, dataset, epochs):
    """Train the network."""
    optimizer = torch.optim.AdamW(net.parameters(), lr=1e-3)

    dataloader = torch.utils.data.DataLoader(
        dataset=dataset, batch_size=4, pin_memory=True
    )

    stats = slayer.utils.LearningStats(
        loss_str="loss",
        loss_unit="",
        accuracy_str="",
        accuracy_unit="",
    )

    assistant = slayer.utils.Assistant(
        net=net,
        error=lambda output, target: torch.nn.functional.mse_loss(
            output[:, :, -1].flatten(), target[:, :, -1].flatten()
        ),
        optimizer=optimizer,
        stats=stats,
        classifier=None,
        count_log=True,
    )

    for epoch in range(epochs):
        for i, (feature, label) in enumerate(dataloader):
            _, count = assistant.train(feature, label)
            print(f"\r[Epoch {epoch:3d}/{epochs}] {stats}", end="")

        stats.update()

    return stats, count


# Main function

In the main function we go through the following steps:


1.   Create the SNN `net`
2.   Create the XOR-dataset `dataset` with pseudo-time `time`
3.   Train the network using `train()`
4.   Print sparsity information `spike_activity`
5.   Predict `prediction` and print some results!



In [6]:
def main():
    """Execute main function."""
    net = Network()
    dataset = XOR(time=10)

    _, count = train(net=net, dataset=dataset, epochs=500)

    spike_activity = [str(round(i.item() * 100, 2)) for i in count.numpy()]
    spike_activity = "| " + "".join(x + "% | " for x in spike_activity)
    print(f"\n\nSpike activity per layer: {spike_activity}\n")

    prediction = torch.round(net(dataset.feature)[0])

    print(
        f"{'Input:':<12}{dataset.feature[:, :, -1].detach().numpy().tolist()}"
    )
    print(
        f"{'Output:':<12}{dataset.label[:, :, -1].detach().numpy().tolist()}"
    )
    print(
        f"{'Prediction:':<12}{prediction[:, :, -1].detach().numpy().tolist()}"
    )

    return None


# Run it!

You can run everything (again and again) using the following cell. The layout of the notebook was chosen in order to get a nice `.py` script when exporting. It can aid as a solid basis for your own experiments!

In [8]:
if __name__ == "__main__":
    main()


[Epoch 499/500] Train loss =     0.00098 (min =     0.00000)      = 0.00000 (max = 0.00000) 

Spike activity per layer: | 45.0% | 26.09% | 18.28% | 35.78% | 

Input:      [[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]]
Output:     [[0.0], [1.0], [1.0], [0.0]]
Prediction: [[0.0], [1.0], [1.0], [0.0]]
