## VPRTempoQuant - Training and Inferencing Tutorial

### By Adam D Hines (https://research.qut.edu.au/qcr/people/adam-hines/)

VPRTempo is based on the following paper, if you use or find this code helpful for your research please consider citing the source:
    
[Adam D Hines, Peter G Stratton, Michael Milford, & Tobias Fischer. "VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition. arXiv September 2023](https://arxiv.org/abs/2309.10225)

### Introduction

Traditional methods for visual place recognition (VPR) tasks typically employ the use of convolutional neural networks like ResNet to train large datasets for feature extraction of incoming query images, rather than specifically learning said query place. The networks are extremely effective at accurate localisation, but are are slow to train, inference, and store.

Spiking neural networks (SNNs) by contrast are more energy efficient and have low latency computation, meaning their deployment capability for VPR is extremely promising. Specifically, networks can be trained on the exact location you wish to query which takes a fundamentally different approach to the VPR task.

VPRTempo uses a temporal encoding scheme for spikes, where the amplitude of a spike is determined by an incoming training or query image's pixel intensity. This amplitude defines the 'timing' of the spike, similar to a latency code. As spikes propagate throughout the system, spike-timing dependent plasticity (STDP) learning rules train neuronal connections based off of the pixel intensity spike amplitudes. 

In this tutorial, we are going to take the base VPRTempo model to train and inference a network with PyTorch's Quantized Aware Training ([QAT](https://pytorch.org/docs/stable/quantization.html)). 

To get started, please ensure you have installed and currently have activated the `conda` environment for VPRTempo.

In [None]:
!conda activate vprtempo

## 1. Get the Nordland dataset

### 1.1 Download the dataset

Please [download the Nordland datasets](https://webdiis.unizar.es/~jmfacil/pr-nordland/#download-dataset) (Summer, Spring, Fall, & Winter). There are two datasets available, the full size and downsampled versions. Either will work fine but our paper details the full size dataset. If disk space is a concern, please use the downsampled version.

Save the data in the `./VPRTempo-quant/dataset/` subfolder.

### 1.2 Prepare the dataset for the model

The datset seasons are downloaded in .zip format and need to be extracted into a single folder. The `nordland` function has been provided to automatically do this for you and to re-name the images to match those in the nordland.csv file.

In [1]:
import os
import re
import shutil
import zipfile
import sys
sys.path.append('./src')
sys.path.append('./VPRTempo-quant/dataset')

from os import walk
from nordland import nord_sort

# unzip, re-organise, and re-name the Nordland datasets
nord_sort()

AssertionError: Please set the outDir to the desired output location for unzipping the Nordland datasets

## Prepare the model for training

Let's now look at preparing our network to train our first model. There are a few initial steps to take care of first.

### 2.1 Import modules

In [15]:
import torch
import jdc
import torch.nn as nn
import blitnet as bn
import numpy as np
from torch.ao.quantization import QuantStub, DeQuantStub
from tqdm import tqdm
from settings import configure, image_csv, model_logger

### 2.2 Define and initialize the VPRTempo model

We'll first define the VPRTempo class which handles the configuration as set in `./src/settings.py`, determining which images to load, and establishes the layers used for training. For this tutorial, leave the settings as the default.

`__init__` is where we define the layers used for the model. In this case, we define a `feature_layer` and an `output_layer`. `dims` represents the number of neurons in the input and the layer itself, which in this case is `self.input`, `self.feature`, and `self.output`. Note that the size of the input for each proceeding layer is the size of previous layer. In this example, we have an input of 784 neurons (for 28x28 images) connected to a 1568 neuron feature layer which then connects to a final output layer of 500 neurons.

The other hyperparameters for each layer are set here as well.

In [5]:
class VPRTempo(nn.Module):
    def __init__(self):
        super(VPRTempo, self).__init__()

        # Configure the network
        configure(self)
        
        # Define the images to load (both training and inference)
        image_csv(self)

        # Add quantization stubs for Quantization Aware Training (QAT)
        self.quant = QuantStub()
        self.dequant = DeQuantStub()
        
        # Define the add function for quantized addition
        self.add = nn.quantized.FloatFunctional()      

        # Layer dict to keep track of layer names and their order
        self.layer_dict = {}
        self.layer_counter = 0

        """
        Define trainable layers here
        """
        self.add_layer(
            'feature_layer',
            dims=[self.input, self.feature],
            thr_range=[0, 0.5],
            fire_rate=[0.2, 0.9],
            ip_rate=0.15,
            stdp_rate=0.005,
            const_inp=[0, 0.1],
            p=[0.1, 0.5]
        )
        self.add_layer(
            'output_layer',
            dims=[self.feature, self.output],
            ip_rate=0.15,
            stdp_rate=0.005,
            spk_force=True
        )
    def add_layer(self, name, **kwargs):
        """
        Dynamically add a layer with given name and keyword arguments.

        :param name: Name of the layer to be added
        :type name: str
        :param kwargs: Hyperparameters for the layer
        """
        # Check for layer name duplicates
        if name in self.layer_dict:
            raise ValueError(f"Layer with name {name} already exists.")

        # Add a new SNNLayer with provided kwargs
        setattr(self, name, bn.SNNLayer(**kwargs))

        # Add layer name and index to the layer_dict
        self.layer_dict[name] = self.layer_counter
        self.layer_counter += 1  

        print('Succesfully added '+name)

    def train_model(self, train_loader, layer, prev_layers=None):
        """
        Train a layer of the network model.

        :param train_loader: Training data loader
        :param layer: Layer to train
        :param prev_layers: Previous layers to pass data through
        """

        # Initialize the tqdm progress bar
        pbar = tqdm(total=int(self.T * self.epoch),
                    desc="Training ",
                    position=0)

        # Initialize the learning rates for each layer (used for annealment)
        init_itp = layer.eta_ip.detach()
        init_stdp = layer.eta_stdp.detach()

        # Run training for the specified number of epochs
        for epoch in range(self.epoch):
            mod = 0  # Used to determine the learning rate annealment, resets at each epoch
            # Run training for the specified number of timesteps
            for spikes, labels in train_loader:
                spikes, labels = spikes.to(self.device), labels.to(self.device)
                idx = labels / self.filter # Set output index for spike forcing
                # Pass through previous layers if they exist
                if prev_layers:
                    with torch.no_grad():
                        for prev_layer_name in prev_layers:
                            prev_layer = getattr(self, prev_layer_name) # Get the previous layer object
                            spikes = self.forward(spikes, prev_layer) # Pass spikes through the previous layer
                            spikes = bn.clamp_spikes(spikes, prev_layer) # Clamp spikes [0, 0.9]
                else:
                    prev_layer = None
                # Get the output spikes from the current layer
                pre_spike = spikes.detach() # Previous layer spikes for STDP
                spikes = self.forward(spikes, layer) # Current layer spikes
                spikes_noclp = spikes.detach() # Used for inhibitory homeostasis
                spikes = bn.clamp_spikes(spikes, layer) # Clamp spikes [0, 0.9]
                # Calculate STDP
                layer = bn.calc_stdp(pre_spike,spikes,spikes_noclp,layer, idx, prev_layer=prev_layer)
                # Adjust learning rates
                layer = self._anneal_learning_rate(layer, mod, init_itp, init_stdp)
                # Update the annealing mod & progress bar 
                mod += 1
                pbar.update(1)

        # Close the tqdm progress bar
        pbar.close()

    def forward(self, spikes, layer):
        """
        Compute the forward pass of the model.

        Parameters:
        - spikes (Tensor): Input spikes.

        Returns:
        - Tensor: Output after processing.
        """

        spikes = self.quant(spikes)
        spikes = self.add.add(layer.exc(spikes), layer.inh(spikes))
        spikes = self.dequant(spikes)

        return spikes

    def _anneal_learning_rate(self, layer, mod, itp, stdp):
        """
        Anneal the learning rate for the current layer.
        """
        if np.mod(mod, 100) == 0: # Modify learning rate every 100 timesteps
            pt = pow(float(self.T - mod) / self.T, self.annl_pow)
            layer.eta_ip = torch.mul(itp, pt) # Anneal intrinsic threshold plasticity learning rate
            layer.eta_stdp = torch.mul(stdp, pt) # Anneal STDP learning rate

        return layer

Layers are dynamically added, such that if you wish to add more layers you simply need to define one in the `__init__` script and the system will iterate through all the layers.

To add the layers, we call the `add_layer()` function which will set all the hyperparameters and seed the initial weights.

Now, we can initialize the model and add the layers.

In [6]:
%%add_to VPRTempo
def forward(self, spikes, layer):
    """
    Compute the forward pass of the model.

    Parameters:
    - spikes (Tensor): Input spikes.

    Returns:
    - Tensor: Output after processing.
    """

    spikes = self.quant(spikes)
    spikes = self.add.add(layer.exc(spikes), layer.inh(spikes))
    spikes = self.dequant(spikes)

    return spikes

def _anneal_learning_rate(self, layer, mod, itp, stdp):
    """
    Anneal the learning rate for the current layer.
    """
    if np.mod(mod, 100) == 0: # Modify learning rate every 100 timesteps
        pt = pow(float(self.T - mod) / self.T, self.annl_pow)
        layer.eta_ip = torch.mul(itp, pt) # Anneal intrinsic threshold plasticity learning rate
        layer.eta_stdp = torch.mul(stdp, pt) # Anneal STDP learning rate

    return layer

In [6]:
model = VPRTempo()

Succesfully added feature_layer
Succesfully added output_layer


### 2.3 Set the DataLoader

Now that we've defined the model, we will set up the DataLoaders. These utilise a PyTorch CustomImageDataset and ProcessImage to import images and process them for training or inference. In brief, images are loaded, gamma corrected, resized, and then patch-normalized before being converted into system spikes to be propagated throughout.

Since we present the network with one image at a time, the `batch_size` is kept to 1.

In [7]:
from dataset import CustomImageDataset, ProcessImage
from torch.utils.data import DataLoader

image_transform = ProcessImage(model.dims, model.patches)
train_dataset = CustomImageDataset(annotations_file=model.dataset_file, 
                                       img_dirs=model.training_dirs,
                                       transform=image_transform,
                                       skip=model.filter,
                                       max_samples=model.number_training_images,
                                       test=False)
# Initialize the data loader
train_loader = DataLoader(train_dataset, 
                          batch_size=1, 
                          shuffle=False,
                          num_workers=8,
                          persistent_workers=True)

### 2.4 Other network settings

We will finally set up a unique model name based on the network architecture so we can save and reload our trained model.

In [3]:
import os
import torch
import gc
import sys
sys.path.append('./src')
sys.path.append('./models')
sys.path.append('./settings')
sys.path.append('./output')
sys.path.append('./dataset')
sys.path.append('./config')
torch.multiprocessing.set_sharing_strategy("file_system")
import blitnet as bn
import utils as ut
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.quantization as quantization

from config import configure, image_csv, model_logger
from dataset import CustomImageDataset, SetImageAsSpikes, ProcessImage
from torch.utils.data import DataLoader
from torch.ao.quantization import QuantStub, DeQuantStub
from tqdm import tqdm

In [8]:
def generate_model_name(model):
    """
    Generate the model name based on its parameters.
    """
    return ("VPRTempo" +
            str(model.input) +
            str(model.feature) +
            str(model.output) +
            str(model.number_modules) +
            '.pth')

model_name = generate_model_name(model)

print(model_name)

VPRTempo78415685001.pth


In [22]:
%%add_to VPRTempo
def model_logger(self):
    """
    Log the model configuration to the console.
    """
    model_logger(self)

### 2.5 Model quantization

VPRTempoQuant makes use of Quantized Aware Training QAT and has a few simple steps to prepare the model to accomodate this. First, we will get the default quantization configuration for `fggbem`.

In [9]:
import torch.quantization as quantization

# Set the quantization configuration
qconfig = quantization.get_default_qat_qconfig('fbgemm')

Next, we will set the model to be configured for network training and add our quantization configuration.

In [10]:
# Set the model to training mode and move to device
model.train()
model.to('cpu')
model.qconfig = qconfig

Now we will convert the model over to QAT.

In [11]:
# Apply quantization configurations to the model
model = quantization.prepare_qat(model, inplace=False)



At this point, we are ready to start training our network!

## 3. Set up and run the training 

### 3.1 Define the training regime

Training through the layers is dynamic, such that all you need to do to train everything is to define a new layer. We will first define the training function and explain how it operates.

This training regime runs every image through the newtork for a defined number of epochs. Images are loaded and converted to input spikes using the `train_loader` we defined earlier. For the first layer, it will simply calculate network spikes for the following layer. Otherwise, it will loop through each previous layer and generate spikes through the learned connection weights until it reaches the final layer currently being trained. 

The main calculation is in the `self.forward()` function. Weights in our layers are a `nn.Linear()` class, where input spikes are multiplied by connection weights between the layers. Once spikes from the input to layer have been calculated, spike-timing dependent plasticity (STDP) learning rules are applied. The learning rate for the network is annealed after every 100 images that are propagated.

In [None]:
# Keep track of trained layers to pass data through them
trained_layers = [] 

# Training each layer
for layer_name, _ in sorted(model.layer_dict.items(), key=lambda item: item[1]):
    print(f"Training layer: {layer_name}")
    # Retrieve the layer object
    layer = getattr(model, layer_name)
    # Train the layer
    model.train_model(train_loader, layer, prev_layers=trained_layers)
    # After training the current layer, add it to the list of trained layers
    trained_layers.append(layer_name)

Training layer: feature_layer


Training : 100%|████████████████████████████| 4000/4000 [01:25<00:00, 46.94it/s]


Training layer: output_layer


Training :  67%|██████████████████▋         | 2671/4000 [00:54<00:26, 49.88it/s]