## VPRTempo - Introduction

### By Adam D Hines (https://research.qut.edu.au/qcr/people/adam-hines/)

VPRTempo is based on the following paper, if you use or find this code helpful for your research please consider citing the source:
    
[Adam D Hines, Peter G Stratton, Michael Milford, & Tobias Fischer. "VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition. arXiv September 2023](https://arxiv.org/abs/2309.10225)

### Introduction

Traditional methods for visual place recognition (VPR) tasks typically employ the use of convolutional neural networks like ResNet to train large datasets for feature extraction of incoming query images, rather than specifically learning said query place. The networks are extremely effective at accurate localisation, but are are slow to train, inference, and store.

Spiking neural networks (SNNs) by contrast are more energy efficient and have low latency computation, meaning their deployment capability for VPR is extremely promising. Specifically, networks can be trained on the exact location you wish to query which takes a fundamentally different approach to the VPR task.

VPRTempo uses a temporal encoding scheme for spikes, where the amplitude of a spike is determined by an incoming training or query image's pixel intensity. This amplitude defines the 'timing' of the spike, similar to a latency code. As spikes propagate throughout the system, spike-timing dependent plasticity (STDP) learning rules train neuronal connections based off of the pixel intensity spike amplitudes. 

To get started, please ensure you have installed and currently have activated the `conda` environment for VPRTempo. For more information how to install and setup the environment, please see the [README.md](https://github.com/AdamDHines/VPRTempo-quant/blob/main/README.md).

In [None]:
!conda activate vprtempo

## 1. Get the Nordland dataset

### 1.1 Download the dataset

Please [download the Nordland datasets](https://webdiis.unizar.es/~jmfacil/pr-nordland/#download-dataset) (Summer, Spring, Fall, & Winter). There are two datasets available, the full size and downsampled versions. Either will work fine but our paper details the full size dataset. If disk space is a concern, please use the downsampled version.

Save the data in the `./VPRTempo-quant/dataset/` subfolder.

### 1.2 Import modules

Once we have downloaded the dataset, we'll start by importing all the necessary modules.

For this tutorial, we use [Jupyter Dynamic Classes](https://alexhagen.github.io/jdc/) so if not already installed please install. 

In [None]:
!pip install jdc

In [None]:
import jdc
import os
import torch
import gc
import sys
sys.path.append('../src')
sys.path.append('../models')
sys.path.append('../output')
sys.path.append('../dataset')

import blitnet as bn
import numpy as np
import torch.nn as nn
import torch.quantization as quantization

from settings import configure, image_csv, model_logger
from dataset import CustomImageDataset, ProcessImage
from torch.utils.data import DataLoader
from torch.ao.quantization import QuantStub, DeQuantStub
from tqdm import tqdm

### 1.3 Prepare the dataset for the model (optional)

The datset seasons are downloaded in .zip format and need to be extracted into a single folder. The `nordland` function has been provided to automatically do this for you and to re-name the images to match those in the nordland.csv file.

If you have already done this from the previous tutorial, you can skip this step.

In [None]:
from os import walk
from nordland import nord_sort

# unzip, re-organise, and re-name the Nordland datasets
nord_sort()

## 2. Set up the network

### 2.1 Define and initialize the VPRTempo model class

We'll first define the VPRTempo class which handles the configuration as set in `./src/settings.py`, determining which images to load, and establishes the layers used for training. For this tutorial, leave the settings as the default.

`__init__` is where we define the layers used for the model. In this case, we define a `feature_layer` and an `output_layer`. `dims` represents the number of neurons in the input and the layer itself, which in this case is `self.input`, `self.feature`, and `self.output`. Note that the size of the input for each proceeding layer is the size of previous layer. In this example, we have an input of 784 neurons (for 28x28 images) connected to a 1568 neuron feature layer which then connects to a final output layer of 500 neurons.

The other hyperparameters for each layer are set here as well.

In [None]:
class VPRTempo(nn.Module):
    def __init__(self):
        super(VPRTempo, self).__init__()

        # Configure the network
        configure(self)
        
        # Define the images to load (both training and inference)
        image_csv(self)

        # Add quantization stubs for Quantization Aware Training (QAT)
        self.quant = QuantStub()
        self.dequant = DeQuantStub()
        
        # Define the add function for quantized addition
        self.add = nn.quantized.FloatFunctional()      

        # Layer dict to keep track of layer names and their order
        self.layer_dict = {}
        self.layer_counter = 0

        """
        Define trainable layers here
        """
        self.add_layer(
            'feature_layer',
            dims=[self.input, self.feature],
            thr_range=[0, 0.5],
            fire_rate=[0.2, 0.9],
            ip_rate=0.15,
            stdp_rate=0.005,
            const_inp=[0, 0.1],
            p=[0.1, 0.5]
        )
        self.add_layer(
            'output_layer',
            dims=[self.feature, self.output],
            ip_rate=0.15,
            stdp_rate=0.005,
            spk_force=True
        )
        
        print('VPRTempo succesfully initialized')

### 2.2 Dynamically add layers

As above, the only thing we need to do in order to add additional layers to our model is to include a self.add_layer(args) to the `__init__` component of the script. The actual handling of the layer generation is done by the blitnet.SNNLayer() class from `blitnet.py`. Here, hyperparameters are stored in the layer information and the initial weights are seeded and normalized for training.

In [None]:
%%add_to VPRTempo
def add_layer(self, name, **kwargs):
    """
    Dynamically add a layer with given name and keyword arguments.

    :param name: Name of the layer to be added
    :type name: str
    :param kwargs: Hyperparameters for the layer
    """
    # Check for layer name duplicates
    if name in self.layer_dict:
        raise ValueError(f"Layer with name {name} already exists.")

    # Add a new SNNLayer with provided kwargs
    setattr(self, name, bn.SNNLayer(**kwargs))

    # Add layer name and index to the layer_dict
    self.layer_dict[name] = self.layer_counter
    self.layer_counter += 1  

    print('Succesfully added '+name)

### 2.3 Set the training regime

Training is also handled by the `VPRTempo()` class and recursively runs until all the defined layers are trained. The initial learning rates are copied out so that they can be annealed appropriately for the defined number of time steps. Training runs for the specified number of epochs and the total number of timesteps as set in the train_loader class (more later on that, a simple [PyTorch DataLoader](https://pytorch.org/tutorials/beginner/basics/data_tutorial.html)).

Once a layer has been trained, the learning for that layer will be turned off and training deeper layers will propagate the input spikes through each trained layer until it reaches the one being currently learned. Learning involves spike-timing dependent plasticity (STDP) rules, firing threshold adjustments, and inhibitory connection normalization.

In [None]:
%%add_to VPRTempo
def train_model(self, train_loader, layer, prev_layers=None):
    """
    Train a layer of the network model.

    :param train_loader: Training data loader
    :param layer: Layer to train
    :param prev_layers: Previous layers to pass data through
    """

    # Initialize the tqdm progress bar
    pbar = tqdm(total=int(self.T * self.epoch),
                desc="Training ",
                position=0)

    # Initialize the learning rates for each layer (used for annealment)
    init_itp = layer.eta_ip.detach()
    init_stdp = layer.eta_stdp.detach()

    # Run training for the specified number of epochs
    for epoch in range(self.epoch):
        mod = 0  # Used to determine the learning rate annealment, resets at each epoch
        # Run training for the specified number of timesteps
        for spikes, labels in train_loader:
            spikes, labels = spikes.to(self.device), labels.to(self.device)
            idx = labels / self.filter # Set output index for spike forcing
            # Pass through previous layers if they exist
            if prev_layers:
                with torch.no_grad():
                    for prev_layer_name in prev_layers:
                        prev_layer = getattr(self, prev_layer_name) # Get the previous layer object
                        spikes = self.forward(spikes, prev_layer) # Pass spikes through the previous layer
                        spikes = bn.clamp_spikes(spikes, prev_layer) # Clamp spikes [0, 0.9]
            else:
                prev_layer = None
            # Get the output spikes from the current layer
            pre_spike = spikes.detach() # Previous layer spikes for STDP
            spikes = self.forward(spikes, layer) # Current layer spikes
            spikes_noclp = spikes.detach() # Used for inhibitory homeostasis
            spikes = bn.clamp_spikes(spikes, layer) # Clamp spikes [0, 0.9]
            # Calculate STDP
            layer = bn.calc_stdp(pre_spike,spikes,spikes_noclp,layer, idx, prev_layer=prev_layer)
            # Adjust learning rates
            layer = self._anneal_learning_rate(layer, mod, init_itp, init_stdp)
            # Update the annealing mod & progress bar 
            mod += 1
            pbar.update(1)

    # Close the tqdm progress bar
    pbar.close()

### 2.4 Create the forward pass

Layers in VPRTempo are defined as an [nn.Linear](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html) layer, with incoming spikes being linearly transformed with the layer weights. The forward pass simply takes incoming spikes and caluclates the transform with positive and negative weights and adds them together, returning the transformed spikes.

In [None]:
%%add_to VPRTempo
def forward(self, spikes, layer):
    """
    Compute the forward pass of the model.

    Parameters:
    - spikes (Tensor): Input spikes.

    Returns:
    - Tensor: Output after processing.
    """

    spikes = self.quant(spikes)
    spikes = self.add.add(layer.exc(spikes), layer.inh(spikes))
    spikes = self.dequant(spikes)

    return spikes

### 2.5 Learning rate annealment & model loader/saver

Finally, the last thing we will add to the model is the learning rate annealment regime and the functions for loading and saving trained models.

In [None]:
%%add_to VPRTempo
def _anneal_learning_rate(self, layer, mod, itp, stdp):
    """
    Anneal the learning rate for the current layer.
    """
    if np.mod(mod, 100) == 0: # Modify learning rate every 100 timesteps
        pt = pow(float(self.T - mod) / self.T, self.annl_pow)
        layer.eta_ip = torch.mul(itp, pt) # Anneal intrinsic threshold plasticity learning rate
        layer.eta_stdp = torch.mul(stdp, pt) # Anneal STDP learning rate

    return layer

def save_model(self, model_out):    
    """
    Save the trained model to models output folder.
    """
    torch.save(self.state_dict(), model_out) 

def load_model(self, model_path):
    """
    Load pre-trained model and set the state dictionary keys.
    """
    self.load_state_dict(torch.load(model_path, map_location=self.device),
                         strict=True)

### 2.6 Initialize the model

Now that the model has been defined, we can initialize it and start with the quantization process.

In [None]:
model = VPRTempo()
model_logger(model)
model.train()

### 2.7 Generate unique model name

We will finally set up a unique model name based on the network architecture so we can save and reload our trained model.

In [None]:
def generate_model_name(model):
    """
    Generate the model name based on its parameters.
    """
    return ("VPRTempo" +
            str(model.input) +
            str(model.feature) +
            str(model.output) +
            str(model.number_modules) +
            '.pth')

model_name = generate_model_name(model)

print(model_name)

## 3. Define the DataLoader

### 3.1 Set the DataLoader

Now that we've defined the model, we will set up the DataLoaders. These utilise a PyTorch CustomImageDataset and ProcessImage to import images and process them for training or inference. In brief, images are loaded, gamma corrected, resized, and then patch-normalized before being converted into system spikes to be propagated throughout.

Since we present the network with one image at a time, the `batch_size` is kept to 1.

In [None]:
from dataset import CustomImageDataset, ProcessImage
from torch.utils.data import DataLoader

image_transform = ProcessImage(model.dims, model.patches)
train_dataset = CustomImageDataset(annotations_file=model.dataset_file, 
                                       img_dirs=model.training_dirs,
                                       transform=image_transform,
                                       skip=model.filter,
                                       max_samples=model.number_training_images,
                                       test=False)
# Initialize the data loader
train_loader = DataLoader(train_dataset, 
                          batch_size=1, 
                          shuffle=False,
                          num_workers=8,
                          persistent_workers=True)

## 5. Set up and run the training 

### 5.1 Define and run the training regime

The training will loop through each defined layer until every single one has trained. In order to propagate spikes throughout the system, trained layers are appended to a list so that they can be re-fed back into the network to calculate spikes based on learned weights.

Run the below cell to train our `feature_layer` and `output_layer`!

In [None]:
# Keep track of trained layers to pass data through them
trained_layers = [] 

# Training each layer
for layer_name, _ in sorted(model.layer_dict.items(), key=lambda item: item[1]):
    print(f"Training layer: {layer_name}")
    # Retrieve the layer object
    layer = getattr(model, layer_name)
    # Train the layer
    model.train_model(train_loader, layer, prev_layers=trained_layers)
    # After training the current layer, add it to the list of trained layers
    trained_layers.append(layer_name)
    
print('All layers trained succesfully')

### 5.2 Convert and save the model

Now that the training has been completed, we can convert the QAT model over to be fully quantized. As the layers were trained, scale and zero-point factors will learned for all the elements of the model and can now be applied to the layers. Once converted, we will save the model for use in inferencing.

In [None]:
# Convert the model to eval mode
model.eval()
# Save the model
model.save_model(os.path.join('../models', model_name))  

## 6. Inferencing

As in the previous tutorial, inferencing with a trained model is quite simple. The only additional thing we need to do is reinitialize the VPRTempo class and convert it to quantized before loading the model. Without pre-quantizing the inference model, state dictionary keys will not match since all the layers and associated components have new parameters such as scale and zero-point.

### 6.1 Add the inference function to the VPRTempo class

We will start by adding in the inference function to VPRTempo. It is similar to the training regime but omits the learning components `calc_stdp` and simply runs through all the layers until it reaches the output.

In [None]:
%%add_to VPRTempo
def evaluate(self, model, test_loader, layers=None):
    """
    Run the inferencing model and calculate the accuracy.

    :param test_loader: Testing data loader
    :param layers: Layers to pass data through
    """

    # Initialize the number of correct predictions
    numcorr = 0
    idx = 0

    # Initialize the tqdm progress bar
    pbar = tqdm(total=self.number_testing_images,
                desc="Running the test network",
                position=0)

    # Run inference for the specified number of timesteps
    for spikes, labels in test_loader:
        # Set device
        spikes, labels = spikes.to(self.device), labels.to(self.device)
        # Pass through previous layers if they exist
        if layers:
            for layer_name in layers:
                layer = getattr(self, layer_name)
                spikes = self.forward(spikes, layer)
                spikes = bn.clamp_spikes(spikes, layer)

        # Evaluate if the prediction is correct
        if torch.argmax(spikes.reshape(1, self.number_training_images)) == idx:
            numcorr += 1

        # Update the index and progress bar
        idx += 1
        pbar.update(1)

    # Close the tqdm progress bar
    pbar.close()
    # Calculate and record the accuracy
    accuracy = round((numcorr/self.number_testing_images)*100,2)
    model.logger.info("P@100R: "+ str(accuracy) + '%')

### 6.2 Define the inferencing DataLoader

The only difference between the training and testing DataLoader is the directory with which it will import images from.

In [None]:
# Initialize the image transforms and datasets
image_transform = ProcessImage(model.dims, model.patches)
test_dataset = CustomImageDataset(annotations_file=model.dataset_file, 
                                  img_dirs=model.testing_dirs,
                                  transform=image_transform,
                                  skip=model.filter,
                                  max_samples=model.number_testing_images)
# Initialize the data loader
test_loader = DataLoader(test_dataset, 
                         batch_size=1, 
                         shuffle=False,
                         num_workers=8,
                         persistent_workers=True)

### 6.3 Re-initialize the model class, convert to quantization, and load the model

Now we will re-initialize the VPRTempo class model, set to eval mode, and convert it over to quantized so that we can import our newly trained model.

In [None]:
# Set the model to evaluation mode and set configuration
model = VPRTempo()
model.model_logger()
model.eval()

# Load the model
model.load_model(os.path.join('../models', model_name))

# Retrieve layer names for inference
layer_names = list(model.layer_dict.keys())

### 6.4 Run the model inference

Now we are ready to inference the model!

In [None]:
# Use evaluate method for inference accuracy
model.evaluate(model, test_loader, layers=layer_names)

## 7. Conslusions



This tutorial covered how we can convert the VPRTempo model to perform Quantized Aware Training (QAT) to keep the model size more lightweight. You might notice that if you compare the system between FP32 to Int8, the model works equally as well with a reduced bit-depth with the added benefit of a reduced model size.

To read more about QAT and quantization in general, PyTorch provides many useful articles;
https://pytorch.org/docs/stable/quantization.html
https://pytorch.org/blog/quantization-in-practice/

The key benefit to this is being able to perform fast training and inferencing on CPU architecture, which for resource limited compute scenarios is critical.