## VPRTempoQuant - Basic Demo for a quantized version of VPRTempo

### By Adam D Hines (https://research.qut.edu.au/qcr/people/adam-hines/)

VPRTempo is based on the following paper, if you use or find this code helpful for your research please consider citing the source:
    
[Adam D Hines, Peter G Stratton, Michael Milford, & Tobias Fischer. "VPRTempo: A Fast Temporally Encoded Spiking Neural Network for Visual Place Recognition. arXiv September 2023](https://arxiv.org/abs/2309.10225)

### Introduction

This is a basic, extremely simplified version of VPRTempo that highlights how images are transformed, spikes and weights are used, and the readout for performance using a model trained using Quantized Aware Training (QAT). We will view the system through the lens of integer based weights and spikes to see how a quantized version of VPRTempo operates under the hood.

*Note: In this example, we will lose some amount of precision because we are only using integers for all calculations. In the deployed version, PyTorch quantizes and dequantizes spikes and weights so that some calculations are performed in the floating point domain. As such, this tutorial should be taken purely for conceptual understanding of a quantized version of VPRTempo and not for implementation purposes.*

Before starting, make sure the following packages are installed and imported:

In [None]:
# Imprt opencv-python, NumPy, and matplotlib.pyplot
try:
    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
except:
    ! pip install numpy, opencv-python, matplotlib # pip install if modules not present
    import cv2
    import numpy as np
    import matplotlib.pyplot as plt

### Image processing

As in the previous tutorial, we will load in a 360x640 image and show the patch-normalized version.  

In [None]:
# Load the input image
raw_img = cv2.imread('./mats/1_basicdemoquant/summer.png')
rgb_img = cv2.cvtColor(raw_img, cv2.COLOR_BGR2RGB) # Convert to RGB

# Load the patch normalized image
patch_img = np.load('./mats/1_basicdemoquant/summer_patchnorm.npy', allow_pickle=True)
patch_img = patch_img.astype(np.int32)
# Create a figure to hold the subplots
plt.figure(figsize=(10, 4))

# Plot the first image
plt.subplot(1, 2, 1)  # 1 row, 2 columns, 1st subplot
plt.imshow(rgb_img)
plt.title('Nordland Summer')

# Plot the second image
plt.subplot(1, 2, 2)  # 1 row, 2 columns, 2nd subplot
plt.matshow(patch_img, fignum=False)
plt.title('Nordland Summer Patch Normalized')
plt.colorbar(shrink=0.75, label="Pixel intensity")

# Show the plot
plt.show()
max_int = np.max(patch_img)
print(f"The maximum integer pixel value is {max_int}")

The patch normalized image here are floating point values in the range [0, 1]. For the base VPRTempo system, this is fine because the entire system works using floating points. However, in our quantized model we will be using integers. To demonstrate the conversion from floating point to integer, we'll manually quantize our input spikes by dividing using the `scale_factor` determined from the QAT.

Let's load in some model scale factors, we'll use some of these later for the weight calculations.

In [None]:
# Load network scale factor
scale_factors = np.load('./mats/1_basicdemoquant/if_scales.npy',allow_pickle=True)

Like in the previous tutorial, we will convert this to a 1D-array to pass through the layers.

In [None]:
# Convert 2D image to a 1D-array
patch_1d = np.reshape(patch_img, (784,))

### Load the pre-trained network weights

Our network consists of the same architecture as in the previous tutorial. The excitatory and inhibitory weights have been converted to the integer representations from the QAT and will be applied directly to the quantized input spikes.

In [None]:
# Load the input to feature excitatory and inhibitory network weights
if_exc = np.load('./mats/1_basicdemoquant/if_exc.npy')
if_inh = np.load('./mats/1_basicdemoquant/if_inh.npy')

# Create a figure and a set of subplots
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 5)) # Adjust the figure size as needed

# Plot the excitatory weights
exc_plot = axes[0].matshow(if_exc.T)
axes[0].set_title('Input > Feature Excitatory Weights')
fig.colorbar(exc_plot, ax=axes[0], shrink=0.4, label="Weight strength")

# Plot the inhibitory weights
inh_plot = axes[1].matshow(if_inh.T, cmap='viridis_r')
axes[1].set_title('Input > Feature Inhibitory Weights')
fig.colorbar(inh_plot, ax=axes[1], shrink=0.4, label="Weight strength")

# Display the plots
plt.show()

# Print dtype
print(f"Excitatory weights integer type is {if_exc.dtype}")
print(f"Inhibitory weights integer type is {if_inh.dtype}")

In addition to this, we will set the zero points for these weights. From the QAT, the zero point for the excitatory weights was 0 and 127 for the inhibitory weights.

In [None]:
# Set the zero point for the inhibitory weights
zeropoint_inh = 127

### Propagate network spikes

Now we'll propagate the input spikes across the feature to get the output, like in the previous tutorial. Let's start with the excitatory weights though first, since we will need to use different scaling of the output based on the zero point.

In [None]:
# Calculate feature spikes for the positive weight calculation
exc_feature_spikes = (np.matmul(if_exc,patch_1d))

# Now create the line plot
plt.plot(np.arange(len(exc_feature_spikes)), exc_feature_spikes)

# Add title and labels if you wish
plt.title('Excitatory Feature Layer Spikes')
plt.xlabel('Neuron ID')
plt.ylabel('Spike Amplitude')
# Show the plot
plt.show()

We can see here that the spike values calculated for the feature layer are huge. That is because they need to be properly re-scaled after calculation. To do this, we need to take a couple of the scaling factors we imported earlier to transform the output to a reasonable range.

In [None]:
# Get the required scale factors to transform the feature spikes
perslice_scale_exc = scale_factors[0]
perchannel_scale_exc = scale_factors[2]

# Transform the feature layer spikes based on the scale factors
scaled_exc_feature_spikes = (exc_feature_spikes//(perslice_scale_exc*perchannel_scale_exc))//perchannel_scale_exc
scaled_exc_feature_spikes = scaled_exc_feature_spikes.astype(np.int32)
# Plot out the scaled feature layer spikes
# Now create the line plot
plt.plot(np.arange(len(scaled_exc_feature_spikes)), scaled_exc_feature_spikes)

# Add title and labels if you wish
plt.title('Excitatory Feature Layer Spikes')
plt.xlabel('Neuron ID')
plt.ylabel('Spike Amplitude')
# Show the plot
plt.show()

Now let's do the same thing for our inhibitory weights.

In [None]:
# Calculate feature spikes for the negative weight calculation
inh_feature_spikes = (np.matmul(if_inh, patch_1d))

# Get the required scale factors to transform the feature spikes
perslice_scale_inh = scale_factors[1]
perchannel_scale_inh = scale_factors[3]

# Transform the feature layer spikes based on the scale factors
scaled_inh_feature_spikes = ((inh_feature_spikes - zeropoint_inh) // (perslice_scale_inh * perchannel_scale_inh)) // perchannel_scale_inh + zeropoint_inh
scaled_inh_feature_spikes = scaled_inh_feature_spikes.astype(np.int32)

# Create a figure and a set of subplots
fig, axs = plt.subplots(1, 2, figsize=(10, 5))  # 'figsize' can be adjusted as needed

# First subplot
axs[0].plot(np.arange(len(inh_feature_spikes)), inh_feature_spikes)
axs[0].set_title('Inhibitory Feature Layer Spikes')
axs[0].set_xlabel('Neuron ID')
axs[0].set_ylabel('Spike Amplitude')

# Second subplot
axs[1].plot(np.arange(len(scaled_inh_feature_spikes)), scaled_inh_feature_spikes)
axs[1].set_title('Scaled Inhibitory Feature Layer Spikes')
axs[1].set_xlabel('Neuron ID')
axs[1].set_ylabel('Spike Amplitude')

# Adjust the layout
plt.tight_layout()
# Show the plot
plt.show()


One thing you may notice is that although we used negative weights, we output positive spikes from in this operation. That is because of the `zeropoint_inh` of 127, which we add to the final spike calculation.

Now that we separately calculated our positive and negative feature layer spikes, we need to add them together to get the final feature spikes. However, we'll note that because the scales and zero points for the two operations are different they will require to undergo additional transformation to match the scales. In the VPRTempoQuant model, we derive this addition scale and zero point from the `nn.quantized.FloatFunctional.add` function which learns these values during QAT. 

In [None]:
# Combined scale factors
combined_scale = 58
combined_zeropoint = 61

# Remove zeropoint from inhibitory spikes
scaled_inh_feature_spikes_zero = scaled_inh_feature_spikes - zeropoint_inh

# Combine the excitiatory and inhibitory feature spikes
exc_rescaled = (scaled_exc_feature_spikes/perchannel_scale_exc) * combined_scale
inh_rescaled = (scaled_inh_feature_spikes_zero/perchannel_scale_exc) * combined_scale
print(perchannel_scale_inh.dtype)
combined = (exc_rescaled.astype(np.int32) + inh_rescaled.astype(np.int32)) + combined_zeropoint
combined = np.clip(combined,0,max_int)

# Plot the combined spikes
plt.plot(np.arange(len(combined)), combined)

# Add title and labels if you wish
plt.title('Combined Feature Layer Spikes')
plt.xlabel('Neuron ID')
plt.ylabel('Spike Amplitude')
# Show the plot
plt.show()

Now we will apply the same process for the output layer to get the output spikes.

In [None]:
# Load the input to feature excitatory and inhibitory network weights
fo_exc = np.load('./mats/1_basicdemoquant/fo_exc.npy')
fo_inh = np.load('./mats/1_basicdemoquant/fo_inh.npy')

# Create a figure and a set of subplots
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 5)) # Adjust the figure size as needed

# Plot the excitatory weights
exc_plot = axes[0].matshow(fo_exc)
axes[0].set_title('Feature > Output Excitatory Weights')
fig.colorbar(exc_plot, ax=axes[0], shrink=0.4, label="Weight strength")

# Plot the inhibitory weights
inh_plot = axes[1].matshow(fo_inh, cmap='viridis_r')
axes[1].set_title('Feature > Output Inhibitory Weights')
fig.colorbar(inh_plot, ax=axes[1], shrink=0.4, label="Weight strength")

# Display the plots
plt.show()

# Print dtype
print(f"Excitatory weights integer type is {if_exc.dtype}")
print(f"Inhibitory weights integer type is {if_inh.dtype}")

We'll get our excitatory and inhibitory spikes for the output and scale them.

In [None]:
# Load the output layer scales
fo_scales = np.load('./mats/1_basicdemoquant/fo_scales.npy',allow_pickle=True)

# Calculate the excitatory and inhibitory spikes and scale them
exc_output_spikes = np.round(np.matmul(fo_exc,combined))
scaled_exc_output_spikes = exc_output_spikes // (fo_scales[0]) 

inh_output_spikes = (np.matmul(fo_inh,combined.astype(np.int32)))
scaled_inh_output_spikes = (inh_output_spikes - zeropoint_inh) // fo_scales[1]

# Combine the excitiatory and inhibitory feature spikes
exc_rescaled = (scaled_exc_output_spikes/fo_scales[2]) * combined_scale
inh_rescaled = ((scaled_inh_output_spikes - zeropoint_inh)/fo_scales[3]) * combined_scale

output_spikes = (exc_rescaled.astype(np.int32) + inh_rescaled.astype(np.int32)) + combined_zeropoint
output_spikes = np.clip(output_spikes,0,max_int)

# Plot the combined spikes
plt.plot(np.arange(len(output_spikes)), output_spikes)

# Add title and labels if you wish
plt.title('Combined Output Layer Spikes')
plt.xlabel('Neuron ID')
plt.ylabel('Spike Amplitude')
# Show the plot
plt.show()

And now, as in the previous tutorial, we can clearly see that Neuron ID has the highest output spike amplitude corresponding to our first learned location.

Let's quickly prove it.

In [None]:
# Output the argmax from the output spikes
prediction = np.argmax(output_spikes)
print(f"Neuron ID with the highest output is {prediction}")

### Conclusions

We have gone through a very basic demo of how VPRTempoQuant works and the operations involved for quantizing floating points spikes and weights into the integer domain. Although this is isn't exactly how PyTorch performs these tasks (a lot of them are done in the FP space, especially with regards to rescaling for addition) - it should give you a good idea as to how we can perform these kinds of operations in whole integers. This is particularly useful for implementation on hardware such as neuromorphic processors.

If you would like to go more in-depth with training and inferencing, checkout some of the [other tutorials](https://github.com/AdamDHines/VPRTempo-quant/tree/main/tutorials) which show you how to train your own model and goes through the more sophisticated implementation of VPRTempo.