# Learning Goals
In Assignment 2, we learnt how to construct networks of spiking neurons and propagate information through a network of fixed weights. In this assignment, you will learn how to train network weights for a given task using brain-inspired learning rules.

Let's import all the libraries required for this assignment. 

In [1]:
import math
import numpy as np
import matplotlib.pyplot as plt

# Question 1: Training a Network

## 1a. 
What is the purpose of a learning algorithm? In other words, what does a learning algorithm dictate, and what is the objective of it?

## Answer 1a. 
Learning algorithms consist of a set of procedures designed to enable a computer program to learn from data and improve its performance over time. These algorithms are part of a subfield of artificial intelligence. It dictates how well a computer program can learn using a training model. The objective of learning algorithms is to observe that the rate of learning decreases with time during training, and to enable the algorithm to make formidable predictions during testing. When it comes to learning, two majors obstacles is choosing the correct loss function to measure reliance of the data and preventing overfitting.

There are also various types of learning algorithms, including supervised and unsupervising learning. In my Introduction to AI course, our final project dealt with supervised learning! We had a 20 x 20 grid layout of a pixel image with four colored wires, two across and two down. These colored wires were laid randomly, and our goal was to cut the third wire laid in order to diffuse an impending bomb! We used a linear regression model for the first task, and multinomial regression for the second task. In our write-up, we had to clearly outline key components of a learning algorithm: input/output space, model space, loss function, type of regularization, training algorithm (or gradient descent), and testing performance.

## 1b. 
Categorize and explain the various learning algorithms w.r.t. biological plausibility. Can you explain the tradeoffs involved with the different learning rules? *Hint: Think computational advantages and disadvantages of biological plausibility.*

## Answer 1b. 
As mentioned, two major types of learning algorithms are supervising learning versus unsupervised learning. Both can be discussed with respect to biological plausibility in the following manner.

In supervised learning, the learning algorithm learns from direct data, where each input-output pair is given during training. After training, the algorithm is given new data where it learns to map inputs to outputs by classifying a label during testing. In terms of biological plausability, such a strict form of learning may not prove to be efficient because it tends to lead to overfitting to the training data, leading to complications when it comes to adapting to new data. While supervised learning will work for less complex mechanisms, it may not be the most suitable to datasets that are susceptible to change.

In unsupervised learning, the learning algorithm learns patterns from unlabeled data. The algorithm tries to find hidden or intrinsic structures within the data, without human intervention, via clustering or dimensionality reduction. However, this type of learning leads to less accuracy, as opposed to supervised learning. In fact, a downside to unsupervised learning is computational complexity. In terms of biological plausability, this 'free'-form of learning may be more efficient because it allows for passive changes to be taken into account when conducting statistical correlations. Therefore, even if there are outliers in the data, they are still taken into account in the process of learning.

Next, we talk about two specific learning algorithms: Hebbian learning and STDP.

In Hebbian learning, it can best be summarized by "neurons that fire together wire together". In terms of biological plausability, this means that the connection between neurons are stronger when activated together, or weaker otherwise.

In STDP, it is a type of Hebbian learning that takes into account temporal encoding. In terms of biological plasuability, the strength of a connection between neurons is modified based on the relative timing of spikes.

More on Hebbian learning and STDP will be discussed later.

# Question 2: Hebbian Learning

## 2a.

In this exercise, you will implement the hebbian learning rule to solve AND Gate. First, we need to create a helper function to generate the training data. The function should return lists of tuples where each tuple comprises of numpy arrays of rate-coded inputs and the corresponding rate-coded output. 

Below is the function to generate the training data. Fill the components to return the training data. 

In [2]:
def genANDTrainData(snn_timestep):
    """ 
    Function to generate the training data for AND.
        Args:
            snn_timestep (int): timesteps for SNN simulation
        Return:
            train_data (list): list of tuples where each tuple comprises of numpy arrays of rate-coded inputs and output
        
        Write the expressions for encoding 0 and 1. Then append all 4 cases of AND gate to the list train_data
    """
    # Initialize an empty list for train data.
    train_data = []
    
    # Encode 0. Numpy random choice function might be useful here.
    zero = np.random.choice([1, 0], snn_timestep)
    print('Zero:\n', zero)
    
    # Encode 1. Numpy random choice function might be useful here.
    one = np.random.choice([1, 0], snn_timestep)
    print('One:\n', one)
    
    # Append all 4 cases of AND gate to train_data. Numpy stack operation might be useful here.
    sub_case1 = np.array([zero, zero])
    sub_case2 = np.array([zero, one])
    sub_case3 = np.array([one, zero])
    sub_case4 = np.array([one, one])
    
    case1 = np.array([sub_case1, zero])
    case2 = np.array([sub_case2, zero])
    case3 = np.array([sub_case3, zero])
    case4 = np.array([sub_case4, one])
    
    train_data = np.stack((case1, case2, case3, case4))
    print('Training Data:\n', train_data)
    
    return train_data

## 2b. 
We will use the implementation of the network from assignment 2 to create an SNN comprising of one input layer and one output layer. Can you explain algorithmically, how you can use this simple architecture to learn AND gate. Your algorithm should comprise of encoding, forward propagation, network training, and decoding steps. 

## Answer 2b.
First, we define that Hebbian learning is a type of unsupervised learning that is based on the idea that connections between neurons strengthen when those neurons fire together. Therefore, during learning, the connections of neurons that fire together strengthen, while the connections of neurons that do not fire together weaken.

Second, we define an AND gate is an implementation of two input neurons and one output neuron. Each input neuron represents one of the binary inputs to the AND gate (0 or 1), and the output neuron represents the output of the gate (0 or 1). If throwback to Discrete Structures, we know that the expression (A & B) is only true if both A and B are true.

We can now combine the two concepts together. When both input neurons are active, the output neuron should also be activated to produce a true output. Strengthening the synaptic connections between the input neurons and the output neuron when both inputs are active ensures that the output neuron is more likely to fire in response to this input pattern in the future. When both input neurons are not active or only _one_ input neuron is active, the output neuron should also not be as activated, if at all active.

Using this fundamental idea, we can now begin to explain algorithmically how an SNN comprising of one input layer and one output layer can learn a simple logical gate.

**Encoding:**
First, we encode the input layer into spike trains. Each neuron in this layer is resembled by one input bit of the AND gate (0,0), (0,1), (1,0), or (1,1).

**Forward Propagation:**
Next, we propagate spikes from the input layer to the output layer. Each connection between an input neuron and an output neuron has a weight associated with it. The output neuron integrates the weighted spikes from the input neurons over time.

**Training Data:**
   - Initialize the connection weights connecting the input neurons to the output neurons with random values.
   - Iterate over the training data and calculate the mean firing rate.
   - Adjust the connection weights based on mean firing rate. This should strengthen the connections that contribute to correct outputs and weaken those that contribute to incorrect outputs.
   - Run the model for specified number of epochs.

**Decoding:**
Lastly, we decode the output spikes to determine the predicted output of the AND gate. We can do so by using a threshold value: Over the threshold would be classified as 1, or 0 otherwise.

**Testing Data:**
Present new input data to the trained SNN and compare the decoded outputs with the expected outputs. We can measure the accuracy of the SNN in correctly predicting the output of the AND gate for different input patterns.

For an AND gate, this means testing it with input patterns where both inputs are 0, one input is 0 while the other is 1, and both inputs are 1.

If the SNN-based AND gate has learned correctly through Hebbian learning, it should produce the correct output pattern (0 or 1) for each corresponding input pattern.

The SNN has already been implemented for you. You do not need to do anything here. Just understand the implementation so that you can use it in the later parts. 

In [3]:
class LIFNeurons:
    """ 
        Define Leaky Integrate-and-Fire Neuron Layer 
        This class is complete. You do not need to do anything here.
    """

    def __init__(self, dimension, vdecay, vth):
        """
        Args:
            dimension (int): Number of LIF neurons in the layer
            vdecay (float): voltage decay of LIF neurons
            vth (float): voltage threshold of LIF neurons
        
        """
        self.dimension = dimension
        self.vdecay = vdecay
        self.vth = vth

        # Initialize LIF neuron states.
        self.volt = np.zeros(self.dimension)
        self.spike = np.zeros(self.dimension)
    
    def __call__(self, psp_input):
        """
        Args:
            psp_input (ndarray): synaptic inputs 
        Return:
            self.spike: output spikes from the layer
                """
        self.volt = self.vdecay * self.volt * (1. - self.spike) + psp_input
        self.spike = (self.volt > self.vth).astype(float)
        return self.spike

class Connections:
    """ Define connections between spiking neuron layers """

    def __init__(self, weights, pre_dimension, post_dimension):
        """
        Args:
            weights (ndarray): connection weights
            pre_dimension (int): dimension for pre-synaptic neurons
            post_dimension (int): dimension for post-synaptic neurons
        """
        self.weights = weights
        self.pre_dimension = pre_dimension
        self.post_dimension = post_dimension
    
    def __call__(self, spike_input):
        """
        Args:
            spike_input (ndarray): spikes generated by the pre-synaptic neurons
        Return:
            psp: postsynaptic layer activations
        """
        psp = np.matmul(self.weights, spike_input)
        return psp
    
    
class SNN:
    """ Define a Spiking Neural Network with No Hidden Layer """

    def __init__(self, input_2_output_weight, 
                 input_dimension=2, output_dimension=2,
                 vdecay=0.5, vth=0.5, snn_timestep=20):
        """
        Args:
            input_2_hidden_weight (ndarray): weights for connection between input and hidden layer
            hidden_2_output_weight (ndarray): weights for connection between hidden and output layer
            input_dimension (int): input dimension
            hidden_dimension (int): hidden_dimension
            output_dimension (int): output_dimension
            vdecay (float): voltage decay of LIF neuron
            vth (float): voltage threshold of LIF neuron
            snn_timestep (int): number of timesteps for inference
        """
        self.snn_timestep = snn_timestep
        self.output_layer = LIFNeurons(output_dimension, vdecay, vth)
        self.input_2_output_connection = Connections(input_2_output_weight, input_dimension, output_dimension)
    
    def __call__(self, spike_encoding):
        """
        Args:
            spike_encoding (ndarray): spike encoding of input
        Return:
            spike outputs of the network
        """
        spike_output = np.zeros(self.output_layer.dimension)
        for tt in range(self.snn_timestep):
            input_2_output_psp = self.input_2_output_connection(spike_encoding[:, tt])
            output_spikes = self.output_layer(input_2_output_psp)
            spike_output += output_spikes
        return spike_output/self.snn_timestep      

## 2c. 
Next, you need to write a function for network training using hebbian learning rule. The function is defined below. You need to fill in the components so that the network weights are updated in the right manner. 

In [4]:
def hebbian(network, train_data, lr=1e-5, epochs=10):
    """ 
    Function to train a network using Hebbian learning rule
        Args:
            network (SNN): SNN network object
            train_data (list): training data 
            lr (float): learning rate
            epochs (int): number of epochs to train with. Each epoch is defined as one pass over all training samples. 
        
        Write the operations required to compute the weight increment according to the hebbian learning rule.
        Then increment the network weights. 
    """
    
    # Iterate over the epochs.
    for ee in range(epochs):
        # Iterate over all samples in train_data.
        for data in train_data:
            # Compute the firing rate for the input.
            input_firing_rate = np.mean(data[0], 1)
            
            # Compute the firing rate for the output.
            output_firing_rate = np.mean(data[1])
            
            # Compute the correlation using the firing rates calculated above.
            correlation = input_firing_rate * output_firing_rate 
            
            # Compute the weight increment.
            weight_increment = lr * correlation
            
            # Increment the weight.
            network.input_2_output_connection.weights += weight_increment

## 2d. 
In this exercise, you will use your implementations above to train an SNN to learn AND gate. 

In [5]:
# Define a variable for input dimension.
input_dimension = 2

# Define a variable for output dimension.
output_dimension = 1

# Define a variable for voltage decay.
vdecay = 0.5

# Define a variable for voltage threshold.
vth = 0.5

# Define a variable for snn timesteps.
snn_timestep = 10

# Initialize randomly the weights from input to output. Numpy random rand function might be useful here.
input_2_output_weight = np.random.rand(output_dimension, input_dimension)

# Print the initial network weights.
print('\nInitial Network Weights:\n', input_2_output_weight)

# Initialize a SNN using the arguments defined above.
snn_1 = SNN(input_2_output_weight, input_dimension, output_dimension, vdecay, vth, snn_timestep)

# Get the training data for AND gate using the function defined in 2a.
print('\n***TRAINING***')
train_data = genANDTrainData(snn_timestep)

# Train the network using the function defined in 2c with the appropriate arguments.
hebbian(snn_1, train_data, lr=2e-2, epochs=10)

# Test the trained network and print the network output for all 4 cases.
print('\n***TESTING***')
test_data = []

zero = np.random.choice([1, 0], snn_timestep)
print('Zero:\n', zero)

one = np.random.choice([1, 0], snn_timestep)
print('One:\n', one)

case1 = np.array([zero, zero])
case2 = np.array([zero, one])
case3 = np.array([one, zero])
case4 = np.array([one, one])

# Case 1:
output1 = snn_1(case1)
print("Case 1:", output1)
if output1 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 2:
output2 = snn_1(case2)
print("Case 2:", output2)
if output2 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 3:
output3 = snn_1(case3)
print("Case 3:", output3)
if output3 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 4:
output4 = snn_1(case4)
print("Case 4:", output4)
if output4 > vth:
    test_data.append(1)
else:
    test_data.append(0)

print('Testing Data:\n', test_data)

# Print the final network weights.
print("\nFinal Network Weights:\n", snn_1.input_2_output_connection.weights)


Initial Network Weights:
 [[0.33186292 0.45042939]]

***TRAINING***
Zero:
 [0 1 1 1 0 1 1 1 1 0]
One:
 [1 1 1 0 1 0 0 0 0 1]
Training Data:
 [[array([[0, 1, 1, 1, 0, 1, 1, 1, 1, 0],
       [0, 1, 1, 1, 0, 1, 1, 1, 1, 0]])
  array([0, 1, 1, 1, 0, 1, 1, 1, 1, 0])]
 [array([[0, 1, 1, 1, 0, 1, 1, 1, 1, 0],
       [1, 1, 1, 0, 1, 0, 0, 0, 0, 1]])
  array([0, 1, 1, 1, 0, 1, 1, 1, 1, 0])]
 [array([[1, 1, 1, 0, 1, 0, 0, 0, 0, 1],
       [0, 1, 1, 1, 0, 1, 1, 1, 1, 0]])
  array([0, 1, 1, 1, 0, 1, 1, 1, 1, 0])]
 [array([[1, 1, 1, 0, 1, 0, 0, 0, 0, 1],
       [1, 1, 1, 0, 1, 0, 0, 0, 0, 1]])
  array([1, 1, 1, 0, 1, 0, 0, 0, 0, 1])]]

***TESTING***
Zero:
 [0 1 1 0 0 1 1 1 1 0]
One:
 [1 0 0 0 0 1 1 0 0 1]
Case 1: [0.6]
Case 2: [0.8]
Case 3: [0.8]
Case 4: [0.4]
Testing Data:
 [1, 1, 1, 0]

Final Network Weights:
 [[0.64786292 0.76642939]]


# Question 3: Limitations of Hebbian Learning rule

## 3a. 
Can you learn the AND gate using 2 neurons in the output layer instead of one? If yes, describe what changes you might need to make to your algorithm in 2b. If not, explain why not, and what consequences it might entail for the use of hebbian learning for complex real-world tasks. 

## Answer 3a. 
We can use two neurons in the output layer! However, since this is Hebbian learning, we would need to adjust the weight update rule to accommodate multiple output neurons by using negative weights. Negative weights enable the SNN to learn that certain input combinations should result in a higher activation ('strengthen') for specific output neurons. So, in the decoding phase: Over the threshold would be classified as 0.5 (rather than 1), or -0.5 (instead of 0) otherwise.

## 3b. 
Train the network using hebbian learning for AND gate with the same arguments as defined in 2d. but now multiply the number of epochs by 20. Can your network still learn AND gate correctly? Inspect the initial and final network weights, and compare them against the network weights in 2d. Based on this, explain your observations for the network behavior. 

In [6]:
# Implementation for 3b (Same as 2d, but with change of one argument).

# ...
# Initialize a SNN using the arguments defined above.
snn_2 = SNN(input_2_output_weight, input_dimension, output_dimension, vdecay, vth, snn_timestep)

# Get the training data for AND gate using the function defined in 2a.
print('\n***TRAINING***')
train_data = genANDTrainData(snn_timestep)

# Train the network using the function defined in 2c with the appropriate arguments.
hebbian(snn_2, train_data, lr=2e-2, epochs=200) # 10 * 20 = 200

# Test the trained network and print the network output for all 4 cases.
print('\n***TESTING***')
test_data = []

zero = np.random.choice([1, 0], snn_timestep)
print('Zero:\n', zero)

one = np.random.choice([1, 0], snn_timestep)
print('One:\n', one)

case1 = np.array([zero, zero])
case2 = np.array([zero, one])
case3 = np.array([one, zero])
case4 = np.array([one, one])

# Case 1:
output1 = snn_2(case1)
print("Case 1:", output1)
if output1 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 2:
output2 = snn_2(case2)
print("Case 2:", output2)
if output2 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 3:
output3 = snn_2(case3)
print("Case 3:", output3)
if output3 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 4:
output4 = snn_2(case4)
print("Case 4:", output4)
if output4 > vth:
    test_data.append(1)
else:
    test_data.append(0)

print('Testing Data:\n', test_data)

# Print the final network weights.
print("\nFinal Network Weights:\n", snn_2.input_2_output_connection.weights)


***TRAINING***
Zero:
 [0 0 0 1 0 1 0 1 0 1]
One:
 [0 1 0 1 0 0 1 1 0 1]
Training Data:
 [[array([[0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
       [0, 0, 0, 1, 0, 1, 0, 1, 0, 1]])
  array([0, 0, 0, 1, 0, 1, 0, 1, 0, 1])]
 [array([[0, 0, 0, 1, 0, 1, 0, 1, 0, 1],
       [0, 1, 0, 1, 0, 0, 1, 1, 0, 1]])
  array([0, 0, 0, 1, 0, 1, 0, 1, 0, 1])]
 [array([[0, 1, 0, 1, 0, 0, 1, 1, 0, 1],
       [0, 0, 0, 1, 0, 1, 0, 1, 0, 1]])
  array([0, 0, 0, 1, 0, 1, 0, 1, 0, 1])]
 [array([[0, 1, 0, 1, 0, 0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0, 0, 1, 1, 0, 1]])
  array([0, 1, 0, 1, 0, 0, 1, 1, 0, 1])]]

***TESTING***
Zero:
 [0 1 0 0 0 0 1 1 0 1]
One:
 [0 1 1 0 1 0 0 0 0 1]
Case 1: [0.4]
Case 2: [0.6]
Case 3: [0.6]
Case 4: [0.4]
Testing Data:
 [0, 1, 1, 0]

Final Network Weights:
 [[3.72786292 3.84642939]]


## Answer 3b. 
In `2d`, when `epochs=10`, the final network weights were generally in the range between 0 and 1 (more or less inclusively). In `3b`, when `epochs=200`, the final network weights were generally much greater than 1. This shows a drastic difference in weight calculation when comparing the respective initial weights to the final weights.

Such a change in weightsÂ is likely an indicator of stronger connections between the input and output neurons. We refer back to `2b` where we state that our main goal with the AND gate is to strengthen the connection weights between the input neurons and output neurons. With the increase in weight, we can see that increasing epochs enabled that goal. So, our network can, indeed, still learn and make better predictions! However, this might be resulting a positive feedback loop. With no bounds in place, it seems that the connection weights are able to grow without fault over time.

## 3c. 
Based on your observations and response in 3b., can you explain another limitation of hebbian learning rule w.r.t. weight growth? Can you also suggest a possible remedy for it?

## Answer 3c. 
Another limitation of hebbian learning with respect to weight growth is oversaturation. As we observed in `3b`, the connection weights of the network grew without bounds. This could lead to neurons saturating as they reach their maximum activation levels. This saturation can prevent further learning and limit the network's capacity to adapt to new information.

To remedy this, (after possibly sneaking a peek at 3d...), one approach is Oja's rule. Oja's rule is useful for principal component analysis (PCA) and for extracting the principal components of the input data. It operates on a single neuron and updates its weight vector in the direction of the input vector that has the maximal variance with the current weight vector. We do this because maximum variances leads to maximum information. By focusing on the most informative dimensions of the input data, Oja's rule can lead to more efficient learning and adaptation.

## 3d. 
To resolve the issues with hebbian learning, one possibility is Oja's rule. In this exercise, you will implement and train an SNN using Oja's learning rule. 

In [7]:
def oja(network, train_data, lr=1e-5, epochs=10):
    """ 
    Function to train a network using Hebbian learning rule.
        Args:
            network (SNN): SNN network object
            train_data (list): training data 
            lr (float): learning rate
            epochs (int): number of epochs to train with. Each epoch is defined as one pass over all training samples. 
        
        Write the operations required to compute the weight increment according to the hebbian learning rule. Then increment the network weights. 
    """
    
    # Iterate over the epochs.
    for ee in range(epochs):
        # Iterate over all samples in train_data.
        for data in train_data:
            # Compute the firing rate for the input.
            input_firing_rate = np.mean(data[0], 1)
            
            # Compute the firing rate for the output.
            output_firing_rate = np.mean(data[1])
            
            # Compute the weight increment.
            weight_increment = lr * np.outer(output_firing_rate, input_firing_rate)
            
            # Increment the weight.
            network.input_2_output_connection.weights += weight_increment

Now, test your implementation below. 

In [8]:
# Define a variable for input dimension.
input_dimension = 2

# Define a variable for output dimension.
output_dimension = 1

# Define a variable for voltage decay.
vdecay = 0.5

# Define a variable for voltage threshold.
vth = 0.5

# Define a variable for snn timesteps.
snn_timestep = 10

# Initialize randomly the weights from input to output. Numpy random rand function might be useful here.
input_2_output_weight = np.random.rand(output_dimension, input_dimension)

# Print the initial network weights.
print('\nInitial Network Weights:\n', input_2_output_weight)

# Initialize a SNN using the arguments defined above.
snn_3 = SNN(input_2_output_weight, input_dimension, output_dimension, vdecay, vth, snn_timestep)

# Get the training data for AND gate using the function defined in 2a.
print('\n***TRAINING***')
train_data = genANDTrainData(snn_timestep)

# Train the network using the function defined in 3d with the appropriate arguments.
oja(snn_3, train_data, lr=2e-2, epochs=10)

# Test the trained network and print the network output for all 4 cases.
print('\n***TESTING***')
test_data = []

zero = np.random.choice([1, 0], snn_timestep)
print('Zero:\n', zero)

one = np.random.choice([1, 0], snn_timestep)
print('One:\n', one)

case1 = np.array([zero, zero])
case2 = np.array([zero, one])
case3 = np.array([one, zero])
case4 = np.array([one, one])

# Case 1:
output1 = snn_3(case1)
print("Case 1:", output1)
if output1 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 2:
output2 = snn_3(case2)
print("Case 2:", output2)
if output2 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 3:
output3 = snn_3(case3)
print("Case 3:", output3)
if output3 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 4:
output4 = snn_3(case4)
print("Case 4:", output4)
if output4 > vth:
    test_data.append(1)
else:
    test_data.append(0)

print('Testing Data:\n', test_data)

# Print the final network weights.
print("\nFinal Network Weights:\n", snn_3.input_2_output_connection.weights)


Initial Network Weights:
 [[0.78223878 0.0657187 ]]

***TRAINING***
Zero:
 [0 0 1 1 1 1 1 1 1 0]
One:
 [0 0 0 0 0 0 1 1 0 1]
Training Data:
 [[array([[0, 0, 1, 1, 1, 1, 1, 1, 1, 0],
       [0, 0, 1, 1, 1, 1, 1, 1, 1, 0]])
  array([0, 0, 1, 1, 1, 1, 1, 1, 1, 0])]
 [array([[0, 0, 1, 1, 1, 1, 1, 1, 1, 0],
       [0, 0, 0, 0, 0, 0, 1, 1, 0, 1]])
  array([0, 0, 1, 1, 1, 1, 1, 1, 1, 0])]
 [array([[0, 0, 0, 0, 0, 0, 1, 1, 0, 1],
       [0, 0, 1, 1, 1, 1, 1, 1, 1, 0]])
  array([0, 0, 1, 1, 1, 1, 1, 1, 1, 0])]
 [array([[0, 0, 0, 0, 0, 0, 1, 1, 0, 1],
       [0, 0, 0, 0, 0, 0, 1, 1, 0, 1]])
  array([0, 0, 0, 0, 0, 0, 1, 1, 0, 1])]]

***TESTING***
Zero:
 [1 0 0 0 1 0 0 0 1 0]
One:
 [1 1 0 1 1 0 1 1 0 0]
Case 1: [0.3]
Case 2: [0.3]
Case 3: [0.6]
Case 4: [0.6]
Testing Data:
 [0, 0, 1, 1]

Final Network Weights:
 [[1.03823878 0.3217187 ]]


# Question 4: Spike-time dependent plasticity (STDP)

## 4a. 
What is the limitation with hebbian learning that STDP aims to resolve?

## Answer 4a. 
Hebbian learning simply strengthens connection weights between neurons that fire together, without distinguishing between the relative timing of their activity. As a result, it cannot capture the causal relationship between the firing of pre- and postsynaptic neurons. Therefore, the limitation of hebbian learning that STDP aims to resolve is the precise timing of pre- and postsynaptic spikes.

## 4b. 
Describe the algorithm to train a network using STDP learning rule. You do not need to describe encoding here. Your algorithm should be such that its naturally translatable to a program. 

## Answer 4b. 
According to the STDP model referenced in the article below, there is a weight change if there is a pre-synaptic spike in the temporal vicinity of a post-synaptic spike. The change is positive if the pre-synaptic spike occurs immediately _before_ the post-synaptic spike. Otherwise, the change is negative. Unlike hebbian learning, each connection weight in our network introduces a time delay, representing the difference between post-synaptic firing and pre-synaptic potential rise (where time in in discrete units). Membrane potential increases by the weight value of each incoming spike, but a constant is subtracted periodically to account for time delay. When the potential crosses the threshold, a spike is produced, and the potential resets to the resting level. This is the basic algorithm for STDP.

Note that the same ideas from hebbian learning still apply, such as encoding, propagation, training, decoding, testing...etc. But, connection weights are updated much differently.

## 4c. 
In this exercise, you will implement the STDP learning algorithm to train a network. STDP has many different flavors. For this exercise, we will use the learning rule defined in: https://dl.acm.org/doi/pdf/10.1609/aaai.v33i01.330110021. Pay special attention to Equations 2 and 3. 

Below is the class definition for STDP learning algorithm. Your task is to fill in the components so that the weights are updated in the right manner. 

In [9]:
class STDP():
    """Train a network using STDP learning rule"""
    def __init__(self, network, A_plus, A_minus, tau_plus, tau_minus, lr, snn_timesteps=20, epochs=30, w_min=0, w_max=1):
        """
        Args:
            network (SNN): network which needs to be trained
            A_plus (float): STDP hyperparameter
            A_minus (float): STDP hyperparameter
            tau_plus (float): STDP hyperparameter
            tau_minus (float): STDP hyperparameter
            lr (float): learning rate
            snn_timesteps (int): SNN simulation timesteps
            epochs (int): number of epochs to train with. Each epoch is defined as one pass over all training samples.  
            w_min (float): lower bound for the weights
            w_max (float): upper bound for the weights
        """
        self.network = network
        self.A_plus = A_plus
        self.A_minus = A_minus
        self.tau_plus = tau_plus
        self.tau_minus = tau_minus
        self.snn_timesteps = snn_timesteps
        self.lr = lr
        self.time = np.arange(0, self.snn_timesteps, 1)
        self.sliding_window = np.arange(-4, 4, 1) # Defines a sliding window for STDP operation. 
        self.epochs = epochs
        self.w_min = w_min
        self.w_max = w_max
    
    def update_weights(self, t, i):
        """
        Function to update the network weights using STDP learning rule.
        Args:
            t (int): time difference between postsynaptic spike and a presynaptic spike in a sliding window
            i(int): index of the presynaptic neuron
        
        Fill the details of STDP implementation.
        """
        # Compute delta_w for positive time difference.
        if t > 0:
            delta_w = self.A_plus * np.exp(-t / self.tau_plus)
        
        # Compute delta_w for negative time difference.
        else:
            delta_w = -self.A_minus * np.exp(-t / self.tau_minus)
        
        # Update the network weights if weight increment is negative.
        if delta_w < 0:
            w_old = self.network.input_2_output_connection.weights
            w_new = w_old + self.lr * delta_w * (w_old - self.w_min)
            self.network.input_2_output_connection.weights = w_new
        
        # Update the network weights if weight increment is positive.
        elif delta_w > 0:
            w_old = self.network.input_2_output_connection.weights
            w_new = w_old + self.lr * delta_w * (self.w_max - w_old)
            self.network.input_2_output_connection.weights = w_new
    
    def train_step(self, train_data_sample):
        """
        Function to train the network for one training sample using the update function defined above. 
        Args:
            train_data_sample (list): a sample from the training data
            
        This function is complete. You do not need to do anything here. 
        """
        input = train_data_sample[0]
        output = train_data_sample[1]
        for t in self.time:
            if output[t] == 1:
                for i in range(2):
                    for t1 in self.sliding_window:
                        if (0<= t + t1 < self.snn_timesteps) and (t1!=0) and (input[i][t+t1] == 1):
                            self.update_weights(t1, i)
    
    def train(self, training_data):
        """
        Function to train the network
        
        Args:
            training_data (list): training data
        
        This function is complete. You do not need to do anything here. 
        """
        for ee in range(self.epochs):
            for train_data_sample in training_data:
                self.train_step(train_data_sample)

Let's test the implementation

In [10]:
# Define a variable for input dimension.
input_dimension = 2

# Define a variable for output dimension.
output_dimension = 1

# Define a variable for voltage decay.
vdecay = 0.5

# Define a variable for voltage threshold.
vth = 0.5

# Define a variable for snn timesteps.
snn_timestep = 10

# Initialize randomly the weights from input to output. Numpy random rand function might be useful here.
input_2_output_weight = np.random.rand(output_dimension, input_dimension)

# Print the initial network weights.
print('\nInitial Network Weights:\n', input_2_output_weight)

# Initialize a SNN using the arguments defined above.
snn_4 = SNN(input_2_output_weight, input_dimension, output_dimension, vdecay, vth, snn_timestep)

# Get the training data for AND gate using the function defined in 2a.
print('\n***TRAINING***')
train_data = genANDTrainData(snn_timestep)

# Create an object of STDP class with appropriate arguments.
stdp = STDP(snn_4, A_plus=0.6, A_minus=0.3, tau_plus=8, tau_minus=4, 
            lr=0.25, snn_timesteps=10, epochs=10, w_min=0, w_max=1)

# Train the network using STDP.
stdp.train(train_data)

# Test the trained network and print the network output for all 4 cases.
print('\n***TESTING***')
test_data = []

zero = np.random.choice([1, 0], snn_timestep)
print('Zero:\n', zero)

one = np.random.choice([1, 0], snn_timestep)
print('One:\n', one)

case1 = np.array([zero, zero])
case2 = np.array([zero, one])
case3 = np.array([one, zero])
case4 = np.array([one, one])

# Case 1:
output1 = snn_4(case1)
print("Case 1:", output1)
if output1 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 2:
output2 = snn_4(case2)
print("Case 2:", output2)
if output2 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 3:
output3 = snn_4(case3)
print("Case 3:", output3)
if output3 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 4:
output4 = snn_4(case4)
print("Case 4:", output4)
if output4 > vth:
    test_data.append(1)
else:
    test_data.append(0)

print('Testing Data:\n', test_data)

# Print the final network weights.
print("\nFinal Network Weights:\n", snn_4.input_2_output_connection.weights)


Initial Network Weights:
 [[0.96377245 0.99858174]]

***TRAINING***
Zero:
 [1 0 0 0 1 1 0 0 0 0]
One:
 [0 1 1 1 0 0 1 0 1 0]
Training Data:
 [[array([[1, 0, 0, 0, 1, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 1, 1, 0, 0, 0, 0]])
  array([1, 0, 0, 0, 1, 1, 0, 0, 0, 0])]
 [array([[1, 0, 0, 0, 1, 1, 0, 0, 0, 0],
       [0, 1, 1, 1, 0, 0, 1, 0, 1, 0]])
  array([1, 0, 0, 0, 1, 1, 0, 0, 0, 0])]
 [array([[0, 1, 1, 1, 0, 0, 1, 0, 1, 0],
       [1, 0, 0, 0, 1, 1, 0, 0, 0, 0]])
  array([1, 0, 0, 0, 1, 1, 0, 0, 0, 0])]
 [array([[0, 1, 1, 1, 0, 0, 1, 0, 1, 0],
       [0, 1, 1, 1, 0, 0, 1, 0, 1, 0]])
  array([0, 1, 1, 1, 0, 0, 1, 0, 1, 0])]]

***TESTING***
Zero:
 [1 0 1 1 1 0 1 0 1 1]
One:
 [1 0 1 0 0 1 1 0 0 0]
Case 1: [0.7]
Case 2: [0.3]
Case 3: [0.3]
Case 4: [0.4]
Testing Data:
 [1, 0, 0, 0]

Final Network Weights:
 [[0.27544411 0.27544411]]


# Question 5: OR Gate
Can you train the network with the same architecture in Q2-4 for learning the OR gate. You will need to create another function called genORTrainData. Then create an SNN and train it using STDP. 

In [11]:
# Write your implementation of genORTrainData here. 
def genORTrainData(snn_timestep):
    """ 
    Function to generate the training data for AND.
        Args:
            snn_timestep (int): timesteps for SNN simulation
        Return:
            train_data (list): list of tuples where each tuple comprises of numpy arrays of rate-coded inputs and output
        
        Write the expressions for encoding 0 and 1. Then append all 4 cases of OR gate to the list train_data.
    """
    # Initialize an empty list for train data.
    train_data = []
    
    # Encode 0. Numpy random choice function might be useful here.
    zero = np.random.choice([1, 0], snn_timestep)
    print('Zero:\n', zero)
    
    # Encode 1. Numpy random choice function might be useful here.
    one = np.random.choice([1, 0], snn_timestep)
    print('One:\n', one)
    
    # Append all 4 cases of AND gate to train_data. Numpy stack operation might be useful here.
    sub_case1 = np.array([zero, zero])
    sub_case2 = np.array([zero, one])
    sub_case3 = np.array([one, zero])
    sub_case4 = np.array([one, one])
    
    case1 = np.array([sub_case1, zero]) # Change second param.
    case2 = np.array([sub_case2, one])
    case3 = np.array([sub_case3, one])
    case4 = np.array([sub_case4, one])
    
    train_data = np.stack((case1, case2, case3, case4))
    print('Training Data:\n', train_data)
    
    return train_data

In [12]:
# Train the network for OR gate here using the implementation from 4c.
# Define a variable for input dimension.
input_dimension = 2

# Define a variable for output dimension.
output_dimension = 1

# Define a variable for voltage decay.
vdecay = 0.5

# Define a variable for voltage threshold.
vth = 0.5

# Define a variable for snn timesteps.
snn_timestep = 10

# Initialize randomly the weights from input to output. Numpy random rand function might be useful here.
input_2_output_weight = np.random.rand(output_dimension, input_dimension)

# Print the initial network weights.
print('\nInitial Network Weights:\n', input_2_output_weight)

# Initialize a SNN using the arguments defined above.
snn_5 = SNN(input_2_output_weight, input_dimension, output_dimension, vdecay, vth, snn_timestep)

# Get the training data for AND gate using the function defined in 2a.
print('\n***TRAINING***')
train_data = genORTrainData(snn_timestep)

# Create an object of STDP class with appropriate arguments.
stdp = STDP(snn_5, A_plus=0.6, A_minus=0.3, tau_plus=8, tau_minus=4, 
            lr=0.25, snn_timesteps=10, epochs=10, w_min=0, w_max=1)

# Train the network using STDP.
stdp.train(train_data)

# Test the trained network and print the network output for all 4 cases.
print('\n***TESTING***')
test_data = []

zero = np.random.choice([1, 0], snn_timestep)
print('Zero:\n', zero)

one = np.random.choice([1, 0], snn_timestep)
print('One:\n', one)

case1 = np.array([zero, zero])
case2 = np.array([zero, one])
case3 = np.array([one, zero])
case4 = np.array([one, one])

# Case 1:
output1 = snn_5(case1)
print("Case 1:", output1)
if output1 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 2:
output2 = snn_5(case2)
print("Case 2:", output2)
if output2 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 3:
output3 = snn_5(case3)
print("Case 3:", output3)
if output3 > vth:
    test_data.append(1)
else:
    test_data.append(0)

# Case 4:
output4 = snn_5(case4)
print("Case 4:", output4)
if output4 > vth:
    test_data.append(1)
else:
    test_data.append(0)

print('Testing Data:\n', test_data)

# Print the final network weights.
print("\nFinal Network Weights:\n", snn_5.input_2_output_connection.weights)


Initial Network Weights:
 [[0.58827154 0.5858281 ]]

***TRAINING***
Zero:
 [0 0 1 1 1 0 1 1 0 0]
One:
 [1 1 0 0 0 1 1 1 0 1]
Training Data:
 [[array([[0, 0, 1, 1, 1, 0, 1, 1, 0, 0],
       [0, 0, 1, 1, 1, 0, 1, 1, 0, 0]])
  array([0, 0, 1, 1, 1, 0, 1, 1, 0, 0])]
 [array([[0, 0, 1, 1, 1, 0, 1, 1, 0, 0],
       [1, 1, 0, 0, 0, 1, 1, 1, 0, 1]])
  array([1, 1, 0, 0, 0, 1, 1, 1, 0, 1])]
 [array([[1, 1, 0, 0, 0, 1, 1, 1, 0, 1],
       [0, 0, 1, 1, 1, 0, 1, 1, 0, 0]])
  array([1, 1, 0, 0, 0, 1, 1, 1, 0, 1])]
 [array([[1, 1, 0, 0, 0, 1, 1, 1, 0, 1],
       [1, 1, 0, 0, 0, 1, 1, 1, 0, 1]])
  array([1, 1, 0, 0, 0, 1, 1, 1, 0, 1])]]

***TESTING***
Zero:
 [0 1 1 1 0 0 0 0 1 0]
One:
 [1 0 1 0 1 0 0 1 1 1]
Case 1: [0.1]
Case 2: [0.]
Case 3: [0.1]
Case 4: [0.2]
Testing Data:
 [0, 0, 0, 0]

Final Network Weights:
 [[0.17177952 0.17177952]]
