<img src="../vs265header.svg"/>
<h1 align="center"> Lab 4 - Sparse, Distributed Representation <font color="red"> [SOLUTIONS] </font> </h1>

## Sparse Coding of Binary Patterns

In this problem you will implement Foldiak's model for learning features of binary data. Most of the code has been written for you, you only have to add the interesting parts - the dynamics, learning rules and statistics calculations. We recommend that you read the Foldiak 1990 paper, <i>Forming Sparse Representations by Local Anti-Hebbian Learning</i>, before moving forward with this homework.

In [1]:
%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import utils.plotFunctions as pf

First we have a function that we will use to generate our dataset, which is the same dataset as what is shown in Figure 2 of the Foldiak paper. 

In [2]:
def subtractMeanPerImage(data):
    return np.stack([data[:, idx] - np.mean(data[:, idx]) for idx in range(data.shape[1])], axis=1)

def genLines(numInpus, numDatapoints, probabilityOn):
    """
    Generate random dataset of images containing lines. Each image has a mean value of 0.
    Inputs:
        numInputs [int] number of pixels for each image, must have integer sqrt()
        numDatapoints [int] number of images to generate
        probabilityOn [float] probability of a line (row or column of 1 pixels) appearing in the image,
            must be between 0.0 (all zeros) and 1.0 (all ones)
    Outputs:
        outImages [np.ndarray] batch of images, each of size
            (numInputs, numDatapoints)
    """
    assert numInputs%np.sqrt(numInputs) == 0, (
      "numInputs must have an integer square root")
    if probabilityOn < 0.0 or probabilityOn > 1.0:
        assert False, "probabilityOn must be between 0.0 and 1.0"
        
    # Each image is a square, rasterized into a vector
    outImages = np.zeros((numInputs, numDatapoints))
    numEdgePixels = int(np.sqrt(numInputs))
    for batchIdx in range(numDatapoints):
        outImage = np.zeros((numEdgePixels, numEdgePixels))
        # Construct a set of random rows & columns that will have lines with probablityOn chance
        rowIdx = np.where(np.random.uniform(low=0, high=1, size=numEdgePixels) < probabilityOn)
        colIdx = np.where(np.random.uniform(low=0, high=1, size=numEdgePixels) < probabilityOn)
        if np.any(rowIdx):
            outImage[rowIdx, :] = 1
        if np.any(colIdx):
            outImage[:, colIdx] = 1
        outImages[:, batchIdx] = outImage.reshape((numInputs))
    return subtractMeanPerImage(outImages)

In [3]:
# Dataset generation
numDatapoints = 100 # Number of images in a batch
numInputs = 64 # Number of pixels in an image (should have integer sqrt)

# probabilityOn is used to determine, on average, what portion of the dataset
# has lines. It is also used to set the target output firing rate
# (num active / total num outputs) for the Foldiak model.
probabilityOn = 0.1

np.random.seed(123456)

# Learning
eta = 0.001 # learning rate for Hebbian learning
numTrials = 1000 # number of learning steps to take - for both Hebbian and Foldiak learning

# Network architecture
numOutputs = 16 # inumber of neurons in the network layer - for both Hebbian and Foldiak models

Now we use the genLines function to construct a binary line dataset.

In [4]:
dataset = genLines(numInputs, numDatapoints, probabilityOn)
pf.plotDataTiled(dataset, title="Dataset")

<IPython.core.display.Javascript object>

Our overall goal is to build a model that can learn the structure of this data. We know (because we made the data) that the basic building blocks of the data are a set of 16 lines, which can be combined and thresholded to make any of the images we see in our sample dataset.

In [5]:
def imprintWeights(weights, dataset):
    """
    Imprint weights with samples from the dataset
    Inputs:
        weights [np.ndarray] of shape (numInputs, numOutputs)
        dataset [np.ndarray] batch of images, each of size
            (numInputs, numDataPoints)
    """
    for neuronIndex in range(weights.shape[1]):
        # Choose random datapoint
        image = dataset[:, np.random.randint(dataset.shape[1])]
        weights[:, neuronIndex] = image
    return weights

In [6]:
# Initialize weights
weights = imprintWeights(np.random.randn(numInputs, numOutputs), dataset)
pf.plotDataTiled(weights, title="Weights Initialized with Imprinting")

<IPython.core.display.Javascript object>

If the data were linearly separable, then we should be able to use Oja's rule or Sanger's rule to separate the individual components that combine to form the images in the dataset.

In [7]:
def sangerLearn(dataset, weights, learningRate):
    
    output = weights.T @ dataset # compute neuron output for all data
    numOutputs = output.shape[0]
    
    residual = dataset;
    dw = np.zeros(weights.shape)
    
    for i in range(numOutputs):
        residual = residual - (weights[:, i, None] @ output[None, i, :])
        dw[:, i] = residual @ output[i, :].T

    weights += learningRate * dw # update weight vector by dw
    
    return weights

In [8]:
for trial in range(numTrials):
    dataset = genLines(numInputs, numDatapoints, probabilityOn)
    weights = sangerLearn(dataset, weights, eta / numDatapoints) # Divide learning rate by batch size
pf.plotDataTiled(weights, title="Weights learned with Sanger's rule")

<IPython.core.display.Javascript object>

These weights resemble the first fiew principal components of the dataset, and fail to learn the structure of the data. Next we will try to learn the data structure using Foldiak's network.

In [13]:
# Foldiak learning rates
alpha = 0.1
beta = 0.05
gamma = 0.1
averageEta = 0.1

In [14]:
def sigmoid(data):
    return 1.0 / (1.0 + np.exp(-data))

In [15]:
def foldiakSparsify(dataset, forwardWeights, lateralWeights, thresholds, numTrials):
    (numInputs, numDatapoints) = dataset.shape
    numOutputs = forwardWeights.shape[1]
    numEdgePixels = np.sqrt(numInputs)
    
    y = np.zeros((numOutputs, numDatapoints))
    
    # Activations are computed by projecting the dataset onto the feedforward weights
    activations = forwardWeights.T @ dataset
    
    for trial in range(numTrials):
        dy = sigmoid(activations + lateralWeights @ y - np.expand_dims(thresholds, 1))
        y += eta * dy
        
    output = np.zeros((numOutputs, numDatapoints))
    output[np.where(y>0.5)] = 1
    
    return output

In [16]:
def foldiakLearn(dataset, activity, forwardWeights, lateralWeights, thresholds, alpha, beta, gamma):
    numDataPoints = dataset.shape[1]
    
    # Compute mean and correlation statistics (avg across datapoints)
    avgActivity = np.mean(activity, axis=1)
    corrOutOut = activity @ activity.T / numDataPoints
    corrOutIn = dataset @ activity.T / numDataPoints
    
    # Update lateral weights (w in Foldiak paper)
    dw = -alpha * (corrOutOut - probabilityOn**2)
    lateralWeights += dw
    lateralWeights -= np.diag(np.diag(lateralWeights)) # We do not want the units to inhibit themselves
    lateralWeights[np.where(lateralWeights>0)] = 0
    
    # Update feedforward weights (q in Foldiak paper)
    dq = beta * (corrOutIn - forwardWeights @ np.diag(np.mean(activity, axis=1)))
    forwardWeights += dq
    
    # Update thresholds (t in Foldiak paper)
    dthresh = gamma * (avgActivity - probabilityOn)
    thresholds += dthresh
    
    return (forwardWeights, lateralWeights, thresholds)

In [17]:
# Foldiak data structures
forwardWeights = np.random.randn(numInputs, numOutputs)
lateralWeights = np.zeros((numOutputs, numOutputs))
thresholds = np.ones((numOutputs))
movingAverageActivity = probabilityOn
movingAverageCorrelation = probabilityOn**2

# Learn Foldiak model
for trial in range(numTrials):
    dataset = genLines(numInputs, numDatapoints, probabilityOn)
    
    activity = foldiakSparsify(dataset, forwardWeights, lateralWeights, thresholds, numTrials)
    
    (forwardWeights, lateralWeights, thresholds) = foldiakLearn(dataset, activity, forwardWeights, lateralWeights, thresholds, alpha, beta, gamma)

    movingAverageActivity += averageEta * np.mean(activity, axis=1)
    movingAverageCorrelation += averageEta * (activity @ activity.T / numDatapoints)
    
pf.plotDataTiled(forwardWeights, "Feedforward Weights")

<IPython.core.display.Javascript object>