# **Unsupervised Learning**

The coding questions are independent of the theoretical questions.

In this assignment, we will build the simplest models for unsupervised learning for a linear neuron:

$$y = \sum\limits_{i=1}^{n} w_i x_i. \tag{1}$$


Here ${x_i}$ are input neurons and $w_i$ are the associated synaptic weights. Let's start with the assumption that synaptic weights are increased in proportion to the correlation between pre-synaptic and post-synaptic activity; synchonized firing $⟶$ stronger wiring: 


$$\Delta w_i = \tau <y,x_i>, \tag{2}$$

where $\Delta w_i$ is the change in the strength of synapse $w_i$ and the parameter $\tau > 0$ is a slow learning rate.



1) Suppose you have $m$ observations for each input $x_i$: $x_i^1,x_i^2,\cdots,x_i^m$. These observations are faster than the learning rate. After all the observations have occurred, we want to evaluate the synaptic weights. Show that the weights vector $\mathbf{w}^T={w_1,w_2, ..., w_n}$ is the solution to a differential equation of the form:

$$\gamma \frac{d \mathbf{w}}{d t}  = C \, \mathbf{w},$$
with $\gamma > 0$ and $C$ is the correlation matrix such that $C_{i,j} =  \sum\limits_{i=1}^{m} x_i^m x_j^m$.




## Answer here 
For one boservation, we have n dimension input and get scaler output y.
$$y = \sum\limits_{i=1}^{n} w_i x_i $$
$$\Delta w_j = \tau <y,x_j> = \tau y x_j = \tau \sum\limits_{i=1}^{n} w_i x_i x_j = \tau w_j \sum\limits_{i=1}^{n} x_i x_j = \tau w_j C_{i,j}  $$

Observations are independent -> Stochastic ascent

2) The dataset "input.npy" contains presynaptic activity from two input neurons at random time points. Using the the rules (1) and (2), write a program that simulates the output $y$ and estimates the weights vector ${w_i}$. You can select the points in the data randomly and thus estimate the weights by stochastic gradient ascent. 

3) Run the program with 150 iterations (select 150 observations out of the 1000) and plot the weights and the data. What is the estimated vector ? Do you understand why ?



In [1]:
import numpy as np
import matplotlib.pyplot as plt
from google.colab import files

# Upload spike_counts.npz from local drive
# presynaptic activity from two input neurons
uploaded = files.upload()

Saving input.npy to input.npy


In [15]:
inputdata = np.load('input.npy')
nb_samples = inputdata.shape[1]
# Set random seed for reproducibility
np.random.seed(1)
# 2 inputs, 1 output (initialize the weights)
weights = np.random.rand(2,1) 

lrate = 0.01 # This rate should work

#  CODE THE LOOP HERE AND PLOT RESULTS

ValueError: ignored

In [None]:
y = np.dot(weights,inputdata[:,1])

In [12]:
inputdata[:,1]

array([0.50299801, 1.17842903])

4) Run the function with 300 and 500 iterations (same wights intialization and learning rate) and plot the results. Do you understand why the weights behave this way ? Is it biologically plausible that ?


In [None]:
# Set random seed for reproducibility
np.random.seed(1)


# CODE and Answer here



From questions (1-4), you might noticed that the pure linear model will make some weights very large while reducing the others to zero. This is sometimes not desirable. Think of a layer 4 neuron taking inputs from two LGN neurons (each associated with an input from the left or the right eye). If one of the weights become very large compared to the other, we would reach a biologically implausible ocular dominance. One way to solve this problem is to constrain the growth of each weight in the following way:

$$w_i = \tau (<y,x_i> - \bar{w}_i y^2), \tag{3}$$

where $\bar{w}_i$ is the current weights. This way, we make sure the weights are bounded.


5) Update the program you wrote before to perform rule (3). Run the function with $2000$ iterations. Comment.

In [None]:
# Set random seed for reproducibility
np.random.seed(1)


# CODE and Answer here




Let's consider the case of multipe linear neurons each following rule (1):

$$y_j = \sum\limits_{i=1}^{n} w_{ij} x_i. \tag{4}$$

One method to estimate the weights is by generalizing rule (3) in the following way:

$$w_{ij} = \tau (<y_j,x_i> -  y_j \sum\limits_{k \leq j}\bar{w}_{i,k} \, y_k), \tag{5}$$

For the first output neuron, this rule is the same as rule (3). The following weights are updated by taking the residuals from previous weights.

6) Update your program in order to take into account multiple neurons following rule (5). Run the program with two output neurons using the same data and plot the weights. 

(This algorithm might have trouble converging. Use $20000$ iterations and initialize the weights randomly like in the other questions)


In [None]:
# Set random seed for reproducibility
np.random.seed(1)


# CODE and Answer here


