<a href="https://colab.research.google.com/github/lucas-pinto/NUIN443/blob/main/problemSets/NUIN443_ps5_FA%2BGPs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Import Packages**

In [2]:
import numpy as np
import matplotlib.pyplot as plt

#Some helper functions for Gaussians
from numpy.random import normal, multivariate_normal

#Sci-kit linear factor analysis and PCA
from sklearn.decomposition import FactorAnalysis, PCA

## Factor Analysis

1) Write code to simulate data from the  factor analysis model with the following parameters, with 1 latent and 3-dimensional observations:  

$W = \begin{bmatrix} -1 \\ 1 \\ 2 \end{bmatrix}$
$\Psi = \begin{bmatrix} 10 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$

<br>

*Credit: This problem is inspired from a problem from Jonathan Pillow's Computational Neuroscience course at Princeton*

a) First, generate 2000 samples of 1-dimensional latent $z$

In [5]:
#Set the random seed so we get the same results every time
np.random.seed(1)

#Fill in below to generate samples of the latent



In [6]:
#Change below code if necessary, based on the variable name you gave to Z

# Sort samples of Z in order to make visualization of the latent more obvious later on
Z=np.sort(Z)

b) From the latent, generate 3-dimensional samples $x$. <br> Also, create a version, $x_{noiseless}$, that doesn't include the observation noise ($\Psi$).

c) Make a scatter plot showing the first two dimensions of $x$ samples.
<br> Overlay the same for $x_{noiseless}$, to get an intuition for how the shared signal and noise differ.
<br> Make the x and y axes have equal limits

In [None]:
#Fill in below to generate a scatter plot of the first two dimensions of x samples
plt.scatter()

#Fill in below to generate a scatter plot of the first two dimensions of x_noiseless samples
plt.scatter()

plt.ylim([-12,12])
plt.xlim([-12,12])

d) To make this more clearly connected to neuroscience, let's assume each of these samples was taken over the course of 2000 time points. Plot the latent and the first two dimensions of $x$ (the 'activity of two neurons') by filling in the below code

In [None]:
#Fill in below to plot latent
plt.subplot(3,1,1)
plt.plot()

#Fill in below to plot first dimension of x
plt.subplot(3,1,2)
plt.plot()

#Fill in below to plot second dimension of x
plt.subplot(3,1,3)
plt.plot()

e) Fit a factor analysis model with 1 latent to the data, using sci-kit learn, and get the fit latent (we'll be plotting this latent later). Note that the package, FactorAnalysis, has already been imported

f) Print the model's loadings (components_) and noise (noise_variance_) to check that they're approximately the same as the model you generated the data from.

g) Fit a PCA model to the data. You can also use sci-kit learn. Print the loadings (components_) to see how this differs from factor analysis

h) Replot the scatter plot from (c). We will now overlay a line with the FA loading axis and a line with the PCA loading axis (just the first two dimensions of those).

In [None]:
#Fill in below to generate a scatter plot of the first two dimensions of x samples
plt.scatter()

#Fill in below to generate a scatter plot of the first two dimensions of x_noiseless samples
plt.scatter()

#This will plot the axes of the PCA and FA components_ (loading axes)
plt.plot([-10*pca.components_[0][0],10*pca.components_[0][0]],[-10*pca.components_[0][1],10*pca.components_[0][1]],'r--',linewidth=3)
plt.plot([-10*fa.components_[0][0],10*fa.components_[0][0]],[-10*fa.components_[0][1],10*fa.components_[0][1]],'k--',linewidth=3)

plt.ylim([-12,12])
plt.xlim([-12,12])

#This will add a legend
plt.legend(['X','X_noiseless','PCA axis','FA axis'])

i) Explain in words, in the cell below, why the loadings for PCA and FA are different

j) Plot the ground truth latent, and the recovered latent via FA, overlaid

In [None]:
#Fill in below to plot the latent recovered from factor analysis
plt.plot()

#Fill in below to plot the ground truth latent
plt.plot()

k) Plot the ground truth latent and the recovered latent via PCA, overlaid

In [None]:
#Fill in below to plot the latent recovered from PCA
#Note that, by chance, the PCA axes in this example are flipped relative to the ground truth
#(this is a model degeneracy, where both the latents and axes can be flipped and we get an equal result)
#Therefore, actually plot the negative of the recovered latent from PCA below for a better match to the ground truth
plt.plot()

#Fill in below to plot the ground truth latent
plt.plot()

L) Explain in words, in the cell below, why PCA's estimate of the latent are so inaccurate relative to FA. In particular, consider which dimensions are being most heavily utilized to estimate the latent.

m) While you were able to use sci-kit learn for factor analysis rather than fully implementing EM, let's just do the "E" step here using the final model parameters found with sci-kit learn (fa.noise_variance_ and fa.components_). That is, find p(z|x), using the equation shown in class. You can just find the mean here and not worry about the variance.<br>

Then, create a scatter plot of the latent estimated manually, versus the latent directly output from sci-kit learn. If the above calculations were correct, this should be a diagonal line.

In [None]:
#Fill in code below to find p(z|x)


#Fill in code below to create a scatter plot of the latent estimated above, versus the latent output from sci-kit learn (from part e)
plt.scatter()

n) The above showed pretty large differences between factor analysis and PCA to demonstrate how they differ, but for most neural datasets, the difference is more minor. <br><br> Re-run the above with FA vs PCA comparison on data generated with the following, less extreme, noise model:
$\Psi = \begin{bmatrix} 2 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$
<br>
Just output the same plot as in (h)

And by re-run, I mean copy-paste your code from a,b,e,g,h (while changing $\Psi$) into your cell below, so you don't overwrite your previous results.

## 2) Gaussian Processes

I have provided a function for a radial basis function kernel below

In [None]:
#In the below function, x is a vector of datapoints, and L (the lengthscale) is a scalar

def cov_RBF(x,L):
  cov_rbf = np.exp(-(x-x.T)**2/(2*L**2))
  return cov_rbf

a) Let's say our datapoints have values 1,2,...,399,400 (e.g. these are the values of timepoints).  Plot (imshow) the RBF covariance for those values, for lengthscale=100 and lengthscale=1. Fill in the missing code below to do so.

In [None]:
#Datapoints X
X = np.arange(1,400)[None,:]

#Fill in below to generate the RBF kernel for lengthscale=100
scale1=
K_rbf1 = cov_RBF()

#Plot the image
plt.figure()
plt.imshow(K_rbf1,clim=[0,1])
plt.title('Lengthscale 100')
plt.colorbar()

#Fill in below to generate the RBF kernel for lengthscale=1
scale2=
K_rbf2 = cov_RBF()

#Plot the image
plt.figure()
plt.imshow(K_rbf2,clim=[0,1])
plt.title('Lengthscale 1')
plt.colorbar()

b) For both of the above RBF covariance functions, sample datapoints from the multivariate normal distribution with mean 0 and those covariances. Plot both of these samples. Note that the result will be 400 datapoints. As a hint, the means that you input into the multivariate_normal function will need to be arrays with 400 zeros.

In [None]:
#Fill in below to generate mean of distributions
mu=

#Fill in below to generate samples from the above mean and the covariance of an RBF kernel with lengthscale 100 (from above)
sample1 = np.random.multivariate_normal(mean=, cov=)
#Fill in below to generate samples from the above mean and the covariance of an RBF kernel with lengthscale 1 (from above)
sample2 = np.random.multivariate_normal(mean=, cov=)

#Plot the samples
plt.figure()
plt.plot(sample1)
plt.title('GP Sample for lengthscale=100')

plt.figure()
plt.plot(sample2)
plt.title('GP Sample for lengthscale=1')

Explain in words how the differences in the covariance functions in (a) (for different scale parameters) explain the differences in the samples from the distributions (b)