First install the repo and requirements.

In [None]:
%pip --quiet install git+https://github.com/wilson-labs/cola.git



# CoLA Library Exercise

This exercise is designed to help you get familiar with the CoLA linear algebra library. 

## Installation

Make sure you have a Python 3.10+ environment with either Jax or Pytorch installed. You can install CoLA using pip:



In [None]:

%pip install git+https://github.com/wilson-labs/cola.git


Alternatively, you can open the documentation in Colab and start working from there: [Quick Start](https://colab.research.google.com/github/wilson-labs/cola/blob/master/docs/notebooks/colabs/Quick_Start.ipynb)

Have a read through the [documentation](https://cola.readthedocs.io/en/latest/index.html) to understand the library better.

## Basic Exercises

1. Create a LinearOperator using the `ops.Diagonal` and `ops.Dense` classes. Perform basic operations like addition, subtraction, and multiplication on these operators.

2. 

3. 

4. 



## Large Scale Machine Learning with CoLA

Using Jax or Pytorch, pick any 3 out of the 5:

### 1. GP

GP Implement Gaussian Process (GP) inference with Radial Basis Function (RBF) kernel using `inverse()` from scratch on a dataset with at least 10k observations. You are not allowed to use gpytorch. The formula for the GP posterior is:

$$f_* | X, y, X_* \sim \mathcal{N}(\mu_*, \Sigma_*)$$

where:

$$\mu_* = K(X_*, X)[K(X, X) + \sigma^2_n I]^{-1}y$$

$$\Sigma_* = K(X_*, X_*) - K(X_*, X)[K(X, X) + \sigma^2_n I]^{-1}K(X, X_*)$$

Here, $K$ is the RBF kernel, $X$ are the training inputs, $y$ are the training targets, $X_*$ are the test inputs, and $\sigma^2_n$ is the noise variance.



### 2. Hessian Spectrum
Compute the eigenspectrum of the Hessian of a pretrained neural network. You can download weights of image classifiers pretrained on CIFAR10. Use `cola.eigs` or `cola.algorithms.lanczsos` and the spectral KDE smoothing method from [this paper](https://arxiv.org/pdf/1901.10159.pdf) to get a smoothed spectrum estimate.



### 3. Linear Regression
Implement linear regression with a heteroscedastic noise model where $\Phi$ is the design matrix, $\beta$ are the parameters and $\sigma_i$ is the measurement noise. The model is:

    $$y = \Phi \beta + \epsilon, \quad \epsilon \sim \mathcal{N}(0, D)$$
    
    where $D$ is a diagonal matrix with $\sigma_i^2$ on the diagonal. Add a Gaussian prior (regularization) if necessary.
    


Hint: $\hat{\beta}_{MLE} = (\Phi^T D^{-1} \Phi)^{-1} \Phi^T D^{-1} y$

### 4. Implement pagerank to find the most influential pages of Wikipedia.
 From the transition matrix on the [Linked- WikiText-2 dataset](https://rloganiv.github.io/linked-wikitext-2/#/), compute the largest eigenvector using `cola.eigmax`. From this eigenvector, rank the values to determine which web pages are most influential.

The PageRank algorithm computes the stationary distribution of a Markov chain. Given a transition matrix $P$, the PageRank vector $r$ is the eigenvector corresponding to the largest eigenvalue (which should be 1 for a stochastic matrix).

The transition matrix $P$ is defined as:

$$P = (1-\alpha)W + \alpha \mathbf{1}\mathbf{1}^T$$

where $W$ is the adjacency matrix normalized by the degree, $\alpha$ is the damping factor (usually set to 0.15), and $\mathbf{1}$ is a vector of ones.

The adjacency matrix $A_{ij}$ is 1 if there is a link from page $i$ to page $j$ (not the other way around). The degree-normalized adjacency matrix $W$ is obtained by dividing each row of $A$ by its sum.

The PageRank vector $r$ can be found by solving the eigenproblem:

$$P^T r = r$$

The entries of $r$ give the PageRank scores of the pages. The pages can then be ranked by these scores to find the most influential ones.



### 5. Make a pull request to CoLA.
 e.g., improvement to the documentation, new commonly used linear operator (e.g., Fisher information matrix, banded matrix, FFT matrix), bug fix. If your code for one of the above exercises is particularly clean, consider adding markdown text explaining the steps and let's add it to the CoLA documentation under examples.
