# Lab 2: One hidden neural network with finite-sample expressivity

This lab studies a one hidden neural network with $N$ hidden neurons that can approximate any finite set with $N$ samples.

Author: Lionel Fillatre

# Objective

For weight vectors $w,b\in \mathbb{R}^N$ and $a\in \mathbb{R}^n$, we consider the function $f:\mathbb{R}^n \rightarrow \mathbb{R}$,
$$
f(x)=\sum_{j=1}^{N}w_j \max\{a^Tx-b_j,0\}.
$$

Let $S$ be any sample $S = \{x_1,\ldots,x_N\}$ of size $N$ with $x_i \in\mathbb{R}^n$ and some target vectors $y_i \in \mathbb{R}^N$.
It is assumed that all the $x_i's$ are distinct.

We want to find weights $a$, $b$, $w$ so that $y_i = f(x_i)$ for all $i \in \{1,\ldots,N\}$.

### Question 1:

Verify that $f(x)$ can be expressed by a depth $2$ network (one hidden layer only) with ReLU activations.

Write your answer here.

In [None]:
import torch
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
seed = 79790
torch.manual_seed(seed) # set the seed of the random generator

<torch._C.Generator at 0x7fa945343230>

# Data generation

### Question 2:

As a numerical example to test our mathematical results, we simulate some synthetic samples. Use ``torch.randn'' to generate the random samples $x_i$. Each $x_{i,j}$ must follow the distribution $\mathcal{N}(0,\frac{1}{n})$. You can generate $N=5$ random samples with $n=2$.

In [None]:
# Write your code here.

### Question 3:

We also simulate the labels. For each $x_i$, generate the label $y_i$ as
		$$
		y_i=\min\left\{10,\max\left\{1,\frac{1}{n}\sum_{j=1}^{n}  \left|\sinh( n^2\,x_{i,j}) \right| \right\}\right\}.
		$$

In [None]:
# Write your code here.

# Neural network parameters

### Question 4:

Show that we can find $a\in\mathbb{R}^n$ such that, with $z_i = a^Tx_i$, we have $z_i\neq z_j$ for all $1\leq i\neq j\leq N$. In the rest of the exercise, we assume that $z_1  < z_2 < \ldots < z_N$ (even if we swap the $x_i$'s and the $y_i$'s).

Hint: a finite union of hyperplanes in $\mathbb{R}^n$ can not cover $\mathbb{R}^n$.

Write your answer here.

### Question 5:

Imagine a random mecanism to generate $a$. Generate $a$ in Python.

Hint: a finite union of hyperplanes in $\mathbb{R}^n$ can not cover $\mathbb{R}^n$.

In [None]:
# Write your code here.

### Question 6:

Show that we can find $a$ and $b$ such that, with $z_i = a^Tx_i$, we have the interleaving property $b_1 < z_1 < b_2 < z_2 < \ldots < b_N < z_N$.

Write your answer here.

### Question 7:

Compute the $z_i$'s in Python.

In [None]:
# Write your code here.

### Question 8:

Compute the $b_i$'s in Python.

In [None]:
# Write your code here.

### Question 9:

Show that the $N \times N$ matrix $A =\left( a_{i,j} \right)$ with  $a_{i,j}=\max\{z_i-b_j , 0\}$ is lower triangular.

Write your answer here.

### Question 10:

Compute the matrix $A$ in Python.

In [None]:
# Write your code here.

### Question 11:

Show that $A$ has full rank.

Write your answer here.

### Question 12:

Compute the determinant of $A$ in Python.

In [None]:
# Write your code here.

### Question 13:

Consider the set of $N$ equations $y_i=f(x_i)$ for $i \in \{1,\ldots,N\}$. Show that $f(x_i) = A_i w$ where $A_i$ is the $i$-th row of $A$.

Write your answer here.

### Question 14:

Show that we can find $w$ such that $y_i = f(x_i)$ for all $i \in \{1,\ldots,N\}$.

Write your answer here.

### Question 15:

Compute the optimal $w$ in Python.

In [None]:
# Write your code here.

# Neural network implementation

### Question 16:

Write a neural network class "Net(nn.Module)" which implements the depth 2 network with the parameters you have computed in the previous cells.

In [None]:
# Write your code here.

### Question 17:

Generate a test set with the same size as the training set. Compare the training loss and the test loss.

In [None]:
# Write your code here.