Damian Anderson, Whitney Anderson, Erika Ibarra, Bryce Lunceford, Paul Smith, and Sebastian Valencia
Suppose that we generate data according to some process:
where 
- Train a neural network with parameters 
$\theta$ to predict$y_i$ from$x_i$ . Call it$\hat{f}_\theta(x_i)$ . - Let 
$X$ be a random variable that is distributed like the data points$x_i$ . - Compare the distribution of 
$f(X) - \hat{f}_\theta(X)$ to the distribution of$\varepsilon_i$ . - If the two distributions are similar, this could explain why double descent is observed.
 - We will try this with a bunch of different neural network architectures, functions 
$f$ and distributions over$\varepsilon_i$ .