# Paper - AR-Net
> A simple Auto-Regressive Neural Network for time series

- layout: post
- categories: [paper]
- search_exclude: true

https://arxiv.org/abs/1911.12436

## Introduction
#### What is auto-regression? 
An autoregressive model is when a value from a time series is regressed on previous values from that same time series. for example,
$$
y_t = \beta_0 + {\beta_1}{y_{t-1}} + \epsilon_t
$$

The order of an autoregression is the number of immediately preceding values in the series that are used to predict the value at the present time. So, the preceding model is a first-order autoregression, written as AR(1).

## Classic-AR model vs AR-Net
<blockquote>
We formulate a simple neural network that mimics the Classic-AR model, with the only difference being how they are
fitted to data. Our model termed AR-Net, in it’s simplest form, is identical to linear regression, fitted with stochastic
gradient descent (SGD). We show that AR-Net is identically interpretable as a Classic-AR model and scales to large
p-orders. As we discuss in the future work section, our vision is to leverage more powerful temporal modeling
techniques of deep learning without sacrificing interpretability via explicit modeling of time-series components
</blockquote>

<blockquote>
We intentionally did not use more powerful methods, such as modeling latent states with recurrent networks or convolution because our goal
was to bridge, not widen, the gap between traditional time-series and deep learning methods. We hope to show with
AR-Net that deep learning models can be simple, interpretable, fast and easy to use, so that the time-series community
may consider deep learning a viable option.
</blockquote>

|   | Statistical Models                                                                  | Neural Networks                                                                                                                           |
|---|-------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|
|   | Concise model, but we need to make strong assumptions on data such as the p-order.  | Non-parametric data driven model and it does not require restrictive assumptions on the underlying process from which data are generated. |
|   | Models with large p-order become slow                                               |                                                                                                                                           |
|   |                                                                                     | Have non-linear function mapping capability which can approximate any continuous function, hence can solve many complex problems          |

## Contribution


## Methods
### Classic AR
AR model or order *p* can be written as 
$$
y_t = c + \sum_{i=1}^{i=p}w_i*y_{t-i} + e_t
$$

$e_t$ is the noise.

### AR Net model
AR Net model is a neural network whose parameters in the first layer are equivalent to the classic AR coefficients. *AR Net* can be extended with hidden layers to achieve greater forecasting accuracy, **at the cost of direct interpretability**. Loss used is MSE, to keep it comparable with *classic AR*. 
$$
L(y, \hat{y}, \theta) = \frac{1}{n}\sum_1^n(y-\hat{y}_\theta)^2
$$

### Sparse AR Net
<blockquote>
In order to relax the constraint of knowing the true AR order, we can fit a larger model with sparse AR coefficients.
This will also do away with the assumption that the AR-coefficients must consist of consecutive lags. We achieve this
by adding a regularization term R to the loss L being minimized.
</blockquote>
TODO

#### References
https://online.stat.psu.edu/stat501/lesson/14/14.1