The most common application for AR, ARMA, and ARIMA models is inventory planning. Planning inventory for a small shop can be difficult enough, but you've just been hired to plan inventory for a _big_ store - Walmart

In this lab, you will be analyzing weekly Walmart sales data over a two year period from 2010 to 2012. The data is separated by store and by department, but you should focus on analyzing one store for simplicity. Your supervisor has set out the following goals for this project:

- Record any observed trends in the data
- Produce a trained model to predict future sales numbers

Try your best to tune your model. It can be difficult, but don't worry - timeseries analysis is just a difficult and specialized topic.


- Assemble observations and graphs as well as timeseries models in a notebook.

In [1]:
#### Starter code

import pandas as pd
import numpy as np


%matplotlib inline
# get the Datas
data = pd.read_csv('./datasets/train.csv')
data.set_index('Date', inplace=True)
data.head()

Unnamed: 0_level_0,Store,Dept,Weekly_Sales,IsHoliday
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2010-02-05,1,1,24924.5,False
2010-02-12,1,1,46039.49,True
2010-02-19,1,1,41595.55,False
2010-02-26,1,1,19403.54,False
2010-03-05,1,1,21827.9,False


In [2]:
import datetime

#### Deliverable

**Look back at the readme from wednesday's lecture on timeseries for good information about these models and acf/pacf plots!**

1. Filter the dataframe to Store 1 sales and aggregate over departments to compute the total sales per week.
- Plot the rolling_mean for `Weekly_Sales`. What general trends do you observe?
- Compute the 1, 2, 52 autocorrelations for `Weekly_Sales` and create an autocorrelation and partial autocorrelation plot.
- "Difference" the data by converting the sales into change in sales (diff function is convenient for doing this.)
- Split the weekly sales data in a training and test set - using 75% of the data for training.
- Create an AR(1) model on the training data and compute the mean absolute error of the predictions. How effective is this model?
- Plot the residuals - where are their significant errors?
- Compute and AR(2) model and an ARMA(2, 2) model - does this improve your mean absolute error on the held out set?
- Compute and ARIMA model

#### Additional Resources

1. [ARMA Example](http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/tsa_arma.html)
2. [ARMA Models for TSA](https://www.quantstart.com/articles/Autoregressive-Moving-Average-ARMA-p-q-Models-for-Time-Series-Analysis-Part-1)

### 1. Filter the dataframe to Store 1 sales and aggregate over departments to compute the total sales per store.

### 2. Plot the rolling_mean for Weekly_Sales. What general trends do you observe?

### 3. Compute the 1, 2, 52 autocorrelations for Weekly_Sales and create an autocorrelation and partial autocorrelation plot.


### 4. Split the weekly sales data in a training and test set - using 75% of the data for training.

### 5. "Difference" the data by converting the sales into change in sales (diff function is convenient for doing this.)

```python
series.diff(periods = n)
```

### 6.Create an AR(1) model on the training data and compute the mean absolute error of the predictions. 
####    How effective is this model?

### 7. Plot the residuals - where are their significant errors?

### 8. Compute and AR(2) model and an ARMA(2, 2) model - does this improve your mean absolute error on the held out set?

### 9. Create an ARIMA Model with the data. Assess its performance