# Chapter 2 Time Series Analysis: AR, MA, ARMA, ARMAX, ARIMA Models

This notebook contains exercises related to fitting different time series models including AR, MA, ARMA, ARMAX, and ARIMA.
Each section contains explanations and code cells to perform the required tasks.

---


## Fitting AR and MA Models

In this exercise you will fit an AR and an MA model to some data. The data here has been generated using the `arma_generate_sample()` function.

You know the real AR and MA parameters used to create this data so it is a really good way to gain some confidence with ARMA models.

Fit an **AR(2)** model to the `'timeseries_1'` column of `sample` and print the summary.

Remember that an ARMA(p,0) model is the same as an AR(p) model

In [None]:
# Instantiate the model
model = ARMA(sample['timeseries_1'], order=(2,0))

# Fit the model
results = model.fit()

# Print summary
print(results.summary())


## Fitting an MA(3) Model

Fit an **MA(3)** model to the `'timeseries_2'` column of `sample` and print the summary.

In [None]:
# Instantiate the model
model = ARMA(sample['timeseries_2'], order=(0,3))

# Fit the model
results = model.fit()

# Print summary
print(results.summary())


## Fitting an ARMA Model

In this exercise, you will fit an **ARMA(3,1)** model to the `earthquake` dataset and print the summary.

In [None]:
# Instantiate the model
model = ARMA(earthquake, order=(3,1))

# Fit the model
results = model.fit()

# Print model fit summary
print(results.summary())


## Fitting an ARMAX Model

In this exercise you will fit an ARMAX model to a time series which represents the wait times at an accident and emergency room for urgent medical care.

Fit an **ARMAX(2,1)** model to the `'wait_times_hrs'` column of `hospital` using `'nurse_count'` as an exogenous variable.

The variable you would like to model is the wait times to be seen by a medical professional wait_times_hrs. This may be related to an exogenous variable that you measured nurse_count which is the number of nurses on shift at any given time. These can be seen below.

This is a particularly interesting case of time series modeling as, if the number of nurses has an effect, you could change this to affect the wait times.

In [None]:
# Instantiate the model
model = ARMA(hospital['wait_times_hrs'], order=(2,1),
             exog=hospital['nurse_count'])

# Fit the model
results = model.fit()

# Print model fit summary
print(results.summary())


## Generating One-Step-Ahead Predictions

It is very hard to forecast stock prices. Classic economics actually tells us that this should be impossible because of market clearing.

Your task in this exercise is to attempt the impossible and predict the Amazon stock price anyway.

In this exercise you will generate one-step-ahead predictions for the stock price as well as the uncertainty of these predictions.

A model has already been fitted to the Amazon data for you. The results object from this model is available in your environment as results.
Use the results object to make one-step-ahead predictions over the latest 30 days of data and assign the result to one_step_forecast.
Assign your mean predictions to mean_forecast using one of the attributes of the one_step_forecast object.
Extract the confidence intervals of your predictions from the one_step_forecast object and assign them to confidence_intervals.
Print your mean predictions.

In [None]:
# Generate predictions
one_step_forecast = results.get_prediction(start=-30)

# Extract prediction mean
mean_forecast = one_step_forecast.predicted_mean

# Get confidence intervals of predictions
confidence_intervals = one_step_forecast.conf_int()

# Select lower and upper confidence limits
lower_limits = confidence_intervals.loc[:,'lower close']
upper_limits = confidence_intervals.loc[:,'upper close']

# Print best estimate predictions
print(mean_forecast)


## Plotting One-Step-Ahead Predictions

Now that you have your predictions on the Amazon stock, you should plot these predictions to see how you've done.

You made predictions over the latest 30 days of data available, always forecasting just one day ahead. By evaluating these predictions you can judge how the model performs in making predictions for just the next day, where you don't know the answer.

The lower_limits, upper_limits and amazon DataFrames as well as your mean prediction mean_forecast that you created in the last exercise are available in your environment.

Plot the amazon data, using the amazon.index as the x coordinates.
Plot the mean_forecast prediction similarly, using mean_forecast.index as the x-coordinates.
Plot a shaded area between lower_limits and upper_limits of your confidence interval. Use the index of lower_limits as the x coordinates.

In [None]:
# plot the amazon data
plt.plot(amazon.index, amazon, label='observed')

# plot your mean predictions
plt.plot(mean_forecast.index, mean_forecast, color='r', label='forecast')

# shade the area between your confidence limits
plt.fill_between(lower_limits.index, lower_limits,
               upper_limits, color='pink')

# set labels, legends and show plot
plt.xlabel('Date')
plt.ylabel('Amazon Stock Price - Close USD')
plt.legend()
plt.show()


## Generating Dynamic Forecasts

Now lets move a little further into the future, to dynamic predictions. What if you wanted to predict the Amazon stock price, not just for tomorrow, but for next week or next month? This is where dynamical predictions come in.

Remember that in the video you learned how it is more difficult to make precise long-term forecasts because the shock terms add up. The further into the future the predictions go, the more uncertain. This is especially true with stock data and so you will likely find that your predictions in this exercise are not as precise as those in the last exercise.


Use the results object to make a dynamic predictions for the latest 30 days and assign the result to dynamic_forecast.
Assign your predictions to a new variable called mean_forecast using one of the attributes of the dynamic_forecast object.
Extract the confidence intervals of your predictions from the dynamic_forecast object and assign them to a new variable confidence_intervals.
Print your mean predictions.

Use the .get_prediction() method of the results object to make a dynamic predictions for the latest 30 steps. Remember to set the dynamic argument to True.
You can use the .predicted_mean attribute of dynamic_forecast to find the mean predictions.
You can use the .conf_int() method of dynamic_forecast to generate a confidence interval.

In [None]:
# Generate predictions
dynamic_forecast = results.get_prediction(start=-30, dynamic=True)

# Extract prediction mean
mean_forecast = dynamic_forecast.predicted_mean

# Get confidence intervals of predictions
confidence_intervals = dynamic_forecast.conf_int()

# Select lower and upper confidence limits
lower_limits = confidence_intervals.loc[:,'lower close']
upper_limits = confidence_intervals.loc[:,'upper close']

# Print best estimate predictions
print(mean_forecast)


## Plotting Dynamic Forecasts

Time to plot your predictions. Remember that making dynamic predictions, means that your model makes predictions with no corrections, unlike the one-step-ahead predictions. This is kind of like making a forecast now for the next 30 days, and then waiting to see what happens before comparing how good your predictions were.

The lower_limits, upper_limits and amazon DataFrames as well as your mean predictions mean_forecast that you created in the last exercise are available in your environment.

Plot the amazon data using the dates in the index of this DataFrame as the x coordinates and the values as the y coordinates.
Plot the mean_forecast predictions similarly.
Plot a shaded area between lower_limits and upper_limits of your confidence interval. Use the index of one of these DataFrames as the x coordinates.

In [None]:
# plot the amazon data
plt.plot(amazon.index, amazon, label='observed')

# plot your mean predictions
plt.plot(mean_forecast.index, mean_forecast, color='r', label='forecast')

# shade the area between your confidence limits
plt.fill_between(lower_limits.index, lower_limits,
               upper_limits, color='pink')

# set labels, legends and show plot
plt.xlabel('Date')
plt.ylabel('Amazon Stock Price - Close USD')
plt.legend()
plt.show()


## Differencing and Fitting ARMA

In this exercise you will fit an ARMA model to the Amazon stocks dataset. As you saw before, this is a non-stationary dataset. You will use differencing to make it stationary so that you can fit an ARMA model.

In the next section you'll make a forecast of the differences and use this to forecast the actual values.

The Amazon stock time series in available in your environment as amazon. The SARIMAX model class is also available in your environment.

Use the .diff() method of amazon to make the time series stationary by taking the first difference. Don't forget to drop the NaN values using the .dropna() method.
Create an ARMA(2,2) model using the SARIMAX class, passing it the stationary data.
Fit the model.

In [None]:
# Take the first difference of the data
amazon_diff = amazon.diff().dropna()

# Create ARMA(2,2) model
arma = SARIMAX(amazon_diff, order=(2,0,2))

# Fit model
arma_results = arma.fit()

# Print fit summary
print(arma_results.summary())


## Unrolling ARMA Forecast

Now you will use the model that you trained in the previous exercise arma in order to forecast the absolute value of the Amazon stocks dataset. Remember that sometimes predicting the difference could be enough; will the stocks go up, or down; but sometimes the absolute value is key.

The results object from the model you trained in the last exercise is available in your environment as arma_results. The np.cumsum() function and the original DataFrame amazon are also available.
Use the .get_forecast() method of the arma_results object and select the predicted mean of the next 10 differences.
Use the np.cumsum() function to integrate your difference forecast.
Add the last value of the original DataFrame to make your forecast an absolute value.


In [None]:
# Make arma forecast of next 10 differences
arma_diff_forecast = arma_results.get_forecast(steps=10).predicted_mean

# Integrate the difference forecast
arma_int_forecast = np.cumsum(arma_diff_forecast)

# Make absolute value forecast
arma_value_forecast = arma_int_forecast + amazon.iloc[-1,0]

# Print forecast
print(arma_value_forecast)


## Fitting an ARIMA Model

In this exercise you'll learn how to be lazy in time series modeling. Instead of taking the difference, modeling the difference and then integrating, you're just going to lets statsmodels do the hard work for you.

You'll repeat the same exercise that you did before, of forecasting the absolute values of the Amazon stocks dataset, but this time with an ARIMA model.

A subset of the stocks dataset is available in your environment as amazon and so is the SARIMAX model class.

Create an ARIMA(2,1,2) model, using the SARIMAX class, passing it the Amazon stocks data amazon.
Fit the model.
Make a forecast of mean values of the Amazon data for the next 10 time steps. Assign the result to arima_value_forecast.

In [None]:
# Create ARIMA(2,1,2) model
arima = SARIMAX(amazon, order=(2,1,2))

# Fit ARIMA model
arima_results = arima.fit()

# Make ARIMA forecast of next 10 values
arima_value_forecast = arima_results.get_forecast(steps=10).predicted_mean

# Print forecast
print(arima_value_forecast)
