# Times Series (Forecasting)

Let's explore the US Real GDP time series per quarter. The data can be obtained on [FRED](https://fred.stlouisfed.org/series/GDPC1).

First load the data set from the file <b>GDP.xls</b> and import libraries.

In [None]:
# Importing useful libraries

import numpy as np
import pandas as pd 
import pandas.plotting as pp
import statsmodels.api as sm
import matplotlib.pyplot as plt
import statsmodels.formula.api as smf
import statsmodels.tsa.stattools as stt
import statsmodels.tsa.api as tsa
import statsmodels.stats.api as sms
from sklearn.metrics import mean_squared_error
import statsmodels.tsa.ar_model as arm
import warnings 
warnings.filterwarnings('ignore') # Settings the warnings to be ignored

In [None]:
# Loading GDP series

df = pd.read_excel("GDP.xls")
df["date"] = pd.to_datetime(df["date"])

In [None]:
# Plotting the US Real GDP Series

plt.plot(df.date,df.GDPC)

title = "US Real GDP (US$ billions of 2012)"

plt.title(title)                             # Plot title
plt.xlabel("Date")                        # Plot x-axis label
plt.ylabel("Real GDP")                  # Plot y-axis label

plt.show

In [None]:
# Testing the US Real GDP series for stationarity with the Augmented Dickey-Fuller (ADF) test

test = stt.adfuller(df.GDPC)

adf_statistic = test[0]

p_value = test[1]

# Printing ADF test statistic and p-value for the US Real GDP series
print("GDP: ADF statistic", round(adf_statistic,4),"and p-value",p_value)

In [None]:
# Create the rate of growth of the US Real GDP series

df["delta_GDPC"] = np.log(df["GDPC"]).diff()
df = df.dropna().reset_index(drop=True)
df

In [None]:
# Plotting the rate of growth of the US Real GDP Series

plt.plot(df.date,df.delta_GDPC)

title = "US Real GDP Growth Rate"

plt.title(title)                             # Plot title
plt.xlabel("Date")                        # Plot x-axis label
plt.ylabel("Growth Rate")                  # Plot y-axis label

plt.show

In [None]:
# Testing the rate of growth of the US Real GDP Series for stationarity with the Augmented Dickey-Fuller (ADF) test

test = stt.adfuller(df.delta_GDPC)

adf_statistic = test[0]

p_value = test[1]

# Printing ADF test statistic and p-value for the US Real GDP series
print("GDP: ADF statistic", round(adf_statistic,4),"and p-value",p_value)

## Expandind Window Forecast

We will apply an expanding window forecast to the last three years of the series.

In [None]:
# Creating a function to find the AR(k) forecasts for each training set

def ar_pred(x,k):
    model = arm.AutoReg(x,lags=k).fit()
    yhat = model.forecast(steps=1)
    return yhat

In [None]:
# Creating Expanding Window Forecasts

X = df["delta_GDPC"]

expanding_pred = list()

test_size = 12

for i in range(test_size):
    prediction = ar_pred(X[0:i-test_size],k=1)
    expanding_pred.append(prediction)

In [None]:
# Calculating the Root Mean Squared Error of the Forecast
rmse = np.sqrt(mean_squared_error(X[-test_size:], expanding_pred))
print("The root mean squared error is:",rmse)

In [None]:
# Creating figure and axis objects
fig, ax = plt.subplots(figsize=(16,9))

# Plotting the rate of growth of the US Real GDP for the test and predictions

ax.plot(df.date[-test_size:],X[-test_size:])
ax.plot(df.date[-test_size:],expanding_pred)

# Setting title and axis labels
ax.set(title="US Real GDP Growth Rate and Expanding Window Forecast",
       ylabel="Growth Rate",xlabel="Date")

# Rotate x-axis labels
plt.xticks(rotation=45, ha='right')

# Creating legend and setting to center right
fig.legend(['True','Forecast'], loc='center right')

# Show plot
plt.show()

## Rolling Window Forecast

In [None]:
# Creating Rolling Window Forecasts

X = df["delta_GDPC"]

rolling_pred = list()

test_size = 12

for i in range(test_size):
    prediction = ar_pred(df["delta_GDPC"][i-0:i-test_size],k=1)
    rolling_pred.append(prediction)

In [None]:
# Calculating the Root Mean Squared Error of the Forecast
rmse = np.sqrt(mean_squared_error(X[-test_size:], rolling_pred))
print("The root mean squared error is:",rmse)

In [None]:
# Creating figure and axis objects
fig, ax = plt.subplots(figsize=(16,9))

# Plotting the rate of growth of the US Real GDP for the test and predictions

ax.plot(df.date[-test_size:],X[-test_size:])
ax.plot(df.date[-test_size:],rolling_pred)

# Setting title and axis labels
ax.set(title="US Real GDP Growth Rate and Rolling Window Forecast",
       ylabel="Growth Rate",xlabel="Date")

# Rotate x-axis labels
plt.xticks(rotation=45, ha='right')

# Creating legend and setting to center right
fig.legend(['True','Forecast'], loc='center right')

# Show plot
plt.show()

## Johnson&Johnson EPS 

Let's explore the time series in the file <b>jj_eps.xlsx</b> which contains the quarterly Earnings-per-Share ratio for Johnson & Johnson from 1960 to 1980.

The data set is from Kaggle and can be obtained [here](https://www.kaggle.com/datasets/nirmalsankalana/johnson-and-johnson-quarterly-earnings).

In [None]:
# Loading J&J EPS series

data = pd.read_excel("jj_eps.xlsx")
data["date"] = pd.to_datetime(data["date"])

1. Plot the J&J EPS Series

2. Test the series for stationarity

3. Create the rate of growth of J&J EPS series

4. Testing the rate of growth of J&J EPS series for stationarity

5. Create Expanding Window Forecasts for the last 12 periods using an AR(1) model

6. Plot the Expanding Window Forecasts versus the actual data of the last 12 periods

7. Create Rolling Window Forecasts for the last 12 periods using an AR(1) model

8. Plot the Expanding Window Forecasts versus the actual data of the last 12 periods

9. Calculate the RMSE of the expanding window and rolling window forecasts.