<img src="../Images/DSC_Logo.png" style="width: 400px;">

# Time Series Theory in Python - Part 2: Stationary Time Series Models

This notebook provides an overview of stationary time series models, with a specific emphasis on the autoregressive moving average (ARMA) process. ARMA models facilitate intuitive interpretations of how past values and past errors influence current observations, which can inform decision-making processes. Additionally, the ARMA model serves as the foundation for more complex models, including ARIMA (which incorporates differencing for non-stationary data) and seasonal or multivariate variants such as SARIMA and VARMA. While ARMA models offer valuable tools for understanding time series behavior, it is important to recognize their limitations when dealing with complex real-world data, which may require alternative modeling approaches that account for non-linear behavior or non-stationary processes.

The notebook also outlines the general steps involved in the development of time series models. These steps include identifying an appropriate model type, estimating model parameters, fitting the selected model, testing the residuals to ensure model adequacy, and generating forecasts based on the fitted model. Notably, these methodologies are not limited to stationary models; they are also applicable to non-stationary or non-linear time series, providing a general framework for time series analysis and prediction.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.stattools import acf
from pandas.plotting import lag_plot
from statsmodels.stats.diagnostic import acorr_ljungbox
import statsmodels.api as sm

from statsmodels.tsa.arima_process import arma_generate_sample
from statsmodels.tsa.arima.model import ARIMA 
from statsmodels.tsa.stattools import kpss
from scipy import stats 

from PythonTsa.datadir import getdtapath
dtapath=getdtapath()

## 1. Moving Average (MA) Models

Generating MA(2) Model given by the equation:
<span style="font-size: 24px;">$$ X_t = \varepsilon_t + 0.6 \varepsilon_{t-1} - 0.3 \varepsilon_{t-2} $$</span>
where  <span style="font-size: 18px;">$ \varepsilon_t \sim \text{iid} \, N(0, 1). $</span> 

With the Python function `arma_generate_sample`, we can get samples from an autoregressive moving average (ARMA) process defined by specified parameters.


The time series displays neither trend nor seasonality and appears stationary. The significant ACF values at lags 1 and 2, along with the patterns observed in the PACF and lag plots, confirm that the time series follows the MA(2) process. The MA(2) model expresses the current value of the time series as a function of the current and previous two noise terms. The lag plot for lag 1 shows a positive relationship, while the lag plot for lag 2 suggests a weaker or negative correlation. Furthermore, the absence of discernible patterns in the lag plots for lags 3 and 4 reinforces that these lags do not significantly contribute to the current value, aligning with the typical behavior of an MA(2) process.

In [None]:
# Define the MA parameters
ma = np.array([1, 0.6, -0.3])  # MA coefficients

# Set the random seed for reproducibility
np.random.seed(123457)

# Generate a sample from the ARMA process
x = arma_generate_sample(ar=[1], ma=ma, nsample=500)  # AR part is set to [1] for no AR component; sample of size (length) 500

# Check the type of x (should be a numpy array)
print(type(x))  # Output: <class 'numpy.ndarray'>

# Convert x to a pandas Series
x = pd.Series(x)

# Check the type of x again (now it should be a Series)
print(type(x))  # Output: <class 'pandas.core.series.Series'>

# Plot the time series
x.plot()
plt.title('Generated MA(2) Series')
plt.xlabel('Time')
plt.ylabel('Value')
plt.show()

# Plot ACF and PACF
plot_acf(x, lags=20)
plt.title('ACF of the Generated MA(2) Series') # we see: MA(2) is at lag 1 & 2 out of the band
plt.show()
plot_pacf(x, lags=20)
plt.title('PACF of the Generated MA(2) Series')
plt.show()

# Lag plots for lag 1 and 2
fig = plt.figure()
lag_plot(x, lag=1, ax=fig.add_subplot(211))
plt.title('Lag Plot (Lag 1)')
lag_plot(x, lag=2, ax=fig.add_subplot(212))
plt.title('Lag Plot (Lag 2)')
plt.tight_layout()
plt.show()

# Lag plots for lags 3 and 4
fig = plt.figure()
lag_plot(x, lag=3, ax=fig.add_subplot(211))
plt.title('Lag Plot (Lag 3)')
lag_plot(x, lag=4, ax=fig.add_subplot(212))
plt.title('Lag Plot (Lag 4)')
plt.tight_layout()
plt.show()

## 2. Autoregressive Models

Generating AR(2) Model given by the equation:  
<span style="font-size: 24px;">$$ X_t = 0.8 X_{t-1} - 0.3 X_{t-2} + \varepsilon_t $$</span>  

where  <span style="font-size: 18px;">$ \varepsilon_t \sim \text{iid} \, N(0, 1). $</span>

The time series shows no trend or seasonality and appears stationary. The PACF values are almost zero after lag 3, indicating that the series follows an autoregressive process, where the current observation is primarily influenced by the most recent values (two most recent values for AR(2)). The ACF shows tailing off, which is also characteristic of autoregressive processes. However, significant positive autocorrelation at lag 11 suggests that there may be more complexity in the data needing further investigation.

In [None]:
# Define the AR parameters (AR(2) model)
ar = np.array([1, -0.8, 0.3])  # AR coefficients

# Set the random seed for reproducibility
np.random.seed(123457)

# Generate a sample from the AR process
x = arma_generate_sample(ar=ar, ma=[1], nsample=500)  # ma=[1] means no MA part in the model; sample of size (length) 500

# Convert the generated sample to a pandas Series
x = pd.Series(x)

# Plot the time series
x.plot()
plt.title('Generated AR(2) Sample')
plt.xlabel('Time')
plt.ylabel('Value')
plt.show()

# Plot ACF and PACF
plot_acf(x, lags=20)
plt.title('ACF of the Generated AR(2) Sample')
plt.show()
plot_pacf(x, lags=20)
plt.title('PACF of the Generated AR(2) Sample')
plt.show()

# Lag plot for lag 11
lag_plot(x, lag=11)
plt.title('Lag Plot (Lag 11)')
plt.show()


## 3. Autoregressive Moving Average (ARMA) Models

Consider the following ARMA(2,2) model:

<span style="font-size: 18px;">$$
X_t = 0.8 X_{t-1} - 0.6 X_{t-2} + \varepsilon_t + 0.7 \varepsilon_{t-1} + 0.4 \varepsilon_{t-2}
$$</span>

In an ARMA model, which combines both AR and MA components, the current value $X_t$ is expressed as a linear combination of its previous values **and** past error terms ($\varepsilon $). Specifically, an ARMA(p,q) model combines the features of AR(p), which captures the influence of the past values, and MA(q), which incorporates the effects of past error terms.

The generated time series is stationary and exhibits a rapid decay to zero for both the ACF and PACF. However, many ACF and PACF values remain nonzero for lags equal to or greater than 3, indicating a tailing off behavior typical of an ARMA(2,2) model.

In [None]:
# Set random seed for reproducibility
np.random.seed(12357)

# Define AR and MA parameters for the ARMA model
ar = np.array([1, -0.8, 0.6])  # AR coefficients
ma = np.array([1, 0.7, 0.4])   # MA coefficients

# Create an ARMA process
arma_process = sm.tsa.ArmaProcess(ar, ma)

# Check for stationarity and invertibility
print("Stationarity:", arma_process.isstationary)  # Check if the process is stationary
print("Invertibility:", arma_process.isinvertible)  # Check if the process is invertible

# Generate a sample from the ARMA process
y = arma_generate_sample(ar=ar, ma=ma, nsample=500)
y = pd.Series(y, name='y')  # Convert the generated data to a pandas Series

# Plot the time series
plt.figure()
y.plot(title='ARMA(2, 2) Time Series', xlabel='Time', ylabel='Value')
plt.grid()
plt.show()

# Plot ACF and PACF
plot_acf(y, lags=20)
plt.show()
plot_pacf(y, lags=20)
plt.show()

# Lag plot for lag 11
lag_plot(y, lag=11)
plt.title('Lag Plot (Lag 11)')
plt.show()

### Choose model:

We will now assume that we no longer remember the source of the sample and aim to construct an ARMA(p,q) model for it. The challenge lies in determining the appropriate orders (p,q). To address this, we model various combinations of orders p and q and compare them using the metrics AIC, BIC, and HQIC. AIC (Akaike Information Criterion), BIC (Bayesian Information Criterion), and HQIC (Hannan-Quinn Information Criterion) are statistical criteria used to compare the goodness of fit of multiple models while penalizing for model complexity, helping to avoid overfitting. A smaller value indicates a better model fit relative to the complexity of the model. 

The decreasing AIC values suggest that the models are improving as we increase the complexity (i.e., higher values of q and/or p), which is expected in time series modeling. BIC and HQIC suggest (2,2), while AIC suggests a higher-order model. 

Note: 

- ARMA modeling using the `statsmodels` library is performed with the Python function ARIMA, where the differencing parameter is set to zero (i.e., order=(p,0,q)). This fits an ARMA(p,q) model to the data.

- Warnings serve as a signal that model assumptions, the fitting process, and possibly the data itself require review. It helps to ensure that the models chosen are appropriate and that the estimates produced are reliable.

In [None]:
max_p = 6 # how many past values of the series are used to predict future values
max_q = 7 # number of lagged forecast errors in the prediction equation
model_metrics = []

print("Fitting ARMA models...")

for p in range(max_p + 1):
    for q in range(max_q + 1):
        if p == 0 and q == 0:
            continue  # Skip the case where both p and q are zero
        
        try:
            model = ARIMA(y, order=(p, 0, q)).fit()  # Use ARIMA with d=0 for ARMA
            model_metrics.append((p, q, model.aic, model.bic, model.hqic))  # Store p, q, metrics
            print(f"Fitted ARIMA({p}, 0, {q}) with AIC: {model.aic}, BIC: {model.bic}, HQIC: {model.hqic}")  # Print metrics for each fitted model
        except Exception as e:
            print(f"Error fitting ARIMA({p}, 0, {q}): {e}")

# Convert model metrics to DataFrame for easier analysis
metrics_df = pd.DataFrame(model_metrics, columns=['p', 'q', 'AIC', 'BIC', 'HQIC'])

# Print model metrics
print("\nModel Metrics for Different ARMA Models:")
print(metrics_df)

# Get the best model based on the minimum AIC
if not metrics_df.empty:
    best_model = metrics_df.loc[metrics_df['AIC'].idxmin()] # choose best model based on AIC only
    print("\nBest ARMA Model (p, q, AIC):", best_model)
else:
    print("No models were fitted successfully.")

The function `sm.tsa.arma_order_select_ic()` is a built-in model selection tool in the `statsmodels` library that automates the process of determining the best ARMA model orders based on specified criteria (AIC, BIC, HQIC):

In [None]:
inf = sm.tsa.arma_order_select_ic(y, max_ar=6, max_ma=7, ic=['aic', 'bic', 'hqic'], trend='c') # for statsmodels 0.13.0 and later, trend=’n’

In [None]:
print("Best AR term (p for minimum AIC):", inf.aic_min_order[0])
print("Best MA term (q for minimum AIC):", inf.aic_min_order[1])

print("Best AR term (p for minimum BIC):", inf.bic_min_order[0])
print("Best MA term (q for minimum BIC):", inf.bic_min_order[1])

print("Best AR term (p for minimum HQIC):", inf.hqic_min_order[0])
print("Best MA term (q for minimum HQIC):", inf.hqic_min_order[1])

### Fit model:

Based on the results from the model testing process and the understanding that the sample originates from an ARMA(2, 2) model, we proceed to fit this model to the data, analyze the fitted model and the residuals. 

The summary statistics reveal key characteristics of the fitted model, including the estimated coefficients, standard errors, and the goodness-of-fit measures. Analyzing the residuals is crucial for diagnosing how well the chosen model fits the data and testing for white noise residuals. If the residuals exhibit significant autocorrelation in the ACF/PACF plots or show significant deviations from normality, it may indicate that the model is not properly specified or suitable for the data.

The model appears to fit the data well. The residuals from the model exhibit no significant autocorrelation or evidence of non-normality (hence, behave like those of white noise), suggesting that the model is appropriately specified for the time series data.

In [None]:
# Fit the ARMA(2,2) model
arma22 = ARIMA(y, order=(2,0,2), trend='n').fit()
print(arma22.summary())

# Analyze the residuals
resid22 = arma22.resid

# Plot ACF and PACF of the residuals
plot_acf(resid22, lags=20)
plt.show()
plot_pacf(resid22, lags=20)
plt.show()

# Perform the Ljung-Box test for residuals
    # Calculate Ljung-Box test statistics and p-values
ljung_box_results = acorr_ljungbox(resid22, lags=25, return_df=True)
    # Create a plot for the p-values
plt.figure()
plt.plot(ljung_box_results['lb_pvalue'], marker='o', linestyle='-', color='b')
plt.axhline(y=0.05, color='r', linestyle='--')  # 5% significance level
plt.title('Ljung-Box Test P-Values')
plt.xlabel('Lags')
plt.ylabel('P-Value')
plt.xticks(np.arange(0, 26, 1))
plt.xticks(ljung_box_results.index)  # Set x-ticks to all lags
plt.gca().set_xticklabels([str(int(x)) if x % 2 == 0 else '' for x in ljung_box_results.index])
plt.grid()
plt.show()

# Q-Q plot
plt.figure()
sm.qqplot(resid22, line='q', ax=plt.gca())
plt.title('Q-Q Plot (Sample vs Theoretical Quantiles)')
plt.grid()

# Perform the normality test on residuals
normaltest_result = stats.normaltest(resid22)
print("Normality test result:", normaltest_result)

### Predict model:

For the specified time range, both the initial part of the out-of-sample forecasts and the in-sample predictions appear satisfactory. It is a common occurrence in ARIMA time series forecasting for predictions made further into the future to converge toward a constant value, typically the mean of the historical data. Additionally, the increasing uncertainty of predictions is evident in the wider confidence intervals for more distant out-of-sample forecasts compared to those for in-sample predictions or closer out-of-sample predictions, as the model has less information when forecasting beyond the observed data.

In [None]:
# Get predictions for a specified range (start=450, end=520)
pred = arma22.get_prediction(start=450, end=520)
predicts = pred.predicted_mean
predconf = pred.conf_int()

# Combine observed data, predictions, and confidence intervals into a DataFrame
predframe = pd.concat([y[450:], predicts, predconf], axis=1)
predframe.columns = ['Observed', 'Predicted', 'Lower CI', 'Upper CI']

# Plot observed, predicted, and confidence intervals
plt.figure(figsize=(10, 5))
plt.plot(predframe['Observed'], label='Observed', color='blue')
plt.plot(predframe['Predicted'], label='Predicted', color='red')
plt.fill_between(predframe.index, predframe['Lower CI'], predframe['Upper CI'], color='gray', alpha=0.5, label='Confidence Interval')
plt.title('ARMA(2, 2) Predictions and Confidence Intervals')
plt.legend()
plt.show()

Forecasts are not so good although the resulting model fits well.

### **Example 1: The NAO Index Since January 1950**

The time series is the monthly mean North Atlantic Oscillation (NAO) index since January 1950. The series appears stationary (compare notebook C1). We will treat the NAO series as originating from an ARMA process and aim to build an ARMA(p,q) model. Both AR and MA components could be viable options; however, the evident cutoff in the PACF plot after lag 1 suggests that an autoregressive process may be more appropriate.

In [None]:
# Load the NAO dataset
nao = pd.read_csv(dtapath + 'nao.csv', header=0)

# Create a time index
timeindex = pd.date_range('1950-01', periods=len(nao), freq='ME')
nao.index = timeindex

# Extract NAO index as a Series
naots = nao['index']  # Ensure 'index' corresponds to the correct column name

# Plot the NAO index time series
naots.plot(title='NAO Index Time Series', xlabel='Date', ylabel='NAO Index')
plt.show()

# Plot ACF
fig = plt.figure()
plot_acf(naots, lags=50)
plt.title("ACF of the Time Series")
plt.show()

# Plot PACF
fig = plt.figure()
plot_pacf(naots, lags=50)
plt.title("PACF of the Time Series")
plt.show()

# Plot Lag Plot
lag_plot(naots)
plt.title('Lag Plot of the Time Series')
plt.tight_layout()
plt.show()

**Exercise:** Fit and evaluate suitable models for the time series and select the best model based on AIC, BIC, and HQIC criteria.

**Exercise:** Fit the best model. What is the value of the AR coefficient?

**Exercise:** Analyze the residuals from the fitted ARIMA model to assess whether the model fits well.

**Exercise:** Use the fitted model for out-of-sample and in-sample predictions from April 2010 to December 2019. Plot the predictions and confidence intervals.

## 4. Stationarity Test and Differencing

One limitation of ARMA models is the stationarity condition. In many real-world time series, data can be thought of as being composed of two components: a non-stationary trend component and a zero-mean stationary component. Several strategies exist to achieve stationarity in a non-stationary time series, including differencing, detrending/decompostion and smoothing (see also notebook C3). Here, we demonstrate the differencing technique to stationarize time series. Statistical tests like the KPSS (Kwiatkowski-Phillips-Schmidt-Shin) test can be employed to assess the stationarity of a time series. In the `statsmodels.tsa.stattools` module, the function `kpss` performs the KPSS stationarity test, where the argument `regression='c'` indicates that the test is assessing the stationarity of a time series without a clear trend or obvious seasonality.

### **Example 2: Global Annual Mean Surface Air Temperature Changes Series (1880-1985)**

The time series dataset contains global mean surface air temperature changes from 1880 to 1985, as reported by Hansen and Lebedeff (1987). The temperature changes indicated in the study are relative to the average global mean surface air temperature derived from the baseline period of 1951-1980. The frequency of the time series is annual, meaning that seasonal variations in the annual cycle are not included. The trend in the time series illustrates the global warming. We apply the first difference to the time series to remove the trend and make it stationary (tested with KPSS).

In [None]:
# Load the dataset
tep = pd.read_csv(dtapath + 'Global mean surface air temp changes 1880-1985.csv', header=None)

# Create a date index
dates = pd.date_range('1880-12', periods=len(tep), freq='A-DEC')
tep.index = dates
tepts = pd.Series(tep[0], name='tep')

# Plot the original time series
plt.plot(tepts, color='b')
plt.title('Global Mean Surface Air Temperature Changes (1880-1985)')
plt.show()

# Differencing the time series
dtepts = tepts.diff(1)
dtepts = dtepts.dropna()
dtepts.name = 'dtep'

# Plot the differenced time series
plt.plot(dtepts, color='b')
plt.title('Differenced Time Series')
plt.show()

# Plot ACF and PACF
plot_acf(dtepts, lags=20)
plt.show()
plot_pacf(dtepts, lags=20)
plt.show()

# KPSS test for stationarity
kpss_stat, p_value, lags, crit_values = kpss(dtepts, regression='c', nlags='auto')

# Output the results of the KPSS test
print(f'KPSS Statistic: {kpss_stat}')
print(f'p-value: {p_value}')
print(f'Lags: {lags}')
print('Critical Values:', crit_values)
if p_value < 0.05:
    print("The series is likely non-stationary.")
else:
    print("The series is likely stationary.")

### **Example 3: Chinese Quarterly GDP**

In notebook C1, we found that the Chinese Quarterly GDP time series has time series has both trend and seasonality. Since it is the quarterly data, the number of seasons is 4 naturally. We therefore seasonally difference it with a lag of 4. After applying seasonal differencing with a lag of 4, we plot the seasonally differenced series to visualize the effects of removing seasonality. This transformation helped highlight trends without seasonal noise. Next, we performed a first difference on the seasonally differenced series to eliminate any remaining trend components. We then plotted this first differenced series and examine its ACF and PACF to assess the correlation structure, guiding our ARIMA model selection. We also conduct the KPSS test for stationarity.

In [None]:
# Load the data
x = pd.read_csv(dtapath + 'gdpquarterlychina1992.1-2017.4.csv',header=0)
dates = pd.date_range(start='1992',periods=len(x),freq='QE')
x.index=dates

# Plot the original time series
x.plot()
plt.title('Chinese Quarterly GDP 1992-2017')
plt.ylabel('billions of RMB')
plt.show()

# Create a date range starting from 1992 with quarterly frequency
dates = pd.date_range(start='1992', periods=len(x), freq='QE')
x.index = dates

# Create a time series from the 'GDP' column
x = pd.Series(x['GDP'])

# Seasonal differencing with lag 4
dx = x.diff(4).dropna()

# Plot the seasonally differenced series
dx.plot(marker='o', ms=3)  # ms refers to marker size
plt.title('Seasonally Differenced GDP (Lag 4)')
plt.xlabel('Date')
plt.ylabel('Differenced GDP')
plt.show()

# First differencing the seasonally differenced series
d1dx = dx.diff(1).dropna()

# Plot the first difference of the seasonally differenced series
d1dx.plot(marker='o', ms=3)
plt.title('First Difference of Seasonally Differenced GDP')
plt.xlabel('Date')
plt.ylabel('Differenced GDP')
plt.show()

# Plot ACF and PACF for the first difference of seasonally differenced series
plot_acf(d1dx, lags=44)
plt.title('ACF of Differenced Series')
plt.show()
plot_pacf(d1dx, lags=44)
plt.title('PACF of Differenced Series')
plt.show()

# KPSS test for stationarity
kpss_stat, p_value, lags, crit_values = kpss(d1dx, regression='c', nlags='auto')

# Output the results of the KPSS test
print(f'KPSS Statistic: {kpss_stat}')
print(f'p-value: {p_value}')
print(f'Lags: {lags}')
print('Critical Values:', crit_values)
if p_value < 0.05:
    print("The series is likely non-stationary.")
else:
    print("The series is likely stationary.")

### **Example 1 [continued]: The NAO Index Since January 1950**

**Exercise:** Conduct the KPSS test for stationarity on the NAO data (variable 'naots') with a maximum of 50 lags.

## 5. Autoregressive Integrated Moving Average (ARIMA) Models

ARIMA incorporates the concept of ARMA applied to a differenced series.

### **Example 2 [continued]: Global Annual Mean Surface Air Temperature Changes Series (1880-1985)**

### Choose model:

In [None]:
inf = sm.tsa.arma_order_select_ic(dtepts, max_ar=3, max_ma=3, ic=['aic', 'bic', 'hqic'], trend='c')

In [None]:
print("Best AR term (p for minimum AIC):", inf.aic_min_order[0])
print("Best MA term (q for minimum AIC):", inf.aic_min_order[1])

print("Best AR term (p for minimum BIC):", inf.bic_min_order[0])
print("Best MA term (q for minimum BIC):", inf.bic_min_order[1])

print("Best AR term (p for minimum HQIC):", inf.hqic_min_order[0])
print("Best MA term (q for minimum HQIC):", inf.hqic_min_order[1])

AIC and HQIC derive (p,q) = (1,3) and BIC derives (p,q) = (1,1). We choose to fit and predict the ARMA(1,1) first.

### Fit model:

In [None]:
arma11 = ARIMA(dtepts, order=(1,0,1)).fit()
print(arma11.summary())

# Analyze the residuals
resid11 = arma11.resid

# Plot ACF and PACF of the residuals
plot_acf(resid11, lags=20)
plt.show()
plot_pacf(resid11, lags=20)
plt.show()

# Perform the Ljung-Box test for residuals
    # Calculate Ljung-Box test statistics and p-values
ljung_box_results = acorr_ljungbox(resid11, lags=20, return_df=True)
    # Create a plot for the p-values
plt.figure()
plt.plot(ljung_box_results['lb_pvalue'], marker='o', linestyle='-', color='b')
plt.axhline(y=0.05, color='r', linestyle='--')  # 5% significance level
plt.title('Ljung-Box Test P-Values')
plt.xlabel('Lags')
plt.ylabel('P-Value')
plt.xticks(np.arange(0, 21, 1))
plt.xticks(ljung_box_results.index)  # Set x-ticks to all lags
plt.gca().set_xticklabels([str(int(x)) if x % 2 == 0 else '' for x in ljung_box_results.index])
plt.grid()
plt.show()

# Q-Q plot
plt.figure()
sm.qqplot(resid11, line='q', ax=plt.gca())
plt.title('Q-Q Plot (Sample vs Theoretical Quantiles)')
plt.grid()

# Perform the normality test on residuals
normaltest_result = stats.normaltest(resid11)
print("Normality test result:", normaltest_result)

The residual series behaves like a normal white noise, and so the estimated model fits very well to the differenced series data.

### Predict model:

In [None]:
# Generate prediction results
pred = arma11.get_prediction(start='1960-12', end='1990-12')

# Extract predicted mean and confidence intervals
predicts = pred.predicted_mean
predconf = pred.conf_int()

# Plot
plt.figure(figsize=(10, 5))
plt.plot(dtepts, label='Observed', color='blue')
predicts.plot(label='Forecast', color='red')
plt.fill_between(predicts.index,
                 predconf.iloc[:, 0],
                 predconf.iloc[:, 1], color='gray', alpha=0.3, label='Confidence Interval')
plt.title('ARIMA Forecast (1960-1990)')
plt.xlabel('Year')
plt.ylabel('Differenced Temperature')
plt.legend()
plt.show()

The forecasts — both in-sample and out-of-sample — generated by the fitted ARMA(1,1) model are not really satisfactory.

**Exercise:** Fit and predict the ARMA(1,3) model and compare the results to the ARMA(1,3) model.

Manual differencing versus built-in differencing are two different pre-processing strategies for handling non-stationarity that can lead to different results as the following example demonstrates.

In [None]:
arima111 = ARIMA(tepts, order=(1,1,1)).fit()
resid111 = arima111.resid

# Plot ACF and PACF of the residuals
plot_acf(resid111, lags=20)
plt.show()
plot_pacf(resid111, lags=20)
plt.show()

# Perform the Ljung-Box test for residuals
    # Calculate Ljung-Box test statistics and p-values
ljung_box_results = acorr_ljungbox(resid111, lags=20, return_df=True)
    # Create a plot for the p-values
plt.figure()
plt.plot(ljung_box_results['lb_pvalue'], marker='o', linestyle='-', color='b')
plt.axhline(y=0.05, color='r', linestyle='--')  # 5% significance level
plt.title('Ljung-Box Test P-Values')
plt.xlabel('Lags')
plt.ylabel('P-Value')
plt.xticks(np.arange(0, 21, 1))
plt.xticks(ljung_box_results.index)  # Set x-ticks to all lags
plt.gca().set_xticklabels([str(int(x)) if x % 2 == 0 else '' for x in ljung_box_results.index])
plt.grid()
plt.show()

# Q-Q plot
plt.figure()
sm.qqplot(resid111, line='q', ax=plt.gca())
plt.title('Q-Q Plot (Sample vs Theoretical Quantiles)')
plt.grid()

# Perform the normality test on residuals
normaltest_result = stats.normaltest(resid111)
print("Normality test result:", normaltest_result)

# Generate prediction results
pred = arima111.get_prediction(start='1960-12', end='1990-12')

# Extract predicted mean and confidence intervals
predicts = pred.predicted_mean
predconf = pred.conf_int()

# Plot
plt.figure(figsize=(10, 5))
plt.plot(dtepts, label='Observed', color='blue')
predicts.plot(label='Forecast', color='red')
plt.fill_between(predicts.index,
                 predconf.iloc[:, 0],
                 predconf.iloc[:, 1], color='gray', alpha=0.3, label='Confidence Interval')
plt.title('ARIMA Forecast (1960-1990)')
plt.xlabel('Year')
plt.ylabel('Differenced Temperature')
plt.legend()
plt.show()

Receive both in-sample predictions using `fittedvalues` and out-sample forecasts using `forecast`, directly in the original value scale:

In [None]:
# Fit model
arima111 = ARIMA(tepts, order=(1,1,1)).fit()

# Define out-sample forecast
forecast_steps = 5
forecast_series = arima111.forecast(steps=forecast_steps)
last_date = tepts.index[-1]  # Get the last date of actual data
forecast_dates = pd.date_range(start=last_date, periods=forecast_steps + 1, freq='A')[1:]
forecast_result = arima111.get_forecast(steps=forecast_steps)
forecast_series = forecast_result.predicted_mean
forecast_conf = forecast_result.conf_int()

# Plot
plt.figure(figsize=(10, 5))
plt.xlabel('Year')
plt.ylabel('Temperature Anomaly')
plt.plot(tepts.index, arima111.fittedvalues, color="red", label="Fitted Values")
plt.plot(forecast_dates, forecast_series, color='black', label='Forecast Out-Sample')
tepts.plot(color="blue", label="Observed")
plt.fill_between(forecast_dates,
                 forecast_conf.iloc[:, 0],  # Lower confidence interval
                 forecast_conf.iloc[:, 1],  # Upper confidence interval
                 color='gray', alpha=0.3, label='Confidence Interval')
plt.legend()
plt.show()

In [None]:
# Fit model
arima113 = ARIMA(tepts, order=(1,1,3)).fit()

# Define out-sample forecast
forecast_steps = 5
forecast_series = arima113.forecast(steps=forecast_steps)
last_date = tepts.index[-1]  # Get the last date of actual data
forecast_dates = pd.date_range(start=last_date, periods=forecast_steps + 1, freq='A')[1:]
forecast_result = arima113.get_forecast(steps=forecast_steps)
forecast_series = forecast_result.predicted_mean
forecast_conf = forecast_result.conf_int()

# Plot
plt.figure(figsize=(10, 5))
plt.xlabel('Year')
plt.ylabel('Temperature Anomaly')
plt.plot(tepts.index, arima113.fittedvalues, color="red", label="Fitted Values")
plt.plot(forecast_dates, forecast_series, color='black', label='Forecast Out-Sample')
tepts.plot(color="blue", label="Observed")
plt.fill_between(forecast_dates,
                 forecast_conf.iloc[:, 0],  # Lower confidence interval
                 forecast_conf.iloc[:, 1],  # Upper confidence interval
                 color='gray', alpha=0.3, label='Confidence Interval')
plt.legend()
plt.show()