# **Chapter 21: Traditional Statistical Models**

## **Learning Objectives**

By the end of this chapter, you will be able to:

- Understand the foundational concepts of classical time‑series models
- Explain the difference between autoregressive (AR), moving average (MA), and integrated (I) components
- Identify the appropriate model order using autocorrelation (ACF) and partial autocorrelation (PACF) plots
- Build, estimate, and diagnose ARIMA and SARIMA models for the NEPSE dataset
- Apply exponential smoothing methods for forecasting
- Recognize when to use vector autoregression (VAR) for multivariate series
- Interpret model diagnostics and ensure residuals are white noise
- Compare the strengths and limitations of statistical models versus machine learning approaches
- Implement these models in Python using `statsmodels`

---

## **21.1 Introduction to Statistical Models**

Before the rise of machine learning, forecasting time‑series was dominated by statistical models that explicitly describe the stochastic process generating the data. These models are built on probability theory and assume that the time series has a certain structure (e.g., linear dependence on past values, moving average of past errors, seasonality). The most famous family is the **ARIMA** (Autoregressive Integrated Moving Average) model, popularized by Box and Jenkins. Other important models include **Exponential Smoothing** (Holt‑Winters) and **Vector Autoregression** (VAR) for multivariate series.

For the NEPSE prediction system, statistical models serve as excellent baselines. They are interpretable, have well‑established theory, and often perform surprisingly well, especially on shorter horizons. Moreover, they force us to think carefully about the data generating process, which helps in understanding the market dynamics.

### **21.1.1 When to Use Statistical Models**

- When the series exhibits clear linear autocorrelation structure.
- When interpretability is important (e.g., for regulatory reporting).
- When data is limited; statistical models generally require fewer observations than deep learning.
- As a benchmark to compare against more complex models.

### **21.1.2 The Box‑Jenkins Methodology**

The Box‑Jenkins approach for ARIMA modeling consists of three steps:

1. **Identification** – using plots, ACF, and PACF to guess the model order (p, d, q).
2. **Estimation** – fitting the model parameters via maximum likelihood or least squares.
3. **Diagnostic checking** – verifying that residuals are white noise (no autocorrelation) and that the model is adequate.

We will follow this methodology throughout the chapter.

---

## **21.2 Autoregressive Models (AR)**

An autoregressive model of order p, denoted AR(p), assumes that the current value of the series depends linearly on its own previous p values plus a random error.

**Mathematically:**  
`y_t = c + φ₁ y_{t-1} + φ₂ y_{t-2} + ... + φ_p y_{t-p} + ε_t`  
where `ε_t` is white noise.

In the context of NEPSE, if we model daily returns, an AR(1) model would say that today's return is a fraction of yesterday's return plus a random shock. This captures momentum or mean‑reversion effects.

### **21.2.1 Identifying AR Order with PACF**

The partial autocorrelation function (PACF) measures the correlation between `y_t` and `y_{t-k}` after removing the effects of the intermediate lags. For an AR(p) process, the PACF cuts off after lag p. That is, it becomes zero for all lags > p.

```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.ar_model import AutoReg

# Load NEPSE data for a single symbol (e.g., first symbol)
df = pd.read_csv('nepse_data.csv')
df['Date'] = pd.to_datetime(df['Date'])
df = df.sort_values(['Symbol', 'Date']).reset_index(drop=True)
symbol = df['Symbol'].unique()[0]  # pick first symbol
df_one = df[df['Symbol'] == symbol].copy()

# Compute daily returns (more stationary than prices)
df_one['Return'] = df_one['Close'].pct_change() * 100  # percentage returns
returns = df_one['Return'].dropna()

# Plot PACF to identify AR order
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 8))
plot_acf(returns, lags=30, ax=ax1, title='ACF of Returns')
plot_pacf(returns, lags=30, ax=ax2, title='PACF of Returns')
plt.tight_layout()
plt.show()
```

**Explanation:**

- The ACF of a stationary series typically decays gradually, while the PACF has a sharp cutoff at the AR order. In practice, we look for significant spikes in the PACF up to lag p, after which they become insignificant (within the confidence bands).
- For financial returns, often the PACF shows significance only at lag 1, suggesting an AR(1) model. However, this may vary by stock and period.

### **21.2.2 Fitting an AR Model in Python**

We can fit an AR(p) model using `statsmodels.tsa.ar_model.AutoReg`.

```python
from statsmodels.tsa.ar_model import AutoReg

# Choose p based on PACF (here we'll try p=1,2,3)
for p in [1, 2, 3]:
    model = AutoReg(returns, lags=p, trend='c')  # 'c' includes constant
    result = model.fit()
    print(f"\nAR({p}) AIC: {result.aic:.2f}")
    print(result.summary().tables[1])
```

**Explanation:**

- `AutoReg` fits the model using OLS. The `trend='c'` adds a constant term.
- We compare models using AIC (Akaike Information Criterion); lower is better. AIC balances goodness‑of‑fit with model complexity.
- The summary provides coefficient estimates, standard errors, and p‑values. Significant coefficients suggest those lags are useful.

### **21.2.3 Forecasting with AR Model**

```python
# Forecast next 5 days
forecast = result.forecast(steps=5)
print("Forecasted returns (%):")
print(forecast)

# If we want to forecast prices, we need to invert the return transformation
last_price = df_one['Close'].iloc[-1]
forecast_prices = last_price * (1 + forecast.cumsum() / 100)  # assuming returns are percentages
print("\nForecasted prices:")
print(forecast_prices)
```

**Explanation:**

- `forecast` returns point predictions for the next `steps` periods.
- To convert return forecasts back to price, we apply the cumulative product. However, this is approximate because we are ignoring compounding effects; a more accurate method would be to simulate.

---

## **21.3 Moving Average Models (MA)**

A moving average model of order q, MA(q), expresses the current value as a linear combination of current and past error terms.

**Mathematically:**  
`y_t = μ + ε_t + θ₁ ε_{t-1} + θ₂ ε_{t-2} + ... + θ_q ε_{t-q}`  
where `ε_t` is white noise.

MA models are useful for series that are influenced by random shocks that persist for a few periods (e.g., news events affecting stock prices for a couple of days).

### **21.3.1 Identifying MA Order with ACF**

For an MA(q) process, the ACF cuts off after lag q, while the PACF decays gradually. This is the opposite of AR.

```python
# Plot ACF and PACF again; if ACF cuts off at lag q, consider MA(q)
# For financial returns, often ACF shows no significant lags (white noise) or a small spike at lag 1.
```

### **21.3.2 Fitting an MA Model**

In `statsmodels`, MA models are part of the ARIMA framework. We can fit a pure MA model using `ARIMA` with p=0, d=0.

```python
from statsmodels.tsa.arima.model import ARIMA

# MA(1) model
model_ma1 = ARIMA(returns, order=(0,0,1))  # (p,d,q) = (0,0,1)
result_ma1 = model_ma1.fit()
print(result_ma1.summary())
```

**Explanation:**

- `order=(0,0,1)` specifies AR=0, differencing=0, MA=1.
- The summary includes the MA coefficient (ma.L1) and its significance.

---

## **21.4 ARMA Models**

Combining AR and MA terms gives the ARMA(p,q) model, which is often more parsimonious (fewer parameters) than a pure AR or MA of high order.

**Mathematically:**  
`y_t = c + φ₁ y_{t-1} + ... + φ_p y_{t-p} + ε_t + θ₁ ε_{t-1} + ... + θ_q ε_{t-q}`

For stationary series (no trend), ARMA is appropriate. For financial returns, ARMA(1,1) is a common candidate.

### **21.4.1 Fitting ARMA Models**

```python
# Try ARMA(1,1)
model_arma11 = ARIMA(returns, order=(1,0,1))
result_arma11 = model_arma11.fit()
print(result_arma11.summary())
```

**Explanation:**

- The model includes both AR(1) and MA(1) terms. Often one of them may be insignificant; we can drop it if so.
- Compare AIC with pure AR or MA models to select the best.

---

## **21.5 ARIMA Models**

ARIMA (Autoregressive Integrated Moving Average) extends ARMA to non‑stationary series by including a differencing step. The "I" stands for integrated, meaning we difference the series d times to make it stationary.

**Notation:** ARIMA(p,d,q)  
- p = order of autoregressive part  
- d = degree of differencing  
- q = order of moving average part

For stock prices, which are typically non‑stationary (they trend), we often set d=1 to work with returns (which are stationary). That is, an ARIMA(p,1,q) on prices is equivalent to an ARMA(p,q) on returns.

### **21.5.1 Integration (I) and Stationarity**

We must check if the series is stationary. For NEPSE closing prices, an Augmented Dickey‑Fuller test will likely fail to reject the null of non‑stationarity. Differencing once usually makes it stationary.

```python
from statsmodels.tsa.stattools import adfuller

# Test on prices
prices = df_one['Close'].dropna()
adf_result = adfuller(prices)
print(f"Prices ADF p-value: {adf_result[1]:.4f}")

# Test on returns
adf_result_ret = adfuller(returns.dropna())
print(f"Returns ADF p-value: {adf_result_ret[1]:.4f}")
```

**Explanation:**

- A p‑value < 0.05 indicates stationarity. Prices are usually non‑stationary; returns are stationary.
- Therefore, for price forecasting, we would use d=1. For return forecasting, d=0.

### **21.5.2 Parameter Selection**

Selecting p and q for ARIMA on returns is the same as for ARMA. We can use ACF/PACF of the differenced series (returns) to guide initial choices, then compare AIC.

```python
# ACF/PACF of returns already plotted; suggest possible p,q
# For example, if PACF cuts at 1 and ACF decays, maybe AR(1). If ACF cuts at 1 and PACF decays, maybe MA(1).

# Fit candidate models and compare AIC
candidates = [(1,0,0), (0,0,1), (1,0,1), (2,0,1), (1,0,2)]
results = []
for order in candidates:
    model = ARIMA(returns, order=order)
    result = model.fit()
    results.append({'order': order, 'AIC': result.aic})
    
results_df = pd.DataFrame(results).sort_values('AIC')
print(results_df)
```

**Explanation:**

- We loop over several plausible orders and pick the one with the smallest AIC.
- Note: This is on returns (d=0). If we wanted to model prices directly, we would use order (p,1,q) and fit on prices.

### **21.5.3 Model Fitting and Interpretation**

Let's fit the best model (lowest AIC) and examine its coefficients.

```python
best_order = results_df.iloc[0]['order']
best_model = ARIMA(returns, order=best_order)
best_result = best_model.fit()
print(best_result.summary())
```

**Explanation:**

- The summary shows coefficients, standard errors, and p‑values. If a coefficient is not significant (p > 0.05), we might consider dropping that term.
- Also check the Ljung‑Box test on residuals to ensure no autocorrelation remains (see diagnostics).

### **21.5.4 Forecasting with ARIMA**

```python
# Forecast next 5 returns
forecast_returns = best_result.forecast(steps=5)
print(forecast_returns)

# Convert to price forecast if needed
last_price = df_one['Close'].iloc[-1]
price_forecast = last_price * (1 + forecast_returns.cumsum() / 100)  # again approximate
print(price_forecast)
```

**Explanation:**

- `forecast` returns point predictions and confidence intervals if we use `get_forecast()`.
- For price forecasting, we can also fit ARIMA directly on prices with d=1.

---

## **21.6 SARIMA Models (Seasonal ARIMA)**

If the series exhibits seasonality (e.g., weekly, monthly, yearly patterns), we need a seasonal ARIMA model, denoted SARIMA(p,d,q)(P,D,Q)s, where s is the seasonal period.

For NEPSE, we might see weekly patterns (e.g., Monday effect) or monthly patterns (e.g., end‑of‑month). The seasonal component captures these.

### **21.6.1 Identifying Seasonality**

Plot the series and look for repeating patterns. Also, examine the ACF: significant spikes at seasonal lags (e.g., lag 5 for weekly, lag 20 for monthly) suggest seasonality.

```python
# For daily data with potential weekly seasonality (5 trading days), we might check ACF at lags 5,10,...
# We'll use returns data
plot_acf(returns, lags=40)
plt.show()
```

**Explanation:**

- If we see a spike at lag 5 (one week) and possibly at lag 10 (two weeks), that indicates weekly seasonality.
- However, financial returns often have weak seasonality; it's more common in volumes or volatility.

### **21.6.2 Fitting a SARIMA Model**

We use `SARIMAX` from `statsmodels` (which can also handle exogenous variables).

```python
from statsmodels.tsa.statespace.sarimax import SARIMAX

# Example: SARIMA(1,0,1)(1,0,1,5) - weekly seasonality
seasonal_order = (1,0,1,5)  # (P,D,Q,s)
model_sarima = SARIMAX(returns, order=(1,0,1), seasonal_order=seasonal_order)
result_sarima = model_sarima.fit(disp=False)
print(result_sarima.summary())
```

**Explanation:**

- The `seasonal_order` tuple specifies the seasonal AR, differencing, MA, and the period s.
- For NEPSE, s=5 for weekly (trading days). If we had monthly data, s=12. For daily with monthly effects, s could be 20 (approx trading days per month).
- The model can become quite complex; we must ensure we have enough data to estimate all parameters.

### **21.6.3 Model Selection for SARIMA**

Because the parameter space is large, we often use automated search (e.g., `pmdarima` library) to find the best SARIMA order.

```python
# Install pmdarima if needed: pip install pmdarima
import pmdarima as pm

# Auto-arima to find best SARIMA model
auto_model = pm.auto_arima(returns, seasonal=True, m=5,  # m=5 for weekly
                            start_p=0, start_q=0, max_p=3, max_q=3,
                            start_P=0, start_Q=0, max_P=2, max_Q=2,
                            trace=True, error_action='ignore', suppress_warnings=True)
print(auto_model.summary())
```

**Explanation:**

- `pmdarima` automatically searches over combinations of p,d,q,P,D,Q with given constraints.
- The `m` parameter is the seasonal period. For daily data with weekly seasonality, m=5 (trading days). For monthly seasonality, m=20 or m=22.
- The output includes the best model order and its AIC.

---

## **21.7 Exponential Smoothing**

Exponential smoothing methods forecast as weighted averages of past observations, with weights decaying exponentially as observations get older. They are intuitive and often perform well for series with trend and/or seasonality.

### **21.7.1 Simple Exponential Smoothing (SES)**

For series with no trend or seasonality, SES forecasts as a weighted average of all past values, with weights decaying exponentially. The smoothing parameter α (0<α<1) controls the decay.

**Model:**  
`ŷ_{t+1} = α y_t + (1-α) ŷ_t`

In Python, we can use `SimpleExpSmoothing` from `statsmodels`.

```python
from statsmodels.tsa.holtwinters import SimpleExpSmoothing

# Fit SES on returns (though returns may not need smoothing, but as demo)
ses_model = SimpleExpSmoothing(returns).fit(smoothing_level=0.2, optimized=False)
# or let it optimize alpha:
ses_model_opt = SimpleExpSmoothing(returns).fit()
print(f"Optimized alpha: {ses_model_opt.params['smoothing_level']:.4f}")

# Forecast next 5
forecast_ses = ses_model_opt.forecast(5)
print(forecast_ses)
```

**Explanation:**

- `optimized=False` with a given alpha; `optimized=True` (default) finds the best alpha by minimizing SSE.
- SES is rarely used for financial returns because returns are often unpredictable (white noise), but it can be used for volatility or other smoother series.

### **21.7.2 Double Exponential Smoothing (Holt's Method)**

Adds a trend component. Suitable for series with trend but no seasonality.

**Model:** level and trend equations.

```python
from statsmodels.tsa.holtwinters import Holt

holt_model = Holt(returns).fit()
print(holt_model.summary())
forecast_holt = holt_model.forecast(5)
```

**Explanation:**

- Holt's method estimates a level and a trend, both smoothed exponentially.
- For returns, which have no trend (mean zero), this may not be appropriate, but for prices it could capture trends.

### **21.7.3 Triple Exponential Smoothing (Holt‑Winters)**

Adds a seasonal component. Suitable for series with both trend and seasonality.

**Model:** level, trend, and seasonal components.

```python
from statsmodels.tsa.holtwinters import ExponentialSmoothing

# For daily returns, if we believe weekly seasonality exists
hw_model = ExponentialSmoothing(returns, seasonal_periods=5, trend='add', seasonal='add').fit()
forecast_hw = hw_model.forecast(5)
```

**Explanation:**

- `seasonal_periods` is the number of periods in a season (e.g., 5 for weekly trading days).
- `trend` and `seasonal` can be 'add' or 'mul' for additive or multiplicative components.
- For financial data, additive seasonality is more common.

---

## **21.8 State Space Models**

State space models provide a unified framework for many time‑series models, including ARIMA and exponential smoothing. They represent the observed series as a function of an unobserved state vector that evolves over time.

In `statsmodels`, `SARIMAX` is actually a state space model. Also, `UnobservedComponents` allows flexible specification of trend, seasonality, and cycle.

```python
from statsmodels.tsa.statespace.structural import UnobservedComponents

# Model with a local level (random walk) and a seasonal component
ss_model = UnobservedComponents(returns, level='local level', seasonal=5)
ss_result = ss_model.fit()
print(ss_result.summary())
```

**Explanation:**

- This is a sophisticated approach that can capture many features. It's beyond the scope of this chapter, but worth knowing that such tools exist.

---

## **21.9 Vector Autoregression (VAR)**

When we have multiple related time series (e.g., several stocks, or a stock and an index), we might want to model them jointly. VAR models each variable as a linear function of past values of itself and past values of all other variables.

**Mathematically:**  
`Y_t = c + A₁ Y_{t-1} + ... + A_p Y_{t-p} + ε_t`  
where `Y_t` is a vector of variables, `A_i` are coefficient matrices.

### **21.9.1 When to Use VAR**

- To capture interdependencies (e.g., how NEPSE bank stocks influence each other).
- For forecasting multiple series simultaneously.
- For Granger causality testing (whether one series helps predict another).

### **21.9.2 Fitting a VAR Model**

We'll select a few stocks from the NEPSE dataset.

```python
from statsmodels.tsa.api import VAR

# Select a few symbols
symbols = df['Symbol'].unique()[:3]  # first 3 symbols
dfs = []
for sym in symbols:
    sym_df = df[df['Symbol'] == sym].set_index('Date')['Close'].rename(sym)
    dfs.append(sym_df)

# Combine into a single DataFrame (may have missing dates; we'll inner join)
multi = pd.concat(dfs, axis=1).dropna()
returns_multi = multi.pct_change().dropna() * 100

# Fit VAR model (select lag order by AIC)
var_model = VAR(returns_multi)
lag_order = var_model.select_order(maxlags=10)
print(lag_order.summary())

# Fit with chosen lag (e.g., AIC suggested lag)
results_var = var_model.fit(lag_order.aic)
print(results_var.summary())
```

**Explanation:**

- `select_order` computes information criteria for different lags. We choose the lag that minimizes AIC.
- `fit` estimates the model. The summary shows coefficients for each equation (each stock's return as a function of lagged returns of all stocks).
- Forecasting with VAR uses `forecast` method, which requires the last `p` observations.

### **21.9.3 Forecasting with VAR**

```python
# Forecast next 5 days
lag_order = results_var.k_ar
last_obs = returns_multi.values[-lag_order:]
forecast_var = results_var.forecast(y=last_obs, steps=5)
print(forecast_var)
```

**Explanation:**

- `forecast` returns an array of shape (steps, n_variables).
- We can then convert to price forecasts for each stock.

---

## **21.10 Model Diagnostics**

After fitting any statistical model, we must check whether the residuals resemble white noise. If they don't, the model is misspecified.

### **21.10.1 Residual Analysis**

- Plot residuals over time: should look random, no pattern.
- ACF of residuals: should show no significant autocorrelation.
- Ljung‑Box test: tests whether any group of autocorrelations is significantly different from zero.

```python
from statsmodels.stats.diagnostic import acorr_ljungbox

residuals = best_result.resid

# Plot residuals
plt.figure(figsize=(12,4))
plt.plot(residuals)
plt.title('Residuals of ARIMA Model')
plt.show()

# ACF of residuals
plot_acf(residuals, lags=30)
plt.show()

# Ljung-Box test
lb_test = acorr_ljungbox(residuals, lags=[10, 20, 30], return_df=True)
print(lb_test)
```

**Explanation:**

- For a good model, p‑values of the Ljung‑Box test should be > 0.05, indicating no significant autocorrelation at those lags.
- If residuals show autocorrelation, we need to increase the model order or consider seasonality.

### **21.10.2 Normality of Residuals (Optional)**

Financial returns often have fat tails, so normality is not required, but extreme non‑normality may indicate outliers or model inadequacy. We can check with a Q‑Q plot.

```python
import scipy.stats as stats

stats.probplot(residuals, dist="norm", plot=plt)
plt.show()
```

### **21.10.3 Out‑of‑Sample Validation**

Ultimately, the best diagnostic is out‑of‑sample forecasting performance. We should use the splitting techniques from Chapter 20 to evaluate the model on unseen data.

---

## **21.11 Strengths and Limitations**

### **Strengths of Statistical Models**

- **Interpretability:** Coefficients have clear meanings (e.g., φ₁ measures persistence).
- **Theoretical foundation:** Well‑understood properties, confidence intervals for forecasts.
- **Parsimony:** Often require few parameters, reducing overfitting risk.
- **Benchmarking:** Serve as natural baselines for ML models.
- **Handling of uncertainty:** Provide prediction intervals naturally (via model assumptions).

### **Limitations**

- **Linearity:** Assume linear relationships, which may not hold in financial markets.
- **Stationarity:** Require series to be stationary (or differenced), which may discard long‑term information.
- **Univariate focus:** Basic ARIMA models don't incorporate external regressors easily (though ARIMAX does).
- **Limited flexibility:** Cannot capture complex non‑linear patterns or interactions.
- **Sensitivity to outliers:** Parameter estimates can be distorted by extreme events.

### **Comparison with Machine Learning Models**

| Aspect | Statistical Models | Machine Learning Models |
|--------|-------------------|------------------------|
| Interpretability | High | Low (black box) |
| Data requirements | Moderate | High |
| Handling non‑linearity | Poor | Excellent |
| Feature engineering | Minimal | Extensive |
| Uncertainty quantification | Natural | Requires special methods |
| Computational cost | Low | High |
| Performance on simple series | Good | May overfit |

For the NEPSE system, we might start with a statistical model as a baseline, then see if ML models can improve upon it.

---

## **21.12 Implementation on NEPSE Data: Step‑by‑Step**

Let's walk through a complete example of building an ARIMA model for a single NEPSE stock to forecast the next day's closing price.

### **21.12.1 Data Preparation**

```python
# Load and prepare data for a specific symbol
symbol = "NEPSE"  # or a specific stock ticker
df_stock = df[df['Symbol'] == symbol].copy()
df_stock = df_stock.set_index('Date').sort_index()

# Use closing price
prices = df_stock['Close'].dropna()

# Plot
plt.figure(figsize=(12,4))
plt.plot(prices)
plt.title(f'{symbol} Closing Price')
plt.show()
```

### **21.12.2 Stationarity Check and Differencing**

```python
# ADF test on prices
adf_p = adfuller(prices)
print(f"Prices ADF p-value: {adf_p[1]:.4f}")

# If p > 0.05, difference once
if adf_p[1] > 0.05:
    prices_diff = prices.diff().dropna()
    # Test again
    adf_diff = adfuller(prices_diff)
    print(f"1st difference ADF p-value: {adf_diff[1]:.4f}")
    d = 1
else:
    prices_diff = prices
    d = 0
```

### **21.12.3 Identify AR and MA Orders**

```python
# Plot ACF and PACF of differenced series
fig, (ax1, ax2) = plt.subplots(2,1, figsize=(12,8))
plot_acf(prices_diff, lags=30, ax=ax1)
plot_pacf(prices_diff, lags=30, ax=ax2)
plt.show()
```

### **21.12.4 Fit Candidate ARIMA Models**

```python
# Based on plots, try a few orders
orders = [(1,d,0), (0,d,1), (1,d,1), (2,d,1), (1,d,2)]
results = []
for order in orders:
    try:
        model = ARIMA(prices, order=order)
        fit = model.fit()
        results.append({'order': order, 'AIC': fit.aic})
    except:
        continue

results_df = pd.DataFrame(results).sort_values('AIC')
print(results_df)
```

### **21.12.5 Choose Best Model and Diagnose**

```python
best_order = results_df.iloc[0]['order']
best_model = ARIMA(prices, order=best_order)
best_fit = best_model.fit()
print(best_fit.summary())

# Residual diagnostics
resid = best_fit.resid
plot_acf(resid, lags=30)
plt.show()

lb_test = acorr_ljungbox(resid, lags=[10,20], return_df=True)
print(lb_test)
```

### **21.12.6 Forecasting**

```python
# Forecast next 5 days
forecast_result = best_fit.get_forecast(steps=5)
forecast_mean = forecast_result.predicted_mean
forecast_ci = forecast_result.conf_int()

# Plot historical and forecast
plt.figure(figsize=(12,4))
plt.plot(prices.index[-100:], prices.values[-100:], label='Historical')
forecast_index = pd.date_range(start=prices.index[-1] + pd.Timedelta(days=1), periods=5, freq='B')
plt.plot(forecast_index, forecast_mean, label='Forecast', color='red')
plt.fill_between(forecast_index, forecast_ci.iloc[:,0], forecast_ci.iloc[:,1], color='red', alpha=0.2)
plt.legend()
plt.title(f'{symbol} Price Forecast')
plt.show()
```

---

## **21.13 Chapter Summary**

In this chapter, we covered the essential traditional statistical models for time‑series forecasting, with applications to the NEPSE dataset.

- **AR, MA, ARMA, ARIMA** models form the core of the Box‑Jenkins approach. We learned how to identify orders using ACF/PACF, fit models, and diagnose residuals.
- **SARIMA** extends ARIMA to handle seasonality, which may be present in daily trading data.
- **Exponential smoothing** methods (Holt‑Winters) provide an intuitive alternative for series with trend and seasonality.
- **Vector Autoregression (VAR)** models multivariate time series, capturing interdependencies among multiple stocks.
- **Model diagnostics** (Ljung‑Box test, residual ACF) are crucial to ensure model adequacy.
- We compared strengths and limitations of statistical models with machine learning, highlighting that statistical models are excellent baselines.

### **Practical Takeaways for the NEPSE System:**

- Always start with a simple statistical model as a baseline (e.g., ARIMA on returns).
- Use AIC to guide model selection, but also validate out‑of‑sample.
- Check residuals carefully; if patterns remain, consider more complex models or ML.
- For multivariate forecasting (e.g., multiple stocks), VAR can be a powerful tool.
- Statistical models provide interpretable insights into market dynamics (e.g., persistence, mean reversion).

In the next chapter, **Chapter 22: Tree‑Based Models**, we will explore how decision trees, random forests, and gradient boosting can be applied to the same forecasting problems, often yielding higher accuracy at the cost of interpretability.

---

**End of Chapter 21**