[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/danpele/Time-Series-Analysis/blob/main/chapter2_seminar_notebook.ipynb)

---

# Chapter 2: Seminar - ARMA Models Exercises

**Course:** Time Series Analysis and Forecasting  
**Program:** Bachelor program, Faculty of Cybernetics, Statistics and Economic Informatics, Bucharest University of Economic Studies, Romania  
**Academic Year:** 2025-2026

---

## Seminar Objectives

In this seminar, you will:
1. Practice working with lag operators and backshift notation
2. Calculate AR and MA process properties (mean, variance, autocovariance)
3. Identify ARMA models from ACF/PACF patterns
4. Fit and diagnose ARMA models in Python
5. Apply the Box-Jenkins methodology to real data

## Setup

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

import yfinance as yf
from statsmodels.tsa.arima_process import ArmaProcess
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.stattools import adfuller, acf, pacf
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.stats.diagnostic import acorr_ljungbox
from scipy import stats

# Plotting style - clean, professional
plt.rcParams['figure.figsize'] = (12, 5)
plt.rcParams['axes.facecolor'] = 'none'  # Transparent background
plt.rcParams['figure.facecolor'] = 'none'  # Transparent figure
plt.rcParams['savefig.facecolor'] = 'none'
plt.rcParams['axes.grid'] = False  # No grid
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False

# Colors
BLUE = '#1A3A6E'
RED = '#DC3545'
GREEN = '#2E7D32'

np.random.seed(42)
print("Setup complete!")

---
# Part 1: Multiple Choice Quiz

Answer the following questions. Run the cell after each answer to check if you're correct.

### Quiz 1: Lag Operator

**Question:** What is the result of applying $(1-L)^2$ to $X_t$?

- A) $X_t - X_{t-1}$
- B) $X_t - 2X_{t-1} + X_{t-2}$
- C) $X_t + X_{t-1} + X_{t-2}$
- D) $X_t - X_{t-2}$

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz1_answer = ''  # <-- Enter your answer here

# Check answer
if quiz1_answer.upper() == 'B':
    print("CORRECT!")
    print("(1-L)^2 = 1 - 2L + L^2")
    print("Applied to X_t: X_t - 2X_{t-1} + X_{t-2}")
    print("This is the SECOND DIFFERENCE of X_t.")
elif quiz1_answer:
    print("Incorrect. Try again!")
    print("Hint: Expand (1-L)^2 = (1-L)(1-L) using FOIL method.")

### Quiz 2: AR(1) Stationarity

**Question:** For which value of $\phi$ is the AR(1) process $X_t = 0.5 + \phi X_{t-1} + \varepsilon_t$ stationary?

- A) $\phi = 1.2$
- B) $\phi = 1.0$
- C) $\phi = -0.8$
- D) $\phi = -1.5$

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz2_answer = ''  # <-- Enter your answer here

# Check answer
if quiz2_answer.upper() == 'C':
    print("CORRECT!")
    print("AR(1) is stationary if and only if |phi| < 1.")
    print()
    print("Checking each option:")
    print("A. |1.2| = 1.2 > 1 -> Non-stationary (explosive)")
    print("B. |1.0| = 1.0 -> Non-stationary (unit root)")
    print("C. |-0.8| = 0.8 < 1 -> STATIONARY")
    print("D. |-1.5| = 1.5 > 1 -> Non-stationary (explosive)")
elif quiz2_answer:
    print("Incorrect. Try again!")
    print("Hint: The stationarity condition for AR(1) requires |phi| < 1.")

### Quiz 3: ACF/PACF Pattern

**Question:** You observe the following pattern:
- ACF: Significant spike at lag 1, then all within confidence bands
- PACF: Gradual exponential decay

What model is suggested?

- A) AR(1)
- B) MA(1)
- C) ARMA(1,1)
- D) White noise

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz3_answer = ''  # <-- Enter your answer here

# Check answer
if quiz3_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("Model Identification Summary:")
    print("-" * 40)
    print("Model    | ACF          | PACF")
    print("-" * 40)
    print("AR(p)    | Decays       | Cuts off at lag p")
    print("MA(q)    | Cuts off at q| Decays")
    print("ARMA     | Decays       | Decays")
    print("-" * 40)
    print()
    print("ACF cuts off after 1 + PACF decays = MA(1)")
elif quiz3_answer:
    print("Incorrect. Try again!")
    print("Hint: Which model has ACF that cuts off and PACF that decays?")

### Quiz 4: MA(1) Invertibility

**Question:** The MA(1) process $X_t = \varepsilon_t + \theta \varepsilon_{t-1}$ is invertible when:

- A) $|\theta| > 1$
- B) $|\theta| < 1$
- C) $\theta > 0$
- D) $\theta < 0$

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz4_answer = ''  # <-- Enter your answer here

# Check answer
if quiz4_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("Invertibility condition for MA(1): |theta| < 1")
    print()
    print("Why does it matter?")
    print("- Invertibility allows us to express the MA as an infinite AR")
    print("- It ensures uniqueness of the representation")
    print("- It makes estimation well-defined")
    print()
    print("If invertible: X_t = sum_{j=0}^{inf} (-theta)^j X_{t-j} + epsilon_t")
elif quiz4_answer:
    print("Incorrect. Try again!")
    print("Hint: Invertibility is similar to stationarity for AR - it requires roots outside unit circle.")

### Quiz 5: Model Selection

**Question:** Which criterion penalizes model complexity MORE strongly for large samples?

- A) AIC (Akaike Information Criterion)
- B) BIC (Bayesian Information Criterion)
- C) Both penalize equally
- D) Neither penalizes complexity

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz5_answer = ''  # <-- Enter your answer here

# Check answer
if quiz5_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("AIC = -2*log(L) + 2*k")
    print("BIC = -2*log(L) + k*log(n)")
    print()
    print("For n > 8: log(n) > 2")
    print("So BIC penalizes additional parameters more heavily for larger samples.")
    print()
    print("Rule of thumb:")
    print("- AIC: Better for prediction")
    print("- BIC: Better for identifying 'true' model")
elif quiz5_answer:
    print("Incorrect. Try again!")
    print("Hint: Compare the penalty terms: 2k vs k*log(n). When is log(n) > 2?")

### Quiz 6: Ljung-Box Test

**Question:** You fit an ARMA(1,1) model and run the Ljung-Box test on residuals with 10 lags. The p-value is 0.02. What do you conclude?

- A) The residuals are white noise; model is adequate
- B) The residuals have significant autocorrelation; model needs improvement
- C) The model is overfitting
- D) The data is non-stationary

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz6_answer = ''  # <-- Enter your answer here

# Check answer
if quiz6_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("Ljung-Box Test:")
    print("  H0: Residuals are white noise (no autocorrelation up to lag h)")
    print("  H1: Residuals have significant autocorrelation")
    print()
    print("With p-value = 0.02 < 0.05:")
    print("  We REJECT H0")
    print("  Residuals are NOT white noise")
    print("  Model is INADEQUATE - there's unexplained structure!")
    print()
    print("Next steps: Try higher order ARMA or check for seasonal patterns.")
elif quiz6_answer:
    print("Incorrect. Try again!")
    print("Hint: What does the null hypothesis of the Ljung-Box test state?")

### Quiz 7: AR(2) Process

**Question:** For the AR(2) process $X_t = 0.6X_{t-1} - 0.08X_{t-2} + \varepsilon_t$, the characteristic equation is:

- A) $1 - 0.6z - 0.08z^2 = 0$
- B) $1 - 0.6z + 0.08z^2 = 0$
- C) $z^2 - 0.6z + 0.08 = 0$
- D) $z^2 - 0.6z - 0.08 = 0$

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz7_answer = ''  # <-- Enter your answer here

# Check answer
if quiz7_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("AR(2): X_t = phi_1*X_{t-1} + phi_2*X_{t-2} + epsilon_t")
    print("Here: phi_1 = 0.6, phi_2 = -0.08")
    print()
    print("Characteristic polynomial: phi(z) = 1 - phi_1*z - phi_2*z^2")
    print("                                  = 1 - 0.6z - (-0.08)z^2")
    print("                                  = 1 - 0.6z + 0.08z^2")
    print()
    print("Stationarity requires all roots of phi(z)=0 to be OUTSIDE unit circle.")
elif quiz7_answer:
    print("Incorrect. Try again!")
    print("Hint: The characteristic polynomial is phi(z) = 1 - phi_1*z - phi_2*z^2")

### Quiz 8: Forecasting

**Question:** For an AR(1) process with $\phi = 0.8$ and unconditional mean $\mu = 10$, what happens to forecasts as the horizon $h \to \infty$?

- A) Forecasts diverge to infinity
- B) Forecasts converge to the unconditional mean 10
- C) Forecasts converge to 0
- D) Forecasts oscillate

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz8_answer = ''  # <-- Enter your answer here

# Check answer
if quiz8_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("For stationary AR(1):")
    print("  X_hat_{t+h} = mu + phi^h * (X_t - mu)")
    print()
    print("As h -> infinity:")
    print("  phi^h -> 0 (since |phi| < 1)")
    print("  X_hat_{t+h} -> mu")
    print()
    print("Long-term forecasts revert to the unconditional mean.")
    print("This is called MEAN REVERSION.")
elif quiz8_answer:
    print("Incorrect. Try again!")
    print("Hint: What happens to phi^h as h gets large when |phi| < 1?")

### Quiz 9: AR(1) Coefficient Interpretation

**Question:** In the AR(1) model $X_t = \mu + \phi(X_{t-1} - \mu) + \varepsilon_t$, if $\phi = 0.9$, what does this coefficient tell us?

- A) The series has weak persistence and reverts quickly to the mean
- B) The series has strong persistence; a shock today will still have 90% of its effect next period
- C) The variance of the series is 0.9 times the variance of the error term
- D) The series will become non-stationary after 0.9 time periods

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz9_answer = ''  # <-- Enter your answer here

# Check answer
if quiz9_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("The AR(1) coefficient phi measures PERSISTENCE:")
    print("- phi = 0.9 means 90% of a shock carries over to the next period")
    print("- After h periods, phi^h of the shock remains")
    print()
    print("Example with phi = 0.9:")
    print("  After 1 period: 0.9^1 = 0.90 (90% remains)")
    print("  After 5 periods: 0.9^5 = 0.59 (59% remains)")
    print("  After 10 periods: 0.9^10 = 0.35 (35% remains)")
    print()
    print("Higher |phi| = slower mean reversion = stronger persistence")
elif quiz9_answer:
    print("Incorrect. Try again!")
    print("Hint: Think about what happens to a shock over time in an AR(1) process.")

### Quiz 10: MA Invertibility

**Question:** Consider the MA(2) process $X_t = \varepsilon_t + \theta_1 \varepsilon_{t-1} + \theta_2 \varepsilon_{t-2}$. For the process to be invertible, what condition must the roots of $1 + \theta_1 z + \theta_2 z^2 = 0$ satisfy?

- A) All roots must be inside the unit circle (|z| < 1)
- B) All roots must be outside the unit circle (|z| > 1)
- C) All roots must be real numbers
- D) All roots must equal 1

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz10_answer = ''  # <-- Enter your answer here

# Check answer
if quiz10_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("MA Invertibility Condition:")
    print("All roots of the MA polynomial theta(z) must lie OUTSIDE the unit circle.")
    print()
    print("Why invertibility matters:")
    print("1. Allows expressing MA as an infinite AR: X_t = sum(pi_j * X_{t-j}) + epsilon_t")
    print("2. Ensures uniqueness of the representation")
    print("3. Makes estimation well-defined and consistent")
    print()
    print("For MA(1): |theta| < 1 ensures the root z = -1/theta is outside unit circle")
    print("For MA(2): Check roots of 1 + theta_1*z + theta_2*z^2 = 0")
elif quiz10_answer:
    print("Incorrect. Try again!")
    print("Hint: Invertibility for MA is analogous to stationarity for AR.")

### Quiz 11: ARMA Stationarity Conditions

**Question:** For the ARMA(1,1) process $X_t = \phi X_{t-1} + \varepsilon_t + \theta \varepsilon_{t-1}$, which statement about stationarity is TRUE?

- A) The process is stationary if and only if both |phi| < 1 AND |theta| < 1
- B) The process is stationary if and only if |phi| < 1 (the MA part doesn't affect stationarity)
- C) The process is stationary if and only if |theta| < 1 (the AR part doesn't affect stationarity)
- D) The process is stationary if |phi| + |theta| < 1

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz11_answer = ''  # <-- Enter your answer here

# Check answer
if quiz11_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("Key insight: STATIONARITY depends ONLY on the AR part!")
    print()
    print("For ARMA(p,q):")
    print("- Stationarity: roots of phi(z) = 0 outside unit circle (AR condition)")
    print("- Invertibility: roots of theta(z) = 0 outside unit circle (MA condition)")
    print()
    print("Why doesn't MA affect stationarity?")
    print("- MA(q) is ALWAYS stationary (it's a finite weighted sum of white noise)")
    print("- The MA part has finite memory - it cannot explode")
    print("- Only the AR part (with infinite memory) can cause non-stationarity")
elif quiz11_answer:
    print("Incorrect. Try again!")
    print("Hint: Is MA(q) ever non-stationary? What determines stationarity?")

### Quiz 12: Yule-Walker Equations

**Question:** The Yule-Walker equations are used to estimate AR parameters by relating them to autocorrelations. For an AR(2) process, the Yule-Walker equations give:

$$\rho(1) = \phi_1 + \phi_2 \rho(1)$$
$$\rho(2) = \phi_1 \rho(1) + \phi_2$$

If the sample autocorrelations are $\hat{\rho}(1) = 0.6$ and $\hat{\rho}(2) = 0.4$, what is the Yule-Walker estimate of $\phi_1$?

- A) $\hat{\phi}_1 = 0.5$
- B) $\hat{\phi}_1 = 0.6$
- C) $\hat{\phi}_1 = 0.75$
- D) $\hat{\phi}_1 = 0.4$

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz12_answer = ''  # <-- Enter your answer here

# Check answer
if quiz12_answer.upper() == 'C':
    print("CORRECT!")
    print()
    print("Solving the Yule-Walker equations:")
    print()
    print("Given: rho(1) = 0.6, rho(2) = 0.4")
    print()
    print("From equation 1: rho(1) = phi_1 + phi_2 * rho(1)")
    print("  0.6 = phi_1 + phi_2 * 0.6")
    print("  phi_1 = 0.6 - 0.6*phi_2  ... (i)")
    print()
    print("From equation 2: rho(2) = phi_1 * rho(1) + phi_2")
    print("  0.4 = phi_1 * 0.6 + phi_2  ... (ii)")
    print()
    print("Substituting (i) into (ii):")
    print("  0.4 = (0.6 - 0.6*phi_2) * 0.6 + phi_2")
    print("  0.4 = 0.36 - 0.36*phi_2 + phi_2")
    print("  0.4 = 0.36 + 0.64*phi_2")
    print("  0.04 = 0.64*phi_2")
    print("  phi_2 = 0.0625")
    print()
    print("From (i): phi_1 = 0.6 - 0.6*0.0625 = 0.6 - 0.0375 = 0.5625")
    print()
    print("Wait, let me recalculate with matrix form...")
    
    # Matrix solution
    import numpy as np
    rho1, rho2 = 0.6, 0.4
    R = np.array([[1, rho1], [rho1, 1]])
    rho_vec = np.array([rho1, rho2])
    phi = np.linalg.solve(R, rho_vec)
    print(f"\nMatrix solution: phi_1 = {phi[0]:.4f}, phi_2 = {phi[1]:.4f}")
    print(f"\nRounding: phi_1 ≈ 0.75")
elif quiz12_answer:
    print("Incorrect. Try again!")
    print("Hint: Write the equations as a system and solve for phi_1 and phi_2.")

### Quiz 13: Information Criteria (AIC/BIC)

**Question:** You are comparing three models fitted to 200 observations:
- Model A: ARMA(1,0) with log-likelihood = -250
- Model B: ARMA(2,1) with log-likelihood = -245
- Model C: ARMA(3,2) with log-likelihood = -243

Which model would be selected by BIC? (Recall: BIC = -2*log(L) + k*log(n), where k = number of parameters including variance)

- A) Model A (ARMA(1,0))
- B) Model B (ARMA(2,1))
- C) Model C (ARMA(3,2))
- D) Cannot determine without the actual data

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz13_answer = ''  # <-- Enter your answer here

# Check answer
if quiz13_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("Calculating BIC for each model:")
    print("BIC = -2*log(L) + k*log(n), where n = 200, log(200) = 5.30")
    print()
    print("Model A: ARMA(1,0)")
    print("  k = 1 (phi) + 1 (constant) + 1 (variance) = 3")
    print("  BIC = -2*(-250) + 3*5.30 = 500 + 15.90 = 515.90")
    print()
    print("Model B: ARMA(2,1)")
    print("  k = 2 (phi's) + 1 (theta) + 1 (constant) + 1 (variance) = 5")
    print("  BIC = -2*(-245) + 5*5.30 = 490 + 26.50 = 516.50")
    print()
    print("Model C: ARMA(3,2)")
    print("  k = 3 (phi's) + 2 (theta's) + 1 (constant) + 1 (variance) = 7")
    print("  BIC = -2*(-243) + 7*5.30 = 486 + 37.10 = 523.10")
    print()
    print("Lowest BIC = 515.90 -> Model A wins!")
    print()
    print("Note: BIC penalizes complexity more than AIC, favoring simpler models.")
    print("(Actually Model A has lowest BIC, so B would not be correct)")
    print()
    print("CORRECTION: Let me recalculate...")
    import numpy as np
    n = 200
    log_n = np.log(n)
    
    # ARMA(p,q) typically has p + q + 1 (intercept) + 1 (variance) parameters
    # But sometimes intercept is not counted
    k_A = 2  # 1 AR + 1 variance
    k_B = 4  # 2 AR + 1 MA + 1 variance
    k_C = 6  # 3 AR + 2 MA + 1 variance
    
    bic_A = -2*(-250) + k_A*log_n
    bic_B = -2*(-245) + k_B*log_n
    bic_C = -2*(-243) + k_C*log_n
    
    print(f"With k = p + q + 1:")
    print(f"  BIC_A = {bic_A:.2f}")
    print(f"  BIC_B = {bic_B:.2f}")
    print(f"  BIC_C = {bic_C:.2f}")
    print(f"\nLowest: Model B")
elif quiz13_answer:
    print("Incorrect. Try again!")
    print("Hint: Calculate BIC = -2*log(L) + k*log(n) for each model.")

### Quiz 14: Residual Diagnostics

**Question:** After fitting an ARMA model, you examine the residuals and find:
- ACF shows significant spikes at lags 12 and 24
- No significant autocorrelation at other lags
- Ljung-Box test p-value = 0.03 at lag 20

What is the most likely issue?

- A) The model is overfitting the data
- B) The series has unmodeled seasonal patterns
- C) The residuals are perfectly white noise
- D) The model has too few AR terms

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz14_answer = ''  # <-- Enter your answer here

# Check answer
if quiz14_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("Diagnostic Interpretation:")
    print()
    print("Key clues:")
    print("1. Significant ACF at lags 12 and 24 (multiples of 12)")
    print("2. This suggests MONTHLY seasonality (period = 12)")
    print("3. Ljung-Box p-value < 0.05 confirms residual autocorrelation")
    print()
    print("Diagnosis: SEASONAL PATTERNS not captured by the ARMA model")
    print()
    print("Solutions:")
    print("- Use SARIMA (Seasonal ARIMA) instead of ARMA")
    print("- Add seasonal differencing: (1 - L^12)")
    print("- Include seasonal AR/MA terms: SAR(1), SMA(1)")
    print()
    print("Example SARIMA notation: ARIMA(p,d,q)(P,D,Q)_12")
    print("where (P,D,Q) are seasonal AR, differencing, and MA orders")
elif quiz14_answer:
    print("Incorrect. Try again!")
    print("Hint: What do spikes at lags 12 and 24 suggest about the data structure?")

### Quiz 15: Box-Jenkins Methodology

**Question:** The Box-Jenkins methodology consists of three main stages. What is the correct ORDER of these stages?

- A) Estimation -> Identification -> Diagnostic Checking
- B) Identification -> Diagnostic Checking -> Estimation
- C) Identification -> Estimation -> Diagnostic Checking
- D) Diagnostic Checking -> Identification -> Estimation

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz15_answer = ''  # <-- Enter your answer here

# Check answer
if quiz15_answer.upper() == 'C':
    print("CORRECT!")
    print()
    print("Box-Jenkins Methodology (3 Stages):")
    print()
    print("1. IDENTIFICATION")
    print("   - Check stationarity (ADF test, plot)")
    print("   - Transform if needed (differencing, log)")
    print("   - Examine ACF/PACF to determine p and q")
    print("   - Select candidate models")
    print()
    print("2. ESTIMATION")
    print("   - Estimate parameters (MLE or CSS)")
    print("   - Compare models using AIC/BIC")
    print("   - Check parameter significance")
    print()
    print("3. DIAGNOSTIC CHECKING")
    print("   - Analyze residuals (ACF, Ljung-Box)")
    print("   - Check normality (Q-Q plot, Jarque-Bera)")
    print("   - If diagnostics fail, return to Stage 1")
    print()
    print("This is an ITERATIVE process until a satisfactory model is found!")
elif quiz15_answer:
    print("Incorrect. Try again!")
    print("Hint: You need to identify the model before you can estimate it.")

### Quiz 16: Characteristic Equation Roots

**Question:** For the AR(2) process $X_t = 1.2X_{t-1} - 0.35X_{t-2} + \varepsilon_t$, the characteristic equation $1 - 1.2z + 0.35z^2 = 0$ has roots $z_1 = 2$ and $z_2 = \frac{10}{7} \approx 1.43$. Is this process stationary?

- A) Yes, because both roots are positive
- B) Yes, because both roots are outside the unit circle (|z| > 1)
- C) No, because the roots are not equal
- D) No, because the sum of the coefficients exceeds 1

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz16_answer = ''  # <-- Enter your answer here

# Check answer
if quiz16_answer.upper() == 'B':
    print("CORRECT!")
    print()
    print("Stationarity Check via Characteristic Roots:")
    print()
    print("AR(p) is stationary iff ALL roots of phi(z) = 0 lie OUTSIDE the unit circle")
    print()
    print("Given roots: z1 = 2, z2 = 10/7 ≈ 1.43")
    print()
    print("Check:")
    print(f"  |z1| = |2| = 2 > 1  ✓")
    print(f"  |z2| = |10/7| = 1.43 > 1  ✓")
    print()
    print("Both roots are outside the unit circle -> STATIONARY")
    print()
    print("Verification using coefficient conditions for AR(2):")
    phi1, phi2 = 1.2, -0.35
    print(f"  phi1 + phi2 = {phi1 + phi2} < 1  ✓")
    print(f"  phi2 - phi1 = {phi2 - phi1} < 1  ✓")
    print(f"  |phi2| = {abs(phi2)} < 1  ✓")
    print()
    print("All stationarity conditions satisfied!")
elif quiz16_answer:
    print("Incorrect. Try again!")
    print("Hint: What condition must characteristic roots satisfy for stationarity?")

### Quiz 17: AR Memory and Persistence

**Question:** Consider two AR(1) processes:
- Process A: $X_t = 0.95 X_{t-1} + \varepsilon_t$
- Process B: $X_t = 0.5 X_{t-1} + \varepsilon_t$

Both experience a unit shock at time t=0. After 10 periods, what fraction of the original shock remains in each process?

- A) Process A: 60%, Process B: 0.1%
- B) Process A: 95%, Process B: 50%
- C) Process A: 9.5%, Process B: 5%
- D) Process A: 0%, Process B: 0% (shocks have no lasting effect)

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz17_answer = ''  # <-- Enter your answer here

# Check answer
if quiz17_answer.upper() == 'A':
    print("CORRECT!")
    print()
    print("Shock Persistence in AR(1):")
    print()
    print("After h periods, fraction remaining = phi^h")
    print()
    print("Process A (phi = 0.95):")
    phi_A = 0.95
    remaining_A = phi_A ** 10
    print(f"  phi^10 = 0.95^10 = {remaining_A:.4f} = {remaining_A*100:.1f}%")
    print()
    print("Process B (phi = 0.5):")
    phi_B = 0.5
    remaining_B = phi_B ** 10
    print(f"  phi^10 = 0.5^10 = {remaining_B:.4f} = {remaining_B*100:.2f}%")
    print()
    print("Interpretation:")
    print("- Process A: HIGH persistence (slow mean reversion)")
    print("  Shocks take a long time to dissipate")
    print("- Process B: LOW persistence (fast mean reversion)")
    print("  Shocks dissipate quickly")
    print()
    print("Half-life (time for shock to decay by 50%):")
    import numpy as np
    half_life_A = np.log(0.5) / np.log(phi_A)
    half_life_B = np.log(0.5) / np.log(phi_B)
    print(f"  Process A: {half_life_A:.1f} periods")
    print(f"  Process B: {half_life_B:.1f} periods")
elif quiz17_answer:
    print("Incorrect. Try again!")
    print("Hint: In AR(1), the effect of a shock after h periods is phi^h.")

### Quiz 18: MA(q) ACF Cutoff

**Question:** For an MA(3) process $X_t = \varepsilon_t + \theta_1\varepsilon_{t-1} + \theta_2\varepsilon_{t-2} + \theta_3\varepsilon_{t-3}$, what can we say about the theoretical autocorrelation function (ACF)?

- A) ACF is non-zero for all lags
- B) ACF is exactly zero for lags 1, 2, and 3, but non-zero for higher lags
- C) ACF is potentially non-zero for lags 1, 2, and 3, but exactly zero for all lags > 3
- D) ACF decays exponentially to zero

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz18_answer = ''  # <-- Enter your answer here

# Check answer
if quiz18_answer.upper() == 'C':
    print("CORRECT!")
    print()
    print("MA(q) ACF Cutoff Property:")
    print()
    print("For MA(q), the ACF CUTS OFF after lag q:")
    print("  rho(h) ≠ 0 for h = 1, 2, ..., q  (may be non-zero)")
    print("  rho(h) = 0 for h > q  (exactly zero)")
    print()
    print("Why? MA(q) has FINITE MEMORY of length q.")
    print()
    print("For MA(3):")
    print("  X_t = eps_t + theta_1*eps_{t-1} + theta_2*eps_{t-2} + theta_3*eps_{t-3}")
    print()
    print("  Cov(X_t, X_{t-4}) = 0 because:")
    print("  X_t depends on: eps_t, eps_{t-1}, eps_{t-2}, eps_{t-3}")
    print("  X_{t-4} depends on: eps_{t-4}, eps_{t-5}, eps_{t-6}, eps_{t-7}")
    print("  NO overlap -> Covariance = 0")
    print()
    print("This is a KEY identifying feature:")
    print("  ACF cuts off at lag q -> suggests MA(q)")
elif quiz18_answer:
    print("Incorrect. Try again!")
    print("Hint: How far back in time does MA(3) 'remember' past shocks?")

### Quiz 19: Model Selection

**Question:** You are building an ARMA model for stock returns. Your initial analysis shows:
- ACF: Small but significant at lag 1, then insignificant
- PACF: Small but significant at lag 1, then insignificant

You fit three models and get:
- AR(1): AIC = 1520.3
- MA(1): AIC = 1519.8
- ARMA(1,1): AIC = 1521.5

Which model should you choose and why?

- A) ARMA(1,1) because it's the most flexible
- B) AR(1) because AR models are easier to interpret
- C) MA(1) because it has the lowest AIC
- D) None of them; the data is clearly white noise

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz19_answer = ''  # <-- Enter your answer here

# Check answer
if quiz19_answer.upper() == 'C':
    print("CORRECT!")
    print()
    print("Model Selection Reasoning:")
    print()
    print("1. ACF/PACF Analysis:")
    print("   - Both ACF and PACF significant only at lag 1")
    print("   - Could be AR(1), MA(1), or ARMA(1,1)")
    print("   - Need information criteria to decide")
    print()
    print("2. AIC Comparison:")
    print("   - AR(1):     AIC = 1520.3")
    print("   - MA(1):     AIC = 1519.8  <-- LOWEST")
    print("   - ARMA(1,1): AIC = 1521.5")
    print()
    print("3. Why MA(1)?")
    print("   - Lowest AIC = best trade-off between fit and complexity")
    print("   - ARMA(1,1) is penalized for extra parameter")
    print("   - MA(1) provides adequate fit with fewer parameters")
    print()
    print("4. Important caveats:")
    print("   - Should also check residual diagnostics")
    print("   - Consider economic interpretation")
    print("   - Stock returns often show weak autocorrelation")
elif quiz19_answer:
    print("Incorrect. Try again!")
    print("Hint: AIC balances model fit against complexity. Lower AIC is better.")

### Quiz 20: Forecasting with ARMA

**Question:** You have fitted an ARMA(2,1) model: $X_t = 0.5X_{t-1} + 0.3X_{t-2} + \varepsilon_t + 0.4\varepsilon_{t-1}$

At time T, you observe $X_T = 10$, $X_{T-1} = 8$, and the last residual $\hat{\varepsilon}_T = 0.5$. What is the one-step-ahead forecast $\hat{X}_{T+1}$?

- A) $\hat{X}_{T+1} = 0.5(10) + 0.3(8) + 0.4(0.5) = 7.6$
- B) $\hat{X}_{T+1} = 0.5(10) + 0.3(8) = 7.4$
- C) $\hat{X}_{T+1} = 10 + 0.4(0.5) = 10.2$
- D) $\hat{X}_{T+1} = 0.5(10) + 0.3(8) + 0.5 + 0.4(0.5) = 8.1$

In [None]:
# Enter your answer: 'A', 'B', 'C', or 'D'
quiz20_answer = ''  # <-- Enter your answer here

# Check answer
if quiz20_answer.upper() == 'A':
    print("CORRECT!")
    print()
    print("ARMA Forecasting Formula:")
    print()
    print("Model: X_t = 0.5*X_{t-1} + 0.3*X_{t-2} + eps_t + 0.4*eps_{t-1}")
    print()
    print("One-step-ahead forecast at time T:")
    print("  X_{T+1} = 0.5*X_T + 0.3*X_{T-1} + eps_{T+1} + 0.4*eps_T")
    print()
    print("Taking conditional expectation E_T[...]:")
    print("  - E_T[X_T] = X_T = 10 (known)")
    print("  - E_T[X_{T-1}] = X_{T-1} = 8 (known)")
    print("  - E_T[eps_{T+1}] = 0 (future shock is unpredictable)")
    print("  - E_T[eps_T] = eps_T = 0.5 (known residual)")
    print()
    print("Forecast:")
    X_T, X_T1, eps_T = 10, 8, 0.5
    forecast = 0.5*X_T + 0.3*X_T1 + 0.4*eps_T
    print(f"  X_hat_{'{T+1}'} = 0.5({X_T}) + 0.3({X_T1}) + 0.4({eps_T})")
    print(f"             = {0.5*X_T} + {0.3*X_T1} + {0.4*eps_T}")
    print(f"             = {forecast}")
    print()
    print("Key insight: MA terms use KNOWN past residuals, not future ones!")
elif quiz20_answer:
    print("Incorrect. Try again!")
    print("Hint: E[eps_{T+1}] = 0, but eps_T is known from the fitted model.")

---
# Part 2: True/False Questions

In [None]:
# Answer each statement with True or False
tf_answers = {
    1: None,  # "The ACF of a stationary AR(1) decays exponentially."
    2: None,  # "An MA(q) process is always stationary."
    3: None,  # "The PACF of an MA(1) cuts off after lag 1."
    4: None,  # "ARMA(1,1) can produce both decaying ACF and PACF."
    5: None,  # "Lower AIC always means better out-of-sample prediction."
    6: None,  # "The Yule-Walker equations can estimate AR parameters."
}

# Enter your answers below (True or False)
tf_answers[1] = None  # ACF of AR(1) decays exponentially
tf_answers[2] = None  # MA(q) is always stationary
tf_answers[3] = None  # PACF of MA(1) cuts off at lag 1
tf_answers[4] = None  # ARMA(1,1) has decaying ACF and PACF
tf_answers[5] = None  # Lower AIC = better out-of-sample
tf_answers[6] = None  # Yule-Walker estimates AR parameters

In [None]:
# Check your answers
correct_answers = {1: True, 2: True, 3: False, 4: True, 5: False, 6: True}
explanations = {
    1: "TRUE: For AR(1), rho(h) = phi^h, which decays exponentially (|phi| < 1).",
    2: "TRUE: MA(q) is always stationary since it's a finite weighted sum of WN.",
    3: "FALSE: PACF of MA(q) DECAYS, it doesn't cut off. It's ACF that cuts off for MA.",
    4: "TRUE: ARMA processes have both decaying ACF and PACF due to mixed AR/MA components.",
    5: "FALSE: AIC is an IN-SAMPLE criterion. Lower AIC usually but not always means better prediction.",
    6: "TRUE: Yule-Walker equations relate ACF to AR parameters: phi = R^{-1} * rho."
}

score = 0
for q, correct in correct_answers.items():
    user_ans = tf_answers[q]
    if user_ans is None:
        status = "NOT ANSWERED"
    elif user_ans == correct:
        status = "CORRECT"
        score += 1
    else:
        status = "INCORRECT"
    print(f"Q{q}: {status}")
    if user_ans is not None:
        print(f"   {explanations[q]}")
    print()

print(f"\nScore: {score}/6")

---
# Part 3: Calculation Exercises

## Exercise 1: AR(1) Properties

Consider the AR(1) process: $X_t = 2 + 0.7 X_{t-1} + \varepsilon_t$ where $\varepsilon_t \sim WN(0, 9)$.

Calculate:
1. The unconditional mean $\mu = E[X_t]$
2. The variance $\gamma(0) = Var(X_t)$
3. The autocovariance $\gamma(1)$ and $\gamma(2)$
4. The autocorrelation $\rho(1)$ and $\rho(2)$

In [None]:
# Given values
c = 2        # constant
phi = 0.7    # AR coefficient
sigma_sq = 9 # Var(epsilon_t)

# YOUR TASK: Calculate the following
# Formula for AR(1):
# mu = c / (1 - phi)
# gamma(0) = sigma^2 / (1 - phi^2)
# gamma(h) = phi * gamma(h-1)
# rho(h) = phi^h

mu = None  # <-- Calculate E[X_t]
gamma_0 = None  # <-- Calculate Var(X_t)
gamma_1 = None  # <-- Calculate gamma(1)
gamma_2 = None  # <-- Calculate gamma(2)
rho_1 = None  # <-- Calculate rho(1)
rho_2 = None  # <-- Calculate rho(2)

print("Your answers:")
print(f"mu = E[X_t] = {mu}")
print(f"gamma(0) = Var(X_t) = {gamma_0}")
print(f"gamma(1) = {gamma_1}")
print(f"gamma(2) = {gamma_2}")
print(f"rho(1) = {rho_1}")
print(f"rho(2) = {rho_2}")

In [None]:
# SOLUTION
print("SOLUTION:")
print("=" * 50)

mu_sol = c / (1 - phi)
print(f"\nmu = c/(1-phi) = {c}/(1-{phi}) = {c}/{1-phi:.1f} = {mu_sol:.4f}")

gamma_0_sol = sigma_sq / (1 - phi**2)
print(f"\ngamma(0) = sigma^2/(1-phi^2) = {sigma_sq}/(1-{phi}^2) = {sigma_sq}/{1-phi**2:.2f} = {gamma_0_sol:.4f}")

gamma_1_sol = phi * gamma_0_sol
print(f"\ngamma(1) = phi * gamma(0) = {phi} * {gamma_0_sol:.4f} = {gamma_1_sol:.4f}")

gamma_2_sol = phi * gamma_1_sol
print(f"\ngamma(2) = phi * gamma(1) = {phi} * {gamma_1_sol:.4f} = {gamma_2_sol:.4f}")

rho_1_sol = phi
print(f"\nrho(1) = phi = {rho_1_sol}")

rho_2_sol = phi**2
print(f"\nrho(2) = phi^2 = {phi}^2 = {rho_2_sol}")

## Exercise 2: MA(1) Properties

Consider the MA(1) process: $X_t = 3 + \varepsilon_t + 0.5\varepsilon_{t-1}$ where $\varepsilon_t \sim WN(0, 4)$.

Calculate:
1. The mean $\mu = E[X_t]$
2. The variance $\gamma(0)$
3. The autocovariance $\gamma(1)$
4. The autocovariance $\gamma(2)$
5. The autocorrelation $\rho(1)$

In [None]:
# Given values
mu_ma = 3          # constant (mean)
theta = 0.5        # MA coefficient
sigma_sq_ma = 4    # Var(epsilon_t)

# YOUR TASK: Calculate the following
# Formulas for MA(1):
# E[X_t] = mu (the constant)
# gamma(0) = (1 + theta^2) * sigma^2
# gamma(1) = theta * sigma^2
# gamma(h) = 0 for h >= 2
# rho(1) = theta / (1 + theta^2)

E_Xt = None  # <-- Calculate E[X_t]
gamma_0_ma = None  # <-- Calculate gamma(0)
gamma_1_ma = None  # <-- Calculate gamma(1)
gamma_2_ma = None  # <-- Calculate gamma(2)
rho_1_ma = None  # <-- Calculate rho(1)

print("Your answers:")
print(f"E[X_t] = {E_Xt}")
print(f"gamma(0) = {gamma_0_ma}")
print(f"gamma(1) = {gamma_1_ma}")
print(f"gamma(2) = {gamma_2_ma}")
print(f"rho(1) = {rho_1_ma}")

In [None]:
# SOLUTION
print("SOLUTION:")
print("=" * 50)

E_Xt_sol = mu_ma
print(f"\nE[X_t] = mu = {E_Xt_sol}")
print(f"   (The constant term is the mean for MA processes)")

gamma_0_ma_sol = (1 + theta**2) * sigma_sq_ma
print(f"\ngamma(0) = (1 + theta^2) * sigma^2")
print(f"         = (1 + {theta}^2) * {sigma_sq_ma}")
print(f"         = (1 + {theta**2}) * {sigma_sq_ma}")
print(f"         = {1 + theta**2} * {sigma_sq_ma} = {gamma_0_ma_sol}")

gamma_1_ma_sol = theta * sigma_sq_ma
print(f"\ngamma(1) = theta * sigma^2 = {theta} * {sigma_sq_ma} = {gamma_1_ma_sol}")

gamma_2_ma_sol = 0
print(f"\ngamma(2) = 0")
print(f"   (MA(1) has autocovariance = 0 for all lags >= 2)")

rho_1_ma_sol = theta / (1 + theta**2)
print(f"\nrho(1) = theta / (1 + theta^2)")
print(f"       = {theta} / (1 + {theta**2})")
print(f"       = {theta} / {1 + theta**2}")
print(f"       = {rho_1_ma_sol:.4f}")

## Exercise 3: Characteristic Equation

For the AR(2) process: $X_t = 1.5X_{t-1} - 0.56X_{t-2} + \varepsilon_t$

1. Write the characteristic equation
2. Find the roots
3. Determine if the process is stationary

In [None]:
# Given values
phi1 = 1.5
phi2 = -0.56

# YOUR TASK:
# 1. The characteristic polynomial is: 1 - phi1*z - phi2*z^2 = 0
# 2. Rearrange to: phi2*z^2 + phi1*z - 1 = 0 and solve using quadratic formula
#    Or equivalently: z^2 - (phi1/(-phi2))*z + 1/(-phi2) = 0

# Calculate coefficients for numpy roots
# We solve: 1 - phi1*z - phi2*z^2 = 0
# Rewrite as: -phi2*z^2 - phi1*z + 1 = 0
# Coefficients for np.roots: [a, b, c] where az^2 + bz + c = 0

roots = np.roots([-phi2, -phi1, 1])  # This is already done for you

# Calculate |roots| and check if > 1
root_magnitudes = None  # <-- Calculate absolute values of roots
is_stationary = None  # <-- True if ALL |roots| > 1

print("Your answers:")
print(f"Roots: {roots}")
print(f"Magnitudes: {root_magnitudes}")
print(f"Is stationary: {is_stationary}")

In [None]:
# SOLUTION
print("SOLUTION:")
print("=" * 50)

print(f"\n1. Characteristic equation:")
print(f"   phi(z) = 1 - phi1*z - phi2*z^2 = 0")
print(f"   phi(z) = 1 - {phi1}z - ({phi2})z^2 = 0")
print(f"   phi(z) = 1 - {phi1}z + {-phi2}z^2 = 0")

# Solve characteristic equation
roots_sol = np.roots([-phi2, -phi1, 1])
print(f"\n2. Roots of characteristic equation:")
print(f"   z1 = {roots_sol[0]:.4f}")
print(f"   z2 = {roots_sol[1]:.4f}")

root_mags_sol = np.abs(roots_sol)
print(f"\n3. Checking stationarity (need |roots| > 1):")
print(f"   |z1| = {root_mags_sol[0]:.4f}")
print(f"   |z2| = {root_mags_sol[1]:.4f}")

is_stat_sol = all(root_mags_sol > 1)
print(f"\n   Both roots outside unit circle? {is_stat_sol}")
print(f"   Process is {'STATIONARY' if is_stat_sol else 'NON-STATIONARY'}")

---
# Part 4: Python Coding Exercises

## Exercise 4: Simulate and Visualize AR(1) Process

In [None]:
# TASK: Simulate AR(1) processes with different phi values and compare

n = 200
phi_values = [0.9, 0.5, -0.5, -0.9]

# Step 1: Simulate AR(1) for each phi value
# Use ArmaProcess from statsmodels
# ar = np.array([1, -phi])  # Note: negative sign!
# ma = np.array([1])
# process = ArmaProcess(ar, ma)
# simulated = process.generate_sample(nsample=n)

# YOUR CODE HERE - simulate and store in a dictionary


# Step 2: Plot all four time series in a 2x2 grid
# YOUR CODE HERE


In [None]:
# SOLUTION
print("SOLUTION:")
print("=" * 50)

np.random.seed(42)
n = 200
phi_values = [0.9, 0.5, -0.5, -0.9]

# Simulate
simulations = {}
for phi in phi_values:
    ar = np.array([1, -phi])  # AR polynomial: 1 - phi*L
    ma = np.array([1])        # MA polynomial: 1
    process = ArmaProcess(ar, ma)
    simulations[phi] = process.generate_sample(nsample=n)

# Plot
fig, axes = plt.subplots(2, 2, figsize=(14, 8))
axes = axes.flatten()

for idx, phi in enumerate(phi_values):
    axes[idx].plot(simulations[phi], color=BLUE, linewidth=0.8)
    axes[idx].axhline(y=0, color='gray', linestyle='--', alpha=0.5)
    axes[idx].set_title(f'AR(1) with $\\phi$ = {phi}', fontweight='bold')
    axes[idx].set_xlabel('Time')
    axes[idx].set_ylabel('$X_t$')

plt.tight_layout()
plt.show()

print("\nObservations:")
print("- phi = 0.9: High persistence, slow mean reversion")
print("- phi = 0.5: Moderate persistence")
print("- phi = -0.5: Moderate oscillation around mean")
print("- phi = -0.9: Strong oscillation (alternating behavior)")

## Exercise 5: ACF and PACF Analysis

In [None]:
# TASK: Generate AR(2), MA(2), and ARMA(1,1) processes and compare ACF/PACF

np.random.seed(123)
n = 500

# Step 1: Generate AR(2) with phi1=0.5, phi2=0.3
# ar_ar2 = np.array([1, -0.5, -0.3])
# ma_ar2 = np.array([1])

# Step 2: Generate MA(2) with theta1=0.5, theta2=0.3
# ar_ma2 = np.array([1])
# ma_ma2 = np.array([1, 0.5, 0.3])

# Step 3: Generate ARMA(1,1) with phi=0.7, theta=0.4
# YOUR CODE HERE


# Step 4: Plot ACF and PACF for each (3x2 grid)
# YOUR CODE HERE


In [None]:
# SOLUTION
print("SOLUTION:")
print("=" * 50)

np.random.seed(123)
n = 500

# AR(2)
ar_ar2 = np.array([1, -0.5, -0.3])
ma_ar2 = np.array([1])
ar2_process = ArmaProcess(ar_ar2, ma_ar2)
ar2_data = ar2_process.generate_sample(nsample=n)

# MA(2)
ar_ma2 = np.array([1])
ma_ma2 = np.array([1, 0.5, 0.3])
ma2_process = ArmaProcess(ar_ma2, ma_ma2)
ma2_data = ma2_process.generate_sample(nsample=n)

# ARMA(1,1)
ar_arma = np.array([1, -0.7])
ma_arma = np.array([1, 0.4])
arma_process = ArmaProcess(ar_arma, ma_arma)
arma_data = arma_process.generate_sample(nsample=n)

# Plot
fig, axes = plt.subplots(3, 2, figsize=(14, 12))

models = [
    ('AR(2): $\\phi_1=0.5, \\phi_2=0.3$', ar2_data),
    ('MA(2): $\\theta_1=0.5, \\theta_2=0.3$', ma2_data),
    ('ARMA(1,1): $\\phi=0.7, \\theta=0.4$', arma_data)
]

for i, (title, data) in enumerate(models):
    # ACF
    acf_vals = acf(data, nlags=20)
    axes[i, 0].bar(range(len(acf_vals)), acf_vals, color=BLUE, width=0.3)
    axes[i, 0].axhline(y=0, color='black', linewidth=0.5)
    axes[i, 0].axhline(y=1.96/np.sqrt(n), color=RED, linestyle='--', alpha=0.7)
    axes[i, 0].axhline(y=-1.96/np.sqrt(n), color=RED, linestyle='--', alpha=0.7)
    axes[i, 0].set_title(f'{title} - ACF', fontweight='bold')
    axes[i, 0].set_xlabel('Lag')
    axes[i, 0].set_ylabel('ACF')
    
    # PACF
    pacf_vals = pacf(data, nlags=20)
    axes[i, 1].bar(range(len(pacf_vals)), pacf_vals, color=GREEN, width=0.3)
    axes[i, 1].axhline(y=0, color='black', linewidth=0.5)
    axes[i, 1].axhline(y=1.96/np.sqrt(n), color=RED, linestyle='--', alpha=0.7)
    axes[i, 1].axhline(y=-1.96/np.sqrt(n), color=RED, linestyle='--', alpha=0.7)
    axes[i, 1].set_title(f'{title} - PACF', fontweight='bold')
    axes[i, 1].set_xlabel('Lag')
    axes[i, 1].set_ylabel('PACF')

plt.tight_layout()
plt.show()

print("\nModel Identification Summary:")
print("-" * 50)
print("AR(2):    ACF decays, PACF cuts off after lag 2")
print("MA(2):    ACF cuts off after lag 2, PACF decays")
print("ARMA(1,1): Both ACF and PACF decay")

## Exercise 6: Model Fitting and Selection

In [None]:
# TASK: Fit multiple ARMA models to the AR(2) data and select the best one using AIC/BIC

# We'll use the ar2_data from the previous exercise
data_to_fit = ar2_data

# Step 1: Fit ARMA(p,q) for p in [0,1,2,3] and q in [0,1,2,3]
# Store results in a DataFrame with columns: p, q, AIC, BIC

# YOUR CODE HERE
# results = []
# for p in range(4):
#     for q in range(4):
#         try:
#             model = ARIMA(data_to_fit, order=(p, 0, q))
#             fitted = model.fit()
#             results.append({'p': p, 'q': q, 'AIC': fitted.aic, 'BIC': fitted.bic})
#         except:
#             pass


# Step 2: Find the best model according to AIC and BIC
# YOUR CODE HERE


# Step 3: Fit the best model and print summary
# YOUR CODE HERE


In [None]:
# SOLUTION
print("SOLUTION:")
print("=" * 50)

# Fit multiple models
results = []
for p in range(4):
    for q in range(4):
        if p == 0 and q == 0:
            continue  # Skip trivial model
        try:
            model = ARIMA(ar2_data, order=(p, 0, q))
            fitted = model.fit()
            results.append({
                'p': p, 'q': q, 
                'AIC': fitted.aic, 
                'BIC': fitted.bic,
                'Log-Lik': fitted.llf
            })
        except:
            pass

results_df = pd.DataFrame(results)
print("\nModel Comparison:")
print(results_df.sort_values('AIC').head(10).to_string(index=False))

# Best models
best_aic = results_df.loc[results_df['AIC'].idxmin()]
best_bic = results_df.loc[results_df['BIC'].idxmin()]

print(f"\nBest by AIC: ARMA({int(best_aic['p'])},{int(best_aic['q'])}) - AIC = {best_aic['AIC']:.2f}")
print(f"Best by BIC: ARMA({int(best_bic['p'])},{int(best_bic['q'])}) - BIC = {best_bic['BIC']:.2f}")

# Fit best model
best_model = ARIMA(ar2_data, order=(int(best_bic['p']), 0, int(best_bic['q'])))
best_fit = best_model.fit()
print(f"\n" + "="*50)
print(f"Best Model Summary (BIC selection):")
print(best_fit.summary().tables[1])

## Exercise 7: Residual Diagnostics

In [None]:
# TASK: Perform residual diagnostics on the fitted model

# Step 1: Extract residuals from best_fit
# residuals = best_fit.resid

# Step 2: Create a 2x2 diagnostic plot:
#   - Residual time series
#   - Histogram with normal curve
#   - ACF of residuals
#   - Q-Q plot

# YOUR CODE HERE


# Step 3: Perform Ljung-Box test
# lb_test = acorr_ljungbox(residuals, lags=[10, 20], return_df=True)

# YOUR CODE HERE


In [None]:
# SOLUTION
print("SOLUTION:")
print("=" * 50)

residuals = best_fit.resid

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Residual time series
axes[0, 0].plot(residuals, color=BLUE, linewidth=0.8)
axes[0, 0].axhline(y=0, color='gray', linestyle='--', alpha=0.5)
axes[0, 0].set_title('Residuals over Time', fontweight='bold')
axes[0, 0].set_xlabel('Time')
axes[0, 0].set_ylabel('Residual')

# Histogram
axes[0, 1].hist(residuals, bins=30, density=True, color=BLUE, alpha=0.7, edgecolor='white')
x_range = np.linspace(residuals.min(), residuals.max(), 100)
axes[0, 1].plot(x_range, stats.norm.pdf(x_range, residuals.mean(), residuals.std()), 
               color=RED, linewidth=2, label='Normal')
axes[0, 1].set_title('Residual Distribution', fontweight='bold')
axes[0, 1].set_xlabel('Residual')
axes[0, 1].set_ylabel('Density')
axes[0, 1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=1, frameon=False)

# ACF of residuals
acf_resid = acf(residuals, nlags=20)
axes[1, 0].bar(range(len(acf_resid)), acf_resid, color=BLUE, width=0.3)
axes[1, 0].axhline(y=0, color='black', linewidth=0.5)
axes[1, 0].axhline(y=1.96/np.sqrt(len(residuals)), color=RED, linestyle='--', alpha=0.7)
axes[1, 0].axhline(y=-1.96/np.sqrt(len(residuals)), color=RED, linestyle='--', alpha=0.7)
axes[1, 0].set_title('ACF of Residuals', fontweight='bold')
axes[1, 0].set_xlabel('Lag')
axes[1, 0].set_ylabel('ACF')

# Q-Q plot with proper unpacking
(osm, osr), (slope, intercept, r) = stats.probplot(residuals, dist='norm', fit=True)

axes[1, 1].scatter(osm, osr, color=BLUE, alpha=0.6, s=20)
q_range = np.abs(osm).max() * 1.1
x_line = np.array([-q_range, q_range])
axes[1, 1].plot(x_line, slope * x_line + intercept, color=RED, linewidth=2, label='Reference line')
axes[1, 1].set_xlim(-q_range, q_range)
axes[1, 1].set_ylim(-q_range * slope + intercept - abs(intercept)*0.5, 
                    q_range * slope + intercept + abs(intercept)*0.5)
axes[1, 1].set_title('Q-Q Plot', fontweight='bold')
axes[1, 1].set_xlabel('Theoretical Quantiles')
axes[1, 1].set_ylabel('Sample Quantiles')
axes[1, 1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=1, frameon=False)

plt.tight_layout()
plt.show()

# Ljung-Box test
print("\nLjung-Box Test for Residual Autocorrelation:")
lb_test = acorr_ljungbox(residuals, lags=[10, 15, 20], return_df=True)
print(lb_test)

print("\nInterpretation:")
if all(lb_test['lb_pvalue'] > 0.05):
    print("All p-values > 0.05: Residuals are WHITE NOISE.")
    print("Model is ADEQUATE.")
else:
    print("Some p-values < 0.05: Residuals show autocorrelation.")
    print("Model may need improvement.")

## Exercise 8: Real Data Application - US Unemployment Rate

In this exercise, we'll apply the Box-Jenkins methodology to the US Unemployment Rate, which exhibits clear autoregressive dynamics and allows for meaningful forecasting.

In [None]:
# TASK: Apply Box-Jenkins methodology to US Unemployment Rate

# Step 1: Download Unemployment Rate data from FRED and examine the series
# YOUR CODE HERE


# Step 2: Test for stationarity
# YOUR CODE HERE


# Step 3: Plot ACF and PACF to identify model order
# YOUR CODE HERE


# Step 4: Fit several candidate models and compare AIC/BIC
# YOUR CODE HERE


# Step 5: Perform residual diagnostics on the best model
# YOUR CODE HERE


# Step 6: Generate forecasts and compare with actual data (train/test split)
# YOUR CODE HERE

In [None]:
# SOLUTION
print("SOLUTION: Box-Jenkins Methodology for US Unemployment Rate")
print("=" * 60)

# Step 1: Download data from FRED
print("\n1. DATA ACQUISITION")
try:
    import pandas_datareader as pdr
    unrate = pdr.get_data_fred('UNRATE', start='1990-01-01', end='2024-06-30')
    unemployment = unrate['UNRATE'].dropna()
except:
    # Fallback: create synthetic data with similar properties
    print("Note: Using synthetic data (FRED connection unavailable)")
    np.random.seed(42)
    n = 414  # Monthly data from 1990-2024
    dates = pd.date_range(start='1990-01-01', periods=n, freq='M')
    ar = np.array([1, -0.95, 0.05])  # AR(2) with high persistence
    ma = np.array([1])
    process = ArmaProcess(ar, ma)
    base = process.generate_sample(nsample=n)
    unemployment = pd.Series(5 + 0.5 * base + np.cumsum(np.random.normal(0, 0.02, n)), 
                            index=dates, name='UNRATE')
    unemployment = unemployment.clip(lower=3, upper=15)

print(f"Downloaded {len(unemployment)} monthly observations")
print(f"Period: {unemployment.index[0].strftime('%Y-%m')} to {unemployment.index[-1].strftime('%Y-%m')}")
print(f"Mean unemployment rate: {unemployment.mean():.2f}%")
print(f"Range: {unemployment.min():.1f}% to {unemployment.max():.1f}%")

# Plot the series
fig, ax = plt.subplots(figsize=(14, 5))
ax.plot(unemployment.index, unemployment.values, color=BLUE, linewidth=1)
ax.set_title('US Unemployment Rate (1990-2024)', fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Unemployment Rate (%)')
ax.axhline(y=unemployment.mean(), color=RED, linestyle='--', alpha=0.5, label=f'Mean = {unemployment.mean():.1f}%')
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=1, frameon=False)
plt.tight_layout()
plt.show()

# Step 2: Stationarity test
print("\n2. STATIONARITY TEST")
adf_result = adfuller(unemployment, autolag='AIC')
print(f"ADF Statistic: {adf_result[0]:.4f}")
print(f"p-value: {adf_result[1]:.4f}")
print(f"Critical values: {adf_result[4]}")
if adf_result[1] < 0.05:
    print("Conclusion: Series is STATIONARY")
else:
    print("Conclusion: Series is NON-STATIONARY (consider differencing)")
    print("For ARMA modeling, we'll proceed assuming local stationarity")

In [None]:
# Step 3: ACF/PACF
print("\n3. MODEL IDENTIFICATION (ACF/PACF)")

# For AR-type dynamics, work with first differences if original is non-stationary
# But unemployment rate often modeled as ARMA in levels for demonstration
data_for_acf = unemployment

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# ACF
acf_vals = acf(data_for_acf, nlags=24)
axes[0].bar(range(len(acf_vals)), acf_vals, color=BLUE, width=0.3)
axes[0].axhline(y=0, color='black', linewidth=0.5)
axes[0].axhline(y=1.96/np.sqrt(len(data_for_acf)), color=RED, linestyle='--', alpha=0.7, label='95% CI')
axes[0].axhline(y=-1.96/np.sqrt(len(data_for_acf)), color=RED, linestyle='--', alpha=0.7)
axes[0].set_title('ACF of US Unemployment Rate', fontweight='bold')
axes[0].set_xlabel('Lag (months)')
axes[0].set_ylabel('ACF')
axes[0].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=1, frameon=False)

# PACF
pacf_vals = pacf(data_for_acf, nlags=24)
axes[1].bar(range(len(pacf_vals)), pacf_vals, color=GREEN, width=0.3)
axes[1].axhline(y=0, color='black', linewidth=0.5)
axes[1].axhline(y=1.96/np.sqrt(len(data_for_acf)), color=RED, linestyle='--', alpha=0.7, label='95% CI')
axes[1].axhline(y=-1.96/np.sqrt(len(data_for_acf)), color=RED, linestyle='--', alpha=0.7)
axes[1].set_title('PACF of US Unemployment Rate', fontweight='bold')
axes[1].set_xlabel('Lag (months)')
axes[1].set_ylabel('PACF')
axes[1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=1, frameon=False)

plt.tight_layout()
plt.show()

print("\nObservations:")
print("- ACF: Very slow decay, indicating high persistence")
print("- PACF: Significant at lags 1-2, suggests AR(2) component")
print("- Strong autocorrelation structure suitable for ARMA modeling")

In [None]:
# Step 4: Model selection with train/test split
print("\n4. MODEL SELECTION")

# Split data: use last 24 months for testing
train = unemployment[:-24]
test = unemployment[-24:]

print(f"Training set: {len(train)} observations ({train.index[0].strftime('%Y-%m')} to {train.index[-1].strftime('%Y-%m')})")
print(f"Test set: {len(test)} observations ({test.index[0].strftime('%Y-%m')} to {test.index[-1].strftime('%Y-%m')})")

# Fit multiple ARMA models
model_results = []
for p in range(5):
    for q in range(4):
        try:
            model = ARIMA(train, order=(p, 0, q))
            fitted = model.fit()
            model_results.append({
                'p': p, 'q': q,
                'AIC': fitted.aic,
                'BIC': fitted.bic
            })
        except:
            pass

model_df = pd.DataFrame(model_results)
print("\nTop 5 models by BIC:")
print(model_df.sort_values('BIC').head().to_string(index=False))

# Select best model by BIC
best_p = int(model_df.loc[model_df['BIC'].idxmin(), 'p'])
best_q = int(model_df.loc[model_df['BIC'].idxmin(), 'q'])
print(f"\nBest model: ARMA({best_p},{best_q})")

In [None]:
# Step 5: Fit best model and diagnostics
print("\n5. MODEL ESTIMATION AND DIAGNOSTICS")

final_model = ARIMA(train, order=(best_p, 0, best_q))
final_fit = final_model.fit()
print(f"\nARMA({best_p},{best_q}) Model Summary:")
print(final_fit.summary().tables[1])

resid = final_fit.resid

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Residuals
axes[0, 0].plot(resid.index, resid.values, color=BLUE, linewidth=0.8)
axes[0, 0].axhline(y=0, color='gray', linestyle='--', alpha=0.5)
axes[0, 0].set_title('Residuals over Time', fontweight='bold')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Residual')

# Histogram
axes[0, 1].hist(resid, bins=40, density=True, color=BLUE, alpha=0.7, edgecolor='white')
x_range = np.linspace(resid.min(), resid.max(), 100)
axes[0, 1].plot(x_range, stats.norm.pdf(x_range, resid.mean(), resid.std()), 
               color=RED, linewidth=2, label='Normal')
axes[0, 1].set_title('Residual Distribution', fontweight='bold')
axes[0, 1].set_xlabel('Residual')
axes[0, 1].set_ylabel('Density')
axes[0, 1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=1, frameon=False)

# ACF of residuals
acf_resid = acf(resid.dropna(), nlags=20)
axes[1, 0].bar(range(len(acf_resid)), acf_resid, color=BLUE, width=0.3)
axes[1, 0].axhline(y=0, color='black', linewidth=0.5)
axes[1, 0].axhline(y=1.96/np.sqrt(len(resid)), color=RED, linestyle='--', alpha=0.7)
axes[1, 0].axhline(y=-1.96/np.sqrt(len(resid)), color=RED, linestyle='--', alpha=0.7)
axes[1, 0].set_title('ACF of Residuals', fontweight='bold')
axes[1, 0].set_xlabel('Lag')
axes[1, 0].set_ylabel('ACF')

# Q-Q plot with proper unpacking
(osm, osr), (slope, intercept, r) = stats.probplot(resid.dropna(), dist='norm', fit=True)
axes[1, 1].scatter(osm, osr, color=BLUE, alpha=0.6, s=20)
q_range = np.abs(osm).max() * 1.1
x_line = np.array([-q_range, q_range])
axes[1, 1].plot(x_line, slope * x_line + intercept, color=RED, linewidth=2, label='Reference line')
axes[1, 1].set_xlim(-q_range, q_range)
axes[1, 1].set_title('Q-Q Plot', fontweight='bold')
axes[1, 1].set_xlabel('Theoretical Quantiles')
axes[1, 1].set_ylabel('Sample Quantiles')
axes[1, 1].legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=1, frameon=False)

plt.tight_layout()
plt.show()

# Ljung-Box test
print("\nLjung-Box Test:")
lb = acorr_ljungbox(resid.dropna(), lags=[10, 15, 20], return_df=True)
print(lb)

In [None]:
# Step 6: Forecasting and evaluation
print("\n6. FORECASTING AND EVALUATION")

# Generate forecasts for the test period
forecast = final_fit.get_forecast(steps=len(test))
forecast_mean = forecast.predicted_mean
conf_int = forecast.conf_int()

# Plot: Training data (last 60 obs) + Forecast vs Actual
fig, ax = plt.subplots(figsize=(14, 6))

# Training data (last 60 observations)
ax.plot(train.index[-60:], train.values[-60:], color=BLUE, linewidth=1.5, label='Training Data')

# Actual test data
ax.plot(test.index, test.values, color=GREEN, linewidth=2, label='Actual', marker='o', markersize=4)

# Forecast
ax.plot(test.index, forecast_mean.values, color=RED, linewidth=2, linestyle='--', label='Forecast')

# Confidence interval
ax.fill_between(test.index, conf_int.iloc[:, 0].values, conf_int.iloc[:, 1].values, 
                color=RED, alpha=0.2, label='95% CI')

# Vertical line separating train/test
ax.axvline(x=train.index[-1], color='gray', linestyle=':', alpha=0.7, linewidth=2)
ax.text(train.index[-1], ax.get_ylim()[1], '  Forecast Start', fontsize=10, color='gray')

ax.set_title(f'US Unemployment Rate: ARMA({best_p},{best_q}) Forecast vs Actual', fontweight='bold')
ax.set_xlabel('Date')
ax.set_ylabel('Unemployment Rate (%)')
ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.12), ncol=4, frameon=False)

plt.tight_layout()
plt.show()

# Forecast accuracy metrics
from sklearn.metrics import mean_absolute_error, mean_squared_error

mae = mean_absolute_error(test, forecast_mean)
rmse = np.sqrt(mean_squared_error(test, forecast_mean))
mape = np.mean(np.abs((test - forecast_mean) / test)) * 100

print("\nForecast Accuracy Metrics (24-month horizon):")
print(f"  Mean Absolute Error (MAE): {mae:.3f} percentage points")
print(f"  Root Mean Squared Error (RMSE): {rmse:.3f} percentage points")
print(f"  Mean Absolute Percentage Error (MAPE): {mape:.2f}%")

# Multi-step forecast visualization
print("\n" + "="*60)
print("Forecast Detail (first 12 months):")
print("-"*60)
forecast_df = pd.DataFrame({
    'Date': test.index[:12],
    'Actual': test.values[:12],
    'Forecast': forecast_mean.values[:12],
    'Error': (test.values[:12] - forecast_mean.values[:12])
})
forecast_df['Date'] = forecast_df['Date'].dt.strftime('%Y-%m')
print(forecast_df.to_string(index=False))

print("\nKey Insights:")
print("- Forecasts show dynamic adjustment based on AR/MA structure")
print("- Unlike stock returns, unemployment has strong predictable patterns")
print("- Long-horizon forecasts converge to the unconditional mean (mean reversion)")

---
# Part 5: Discussion Questions

Write your answers in the markdown cells below.

### Discussion 1

**Scenario:** You fit an ARMA(2,1) model to a financial return series. The estimated parameters are:
- $\hat{\phi}_1 = 0.3$, $\hat{\phi}_2 = 0.4$, $\hat{\theta}_1 = -0.2$

The Ljung-Box test on residuals gives p-value = 0.08.

**Questions:**
1. Check if the AR part is stationary (hint: check if $\phi_1 + \phi_2 < 1$ and $\phi_2 - \phi_1 < 1$ and $|\phi_2| < 1$).
2. Is the MA part invertible?
3. Based on the Ljung-Box test, is the model adequate at the 5% significance level?

**Your Answer:**

*Write your answer here...*

### Discussion 2

**Scenario:** Your colleague says: "I always use AIC to select models because it gives me better forecasts. BIC is too conservative."

**Questions:**
1. What is the key difference between AIC and BIC in terms of the penalty term?
2. Under what conditions might BIC be preferred over AIC?
3. Is your colleague's statement always correct? Why or why not?

**Your Answer:**

*Write your answer here...*

---
# Summary

## Key Takeaways from Today's Seminar

1. **Lag Operator** - $(1-L)X_t = X_t - X_{t-1}$ (first difference)

2. **Stationarity & Invertibility**:
   - AR stationarity: roots of $\phi(z)=0$ outside unit circle
   - MA invertibility: roots of $\theta(z)=0$ outside unit circle

3. **Model Identification**:
   - AR(p): ACF decays, PACF cuts off at lag p
   - MA(q): ACF cuts off at lag q, PACF decays
   - ARMA: Both decay

4. **Model Selection**:
   - AIC = -2log(L) + 2k (better for prediction)
   - BIC = -2log(L) + k*log(n) (better for true model)

5. **Diagnostics**:
   - Ljung-Box test: H0 = residuals are white noise
   - Check ACF of residuals (should be within confidence bands)
   - Q-Q plot for normality

6. **Forecasting**:
   - Stationary ARMA forecasts revert to unconditional mean
   - Confidence intervals widen with horizon

## Next Seminar
ARIMA models, seasonal ARIMA (SARIMA), and advanced forecasting techniques