# Module 4: Machine Learning for Finance - Time Series Analysis

This notebook covers time series analysis techniques, including ARIMA and GARCH models, which are widely used in finance.

## 1. Time Series Analysis

**Time series data** is a sequence of data points collected over time. Financial data, such as stock prices and returns, are a classic example of time series data.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Generate some sample time series data
np.random.seed(42)
n_samples = 1000
returns = np.random.randn(n_samples)

# Plot ACF and PACF
fig, ax = plt.subplots(1, 2, figsize=(12, 4))
plot_acf(returns, ax=ax[0])
plot_pacf(returns, ax=ax[1])
plt.show()

## 2. ARIMA Models

**ARIMA (Autoregressive Integrated Moving Average)** models are a class of models that explain a given time series based on its own past values, that is, its own lags and the lagged forecast errors.

In [None]:
from statsmodels.tsa.arima.model import ARIMA

# Fit an ARIMA model
model = ARIMA(returns, order=(1, 0, 1))
results = model.fit()
print(results.summary())

## 3. GARCH Models

**GARCH (Generalized Autoregressive Conditional Heteroskedasticity)** models are used to model volatility clustering, a phenomenon where periods of high volatility are followed by periods of high volatility, and periods of low volatility are followed by periods of low volatility.

In [None]:
from arch import arch_model

# Fit a GARCH model
model = arch_model(returns, vol="Garch", p=1, q=1)
results = model.fit()
print(results.summary())

## üìù Guided Exercises with Auto-Validation

Master time series analysis techniques!

### Exercise 1: AR(1) Process Properties (Intermediate)

Calculate properties of an AR(1) process: X_t = œÜX_{t-1} + Œµ_t.

In [None]:
# Exercise 1: AR(1) Process
import numpy as np

# Given: X_t = œÜ * X_{t-1} + Œµ_t
# where Œµ_t ~ N(0, œÉ¬≤)
phi = 0.7      # AR coefficient
sigma_eps = 1.0  # Noise std dev

# TODO: Check if process is stationary
# Stationary if |œÜ| < 1
is_stationary = None

# TODO: Calculate unconditional variance (if stationary)
# Var[X_t] = œÉ¬≤/(1 - œÜ¬≤)
unconditional_variance = None

# TODO: Calculate autocorrelation at lag 1
# Corr(X_t, X_{t-1}) = œÜ
autocorr_lag1 = None

# TODO: Calculate autocorrelation at lag 2
# Corr(X_t, X_{t-2}) = œÜ¬≤
autocorr_lag2 = None

# TODO: Half-life of shocks (how long for autocorr to decay to 0.5)
# Solve: œÜ^k = 0.5 ‚Üí k = ln(0.5)/ln(œÜ)
half_life = None

# ============= AUTO-VALIDATION (DO NOT MODIFY) =============
assert is_stationary is not None, "‚ùå Check stationarity!"
assert unconditional_variance is not None, "‚ùå Calculate unconditional variance!"
assert autocorr_lag1 is not None, "‚ùå Calculate ACF(1)!"
assert autocorr_lag2 is not None, "‚ùå Calculate ACF(2)!"
assert half_life is not None, "‚ùå Calculate half-life!"
assert is_stationary == (abs(phi) < 1), f"‚ùå Stationary if |œÜ| < 1"
assert is_stationary, "‚ùå Process should be stationary!"
expected_var = sigma_eps**2 / (1 - phi**2)
assert np.isclose(unconditional_variance, expected_var, rtol=0.01), f"‚ùå Var = œÉ¬≤/(1-œÜ¬≤)"
assert autocorr_lag1 == phi, f"‚ùå ACF(1) = œÜ"
assert autocorr_lag2 == phi**2, f"‚ùå ACF(2) = œÜ¬≤"
expected_half_life = np.log(0.5) / np.log(phi)
assert np.isclose(half_life, expected_half_life, rtol=0.01), f"‚ùå Half-life = ln(0.5)/ln(œÜ)"
print("‚úÖ Exercise 1 Complete!")
print(f"   Stationary: {is_stationary}")
print(f"   Unconditional Variance: {unconditional_variance:.4f}")
print(f"   ACF(1): {autocorr_lag1:.3f}")
print(f"   ACF(2): {autocorr_lag2:.3f}")
print(f"   Half-life: {half_life:.2f} periods")
print(f"   Interpretation: AR(1) shocks decay exponentially at rate œÜ!")
# =========================================================