<div style="font-size:22px; margin-top:20px; margin-bottom:15px">
<h2>EGARCH Model for Volatility Dynamics in Digital Asset Markets</h2>
</div>

<div style="font-size:18px; line-height:1.6">

The **Exponential Generalized Autoregressive Conditional Heteroskedasticity (EGARCH)** model is a nonlinear stochastic volatility model proposed by Nelson (1991).  
It belongs to the **GARCH family**, but unlike the standard GARCH specification, it models the **logarithm of the conditional variance**, ensuring positivity without imposing explicit parameter constraints.

In the context of **crypto-asset markets**, where volatility clustering and asymmetric responses to shocks are prominent, EGARCH captures the **leverage effect** — the empirical observation that negative returns tend to increase future volatility more than positive ones of equal magnitude.

</div>

---

<div style="font-size:22px; margin-top:15px">
Mathematical Specification
</div>

<div style="font-size:20px; margin-top:10px; margin-bottom:10px">
The EGARCH(p, o, q) model can be expressed as:
</div>

<div style="font-size:22px; margin-top:15px">
$$
\begin{aligned}
r_t &= \mu + \epsilon_t, \quad \epsilon_t = \sigma_t z_t, \quad z_t \sim i.i.d.(0,1) \\\\
\ln(\sigma_t^2) &= \omega 
+ \sum_{i=1}^{p} \beta_i \ln(\sigma_{t-i}^2)
+ \sum_{j=1}^{q} \alpha_j \left( |z_{t-j}| - \mathbb{E}|z_{t-j}| \right)
+ \sum_{k=1}^{o} \gamma_k z_{t-k}
\end{aligned}
$$
</div>

<div style="font-size:18px; margin-top:10px; line-height:1.6">

**Where:**

<div style="font-size:18px; line-height:1.6; margin-top:10px">

<div style="font-size:18px; line-height:1.6; margin-top:10px">

\\[
\begin{array}{ll}
r_t &\text{: Asset return at time } t \; (\text{e.g. log-return of ETH/USDT}) \\\\
\mu &\text{: Unconditional mean of returns} \\\\
\sigma_t^2 &\text{: Conditional variance of the innovation term } \epsilon_t \\\\
z_t &\text{: Standardized residuals, typically } i.i.d.(0,1) \text{ with zero mean and unit variance} \\\\
\omega &\text{: Constant term controlling the long-run volatility level} \\\\
\beta_i &\text{: Autoregressive coefficients describing the persistence of past volatility} \\\\
\alpha_j &\text{: Coefficients capturing the magnitude effect (response to absolute shocks)} \\\\
\gamma_k &\text{: Coefficients representing asymmetry (leverage effect), allowing negative shocks to have stronger impacts on volatility}
\end{array}
\\]
</div>

---

<div style="font-size:22px; margin-top:20px">
EGARCH in the Context of Digital Assets
</div>

<div style="font-size:18px; line-height:1.6; margin-top:10px">

The **EGARCH model** is particularly suited for **digital asset volatility modeling**, as cryptocurrencies exhibit:
- **High-frequency volatility clustering**, driven by microstructure noise and continuous trading.  
- **Strong asymmetry** in volatility response due to speculative deleveraging and liquidation cascades.  
- **Non-stationary periods** with volatility bursts during market stress events (e.g. exchange hacks, regulatory news).

By applying EGARCH to high-resolution OHLCV data (e.g. 5-minute ETH candles), we estimate the **conditional volatility process** that underlies return dynamics.  
This allows for better **risk forecasting**, **VaR estimation**, and **dynamic position sizing** within a quantitative trading or risk management framework.

</div>

---

<div style="font-size:22px; margin-top:20px">
Interpretation of Parameters
</div>

<div style="font-size:18px; line-height:1.6; margin-top:10px">

\\[
\begin{array}{ll}
\text{Large } \beta_i &\text{: High volatility persistence (slow mean reversion)} \\\\
\text{Large } \alpha_j &\text{: Strong reaction to recent shocks (volatility-of-volatility)} \\\\
\text{Negative } \gamma_k &\text{: Leverage effect — volatility rises more after negative returns} \\\\
\text{Constant } \omega &\text{: Determines the long-run equilibrium level of log-variance}
\end{array}
\\]

</div>

<div style="font-size:18px; line-height:1.6; margin-top:15px">

When fitted on crypto returns, the EGARCH model captures the **stylized facts** of digital markets:  
fat-tailed return distributions, volatility clustering, and asymmetric feedback loops between returns and conditional variance.

</div>
---

<div style="font-size:18px; margin-top:15px; margin-bottom:25px">
In this notebook, we implement and estimate the EGARCH(1,1,1) model for Ethereum intraday returns, visualize conditional volatility, and compare it against realized volatility measures derived from rolling standard deviations of log-returns.
</div>


In [1]:
import os
import sys

# ustawiamy katalog główny projektu jako ścieżkę bazową
project_root = os.path.abspath(os.path.join(".."))  # jeden poziom wyżej niż notebooks/
if project_root not in sys.path:
    sys.path.append(project_root)


In [2]:
# ===========================================
# 03_run_EGARCH.ipynb
# -------------------------------------------
# Notebook do trenowania modelu EGARCH
# z dowolnymi hiperparametrami.
# ===========================================

import pandas as pd
import numpy as np
from src.data_loader import load_train_test_data
from src.egarch_model import fit_egarch, save_model


# --- 1️⃣ Wczytanie danych (tylko train) ---
train_df = load_train_test_data(load_test=False)
print("\n✅ Dane treningowe wczytane.")
print(f"Train shape: {train_df.shape}")
print(f"Zakres czasowy: {train_df['open_time'].min()} → {train_df['open_time'].max()}")


# --- 2️⃣ Konfiguracja hiperparametrów ---
p = 1    # składniki GARCH
o = 2    # składniki asymetrii (leverage)
q = 2    # składniki ARCH
dist = "t"        # 'normal', 't', 'skewt', 'ged'
mean = "constant" # 'Zero', 'Constant', 'AR', 'ARX'

print(f"\n🔧 Konfiguracja modelu: EGARCH(p={p}, o={o}, q={q}), dist='{dist}', mean='{mean}'")


# --- 3️⃣ Trening modelu (na train_df) ---
model, scale = fit_egarch(
    train_df["log_return"],
    p=p,
    o=o,
    q=q,
    dist=dist,
    mean=mean
)


# --- 4️⃣ Zapis wytrenowanego modelu ---
save_model(model, scale, f"egarch_ETH_5m_p{p}_o{o}_q{q}_{dist}.pkl")


print("\n✅ Model zapisany pomyślnie.")
print("\n🧩 Podsumowanie dopasowania:")
print(model.summary())
print(f"\nWspółczynnik skalowania (std train): {scale:.2e}")


✅ Dane treningowe załadowane poprawnie.
Train shape: (17279, 11)
Train range: 2024-01-01 00:05:00 → 2024-02-29 23:55:00

✅ Dane treningowe wczytane.
Train shape: (17279, 11)
Zakres czasowy: 2024-01-01 00:05:00 → 2024-02-29 23:55:00

🔧 Konfiguracja modelu: EGARCH(p=1, o=2, q=2), dist='t', mean='constant'
🔧 Trenuję model EGARCH(p=1, o=2, q=2, dist=t)...
✅ Trening zakończony.

Parametry modelu:
mu          0.009133
omega       0.003978
alpha[1]    0.288801
gamma[1]   -0.015928
gamma[2]    0.018080
beta[1]     0.665637
beta[2]     0.310143
nu          5.355218
Name: params, dtype: float64
💾 Zapisano model i scale do pliku: C:\GitHubRepo\data-science\projects\ETH-volitality-fotecasting\logs\egarch_ETH_5m_p1_o2_q2_t.pkl

✅ Model zapisany pomyślnie.

🧩 Podsumowanie dopasowania:
                        Constant Mean - EGARCH Model Results                        
Dep. Variable:                   log_return   R-squared:                       0.000
Mean Model:                   Constant Mean   Ad