STANDARD PORTFOLIO OPTIMIZATION 

With cvxpy.

Portfolio selection via:
-MeanVariance
-risk parity portfolio

Expected Values estimated via:
-zero assumption
-sample mean
-trading signals (e.g. sentiment analysis)

Varcov estimated via:
- sample variance
- sample variance de-noised via estimation within PCs Factor Model
- EWMA (on PCs and residuals)

Additional steps:
-think about model correlation DCC models and GARCH

## Setup: Risk Parity Portfolio

- **Returns covariance matrix**  
  $$
  \Sigma \in \mathbb{R}^{N \times N}
  $$

- **Portfolio weights**  
  $$
  w \in \mathbb{R}^N, \quad \sum_{i=1}^N w_i = 1, \quad w_i \geq 0
  $$

- **Portfolio variance**  
  $$
  \sigma_p^2 = w^\top \Sigma w
  $$

- **Marginal contribution of asset $i$**  
  $$
  \text{MRC}_i = \frac{\partial \sigma_p}{\partial w_i} 
  = \frac{(\Sigma w)_i}{\sigma_p}
  $$

- **Risk contribution of asset $i$**  
  $$
  \text{RC}_i = w_i \cdot \text{MRC}_i 
  = \frac{w_i (\Sigma w)_i}{\sigma_p}
  $$

- **Risk parity condition**  
  $$
  \text{RC}_1 = \text{RC}_2 = \cdots = \text{RC}_N
  $$




### Why it’s actually exact

Volatility is

$$
\sigma_p(w) = \sqrt{w^\top \Sigma w}.
$$

This is a **homogeneous function of degree 1** in $w$:  
if you scale $w$ by $c$, volatility scales by $|c|$.

---

For any homogeneous degree-1 function $f$,  
**Euler’s homogeneous function theorem** says:

$$
f(w) = \sum_{i=1}^N w_i \, \frac{\partial f(w)}{\partial w_i}.
$$

---

Apply this to $f(w) = \sigma_p(w)$:

$$
\sigma_p = \sum_{i=1}^N w_i \cdot \text{MRC}_i.
$$

---

Now divide through by $\sigma_p$, and you get the exact decomposition into risk contributions:

$$
\sum_{i=1}^N \text{RC}_i 
= \sum_{i=1}^N \frac{w_i \cdot \text{MRC}_i}{\sigma_p} 
= 1.
$$

---

So risk contributions (RCs) aren’t a linear approximation,  
they’re an **exact partition of total risk**.


# EWMA on Returns vs. EWMA on Principal Components

## 1. EWMA directly on returns
We update the full covariance matrix directly:
$$
\Sigma_t = \lambda \Sigma_{t-1} + (1-\lambda) r_t r_t^\top,
$$
where $r_t \in \mathbb{R}^N$ is the vector of returns.

- Pros: simple, captures variances and covariances dynamically.  
- Cons: updates $O(N^2)$ parameters per step using only $N$ observations.  
  → Noisy and unstable in high dimension.

---

## 2. EWMA on Principal Components (factorized approach)
Decompose returns via PCA:
$$
r_t = L f_t + \epsilon_t,
$$
where $f_t \in \mathbb{R}^k$ are the top $k$ principal components,  
$L \in \mathbb{R}^{N \times k}$ is the loading matrix, and $\epsilon_t$ are residuals.

- Apply EWMA only to factor returns $f_t$:
$$
\Sigma_{f,t} = \lambda \Sigma_{f,t-1} + (1-\lambda) f_t f_t^\top,
$$

- Model residual variances with diagonal EWMA:
$$
\Sigma_{\epsilon,t} = \mathrm{diag}(\sigma^2_{\epsilon,1,t}, \dots, \sigma^2_{\epsilon,N,t}).
$$

- Reconstruct total covariance:
$$
\Sigma_t = L \, \Sigma_{f,t} L^\top + \Sigma_{\epsilon,t}.
$$

- Pros: dimension reduction, more robust estimation, mimics industry risk models.  
- Cons: depends on factor specification, PCA loadings must be recomputed periodically.

---

## 3. Rule of Thumb
- Small universe ($N < 15$): use EWMA directly on returns.  
- Medium/large universe ($N > 30$): EWMA on PCs/factors is more stable.  
- Very large universes ($N \gg 100$): factor models are the only feasible option.  


# Forecasting the One-Step Ahead Covariance Matrix

## 1. Baseline: DCC–GARCH
For a universe of assets $r_{i,t}$, the **Dynamic Conditional Correlation (DCC)–GARCH** model is a standard choice.

### Step A: Univariate GARCH
For each asset $i$:
$$
r_{i,t} = \sigma_{i,t} z_{i,t}, \quad z_{i,t} \sim \text{i.i.d.}(0,1)
$$

$$
\sigma_{i,t}^2 = \omega_i + \alpha_i r_{i,t-1}^2 + \beta_i \sigma_{i,t-1}^2
$$

### Step B: DCC correlations
Let $z_t = (z_{1,t}, \dots, z_{N,t})^\top$ be standardized residuals.
$$
Q_t = (1-a-b)\bar{Q} + a z_{t-1}z_{t-1}^\top + b Q_{t-1},
$$

$$
R_t = \text{diag}(Q_t)^{-\tfrac{1}{2}} Q_t \text{diag}(Q_t)^{-\tfrac{1}{2}}
$$

### Step C: Combine
The 1-step-ahead covariance matrix is:
$$
\Sigma_{t+1} = D_{t+1} R_{t+1} D_{t+1},
$$

where
$$
D_{t+1} = \mathrm{diag}(\sigma_{1,t+1}, \dots, \sigma_{N,t+1})
$$

---

## 2. Scaling Issues
- If $N \lesssim 30$: full DCC–GARCH is feasible.  
- If $N$ is large: estimation error dominates. Use **Factor-GARCH / PCA–DCC**:
  - Run PCA on returns, keep top $k$ factors.  
  - Fit GARCH/DCC on factor series.  
  - Model idiosyncratic risks with diagonal EWMA/GARCH.  

---

## 3. Alternatives
- **High-frequency data:** use realized volatility / HAR-RV + DCC.  
- **Simpler robust option:** EWMA on factors + diagonal EWMA for residuals.  
- **Shrinkage methods:** Dynamic Ledoit–Wolf or POET for large $N$.  

---

## 4. Evaluation Checklist
- Forecast loss:
$$
L = \big(\hat\sigma_{p,t+1}^2 - r_{p,t+1}^2\big)^2
$$
for target portfolios.  

- Out-of-sample realized volatility of optimized portfolios.  
- Stability and turnover of portfolio weights.  
- Diebold–Mariano tests for forecast comparison.  

---

**Summary:**  
- DCC–GARCH is a strong 1-step-ahead $\Sigma_{t+1}$ forecaster for moderate universes.  
- For large $N$, prefer **factor + DCC** or **shrinkage-based estimators**.  
- Realized volatility models dominate if high-frequency data is available.
