## Project Summary

Designed a novel implementation of a method that forecasts asset covariances (particularly volatility) using daily candlestick price data and tested it using a portfolio optimzation problem composed of multi-asset futures of SP500, US 10-Year Treasuries, Gold, and WTI Crude Oil.  

## Popular models for volatility and covariance forecasting

- ARCH and GARCH type models: Diagonal BEKK, Orthogonal GARCH, Constant Conditional Covariance (CCC), GARCH-Dynamic Conditional Covariance 

- MA type models: EWMA

- Others: Heterogneous Autoregressive (HAR) and HAR DRD , Hybrid Implied Covariance, Random Walk Estimator


## The Heterogenous Market Hypothesis

- Proposed by Muller in 1997, wherein the market is composed of non-homogenous groups in terms of trading appetite and investment objectives. 
- Each group uniquely contributes to market volatility across varying time horizons.

<p align="center">
  <img src="Figures\terms.png" alt="Image Description">
</p>


## The HAR DRD Framework

- The HAR volatility model is augmented to model covariances.
- Borrowed from the GARCH DCC aprroach, decompose the covariance matrix into correlations and volatilties:

<p align="center">
  <img src="Figures\DRD.png" alt="Image Description">
</p>

- Allows for modelling the two components separately, improving performance. 
- A single set of parameters is used for all asset volatilities and pairwise correlations, making this model very parsimonious. 

## A Typical HAR Volatility Model Specification

$$\sigma_{i,t+1}^{(d)} = c + \beta^{(d)}RV_{i,t}^{(d)} + \beta^{(w)}RV_{i,t}^{(w)} + \beta^{(m)}RV_{i,t}^{(m)} + \epsilon_{i,t+1}^{(d)},$$
$$  i = 1,2,3,4 $$

Where,

$$ RV_{i,t}^{(d)} = \sqrt{\frac{1}{N}\sum_{k=1}^N r_{k,t}^2} $$
$$RV_{i,t}^{(w)} = \frac{1}{5} \sum_{k=0}^4 RV_{i,t-k}^{(d)} $$
$$RV_{i,t}^{(m)} = \frac{1}{21} \sum_{k=0}^{20} RV_{i,t-k}^{(d)} $$

## Correlation Forecast Model in HAR-DRD

<div class="equation-container">
$$vech(R_{t+1}) = vech(\bar{R}_T)(1 - \alpha - \beta - \gamma) 
+ \alpha\cdot vech(R_{t}) + \beta\cdot vech(R_{t-4:t}) + 
\gamma\cdot vech(R_{t-20:t})+ vech(\epsilon_{t+1})$$
</div>

- where the $\textit{vech}$ operator returns the vectorized form of the lower triangular matrix
- similar to the HAR volatility model with daily, weekly, and monthly contributors
- Additional long-term average or intercept term

## A critique of the HAR model

- HAR models typically are implemented using high frequency intraday returns data which are expensive to obtain, computationally challenging to work with, and often exhibit significant noise. 

- Intraday returns are subject to market microstructure effects, which results in measurement errors when using them in daily volatility and correlation estimates. 

- Sum of average squared returns as a variance estimator assumes zero drift. 


## Other Estimators for Daily Realized Volatility

- Daily log range: $\ln(\frac{High}{Low})$

- Parkinson's variance: $\frac{(\ln(\frac{High}{Low}))^2}{4\ln 2}$

While far less noisy and requiring lesser data, these estimators also assume zero drift which is not always the case.  

## OHLC Daily Realized Volatility

Introduced an alternative estimator for daily realized volatility developed by Rogers and Satchell (1991):

$$ RV_{i,t}^{(d)} = \sqrt{u_{i,t}(u_{i,t} - c_{i,t}) + d_{i,t}(d_{i,t} - c_{i,t})},$$
$$u_{i,t} = \ln H_{i,t} - \ln O{i,t}$$
$$d_{i,t} = \ln L_{i,t} - \ln O{i,t}$$
$$c_{i,t} = \ln C_{i,t} - \ln O{i,t},$$ 
$$  i = 1,2,3,4 $$

- Has desirable drift-independent property.

## An Alternative Approach to Correlation Forecasting

Unlike volatility, we cannot estimate intraday correlation using just candlestick data because temporal information is lost. 

To maintain daily portfolio rebalancing while smoothening correlations, we propose to rolling weekly correlations. On each day, the correlation forecast would be over the following week.  

$$vech(R_{t+1:t+5}) = vech(\bar{R}_T)(1 - \alpha - \beta) + \alpha\cdot vech(R_{t-4:t}) + \beta\cdot vech(R_{t-20:t}) + vech(\epsilon_{t+1:t+5})$$

- Enables us to capture the flexibility of daily rebalancing without introducing excessive noise to portfolio weights.


## HAR DRD model parameters

Using data between 2002-2017 to train the model:

<p align="center">
  <img src="Figures\regression.png" alt="Image Description">
</p>


Coefficients for the volatility regression are more significant as the time horizon increases, exhibiting the long memory and persistance behavior of volatility. Correlations also show a similar property. 

## Out-of-sample forecast errors

Generated forecast errors by measuring Euclidean distance $(L_2$ norm) between forecasted and actual vectorized volatility and correlation matrices, respectively, observed during 2018-2022. 

<p align="center">
  <img src="Figures/CorrelationLoss.png" alt="Image Description">
</p>

## 

<p align="center">
  <img src="Figures/VolatilityLoss_PreOilCrash.png" alt="Image Description" style = "width: 40%; height: 20%">
</p>
<p align="center">
  <img src="Figures/VIxPreOil.png" alt="Image Description" style = "width: 35%; height: 50%">
</p>




## 

<p align="center">
  <img src="Figures/VolatilityLoss_PostOilCrash.png" alt="Image Description" style = "width: 40%; height: 20%">
</p>
<p align="center">
  <img src="Figures/VIxPostOil.png" alt="Image Description" style = "width: 35%; height: 50%">
</p>


## Performance in a Portfolio Optimization context

- Construct a minimum variance portfolio of SP500, US 10-Year Treasury, WTI Crude Oil, and Gold Futures:

$$ Min \,\, \omega^T \Sigma \omega $$
$$ s.t.: \,\,\sum_{i=1}^n \omega_i = 1,\,\omega_i \ge 0, \forall_i,$$


where $\omega$ is the vector of portfolio weights and $\Sigma$ is the covariance matrix of asset returns.

- Compare performance against a simple (t-1) lagged historical approach. 


## 

<p align="center">
  <img src="Figures/Portfolio_Value.png" alt="Image Description" style = "width: 50%; height: 50%">
</p>
<p align="center">
  <img src="Figures/PortVol.png" alt="Image Description" style = "width: 50%; height: 50%">
</p>


## Conclusion

- Shown viability of a HAR DRD model in the absence of high frequency intraday returns data.
- New approach offers additional advantages in terms of data noise and measurement errors.

Immediate improvements to the project can be made:

- Compare performance against the traditional HAR DRD model and/or GARCH DCC.
- Observe performance decay of the model when increasing forecasting horizon for weekly and monthly rebalancing.

Other future innovations can include:

- Time varying model parameters for improved regime switching performance.


# Thank you
suran021@umn.edu