XDDDDDDDDDDDDDDDDDDDDDDD

In [1]:
import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from arch import arch_model
from scipy.stats import rankdata

step 1: implement volatility model (modified GARCH)  
  
$\sigma_{n}^{2} = \gamma V_{lt} +\alpha u_{n-1}^{2} + \beta\sigma_{n-1}^{2}$  
  
where:  
  
$\sigma_{n}$ is the volatility of the market variable for day $n$ (made at the end of day $n-1$). $\sigma_{n}^{2}$ is the variance of the market variable for day $n$  
$\gamma$ is the weight assigned to $V_{lt}$  
$V_{lt}$ is the long-run average variance rate  
$\alpha$ is the weight assigned to $u_{n-1}^{2}$   
$u_{n-1}$ is the percentage change in the market variable between the end of day $n-2$ and the end of day $n-1$ (the most recent daily percentage change in the market variable). $u_{n-1}^{2}$ is the squared return of the market variable between the end of day $n-2$ and the end of day $n-1$  

$u_{n-1} = \frac{S_{n-1} - S_{n-2}}{S_{n-2}}$  

$S_{n-1}$ is the value of the market variable at the end of day $n-1$  
$S_{n-2}$ is the value of the market variable at the end of day $n-2$  

$\beta$ is the weight assigned to $\sigma_{n-1}^{2}$  
$\sigma_{n-1}$ is the volatility of the market variable for day $n-1$ (made at the end of day $n-2$) . $\sigma_{n-1}^{2}$ is the variance of the market variable for day $n-1$  
  
since the weights must sum to unity, it follows that  
  
$\gamma+\alpha+\beta=1$  
  
setting $\omega=\gamma V_{lt}$, the model can also be written as  
  
$\sigma_{n}^{2} = \omega +\alpha u_{n-1}^{2} + \beta\sigma_{n-1}^{2}$  
  
we include one lag of an asymmetric shock which transforms a GARCH model into a GJR-GARCH model with variance dynamics given by  
  
$\sigma^2_n   =  \omega + \alpha u_{n-1}^2 + \gamma u_{n-1}^2 I_{[u_{n-1}<0]}+ \beta \sigma_{n-1}^2$

where $I$ an indicator function that takes the value 1 when its argument is true. the log likelihood improves substantially with the introduction of an asymmetric term, and the parameter estimate is highly significant.  

to improve the fit a little more, we model the volatility using absolute values rather than a variance process that evolves in squares. this process, known as a TARCH/ZARCH model is given by  
  
$\sigma_n  =  \omega + \alpha \left|u_{n-1}\right| + \gamma \left|u_{n-1}\right| I_{[u_{n-1}<0]}+ \beta \sigma_{n-1}$

one final adjustment, financial returns are often heavy tailed, and a Student’s T distribution is a simple method to capture this feature. The call to arch changes the distribution from a Normal to a Students’s T.

In [None]:
def volatility(df):
    # df: dataframe of price data. uses GJR-GARCH/TARCH (power=1) + t distribution.
    df = df.copy()
    returns = 100 * df["Close"].pct_change().dropna()
    df = df.iloc[1:].copy()

    am = arch_model(returns, p=1, o=1, q=1, power=1.0, dist="studentst")
    result = am.fit()
    df.loc[:, "Volatility"] = result._volatility
    df.loc[:, "Rolling Volatility"] = df["Volatility"].rolling(window=10).mean()

    calculated = df["Volatility"].mean()
    expected = np.var(returns)
    print(f"percent error in volatility calculation: {(np.abs(calculated - expected) / expected) * 100}")
    return df

step 2: momentum  
don't have much to say about this it's pretty straightforward

In [None]:
def absolute_momentum(df):
    # 4 month momentum
    df = df.copy()
    df['Momentum'] = (df["Close"] / df["Close"].shift(84)) - 1
    df.dropna()
    return df

step 3: ATR Trend/Breakout system  
produces a signal based on bands. if a given's day high is higher than the upper band, the following day the model will go long (signal = 2). if a given day's low is lower than the lower band, the following day the model will go neutral/short (signal = -2). the bands are defined by the 42-day rolling average true range, with the true range being given by  
  
$TR = max\left[(H-L), \left|H-C_{n-1}\right|, \left|L-C_{n-1}\right|\right]$  
  
where $H$ is the day's high, $L$ is the day's low, and $C_{n-1}$ is the previous close

In [None]:
def atr_trend_breakout(df):
    df = df.copy()
    high = df["High"]
    low = df["Low"]
    close = df["Close"]
    tr = pd.concat([high - low, (high - close.shift(1)).abs(), (low - close.shift(1)).abs()], axis=1).max(axis=1)
    atr = tr.rolling(42).mean()
    upper = high.rolling(42).max() + atr  # upper = session highs + ATR
    lower = low.rolling(42).min() - atr
    signal = pd.Series(0.0, index=df.index)
    long_mask = high > upper
    short_mask = low < lower
    signal.loc[long_mask] = 2
    signal.loc[short_mask] = -2
    df['ATR Trend/Breakout'] = signal.replace(0, np.nan).ffill().fillna(0)
    return df

step 4: average relative correlations  
4 months average correlation across the ETFs on daily returns. requires proper merged dataframe but thats for me to figure out later

In [None]:
def correlation(returns_df):
    tickers = returns_df.columns.tolist()
    out = pd.DataFrame(index=returns_df.index, columns=tickers, dtype=float)
    for i in range(84, len(returns_df) + 1):
        chunk = returns_df.iloc[i - 84:i]
        corr = chunk.corr()
        idx = returns_df.index[i - 1]
        for col in tickers:
            others = [c for c in tickers if c != col]
            if others:
                out.loc[idx, col] = corr.loc[col, others].mean()
    return out

step 5: rank the motherfuckers on each metric  
most momentum --> rank 1, least --> rank 13  
least volatility --> rank 1, most --> rank 13  
least correlation --> rank 1, most --> rank 13  

In [None]:
def rank(momentum_series, vol_series, corr_series):
    # Rank(M)
    rank_m = pd.Series(rankdata(momentum_series), index=momentum_series.index)
    # Rank(V)
    rank_v = pd.Series(rankdata(-vol_series), index=vol_series.index)
    # Rank(C)
    rank_c = pd.Series(rankdata(-corr_series), index=corr_series.index)
    return rank_m, rank_v, rank_c

step 6: rank the motherfuckers part 2 as a whole (except for cash)

$Rank = w_{M}R_{M} + w_{V}R_{V} + w_{C}R_{C} - T  
  
where $R_{M}$, $R_{V}$, and $R_{C}$ are the rankings of momentum, volatility, and correlation,   
$w_{m}$, $w_{V}$, and $w_{C}$ are the weights assigned to the rankings of momentum, volatility, and correlation,  
and $T$ is the signal from the ATR Trend/Breakout System 

In [None]:
def ranked_allocation(rank_m, rank_v, rank_c):
    tickers = [x for x in rank_m.index if x != "SHY"]
    n = len(tickers)
    total = 0.45 * (n + 1 - rank_m.reindex(tickers).fillna(0))
    total += 0.4 * (n + 1 - rank_v.reindex(tickers).fillna(0))
    total += 0.15 * (n + 1 - rank_c.reindex(tickers).fillna(0))
    top = total.nsmallest(5).index.tolist()
    return top

step 7: weight portfolio  
classification (including the ranking above) is done on a monthly basis, taking the last value of the month. we reallocate the portfolio at this time, giving our top 5 tickers 20% of the portfolio each. however, if any of the top 5 tickers have a negative absolute momentum, we replace it's weighting with cash

In [None]:
def raam_weights(momentum_series, rank_m, rank_v, rank_c):
    top = ranked_allocation(rank_m, rank_v, rank_c)
    weights = {}
    cash_weight = 0.0
    for t in top:
        if t in momentum_series and momentum_series[t] < 0:
            cash_weight += 0.2
        else:
            weights[t] = 0.2
    if cash_weight > 0:
        weights["SHY"] = weights.get("SHY", 0) + cash_weight
    return weights

tickers in our portfolio:  
  
**Brazilian Equities:**  
EWZ  
FLBR  
EWZS  
  
**International Equities:**  
EFA  
EEM  
  
**Brazilian Real Estate:**  
HGBS11  
BTLG11  
  
**Brazilian Natural Resources – Commodities:**  
BIAU39  
BSLV39  
BCOM39  
  
**Brazilian Bonds:**  
BLTN  
VWOB (Emerging Markets)  
  
**International Bonds:**  
IGOV  
  
**Cash:**  
SHY

In [None]:
TICKERS = ["EWZ", "FLBR", "EWZS", "EFA", "EEM", "HGBS11", "BTLG11", "BIAU39", "BSLV39", "BCOM39", "BLTN", "VWOB", "IGOV", "SHY"]
RISKY_TICKERS = TICKERS[:12]
CASH_TICKER = "SHY"

TODO:  
**build backtest**  
write paper lol  