# Extending Leveraged ETFs Back In Time

### Because sometimes ETFs just don't go back far enough

#### TL;DR: [Show Me The Money](#Example) or [download the data](#Download).


![3X S&P Back to the 80s](images/upro.png)


This takes a [leveraged ETF](https://www.investopedia.com/terms/l/leveraged-etf.asp) and extends it back into the past using a proxy fund.  

The basic idea is to multiply the daily returns of the proxy by the
leverage factor, adjusting for fees and other expenses.  Since some of those expenses are hard to obtain, 
it can also find the parameters that minimize the difference between the leveraged proxy and the actual leveraged ETF.
It plots a telltale chart with difference metrics and writes the simulated prices out to CSV.

If you are new to Jupyter Notebook, you can find [tutorials](https://learn.onemonth.com/jupyter-notebook-a-beginners-tutorial/) online.  If you are not already, you can edit and run this notebook interactively on [Binder](https://mybinder.org/v2/gl/doctorj%2Fquantitative-investing/master?filepath=Leveraged%20ETFs.ipynb).

In [1]:
import sys
import warnings
import unittest     
from itertools import chain, combinations

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance
from scipy.optimize import minimize
from IPython.display import Image, display, HTML

from util import yget, read_fred, annret, annvol, cumret, TRADING_DAYS


assert sys.version_info >= (3, 6, 0), "Ordered dicts are where it's at"
display(HTML("<style>.container { width:90% !important; }</style>"))
# Set to plotly for interactive figures, matplotlib for static images
pd.options.plotting.backend = "plotly"  
if pd.options.plotting.backend == "plotly":
    import plotly.io
    plotly.io.templates.default = "plotly_white"
RASTER = True   # Rasterize complex images to save time/space    
    
%matplotlib inline

In [2]:
plt.rcParams["figure.figsize"] = (16, 9)    # Matplotlib likes this in a separate cell

### Get Borrowing Rates
Leveraged funds have to borrow money, and the borrowing cost is not included in the expense ratio, so we have to account for it.  We use the [Effective Federal Funds Rate](https://en.wikipedia.org/wiki/Federal_funds_rate) because it has history back to 1954 and gives slightly better fits than the LIBOR.

In [3]:
# Source is in percent.
fedfunds = read_fred('DFF').rename('FEDFUNDS') / 100
#tbill = read_fred('DTB3').rename('TBILL') / 100
#libor1d  = read_fred('USDONTD156N').rename('LIBOR-1d') / 100
#libor1w  = read_fred('USD1WKD156N').rename('LIBOR-1w') / 100
#libor1m  = read_fred('USD1MTD156N').rename('LIBOR-1m') / 100     
#libor12m = read_fred('USD12MD156N').rename('LIBOR-12m') / 100

BORROW = fedfunds  

### Leverage

This function leverages a proxy price series using configurable leverage factor, expense ratio, and borrowing rate.

As a first pass, to leverage a daily return $ret$ by a leverage factor $factor$, we just scale the return and subtract the (daily) expense ratio $exp$:

$$ lev = factor * ret - exp $$

However, the fund's borrowing and trading costs are *not* included in the expense ratio, so we subtract borrowing costs from the leveraged return.  A 3X fund meeds to borrow an additional 2X the principal, broadly speaking, or in general `factor - 1` times the borrowing costs.  Borrowing costs vary over time, so we use a short-term interest rate benchmark such as the Federal Funds Rate or LIBOR.

$$ lev = factor * ret - exp - (factor - 1) * borrow$$

Under the hood, leveraged ETFs hold some stock and some [swaps](https://learn.robinhood.com/articles/s3FYEQ0gYx0cNAoiG25du/what-is-a-swap/); the main unknowns in simulating an ETF are the fraction of assets in swaps and the swap rate ([some exploration here](https://www.bogleheads.org/forum/viewtopic.php?p=4884654#p4884654) and [here](https://www.bogleheads.org/forum/viewtopic.php?p=5729993#p5729993)).

To account for these unknowns, we add an adjustment $C$, to be determined for each ETF.

$$ lev = factor * ret - exp - (factor - 1) * borrow + C$$

$C$ is a constant found by curve-fitting, usually quite small.

Finally, the expense ratio and borrowing costs are annual figures, so to get daily values we divide by the periods per year.

In [4]:
def leverage(prices, factor=2, expense=0.01, borrow_rate=BORROW, ann_periods=TRADING_DAYS, 
             factor_scale=1.0, factor_incr=0, borrow_scale=1.0, expense_incr=0):
    """:Return: a Series giving the daily leveraged value of `prices` at a given leverage `factor`.
    
    This is basically the per-period change in prices minus the expense ratio and borrow rate.
    The expense ratio and borrow rate are divided (evenly, arithmetically) by `ann_periods`.
    
    :param float factor: The leverage factor by which daily returns are multiplied.
    :param float expense: Net expense ratio per `ann_periods` as a fraction. Deducted proportionally from each period.  
      Example: 0.0095 for a 0.95% annual expense ratio.
    :param Series borrow_rate: The (annualized) interest rate used to finance short-term borrowing for leverage.  
      Deducted from daily returns.  Typically the daily Federal Funds Rate or LIBOR. 
    :param int ann_periods: The number of periods over which rates are given.  E.g., 252 for daily periods in a typical trading year.
    
    The `_scale` and `_incr` parameters are adjustments to the corresponding parameters found through curve-fitting.
    """
    
    # Align borrow rates with prices
    prices = pd.Series(prices, dtype=float)
    if prices.isna().any():
        raise ValueError('NaN in prices')
    if isinstance(borrow_rate, pd.Series):
        if borrow_rate.isna().any():
            raise ValueError('NaN in borrow_rate')
        if prices.index[0] < borrow_rate.index[0]:
            raise ValueError(f'Prices start {prices.index[0]}, before borrow_rate {borrow_rate.index[0]}')
        borrow_rate = borrow_rate.reindex(index=prices.index, method='ffill') 
    name = f'{prices.name or ""}:{round(factor, 3)}X'
    
    # Curve-fitting adjustments 
    borrow_rate *= borrow_scale
    expense += expense_incr  # This functions as an additive constant for the whole equation, since exp isn't scaled ("C" above)
    
    change = prices.pct_change() * factor * factor_scale + 1      # Period-to-period changes as ratios
    # Changes less expenses and borrowing costs, evenly distributed among periods
    net_change = change - (expense + borrow_rate * (factor + factor_incr - 1)) / ann_periods
    net_change.iat[0] = prices.iat[0]   # Start leveraged series at same value so it's easily comparable
    leveraged = net_change.cumprod()
    return leveraged.rename(name)


# All possible leverage() parameters for curve fitting and their ranges
ALL_LEV_PARAMS = {'factor_scale': (0, 3), 'factor_incr': (-2, 2), 'borrow_scale': (-5, 5), 'expense_incr': (-2, 2)}
# Actually used leverage() parameters, determined in the 'Model Selection' section
LEV_PARAMS = {p: ALL_LEV_PARAMS[p] for p in ('expense_incr',)}

Quick test to make sure we're doing something right

In [5]:
def deleverage(prices, factor, expense, borrow, ann_periods=TRADING_DAYS):
    # Assumes no fudge factors
    rets = prices.pct_change() + (expense + borrow * (factor -1)) / ann_periods
    rets = rets / factor + 1
    rets.iat[0] = 1.0
    return rets.cumprod()

    
class LeverageTest(unittest.TestCase):
    def test_leverage(self):
        vecs = (
            [1.0] * 5,
            [1.01] * 5,
            np.arange(1.0, 1.1, 0.01),
            np.arange(1.0, 0.9, 0.01),
        )
        for rets in vecs:
            for factor in (1, 2, 3, 1.25):
                for expense in (0.0, 0.01, 0.001, -0.01):
                    for borrow in (0.0, 0.01, 0.02):
                        rets = pd.Series(rets)
                        prices = pd.Series(1.0).append(rets.cumprod()).reset_index(drop=True)
                        lev = leverage(prices, factor, expense, borrow)
                        delev = deleverage(lev, factor, expense, borrow)
                        pd.testing.assert_series_equal(prices, delev, check_names=False)
        
unittest.TextTestRunner().run(unittest.TestLoader().loadTestsFromTestCase(LeverageTest));

  prices = pd.Series(1.0).append(rets.cumprod()).reset_index(drop=True)
.
----------------------------------------------------------------------
Ran 1 test in 0.548s

OK


#### Routines for aligning and plotting series

In [6]:
def norm(prices):
    """:Return: prices normalized to start at 1.0."""
    return prices / prices.iloc[0]


def cat(*dfs, dropna=True):
    """:Return: the column-wise concatenation of a sequence of Series or DataFrames.
    
    :param bool dropna: If True, remove rows with any NaN from the result.
    """
    result = pd.concat(dfs, axis=1)
    if dropna:
        result = result.dropna()
    return result


def align(*prices, dropna=True, norm=False):
    """:Return: The `prices` Series with only the dates in common to all of them, as a sequence.
    
    :param bool norm: If True, normalize each series of prices to start at 1.0.
    """
    aligned = cat(*prices, dropna=dropna)
    if norm:
        aligned = globals()['norm'](aligned)   # Calls norm() function, since we shadowed the name. Bit naughty.
    return tuple(col for _, col in aligned.iteritems())


def telltale(reference, *dfs, **layout_kws):
    """Plot the growth of several dataframes or series `dfs` relative to a `reference` series.
    
    https://www.bogleheads.org/wiki/Telltale_chart
    """
    tell = norm(cat(reference, *dfs))
    if tell.columns.nunique() < len(tell.columns):
        raise ValueError('Column names must be unique.')
    tell = tell.apply(lambda c: c / tell.iloc[:, 0])   # straight division doesn't work for some reason
    fig = tell.plot(title=f'Telltale Chart: {", ".join(tell.columns)}')
    if pd.options.plotting.backend == "plotly":
        fig.update_layout(margin=dict(t=50), **layout_kws).show()
               
      
def plotret(*prices, title=None):
    """Nice Plotly cumulative returns plot."""
    return norm(cat(*prices)).sub(1).plot(title=title).update_layout(
        yaxis=dict(tickformat=".0%"), 
        margin=dict(t=50), 
        legend_title_text='',
        yaxis_title='Cumulative Return',
        width=950,
        height=450,
    )
    
    
def color_leverage(factor, alpha=1.0, max_factor=3):
    """:Return: a plotly color string for a given leverage `factor`, more green for more long, more red for more short."""
    intensity = int(abs(factor) / max_factor * 255)
    if factor >= 0:
        return f'rgba(0,{intensity},0,{alpha})'
    else:
        return f'rgba({intensity},0,0,{alpha})'
    
    
def rasterize(figure, raster=False, width=1100, height=600, filename=None):
    """Maybe render a plotly `figure` as a static image, to save space and time."""
    if raster:
        if filename:
            figure.write_image(filename, width=width, height=height, scale=1, engine="kaleido")
            return Image(url=f'{filename}?cache_bust={np.random.randint(100000)}')
        else:
            return Image(figure.to_image(format="png", width=width, height=height, scale=1, engine="kaleido"))
    else:
        return figure    

    
def splice(old, new):
    """Splice together `new` prices with `old` prices before them, adjusted so new prices don't change."""
    if old.index[-1] < new.index[0]:
        raise ValueError(f'Last old index {old.index[-1]} and first new index {new.index[0]} must overlap')
    if old.index[0] > new.index[0]:
        warnings.warn(f'Old has no data older than new; old starts {old.index[0]}, new starts {new.index[0]}')
        return new
    first = old.index.get_loc(new.index[0], method='ffill')  # Find previous value if no exact match
    ratio = old.iloc[first] / new.iloc[0]
    return pd.concat((old.iloc[:first] / ratio, new), verify_integrity=True).rename(new.name)

#### Error metrics

In [7]:
def rmse(a, b):
    return np.sqrt(np.mean((a - b) ** 2))


def mae(a, b):
    return np.mean(np.abs(a - b))


def rel_rmse(a, b):
    """Relative RMSE between 1.0 and b / a, both aligned and normalized to start at 1.0."""
    a, b = align(a, b, norm=True)
    return rmse(1.0, b / a)
    

def rel_mae(a, b):
    """Relative MAE between 1.0 and b / a, both aligned and normalized to start at 1.0."""
    a, b = align(a, b, norm=True)
    return mae(1.0, b / a)
    
    
def return_rmse(a, b):
    """RMSE between simple periodic returns of price series `a` and `b`."""
    a, b = align(a.pct_change(), b.pct_change())
    return rmse(a, b)


def return_mae(a, b):
    """RMSE between simple periodic returns of price series `a` and `b`."""
    a, b = align(a.pct_change(), b.pct_change())
    return mae(a, b)


# This is, as you can imagine, sensitive to the most recent price.
def cumret_diff(a, b):
    """Absolute difference between the cumulative return of `a` and `b`."""
    a, b = align(a, b)
    return abs(cumret(a) - cumret(b))


def errstats(reference, leveraged, ann_periods=TRADING_DAYS):
    """:Return: dict of error metrics between expected `reference` price series and actual `leveraged` series
    at the dates (indices) they have in common."""
    reference, leveraged = align(reference, leveraged)
    return {
        'RMSE': rel_rmse(reference, leveraged), 
        'MAE': rel_mae(reference, leveraged),
        'RETRMSE': return_rmse(reference, leveraged),
        'RETMAE': return_mae(reference, leveraged),
        'CAGR': annret(leveraged, ann_periods) - annret(reference, ann_periods), 
        'VOL': annvol(leveraged, ann_periods) - annvol(reference, ann_periods), 
        'P99': norm(leveraged).div(norm(reference)).sub(1).abs().quantile(.99),
    }


def roundvals(d, digits=4):
    """Round the values of dict `d` to `digits` digits."""
    return {k: round(v, digits) for k, v in d.items()}

#### Parameter Optimization

These functions find the leverage parameters that minimize the error between a reference Series and a leveraged proxy,
and plot the results along with error metrics.

There is the question of which error metric to optimize.  The relative RMSE, basically how well the telltale chart aligns,
seems to do the best job of minimizing all metrics (cumulative and simple, squared and absolute) across funds.  The relative RMSE takes the simulated prices divided by the actual prices, and compute the RMSE between that and 1.0, which would be the ratio if they matched perfectly.

We can get away with local minimization here (as opposed to global) because the leverage function for a single day w.r.t. the leverage fit parameters is convex nonnegative increasing, the product of such functions (i.e. the cumulative return) is convex, and norms like RMSE are also convex.

In [8]:
def find_params(reference, proxy, factor=2, expense=0.01, borrow_rate=BORROW, ann_periods=TRADING_DAYS, 
                params=LEV_PARAMS, errfunc=rel_rmse):
    """Find `params` that minimize the error between a `reference` series and its leveraged `proxy`.
    
    :param dict params: Maps parameters to `leverage()` to the range to search for optimal values.
    :param func errfunc: Error function that will be minimized; takes two price series and returns a distance metric between them.
    """
    if not params:
        return {}   # Well that was easy
    
    reference, proxy = align(reference, proxy)
    def obj(x):
        param_dict = dict(zip(params.keys(), x))   # param name: value
        return errfunc(reference, leverage(proxy, factor, expense, borrow_rate=borrow_rate, ann_periods=ann_periods, **param_dict))
    
    # Find params x that minimize obj(x)
    x0 = tuple(map(np.mean, params.values()))  # Initial guess = midpoint of bounds
    res = minimize(obj, x0, bounds=list(params.values()))
    best = dict(zip(params.keys(), res.x))   # param name: optimal value
    return best

In [9]:
def plotbest(reference, proxy, factor=2, expense=0.01, borrow_rate=BORROW, ann_periods=TRADING_DAYS, plot=True, errfunc=rel_rmse, params=LEV_PARAMS):
    """Find leverage parameters that minimize error between `reference` and leveraged `proxy`, plot a telltale
    chart, and return the new leveraged series."""
    best = find_params(reference, proxy, factor=factor, expense=expense, borrow_rate=borrow_rate, ann_periods=ann_periods, errfunc=errfunc, params=params)
    print(reference.name + ':' + proxy.name, '\tparams:', ', '.join(f'{k}={v}' for k, v in roundvals(best, 4).items()))

    # Get leveraged series with best params
    leveraged = leverage(proxy, factor=factor, expense=expense, borrow_rate=borrow_rate, ann_periods=ann_periods, **best)
    ref, lev = align(reference, leveraged, norm=True)  # Might be superfluous
    error = errstats(ref, lev, ann_periods=ann_periods)
    print(', '.join(f'{k}: {v}' for k, v in roundvals(error).items()))
    sim = leveraged[:reference.index[0]]
    simret = cumret(sim) if not sim.empty else 0
    print(f'CUMRET: sim {simret:.4f} + actual {cumret(reference):.4f} = {(simret + 1) * (cumret(reference) + 1) - 1:.4f}')
    if plot:
        telltale(ref, lev)
    return leveraged

### Example

You can leverage your own ETF by changing the tickers below.  Change `UPRO` to the leveraged ETF you want to extend, and `^SP500TR` to the index or fund it leverages.  Change the factor and expense ratio to match the leveraged fund.  Check that the telltale chart looks reasonably flat and close to 1.0.  The RMSE should be less than say .03 or so.
`leveraged` will be the simulated leveraged price series.

In [10]:
letf, proxy = yget('UPRO'), yget('^SP500TR')

In [11]:
leveraged = plotbest(letf, proxy, factor=3, expense=0.0095, plot=True);

  return tuple(col for _, col in aligned.iteritems())


UPRO:^SP500TR 	params: expense_incr=0.0091
RMSE: 0.0089, MAE: 0.008, RETRMSE: 0.0021, RETMAE: 0.0012, CAGR: 0.002, VOL: 0.0048, P99: 0.0176
CUMRET: sim 1.1122 + actual 29.4411 = 63.2963


In [12]:
fig = plotret(proxy.rename('S&P500'), leveraged, title='3X S&P (UPRO) Back to the 80s')
rasterize(fig, True, filename='images/upro.png')

## Leverage All The Things
Below we extend many popular LETFs in bulk.  You can add more to the list, run the notebook, and they will be included in the output.

In [13]:
# Fund: (benchmark, leverage factor, expense ratio, issuer, start year for good data (or None to use all))
FUNDS = {
    # Mutual Funds
    'RYNVX': ('^SP500TR', 1.5, .0138, 'Rydex', '2000'),
    'ULPIX': ('^SP500TR', 2, .016, 'ProFunds', '2003'),
    #'RYTPX': ('^SP500TR', -2, .0184, 'Rydex', None),  # Bad data
    #'UOPIX': ('QQQ', 2, .0159, 'ProFunds', None),  # Bad data
    #'RYVNX': ('QQQ', -2, .0187, 'Rydex', None),    # Bad data
    'UAPIX': ('IWM', 2, .0178, 'ProFunds', '2003'),  # Russel 2000
    #'RYIRX': ('IWM', -2, .0191, 'Rydex', None),    # Russel 2000; Bad data
    'UMPIX': ('MDY', 2, .0166, 'ProFunds', '2003'), # S&P MidCap 400
    #'UDPIX': ('DIA', 2, .0172, 'ProFunds', None), # Dow; bad data
    'UTPIX': ('IDU', 1.5, .0173, 'ProFunds', '2004'), # Utilities
    'REPIX': ('IYR', 1.5, .0178, 'ProFunds', '2010'), # Real Estate
    #'SRPIX': ('IYR', -1, .0178, 'ProFunds', None), # Real Estate; Bad Data
    #'RYEUX': ('FEZ', 1.25, .0182, 'Rydex', None),  # EuroSTOXX 50; no good benchmark (data)
    'DXKLX': ('IEF', 2, .0143, 'Direxion', '2013'), # ITT
    'DXKSX': ('IEF', -2, .014, 'Direxion', '2013'), # ITT
    'UNPIX': ('EFA', 2, .0178, 'ProFunds', None),  # MSCI EAFE (large - mid foreign)
    'UUPIX': ('ADRE', 2, .0178, 'ProFunds', '2009'), # Emerging Markets
    
    
    # ETFs
    'SSO': ('^SP500TR', 2, .0091, 'ProShares', '2009'), # S&P 500
    'UPRO': ('^SP500TR', 3, .0093, 'ProShares', None),
    'SPXL': ('^SP500TR', 3, .0101, 'Direxion', '2013'),
    'SH': ('^SP500TR', -1, .009, 'ProShares', '2009'),
    'SDS': ('^SP500TR', -2, .0091, 'ProShares', '2009'),
    'SPXS': ('^SP500TR', -3, .0107, 'Direxion', '2013'),
    
    'QLD': ('QQQ', 2, .0095, 'ProShares', '2009'),     # NASDAQ 100
    'TQQQ': ('QQQ', 3, .0095, 'ProShares', None),
    'PSQ': ('QQQ', -1, .0095, 'ProShares', '2009'),
    'QID': ('QQQ', -2, .0095, 'ProShares', '2009'),
    'SQQQ': ('QQQ', -3, .0095, 'ProShares', None),     # Maybe 2013?
    
    'MVV': ('MDY', 2, .0095, 'ProShares', '2010'),     # MidCap 400
    'MYY': ('MDY', -1, .0095, 'ProShares', '2010'),
    'MZZ': ('MDY', -2, .0095, 'ProShares', '2010'),
    
    'UWM': ('IWM', 2, .0095, 'ProShares', '2010'),     # Russel 2000
    'TNA': ('IWM', 3, .0112, 'Direxion', '2013'),
    'RWM': ('IWM', -1, .0095, 'ProShares', '2010'),
    'TWM': ('IWM', -2, .0095, 'ProShares', '2009'),
    'TZA': ('IWM', -3, .0107, 'Direxion', '2013'),
    
    'URE': ('IYR', 2, .0095, 'ProShares', '2010'),     # Real Estate
    'REK': ('IYR', -1, .0095, 'ProShares', '2011'),
    #'SRS': ('IYR', -2, .0095, 'ProShares', '2010'),     # Poor fit
    
    'UPW': ('IDU', 2, .0095, 'ProShares', '2009'),     # Utilities
    'SDP': ('IDU', -2, .0095, 'ProShares', '2009'),
    
    'SOXL': ('SMH', 3, .0090, 'Direxion', None ),       # Semiconductors
    'SOXS': ('SMH', -3, .0101, 'Direxion', None),
    
    'EFO': ('EFA', 2, .0095, 'ProShares', None),       # MSCI EAFE
    'EFZ': ('EFA', -1, .0095, 'ProShares', '2010'),    
    'EFU': ('EFA', -2, .0095, 'ProShares', '2010'),
    
    'EET': ('EEM', 2, .0095, 'ProShares', None),       # Emerging Markets
    'EUM': ('EEM', -1, .0095, 'ProShares', '2009'),
    'EEV': ('EEM', -2, .0095, 'ProShares', '2011'),
    
    'UST': ('IEF', 2, .0095, 'ProShares', '2012'),     # 7-10 Yr Treasury
    'TYD': ('IEF', 3, .0109, 'Direxion', '2010'),
    'TBX': ('IEF', -1, .0095, 'ProShares', None),
    'PST': ('IEF', -2, .0095, 'ProShares', '2011'),    
    'TYO': ('IEF', -3, .0108, 'Direxion', '2011'),
    
    'UBT': ('TLT', 2, .0095, 'ProShares', None),       # 20+ Yr Treasury
    'TMF': ('TLT', 3, .0105, 'Direxion', '2011'),
    'TBF': ('TLT', -1, .0094, 'ProShares', '2011'),
    'TBT': ('TLT', -2, .0092, 'ProShares', '2011'),    
    'TMV': ('TLT', -3, .0104, 'Direxion', '2011'),

    'UGL': ('GLD', 2, .0095, 'ProShares', None),       # Gold
    'GLL': ('GLD', -2, .0132, 'ProShares', None),
    
    'UVXY': ('^VIX', 1.5, .0095, 'ProShares', None),    # Vix
    'VIXY': ('^VIX', 1, .0085, 'ProShares', None )
}

### Get Prices

In [14]:
extras = ('VUSTX', 'BSV', 'SHY')   # We'll use these later

In [15]:
tickers = frozenset(chain.from_iterable((fund, proxy) for fund, (proxy, *_) in FUNDS.items())) | frozenset(extras)
prices = yget(tickers)
prices

Unnamed: 0_level_0,TNA,TYD,TMF,^SP500TR,MVV,QLD,UNPIX,TWM,EFO,^VIX,...,TBX,TBF,UPRO,DXKLX,MZZ,REK,SPXL,EEM,EUM,ADRE
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1986-05-19,,,,,,,,,,,...,,,,,,,,,,
1986-05-20,,,,,,,,,,,...,,,,,,,,,,
1986-05-21,,,,,,,,,,,...,,,,,,,,,,
1986-05-22,,,,,,,,,,,...,,,,,,,,,,
1986-05-23,,,,,,,,,,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-10-25,35.279999,28.379999,6.80,8190.020020,46.849998,41.020000,11.63,16.40,28.719999,28.459999,...,29.190001,23.980000,34.730000,24.650000,18.170000,21.200001,65.580002,34.209999,17.240000,32.430000
2022-10-26,35.840000,28.790001,7.07,8129.620117,47.119999,39.180000,11.84,16.26,29.270000,27.280001,...,29.020000,23.629999,33.959999,24.870001,18.070000,21.160000,64.080002,34.770000,16.980000,32.950001
2022-10-27,35.830002,29.480000,7.32,8080.319824,47.049999,37.730000,11.62,16.25,28.709999,27.389999,...,28.809999,23.379999,33.369999,25.180000,18.120001,21.129999,63.009998,34.500000,17.090000,32.820000
2022-10-28,38.259998,28.990000,7.17,8280.110352,48.759998,40.060001,11.84,15.52,29.270000,25.750000,...,28.969999,23.540001,35.709999,24.940001,17.450001,20.620001,67.459999,34.290001,17.190001,32.540001


In [16]:
rasterize(pd.concat((norm(p.dropna()) for _, p in prices.iteritems()), axis=1).plot(title='All funds cumulative return'), RASTER, filename='images/cumulative.png')


iteritems is deprecated and will be removed in a future version. Use .items instead.



## Leverage

Now for each LETF, we find the best parameters and splice the old synthetically leveraged data with the new actual data (adjusting to most recent prices match current quotes).

Error metrics:

* **RMSE**: Root mean squared error of telltale chart: RMSE(sim / actual, 1.0)
* **MAE**: Mean absolute error of telltale chart: MAE(sim / actual, 1.0)
* **RETRMSE**: RMSE of simple daily returns
* **RETMAE**: MAE of simple daily returns
* **CAGR**: simulated CAGR - actual CAGR
* **VOL**: simulated (daily) volatility - actual volatility
* **P99**: 99th percentile of (absolute value of) telltale chart deviation


In [17]:
# Collect results from fitting and leveraging each ETF
sim, err, tells = {}, {}, {}
for name, (proxy, factor, exp, _, start) in FUNDS.items():
    # Cut out initial bad data from leveraged fund, align with proxy
    lfund, proxy = prices.loc[start:, name].dropna(), prices[proxy].dropna()
    
    leveraged = plotbest(lfund, proxy, factor, exp, plot=False)
    #params = find_params(lfund.iloc[len(lfund) // 2:], proxy, factor, exp)
    #leveraged = leverage(proxy, factor, exp, **params)
    
    sim[name] = splice(leveraged, lfund)
    lfund, lev = align(prices[name].dropna(), leveraged, norm=True)   # Plot whole series including bad initial data
    tells[name] = lev.div(lfund).rename(name)
    err[name] = errstats(lfund, lev)
    print()
    
tells = pd.concat(tells, axis=1, verify_integrity=True)
err = pd.DataFrame.from_dict(err, orient='index')


iteritems is deprecated and will be removed in a future version. Use .items instead.



RYNVX:^SP500TR 	params: expense_incr=0.0029
RMSE: 0.0109, MAE: 0.0087, RETRMSE: 0.0014, RETMAE: 0.0004, CAGR: 0.0009, VOL: -0.0009, P99: 0.0312
CUMRET: sim 10.5708 + actual 2.3429 = 37.6794

ULPIX:^SP500TR 	params: expense_incr=0.0087
RMSE: 0.0106, MAE: 0.0071, RETRMSE: 0.0051, RETMAE: 0.0007, CAGR: 0.0004, VOL: 0.0025, P99: 0.0322
CUMRET: sim 4.3050 + actual 8.0916 = 47.2308



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




UAPIX:IWM 	params: expense_incr=0.0047
RMSE: 0.013, MAE: 0.0107, RETRMSE: 0.006, RETMAE: 0.0025, CAGR: 0.001, VOL: -0.0059, P99: 0.0269
CUMRET: sim -0.4423 + actual 4.6615 = 2.1576




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



UMPIX:MDY 	params: expense_incr=0.0009
RMSE: 0.0213, MAE: 0.0196, RETRMSE: 0.0065, RETMAE: 0.0023, CAGR: 0.0012, VOL: 0.0052, P99: 0.0301
CUMRET: sim 1.5612 + actual 8.6822 = 23.7981

UTPIX:IDU 	params: expense_incr=0.0016
RMSE: 0.011, MAE: 0.0088, RETRMSE: 0.0045, RETMAE: 0.0013, CAGR: -0.0013, VOL: -0.0081, P99: 0.0291
CUMRET: sim -0.2358 + actual 4.9212 = 3.5249



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




REPIX:IYR 	params: expense_incr=-0.0024
RMSE: 0.0064, MAE: 0.0041, RETRMSE: 0.0051, RETMAE: 0.0014, CAGR: 0.0029, VOL: -0.0039, P99: 0.0133
CUMRET: sim 0.5066 + actual 2.0239 = 3.5558

DXKLX:IEF 	params: expense_incr=0.004
RMSE: 0.0051, MAE: 0.0042, RETRMSE: 0.0008, RETMAE: 0.0003, CAGR: -0.0018, VOL: 0.0005, P99: 0.011
CUMRET: sim 1.5137 + actual -0.1615 = 1.1077

DXKSX:IEF 	params: expense_incr=0.0112
RMSE: 0.0312, MAE: 0.0269, RETRMSE: 0.0016, RETMAE: 0.0004, CAGR: -0.0036, VOL: 0.0, P99: 0.0616
CUMRET: sim -0.7041 + actual -0.1806 = -0.7575



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




UNPIX:EFA 	params: expense_incr=0.0182
RMSE: 0.0083, MAE: 0.0057, RETRMSE: 0.0066, RETMAE: 0.0022, CAGR: 0.0002, VOL: -0.018, P99: 0.026
CUMRET: sim 0.8977 + actual -0.5914 = -0.2245




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



UUPIX:ADRE 	params: expense_incr=-0.0003
RMSE: 0.0074, MAE: 0.0053, RETRMSE: 0.0074, RETMAE: 0.0037, CAGR: 0.0003, VOL: 0.0008, P99: 0.0178
CUMRET: sim 1.1897 + actual -0.2458 = 0.6515

SSO:^SP500TR 	params: expense_incr=0.0038
RMSE: 0.008, MAE: 0.0075, RETRMSE: 0.0014, RETMAE: 0.0009, CAGR: 0.0013, VOL: 0.0037, P99: 0.0132
CUMRET: sim 4.0295 + actual 13.3681 = 71.2649

UPRO:^SP500TR 	params: expense_incr=0.0093
RMSE: 0.0089, MAE: 0.008, RETRMSE: 0.0021, RETMAE: 0.0012, CAGR: 0.002, VOL: 0.0048, P99: 0.0176
CUMRET: sim 1.1122 + actual 29.4411 = 63.2963




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



SPXL:^SP500TR 	params: expense_incr=0.0131
RMSE: 0.0022, MAE: 0.0017, RETRMSE: 0.0018, RETMAE: 0.0011, CAGR: -0.0, VOL: 0.007, P99: 0.0062
CUMRET: sim 5.1294 + actual 8.1346 = 54.9890

SH:^SP500TR 	params: expense_incr=-0.0
RMSE: 0.0047, MAE: 0.0044, RETRMSE: 0.0008, RETMAE: 0.0005, CAGR: -0.0003, VOL: 0.0008, P99: 0.0077
CUMRET: sim -0.4836 + actual -0.8836 = -0.9399



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




SDS:^SP500TR 	params: expense_incr=0.0
RMSE: 0.0104, MAE: 0.0096, RETRMSE: 0.0014, RETMAE: 0.0009, CAGR: -0.0005, VOL: 0.004, P99: 0.0175
CUMRET: sim -0.9382 + actual -0.9912 = -0.9995




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



SPXS:^SP500TR 	params: expense_incr=0.0037
RMSE: 0.013, MAE: 0.0114, RETRMSE: 0.002, RETMAE: 0.0012, CAGR: -0.001, VOL: 0.0059, P99: 0.0231
CUMRET: sim -0.9998 + actual -0.9942 = -1.0000

QLD:QQQ 	params: expense_incr=0.0022
RMSE: 0.0042, MAE: 0.0038, RETRMSE: 0.0013, RETMAE: 0.0007, CAGR: 0.0012, VOL: 0.0015, P99: 0.0111
CUMRET: sim -0.9262 + actual 43.2865 = 2.2691




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



TQQQ:QQQ 	params: expense_incr=0.0051
RMSE: 0.0115, MAE: 0.01, RETRMSE: 0.002, RETMAE: 0.001, CAGR: 0.0036, VOL: 0.0089, P99: 0.03
CUMRET: sim -0.9936 + actual 49.2931 = -0.6799

PSQ:QQQ 	params: expense_incr=0.0038
RMSE: 0.0063, MAE: 0.0058, RETRMSE: 0.0009, RETMAE: 0.0005, CAGR: -0.0003, VOL: 0.0003, P99: 0.0105
CUMRET: sim -0.1814 + actual -0.9484 = -0.9578



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




QID:QQQ 	params: expense_incr=0.0088
RMSE: 0.0162, MAE: 0.0144, RETRMSE: 0.0013, RETMAE: 0.0008, CAGR: -0.0004, VOL: 0.0018, P99: 0.0274
CUMRET: sim -0.8510 + actual -0.9985 = -0.9998



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




SQQQ:QQQ 	params: expense_incr=0.0098
RMSE: 0.014, MAE: 0.0115, RETRMSE: 0.0022, RETMAE: 0.0011, CAGR: 0.0004, VOL: 0.0074, P99: 0.0304
CUMRET: sim -0.9983 + actual -0.9999 = -1.0000

MVV:MDY 	params: expense_incr=0.0003
RMSE: 0.0031, MAE: 0.0026, RETRMSE: 0.0015, RETMAE: 0.001, CAGR: 0.0002, VOL: 0.0009, P99: 0.0065
CUMRET: sim 3.4626 + actual 5.9888 = 30.1888

MYY:MDY 	params: expense_incr=0.0049
RMSE: 0.0026, MAE: 0.0022, RETRMSE: 0.0017, RETMAE: 0.0011, CAGR: -0.0002, VOL: -0.0009, P99: 0.006
CUMRET: sim -0.7471 + actual -0.8544 = -0.9632




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



MZZ:MDY 	params: expense_incr=0.0096
RMSE: 0.0088, MAE: 0.0077, RETRMSE: 0.0041, RETMAE: 0.0023, CAGR: -0.0009, VOL: 0.0081, P99: 0.0194
CUMRET: sim -0.9804 + actual -0.9870 = -0.9997

UWM:IWM 	params: expense_incr=0.0008
RMSE: 0.0095, MAE: 0.0089, RETRMSE: 0.0013, RETMAE: 0.0009, CAGR: 0.0017, VOL: 0.0007, P99: 0.0174
CUMRET: sim -0.1334 + actual 3.7907 = 3.1514




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



TNA:IWM 	params: expense_incr=0.01
RMSE: 0.0217, MAE: 0.0198, RETRMSE: 0.0021, RETMAE: 0.0011, CAGR: 0.0024, VOL: 0.0079, P99: 0.0324
CUMRET: sim -0.6441 + actual 1.2746 = -0.1904

RWM:IWM 	params: expense_incr=0.0095
RMSE: 0.0073, MAE: 0.0061, RETRMSE: 0.0009, RETMAE: 0.0006, CAGR: -0.0004, VOL: -0.0007, P99: 0.0142
CUMRET: sim -0.5349 + actual -0.8594 = -0.9346




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



TWM:IWM 	params: expense_incr=0.027
RMSE: 0.0331, MAE: 0.0285, RETRMSE: 0.0017, RETMAE: 0.001, CAGR: -0.0015, VOL: 0.0026, P99: 0.0564
CUMRET: sim -0.7893 + actual -0.9960 = -0.9991

TZA:IWM 	params: expense_incr=0.0325
RMSE: 0.0221, MAE: 0.0179, RETRMSE: 0.0022, RETMAE: 0.0012, CAGR: -0.0012, VOL: 0.0032, P99: 0.0593
CUMRET: sim -0.9991 + actual -0.9958 = -1.0000




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



URE:IYR 	params: expense_incr=-0.001
RMSE: 0.0101, MAE: 0.0086, RETRMSE: 0.0019, RETMAE: 0.0012, CAGR: 0.0017, VOL: 0.0001, P99: 0.0204
CUMRET: sim 0.0087 + actual 2.8700 = 2.9036

REK:IYR 	params: expense_incr=0.0085
RMSE: 0.0064, MAE: 0.0054, RETRMSE: 0.0025, RETMAE: 0.0018, CAGR: -0.0006, VOL: -0.0002, P99: 0.0129
CUMRET: sim -0.8344 + actual -0.7329 = -0.9558




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



UPW:IDU 	params: expense_incr=-0.0008
RMSE: 0.0077, MAE: 0.0062, RETRMSE: 0.0068, RETMAE: 0.0042, CAGR: -0.0003, VOL: 0.005, P99: 0.0207
CUMRET: sim -0.1321 + actual 6.1269 = 5.1853

SDP:IDU 	params: expense_incr=0.015
RMSE: 0.0114, MAE: 0.0092, RETRMSE: 0.0071, RETMAE: 0.0047, CAGR: 0.0014, VOL: 0.0016, P99: 0.033
CUMRET: sim -0.6953 + actual -0.9825 = -0.9947




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



SOXL:SMH 	params: expense_incr=0.04
RMSE: 0.1505, MAE: 0.1227, RETRMSE: 0.0099, RETMAE: 0.0064, CAGR: -0.0071, VOL: -0.0249, P99: 0.3593
CUMRET: sim -1.0000 + actual 14.4159 = -0.9994

SOXS:SMH 	params: expense_incr=0.0724
RMSE: 0.0564, MAE: 0.0423, RETRMSE: 0.0097, RETMAE: 0.0065, CAGR: -0.0002, VOL: -0.0355, P99: 0.1559
CUMRET: sim -0.9990 + actual -1.0000 = -1.0000



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




EFO:EFA 	params: expense_incr=0.0064
RMSE: 0.0181, MAE: 0.0136, RETRMSE: 0.0158, RETMAE: 0.0077, CAGR: 0.0022, VOL: -0.0311, P99: 0.0496
CUMRET: sim -0.2541 + actual 0.5526 = 0.1581




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



EFZ:EFA 	params: expense_incr=0.0042
RMSE: 0.003, MAE: 0.0027, RETRMSE: 0.001, RETMAE: 0.0007, CAGR: -0.0001, VOL: -0.0001, P99: 0.0053
CUMRET: sim -0.4966 + actual -0.6033 = -0.8003

EFU:EFA 	params: expense_incr=0.0114
RMSE: 0.0125, MAE: 0.0103, RETRMSE: 0.0054, RETMAE: 0.0028, CAGR: -0.0017, VOL: 0.0042, P99: 0.0282
CUMRET: sim -0.8715 + actual -0.8978 = -0.9869



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




EET:EEM 	params: expense_incr=0.0029
RMSE: 0.0049, MAE: 0.0035, RETRMSE: 0.0046, RETMAE: 0.0029, CAGR: 0.0008, VOL: 0.0051, P99: 0.0158
CUMRET: sim 2.3851 + actual -0.3052 = 1.3521



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




EUM:EEM 	params: expense_incr=0.0101
RMSE: 0.0042, MAE: 0.0036, RETRMSE: 0.0012, RETMAE: 0.0007, CAGR: -0.0001, VOL: 0.0003, P99: 0.0082
CUMRET: sim -0.7728 + actual -0.7644 = -0.9465

EEV:EEM 	params: expense_incr=0.0193
RMSE: 0.0054, MAE: 0.0042, RETRMSE: 0.0016, RETMAE: 0.0012, CAGR: 0.0, VOL: 0.0006, P99: 0.0132
CUMRET: sim -0.9973 + actual -0.8036 = -0.9995

UST:IEF 	params: expense_incr=-0.0028
RMSE: 0.0028, MAE: 0.0024, RETRMSE: 0.0014, RETMAE: 0.0009, CAGR: 0.0002, VOL: 0.0026, P99: 0.0058
CUMRET: sim 1.6583 + actual -0.0176 = 1.6114




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



TYD:IEF 	params: expense_incr=-0.0041
RMSE: 0.0114, MAE: 0.0083, RETRMSE: 0.0073, RETMAE: 0.0037, CAGR: 0.0002, VOL: -0.0179, P99: 0.0281
CUMRET: sim 1.0243 + actual 0.7046 = 2.4508

TBX:IEF 	params: expense_incr=0.0056
RMSE: 0.0022, MAE: 0.0017, RETRMSE: 0.002, RETMAE: 0.0012, CAGR: 0.0002, VOL: -0.0052, P99: 0.0067
CUMRET: sim -0.2602 + actual -0.2614 = -0.4536




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



PST:IEF 	params: expense_incr=0.0106
RMSE: 0.007, MAE: 0.0061, RETRMSE: 0.0012, RETMAE: 0.0008, CAGR: -0.0003, VOL: 0.0014, P99: 0.012
CUMRET: sim -0.5238 + actual -0.4530 = -0.7395

TYO:IEF 	params: expense_incr=0.0362
RMSE: 0.017, MAE: 0.0139, RETRMSE: 0.0038, RETMAE: 0.0025, CAGR: -0.0033, VOL: -0.0, P99: 0.0399
CUMRET: sim -0.7573 + actual -0.6907 = -0.9249



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




UBT:TLT 	params: expense_incr=-0.0052
RMSE: 0.0107, MAE: 0.0092, RETRMSE: 0.0034, RETMAE: 0.0017, CAGR: 0.0013, VOL: 0.0047, P99: 0.0194
CUMRET: sim 0.8040 + actual 0.3944 = 1.5156

TMF:TLT 	params: expense_incr=-0.0005
RMSE: 0.0047, MAE: 0.0035, RETRMSE: 0.0019, RETMAE: 0.0012, CAGR: 0.0003, VOL: 0.0071, P99: 0.0116
CUMRET: sim 0.9318 + actual -0.0685 = 0.7994



Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.




TBF:TLT 	params: expense_incr=0.0047
RMSE: 0.0037, MAE: 0.003, RETRMSE: 0.0016, RETMAE: 0.0006, CAGR: -0.0004, VOL: 0.0035, P99: 0.0062
CUMRET: sim -0.3364 + actual -0.4540 = -0.6377




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



TBT:TLT 	params: expense_incr=0.0077
RMSE: 0.0061, MAE: 0.0045, RETRMSE: 0.0029, RETMAE: 0.001, CAGR: -0.0007, VOL: 0.0102, P99: 0.0106
CUMRET: sim -0.6524 + actual -0.7564 = -0.9153

TMV:TLT 	params: expense_incr=0.0233
RMSE: 0.0179, MAE: 0.0152, RETRMSE: 0.0015, RETMAE: 0.0011, CAGR: -0.0027, VOL: 0.0042, P99: 0.0369
CUMRET: sim -0.8603 + actual -0.9276 = -0.9899

UGL:GLD 	params: expense_incr=0.0135
RMSE: 0.0231, MAE: 0.0216, RETRMSE: 0.0018, RETMAE: 0.0011, CAGR: 0.0017, VOL: -0.0023, P99: 0.035
CUMRET: sim 0.8693 + actual 0.8409 = 2.4411




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



GLL:GLD 	params: expense_incr=0.0176
RMSE: 0.0441, MAE: 0.0397, RETRMSE: 0.0019, RETMAE: 0.0011, CAGR: -0.005, VOL: -0.002, P99: 0.0811
CUMRET: sim -0.7466 + actual -0.9255 = -0.9811




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



UVXY:^VIX 	params: expense_incr=1.7237
RMSE: 0.787, MAE: 0.7281, RETRMSE: 0.0675, RETMAE: 0.0455, CAGR: -0.0852, VOL: 0.7719, P99: 1.103
CUMRET: sim -1.0000 + actual -1.0000 = -1.0000




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



VIXY:^VIX 	params: expense_incr=0.6981
RMSE: 0.2415, MAE: 0.1985, RETRMSE: 0.0491, RETMAE: 0.0328, CAGR: -0.0129, VOL: 0.6402, P99: 0.5852
CUMRET: sim -1.0000 + actual -0.9995 = -1.0000




Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


iteritems is deprecated and will be removed in a future version. Use .items instead.



In [18]:
err.eval("CAGR = abs(CAGR)\nVOL = abs(VOL)").describe()

Unnamed: 0,RMSE,MAE,RETRMSE,RETMAE,CAGR,VOL,P99
count,55.0,55.0,55.0,55.0,55.0,55.0,55.0
mean,0.074151,0.068493,0.006062,0.00339,0.004955,0.032623,0.112544
std,0.115692,0.107303,0.010765,0.007265,0.011492,0.132866,0.169713
min,0.002229,0.001736,0.001265,0.000652,4.3e-05,8.3e-05,0.006654
25%,0.016831,0.013944,0.002202,0.001176,0.000981,0.001894,0.029912
50%,0.03156,0.026359,0.002864,0.00141,0.002145,0.004828,0.064031
75%,0.101058,0.098495,0.006234,0.002581,0.00537,0.009563,0.132151
max,0.787035,0.728121,0.067503,0.045454,0.085193,0.771907,1.102981


### Telltale Charts

In [19]:
lev_colors = {name: color_leverage(factor, alpha=0.2) for name, (_, factor, _, _, _) in FUNDS.items()}
fig = tells.plot(color_discrete_map=lev_colors, title='Telltale, Simulated vs. Actual Leveraged ETFs<br>In-sample fit', render_mode='webgl')\
    .update_layout(yaxis_title='Simulated / Actual', legend_title_text='Green = long<br>Red = short')
rasterize(fig, RASTER, filename='images/telltales.png')

#### Patch up TMF / VUSTX
TMF's proxy TLT only goes back to 2002, so we use VUSTX before that.  It's not an exact proxy, but better than nothing.  ¯\_(ツ)_/¯

In [20]:
_, factor, exp, _, start = FUNDS['TMF']
sim_vustx = plotbest(prices.loc[start:, 'TMF'].dropna(), prices['VUSTX'].dropna(), factor, exp);


iteritems is deprecated and will be removed in a future version. Use .items instead.



TMF:VUSTX 	params: expense_incr=-0.0023
RMSE: 0.0301, MAE: 0.0253, RETRMSE: 0.0047, RETMAE: 0.0027, CAGR: 0.0075, VOL: -0.035, P99: 0.0796
CUMRET: sim 12.0094 + actual -0.0685 = 11.1179


Splice VUSTX + TLT + TMF

In [21]:
sim['TMF'] = splice(sim_vustx, sim['TMF'])
plotret(cat(splice(prices['VUSTX'].dropna(), prices['TLT'].dropna()), sim['TMF']), title='The Great Bond Bull Run in One Figure')


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.



Saving the reconstructed TLT too even if it's not leveraged because it might be useful

In [22]:
sim["TLT"] = splice(prices['VUSTX'].dropna(), prices['TLT'].dropna())


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.



### Reconstruct unleveraged ETFs

#### BSV, use SHY as proxy

In [25]:
# BSV, use SHY as proxy
sim_shy = plotbest(prices['BSV'].dropna(), prices['SHY'].dropna(), 1, .0004)


iteritems is deprecated and will be removed in a future version. Use .items instead.



BSV:SHY 	params: expense_incr=-0.0095
RMSE: 0.0128, MAE: 0.0098, RETRMSE: 0.0016, RETMAE: 0.0007, CAGR: 0.0025, VOL: -0.0164, P99: 0.0397
CUMRET: sim 0.1747 + actual 0.3680 = 0.6069


In [28]:
plotret(splice(prices['SHY'].dropna(), prices['BSV'].dropna()))


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.



In [30]:
sim["BSV"] = splice(prices['SHY'].dropna(), prices['BSV'].dropna())


Passing method to DatetimeIndex.get_loc is deprecated and will raise in a future version. Use index.get_indexer([item], method=...) instead.



### Save Results to CSV

In [31]:
def is_mutual_fund(ticker):
    return len(ticker) == 5 and ticker.endswith('X')

In [32]:
filename = 'extended-leveraged-etfs.csv'
pd.concat((data for name, data in sim.items() if not is_mutual_fund(name)), axis=1, verify_integrity=True).to_csv(filename, float_format='%.5f')
!du -h $filename; echo
!head -3 $filename; echo; tail -2 $filename

3.0M	extended-leveraged-etfs.csv

Date,SSO,UPRO,SPXL,SH,SDS,SPXS,QLD,TQQQ,PSQ,QID,SQQQ,MVV,MYY,MZZ,UWM,TNA,RWM,TWM,TZA,URE,REK,UPW,SDP,SOXL,SOXS,EFO,EFZ,EFU,EET,EUM,EEV,UST,TYD,TBX,PST,TYO,UBT,TMF,TBF,TBT,TMV,UGL,GLL,UVXY,VIXY,TLT,BSV
1986-05-19,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.69264,,,,,,,,10.60536,
1986-05-20,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.68808,,,,,,,,10.58416,

2022-10-27,44.51000,33.37000,63.01000,16.20000,47.66000,23.88000,37.73000,19.90000,14.53000,25.82000,55.20000,47.05000,25.38000,18.12000,33.50000,35.83000,24.18000,16.25000,34.69000,56.13000,21.13000,60.90000,12.83000,8.52000,63.28000,28.71000,22.33000,15.06000,39.99000,17.09000,29.63000,46.80000,29.48000,28.81000,22.53000,13.82000,23.27000,7.32000,23.38000,34.87000,152.58000,46.71000,37.26000,10.39000,14.86000,97.47000,74.81000
2022-10-28,46.63000,35.71000,67.46000,15.83000,45.39000,22.18000,40.06000,21.70000,14.10000,24.25000,50.18000,48.76000,24.94000,17.45000,35.00000,38.26000,23.65000,15.52000,32.33000,58.

#### Download

In [33]:
display(HTML(f'<h3><a href="{filename}" download>Download CSV</a></h3>'))

In [34]:
assert False, "The note-buck stops here"

AssertionError: The note-buck stops here

## Appendix: Model Selection
Experiements to finds the (sub)set of curve fitting parameters that minimize the out-of-sample prediction error.

This isn't necessary to use the leveraging machinery above.

In [None]:
def oos_error(funds, prices, param_ranges):
    """Find the best leverage parameters for the last half of each fund in `funds`, then use them
    to leverage the first half of each fund and compute the error."""
    tells = {}
    params = {}
    err = {}
    for name, (proxy, factor, exp, _, start) in funds.items():
        # Cut out initial bad data from leveraged fund, align with proxy
        proxy = prices[proxy].dropna()
        lfund, _ = align(prices.loc[start:, name], proxy)  # We do *not* want to modify the proxy
        # Find best params for last half of data
        assert len(lfund) > 500, "That's not enough data!"
        mid = len(lfund) // 2
        params[name] = find_params(lfund.iloc[mid:], proxy, factor, exp, params=param_ranges)
        # Use params to leverage all data
        lev = leverage(proxy, factor, exp, **params[name])
        # Compute error on first half (out of sample)
        err[name] = errstats(lfund.iloc[:mid], lev)
        # Plot the whole thing
        lfund, lev = align(prices[name], lev, norm=True)   # Plot whole series including bad initial data
        tells[name] = lev.div(lfund).rename(name)
        
    return pd.DataFrame.from_dict(err, orient='index'), pd.DataFrame.from_dict(params, orient='index'), pd.concat(tells, axis=1, verify_integrity=True)

In [None]:
def powerset(iterable):
    "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
    s = list(iterable)
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

In [None]:
%%time
# Compute out of sample error for all subsets of fitting parameters, show median + IQR for RMSE and CAGR
# This gives (roughly) equal weight to each fund in the error
result = []
for params in powerset(ALL_LEV_PARAMS):
    combo = ' + '.join(params) or 'none'
    print(combo)
    err, params, tells = oos_error(FUNDS, prices, {p: ALL_LEV_PARAMS[p] for p in params})
    summ = err.eval("CAGR = abs(CAGR)\nVOL = abs(VOL)").describe()
    summ.loc['iqr', :] = summ.loc['75%', :] - summ.loc['25%', :]
    result.append({'params': combo,
                   'RMSE': summ.loc['50%', 'RMSE'], 'RMSE_iqr': summ.loc['iqr', 'RMSE'],
                   'CAGR': summ.loc['50%', 'CAGR'], 'CAGR_iqr': summ.loc['iqr', 'CAGR'],
    })
    #display(summ)
    #display(params)
    
result = pd.DataFrame(result)

In [None]:
result.set_index('params').style.background_gradient()

In [None]:
# Show error stats for best model / fit params
err, params, tells = oos_error(FUNDS, prices, LEV_PARAMS)

In [None]:
err.describe()

In [None]:
params.describe()

In [None]:
fig = tells.plot(color_discrete_map=lev_colors, title='Telltale, Simulated vs. Actual Leveraged ETFs<br>Out-of-sample fit')\
    .update_layout(yaxis_title='Simulated / Actual', legend_title_text='Green = long<br>Red = short')
rasterize(fig, RASTER, filename='images/telltales-oos.png')