# Algothon 2023 x Fyde Treasury Protocol
*Momentum Capture for Smart Beta ETF*

## Instructions
Fyde_Algothon_Instructions.pdf contain challenge objectives, grading, and submission instructions.
README.md contains technical and installation instructions.

We have provided you with a basic script that contains the following functionalities:
1.   Load the provided synthetic data
2.   Apply an initial selection filter to choose assets
3.   Create a market-weighted index benchmark
4.   Optimize the manager's ETF portfolio
5.   Calculate a rebalancer for the portfolio
6.   Evaluate performance using variety of metrics

The main areas for improvement are sections 2, 4, and 5.

## Objective
Your job will be to build a mechanism to capture downward momentum for the portfolio. You can update any of the provided code or create new functions to accomplish your task.


## Note:
We encourage out-of-the-box thinking and the generation of new ideas. While the foundation is important, the sky's the limit when it comes to innovation. 

Good luck!

**Install Packages**

In [None]:
import sys
!{sys.executable} -m pip install -r requirements.txt

**Load Packages**

In [2]:
import pandas as pd
import numpy as np
import helper
import project_helper
import project_tests

## Section 1: Load Market Data

---

The provided CSV file contains the same underlying distribution as the comparison data that will be used for grading purposes.

In [3]:
df = pd.read_csv('Fyde_data.csv',parse_dates=['date'], dayfirst=True)
df['date'] = df['date'].dt.strftime('%Y-%m-%d-%H-%M-%S')

## Section 2: Apply selection filter

---


For this initial example we will be selecting the top 50% of volume compared to all assets.

In [4]:
percent_top_dollar = 0.5
high_volume_symbols = project_helper.large_dollar_volume_tokens(df, 'adj_close', 'adj_volume', percent_top_dollar)
df = df[df['ticker'].isin(high_volume_symbols)]

df = df.set_index(['date', 'ticker']).sort_index()

close = df.reset_index().pivot(index='date', columns='ticker', values='adj_close')
volume = df.reset_index().pivot(index='date', columns='ticker', values='adj_volume')
market_cap = df.reset_index().pivot(index='date', columns='ticker', values='adj_market_cap')

**View Data:** To see what one of these 2-d matrices looks like, let's take a look at the closing prices matrix.

In [None]:
project_helper.print_dataframe(close)

## Section 3: Market Weight Index

---


Here we will create the market weighted index that we will eventually use to compare against your ETF portfolio to see how well it performs.

In [None]:
def generate_market_cap_weights(close, market_cap):
    """
    Generate market capitalization weights.

    Parameters
    ----------
    close : DataFrame
        Close price for each ticker and date
    market_cap : str
        market_cap for each ticker and date

    Returns
    -------
    market_cap_weights : DataFrame
        The market cap weights for each ticker and date
    """
    assert close.index.equals(market_cap.index)
    assert close.columns.equals(market_cap.columns)


    weights = market_cap
    weights = weights.apply(lambda row: row / np.sum(row), axis = 1)
    
    return weights

project_tests.test_generate_market_cap_weights(generate_market_cap_weights)

**View Data:**
Let's generate the index weights using `generate_market_cap_weights` and view them using a heatmap.

In [None]:
index_weights = generate_market_cap_weights(close, market_cap)
project_helper.plot_weights(index_weights, 'Index Weights')

**Returns:**
Implement `generate_returns` to generate returns data for all the assets and dates from price data.

In [None]:
def generate_returns(prices):
    """
    Generate returns for ticker and date.

    Parameters
    ----------
    prices : DataFrame
        Price for each ticker and date

    Returns
    -------
    returns : Dataframe
        The returns for each ticker and date
    """
    
    log_returns = np.log(prices / prices.shift(1))

    return log_returns

project_tests.test_generate_returns(generate_returns)

**View Data:**
Let's generate the closing returns using `generate_returns` and view them using a heatmap.

In [None]:
returns = generate_returns(close)
project_helper.plot_returns(returns, 'Close Returns')

**Weighted Returns:** With the returns of assets computed, we can use it to compute the returns for an index or ETF. Implement `generate_weighted_returns` to create weighted returns using the returns and weights.

In [None]:
def generate_weighted_returns(returns, weights):
    """
    Generate weighted returns.

    Parameters
    ----------
    returns : DataFrame
        Returns for each ticker and date
    weights : DataFrame
        Weights for each ticker and date

    Returns
    -------
    weighted_returns : DataFrame
        Weighted returns for each ticker and date
    """
    assert returns.index.equals(weights.index)
    assert returns.columns.equals(weights.columns)

    return returns * weights

project_tests.test_generate_weighted_returns(generate_weighted_returns)

**View Data:** Let's generate the ETF and index returns using `generate_weighted_returns` and view them using a heatmap.

In [None]:
index_weighted_returns = generate_weighted_returns(returns, index_weights)
project_helper.plot_returns(index_weighted_returns, 'Index Returns')

## Section 4: Portfolio Optimization

---



*Note*: We  want to maintain a hard rule that the **weights sum to one and be long only, so no negative weights.**

Otherwise you are free to change any of these calculations, objectives or methodology.



Now, let's create our own ETF portfolio that we will compare against the market cap weighted index. Initially,  we're minimizing the distance between the weights of our portfolio and the weights of the index.

$Minimize \left [ \sigma^2_p + \lambda \sqrt{\sum_{1}^{m}(weight_i - indexWeight_i)^2} \right  ]$ where $m$ is the number of stocks in the portfolio, and $\lambda$ is a scaling factor that you can choose.

Why are we doing this? One way that investors evaluate a fund is by how well it tracks its index. The fund is still expected to deviate from the index within a certain range in order to improve fund performance.  


##### Covariance
Implement `get_covariance_returns` to calculate the covariance of the `returns`. We'll use this to calculate the portfolio variance.

In [None]:
def get_covariance_returns(returns):
    """
    Calculate covariance matrices.

    Parameters
    ----------
    returns : DataFrame
        Returns for each ticker and date

    Returns
    -------
    returns_covariance  : 2 dimensional Ndarray
        The covariance of the returns
    """
    return np.cov( returns.fillna(0).T )


project_tests.test_get_covariance_returns(get_covariance_returns)

**View Data:**
Let's look at the covariance generated from `get_covariance_returns`.

In [None]:
covariance_returns = get_covariance_returns(returns)
covariance_returns = pd.DataFrame(covariance_returns, returns.columns, returns.columns)

covariance_returns_correlation = np.linalg.inv(np.diag(np.sqrt(np.diag(covariance_returns))))
covariance_returns_correlation = pd.DataFrame(
    covariance_returns_correlation.dot(covariance_returns).dot(covariance_returns_correlation),
    covariance_returns.index,
    covariance_returns.columns)

project_helper.plot_covariance_returns_correlation(
    covariance_returns_correlation,
    'Covariance Returns Correlation Matrix')

**Optimization:**
cvxpy has the constructor `Problem(objective, constraints)`, which returns a `Problem` object.

The `Problem` object has a function solve(), which returns the minimum of the solution.  In this case, this is the minimum variance of the portfolio.

It also updates the vector $\mathbf{x}$.

We can check out the values of $x_A$ and $x_B$ that gave the minimum portfolio variance by using `x.value`

In [None]:
import cvxpy as cvx

def get_optimal_weights(covariance_returns, index_weights, scale=2.0):
    """
    Find the optimal weights.

    Parameters
    ----------
    covariance_returns : 2 dimensional Ndarray
        The covariance of the returns
    index_weights : Pandas Series
        Index weights for all tickers at a period in time
    scale : int
        The penalty factor for weights the deviate from the index
    Returns
    -------
    x : 1 dimensional Ndarray
        The solution for x
    """
    assert len(covariance_returns.shape) == 2
    assert len(index_weights.shape) == 1
    assert covariance_returns.shape[0] == covariance_returns.shape[1]  == index_weights.shape[0]

    m = len(covariance_returns)
    x = cvx.Variable(m)
    portfolio_variance = cvx.quad_form(x, covariance_returns)
    distance_to_index = cvx.norm(x - index_weights)
    objectives = cvx.Minimize(portfolio_variance + scale * distance_to_index)
    constraints = [x >= 0, sum(x) == 1]
    result = cvx.Problem(objectives, constraints).solve()

    return x.value

project_tests.test_get_optimal_weights(get_optimal_weights)

**Optimized Portfolio**
Using the `get_optimal_weights` function, let's generate the optimal ETF weights without rebalanceing. We can do this by feeding in the covariance of the entire history of data. We also need to feed in a set of index weights. We'll go with the average weights of the index over time.

In [15]:
raw_optimal_single_rebalance_etf_weights = get_optimal_weights(covariance_returns.values, index_weights.iloc[-1])
optimal_single_rebalance_etf_weights = pd.DataFrame(
    np.tile(raw_optimal_single_rebalance_etf_weights, (len(returns.index), 1)),
    returns.index,
    returns.columns)

## Section 5: Rebalance Portfolio Over Time

---


The single optimized ETF portfolio used the same weights for the entire history. This might not be the optimal weights for the entire period. Let's rebalance the portfolio over the same period instead of using the same weights. Implement `rebalance_portfolio` to rebalance a portfolio.

Reblance the portfolio every n number of days, which is given as `shift_size`. When rebalancing, you should look back a certain number of days of data in the past, denoted as `chunk_size`. Using this data, compute the optimimal weights using `get_optimal_weights` and `get_covariance_returns`.

In [None]:
def rebalance_portfolio(returns, index_weights, shift_size, chunk_size):
    """
    Get weights for each rebalancing of the portfolio.

    Parameters
    ----------
    returns : DataFrame
        Returns for each ticker and date
    index_weights : DataFrame
        Index weight for each ticker and date
    shift_size : int
        The number of days between each rebalance
    chunk_size : int
        The number of days to look in the past for rebalancing

    Returns
    -------
    all_rebalance_weights  : list of Ndarrays
        The ETF weights for each point they are rebalanced
    """
    assert returns.index.equals(index_weights.index)
    assert returns.columns.equals(index_weights.columns)
    assert shift_size > 0
    assert chunk_size >= 0

    weights = []
    for i in range(chunk_size, len(returns), shift_size):
        covariance_returns = get_covariance_returns(returns[i-chunk_size : i])
        weights.append(get_optimal_weights(covariance_returns, index_weights.iloc[i-1]) )
    return weights

project_tests.test_rebalance_portfolio(rebalance_portfolio)

Run the following cell to rebalance the portfolio using `rebalance_portfolio`.

In [None]:
chunk_size = 250
shift_size = 5
all_rebalance_weights = rebalance_portfolio(returns, index_weights, shift_size, chunk_size)

**Portfolio Turnover:**
With the portfolio rebalanced, we need to use a metric to measure the cost of rebalancing the portfolio. Implement `get_portfolio_turnover` to calculate the annual portfolio turnover.

$ AnnualizedTurnover =\frac{SumTotalTurnover}{NumberOfRebalanceEvents} * NumberofRebalanceEventsPerYear $

$ SumTotalTurnover =\sum_{t,n}{\left | x_{t,n} - x_{t+1,n} \right |} $ Where $ x_{t,n} $ are the weights at time $ t $ for equity $ n $.

$ SumTotalTurnover $ is just a different way of writing $ \sum \left | x_{t_1,n} - x_{t_2,n} \right | $

In [None]:
def get_portfolio_turnover(all_rebalance_weights, shift_size, rebalance_count, n_trading_days_in_year=252):
    """
    Calculage portfolio turnover.

    Parameters
    ----------
    all_rebalance_weights : list of Ndarrays
        The ETF weights for each point they are rebalanced
    shift_size : int
        The number of days between each rebalance
    rebalance_count : int
        Number of times the portfolio was rebalanced
    n_trading_days_in_year: int
        Number of trading days in a year

    Returns
    -------
    portfolio_turnover  : float
        The portfolio turnover
    """
    assert shift_size > 0
    assert rebalance_count > 0

    sumTurnOver = 0
    for i in range( len(all_rebalance_weights)-1 ):
        sumTurnOver += np.sum( np.abs( all_rebalance_weights[i] - all_rebalance_weights[i+1] ) )
    year_count = n_trading_days_in_year / shift_size
    return sumTurnOver / rebalance_count * year_count

project_tests.test_get_portfolio_turnover(get_portfolio_turnover)

Run the following cell to get the portfolio turnover from  `get_portfolio turnover`.

In [None]:
print(get_portfolio_turnover(all_rebalance_weights, shift_size, len(all_rebalance_weights) - 1))

## Section 6: Results Comparison
With our ETF weights built, let's compare it to the index. Run the next cell to calculate the ETF returns and compare it to the index returns.

**Cumulative Returns:**
To compare performance between the ETF and Index, we're going to calculate the tracking error. Before we do that, we first need to calculate the index and ETF comulative returns. Implement `calculate_cumulative_returns` to calculate the cumulative returns over time given the returns.

In [None]:
def calculate_cumulative_returns(returns):
    """
    Calculate cumulative returns.

    Parameters
    ----------
    returns : DataFrame
        Returns for each ticker and date

    Returns
    -------
    cumulative_returns : Pandas Series
        Cumulative returns for each date
    """
    total_returns = returns.apply(lambda row: np.sum(row), axis = 1) + 1
    cumulative_returns = total_returns.cumprod()
    cumulative_returns.iloc[0] = np.nan
    return cumulative_returns

project_tests.test_calculate_cumulative_returns(calculate_cumulative_returns)

In [None]:
index_weighted_cumulative_returns = calculate_cumulative_returns(index_weighted_returns)
optim_etf_returns = generate_weighted_returns(returns, optimal_single_rebalance_etf_weights)
optim_etf_cumulative_returns = calculate_cumulative_returns(optim_etf_returns)

project_helper.plot_benchmark_returns(index_weighted_cumulative_returns, optim_etf_cumulative_returns, 'Optimized ETF vs Index')

In [None]:
print(index_weighted_cumulative_returns)
print(optim_etf_cumulative_returns)

**Tracking Error:**
In order to check the performance of the smart beta portfolio, we can calculate the annualized tracking error against the index. Implement `tracking_error` to return the tracking error between the ETF and benchmark.

In [None]:
def tracking_error(benchmark_returns_by_date, etf_returns_by_date):
    """
    Calculate the tracking error.

    Parameters
    ----------
    benchmark_returns_by_date : Pandas Series
        The benchmark returns for each date
    etf_returns_by_date : Pandas Series
        The ETF returns for each date

    Returns
    -------
    tracking_error : float
        The tracking error
    """
    assert benchmark_returns_by_date.index.equals(etf_returns_by_date.index)


    return np.sqrt(365) * (benchmark_returns_by_date - etf_returns_by_date).std()

project_tests.test_tracking_error(tracking_error)

**View Data:**
Let's generate the tracking error using `tracking_error`.

In [None]:
optim_etf_tracking_error = tracking_error(np.sum(index_weighted_returns, 1), np.sum(optim_etf_returns, 1))
print('Optimized ETF Tracking Error: {}'.format(optim_etf_tracking_error))

**Sharp Ratio:**
Now we will calculate sharpe ratio, a measure of risk-adjsuted returns.

In [None]:
def sharpe_ratio(returns_by_date, risk_free_rate_annual=0.03):
    """
    Calculate the annualized Sharpe Ratio.

    Parameters
    ----------
    returns_by_date : Pandas Series
        The asset returns for each date
    risk_free_rate_annual : float, optional
        The annual risk-free rate. Default is 0.03 (or 3%)

    Returns
    -------
    sharpe_ratio : float
        The annualized Sharpe Ratio
    """
    # Convert annual risk free rate to daily
    risk_free_rate_daily = (1 + risk_free_rate_annual) ** (1/365) - 1
    
    # Calculate the average daily return and daily standard deviation
    avg_daily_return = returns_by_date.mean()
    daily_std_dev = returns_by_date.std()

    # Calculate the annualized Sharpe Ratio
    return np.sqrt(365) * (avg_daily_return - risk_free_rate_daily) / daily_std_dev

project_tests.test_sharpe_ratio(sharpe_ratio)


In [26]:
index_sharpe = sharpe_ratio(index_weighted_returns, risk_free_rate_annual=0.03)
optim_etf_sharpe = sharpe_ratio(optim_etf_returns, risk_free_rate_annual=0.03)

**View Data:**
Let's generate the sharp ratio using `sharpe_ratio`.

In [None]:
print(index_sharpe.iloc[0])
print(optim_etf_sharpe.iloc[0])

**Sortino Ratio:**
The Sortino ratio is a variation of the Sharpe ratio, but instead of using the standard deviation of the returns, it uses the standard deviation of the negative returns (downside deviation). 

In [None]:
def sortino_ratio(returns_by_date, risk_free_rate_annual):
    """
    Calculate the Sortino ratio.

    Parameters
    ----------
    returns_by_date : Pandas Series
        The portfolio returns for each date
    risk_free_rate_annual : float
        The annual risk-free rate

    Returns
    -------
    sortino_ratio : float
        The Sortino ratio
    """
    # Calculate the annualized portfolio return
    annualized_return = (1 + returns_by_date.mean()) ** 365 - 1
    
    # Calculate the annualized risk-free rate per period
    risk_free_rate_per_period = (1 + risk_free_rate_annual) ** (1/365) - 1
    
    # Calculate the excess return
    excess_return = annualized_return - risk_free_rate_annual
    
    # Calculate the downside deviation
    negative_returns = returns_by_date[returns_by_date < risk_free_rate_per_period]
    downside_deviation = negative_returns.std() * np.sqrt(365)
    
    # Calculate the Sortino ratio
    sortino = excess_return / downside_deviation
    
    return sortino


project_tests.test_sortino_ratio(sortino_ratio)

In [29]:
index_sortino = sortino_ratio(index_weighted_returns, risk_free_rate_annual=0.03)
optim_etf_sortino = sortino_ratio(optim_etf_returns, risk_free_rate_annual=0.03)

**View Data:**
Let's generate the sortino ratio using `sortino_ratio`.

In [None]:
print(index_sortino.iloc[0])
print(optim_etf_sortino.iloc[0])