# Risk Balancing Asset Allocation Algorithm for Portfolio Management

Python-based project that aims to optimize portfolio allocation by implementing a risk management strategy while targeting a specific volatility level. The algorithm distributes investments across different assets to achieve a balanced risk exposure, resulting in a more stable and resilient portfolio with a desired volatility target.

Integrated weight scaling by a factor using technical analysis involving price and volume data of individual assets to identify patterns, trends, and indicators that can help predict future price movements. 

Moving averages, stochastic oscillators, and various chart patterns, trends and estimations are incorporated in these indicators,adjusting the weights of assets in portfolio based on the signals they generate.

## Allocation Features

* Risk parity alogrithm
* Targeted Risk Contribution & Volatility Scaling
* Rebalanced weights on frequency
* Factor scaling

## Financial Models & Analysis
* Distance vector estimation & factoring
* Kalman Filter
* SLSQP


In [21]:
import numpy as np
import pandas as pd
import yfinance as yf

import math
from scipy.optimize import minimize
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
from dateutil.relativedelta import relativedelta
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from datetime import datetime

import dataframe_image as dfi
from xhtml2pdf import pisa
import base64
import io

import warnings
warnings.filterwarnings('ignore')

## Market data

Pulled market time series from Yahoo Finance, we will be using S&P500 (`^SPX`), 7-10Y Treasury Bonds(`IEF`), Gold (`GLD`) and Coporate Bonds (`LQD`). Dynamic allocation window will be from 2005 to 2023, rebalanced monthly with a lookback window of 6 months.


In [22]:
tickers = ['^SPX', 'IEF', 'GLD','LQD']
start_date = '2005-01-01'
end_date = datetime.now().strftime('%Y-%m-%d')
look_back_time = 6 #6months
frequency = 'monthly' # monthly

data = yf.download(tickers, start=start_date, end=end_date)['Close'].dropna()

[*********************100%***********************]  4 of 4 completed


## Data

Ensure cleaned data, NaN values and non zero values are cleaned from dataset

In [23]:
data.describe()

Unnamed: 0,GLD,IEF,LQD,^SPX
count,4696.0,4696.0,4696.0,4696.0
mean,121.735656,100.490266,114.599142,2151.107267
std,38.23405,10.754113,9.677588,1053.974213
min,41.259998,79.589996,81.699997,676.530029
25%,97.43,91.6075,107.269997,1295.925049
50%,121.735001,102.514999,114.475002,1873.860046
75%,155.842499,107.542501,120.230003,2797.857544
max,193.889999,123.059998,139.149994,4796.560059


In [24]:
(data < 0 ).any()

GLD     False
IEF     False
LQD     False
^SPX    False
dtype: bool

## Basic Portfolio Functions

- `portfolio_return(weights, rets)`: This function calculates the portfolio return by taking the dot product of the transpose of the `weights` array a e mean retur ) arra The result is multiplied by 252 to annualize the return.

- `portfolio_variance(weights, rets)`: This function computes the portfolio variance by taking the dot product of the transpose of the `weights` array, the covariance matrix of returns (`rets.cov()`), and the `weights` array. The result is multiplied by to annualize the volatility252.

- `portfolio_volatility(weights, rets)`: This function calculates the portfolio volatility by taking the square root of the `portfolio_variance` function's output using the `math.sqrt()` function.

- `rel_risk_contributions(weights, rets)`: This function determines the relative risk contributions of each asset in the portfolio. It computes the portfolio volatility (`vol`), the covariance matrix (`cov`), and multiplies it by the `weights` array divided by `vol` (`np.dot(cov, weights) / vol`). The result is stored in `mvols`. The relative risk contributions (`rc`) are calculated by multiplying `mvols` with `weights`. Finally, `rrc` (relative risk contributions scaled to sum up to 1) is obtained by dividing `rc` by the sum of `rc`.

- `mse_risk_contributions(weights, target, rets)`: This function calculates the mean squared error (MSE) of the relative risk contributions (`rc`) compared to a target array (`target`). It first computes `rc` by calling `rel_risk_contributions(weights, rets)`. Then, it calculates the MSE by squaring the difference between `rc` and `target`, taking the mean, and multiplying it by 100.

These functions can be used for portfolio analysis and risk attribution tasks.
k attribution tasks.


In [25]:
def portfolio_return(weights, rets):
    return np.dot(weights.T, rets.mean()) * 252


def portfolio_variance(weights, rets):
    return np.dot(weights.T, np.dot(rets.cov(), weights)) * 252


def portfolio_volatility(weights, rets):
    return math.sqrt(portfolio_variance(weights, rets))


def rel_risk_contributions(weights, rets):
    vol = portfolio_volatility(weights, rets)
    cov = rets.cov()
    mvols = np.dot(cov, weights) / vol
    rc = mvols * weights
    rrc = rc / rc.sum()
    return rrc


def mse_risk_contributions(weights, target, rets):
    rc = rel_risk_contributions(weights, rets)
    mse = ((rc - target) ** 2).mean()
    return mse * 100



## Scaling Factor with SMA & EMA

### generate_ma_table
This function computes the percentage difference between the 200-day simple moving average (SMA) and the 20-day exponential moving average (EMA) for each ticker in the provided dataset. The results are then resampled to a monthly frequency, returning the last data point for each month.

Parameters:
- `data`: DataFrame containing the price data for each ticker.
- `start_date`: Starting date for the analysis.
- `end_date`: Ending date for the analysis.

Returns:
- A DataFrame containing the monthly percentage differences between the 200-day SMA and 20-day EMA for each
### generate_signal_table
This function generates a binary signal based on the difference between the 200-day SMA and the 20-day EMA for each ticker. A `True` signal indicates that the 20-day EMA is above the 200-day SMA, while a `False` signal indicates the opposite. The results are then resampled to a monthly frequency, returning the last data point for each month.

Parameters:
- `data`: DataFrame containing the price data for each ticker.
- `start_date`: Starting date for the analysis.
- `end_date`: Ending date for the analysis.

Returns:
- A DataFrame containing the monthly binary signals for each

### scale_weight_factor
This function scales the weights of assets based on a provided scaling array. The scaling is determined by specific thresholds in the scaling array. For instance, if a value in the scaling array is greater than 0.002, its corresponding weight is multiplied by 2.5. After scaling, the weights are normalized to ensure they sum up to 1.

Parameters:
- `weight_arr`: Array containing the weights of the assets.
- `scale_arr`: Array containing the scaling values for each asset.

Returns:
- A scaled and normalized weight array.
 ticker.
 ticker.


In [26]:
def generate_ma_table(data, start_date, end_date,):
    # Create an empty DataFrame to store the moving average differences
    ma_diff_table = pd.DataFrame()

    for ticker in data.columns:
        # Calculate the 200-day moving average for each ticker
        ma200 = ta.trend.sma_indicator(data[ticker], window=200)

        # Calculate the 20-day exponential moving average for each ticker
        ema20 = ta.trend.ema_indicator(data[ticker], window=20)

        # Calculate the percentage difference between the 200-day MA and the 20-day EMA
        ma_diff_percentage = ((ema20 - ma200) / ma200)

        # Add the moving average difference to the ma_diff_table DataFrame
        ma_diff_table[ticker] = ma_diff_percentage
        ma_diff_table = ma_diff_table.dropna()

        monthly_data = ma_diff_table.resample('M').last()
        
    return monthly_data[tickers]

def generate_signal_table(data, start_date, end_date,):
    # Create an empty DataFrame to store the moving average differences
    ma_diff_table = pd.DataFrame()

    for ticker in data.columns:
        # Calculate the 200-day moving average for each ticker
        ma200 = ta.trend.sma_indicator(data[ticker], window=200)

        # Calculate the 20-day exponential moving average for each ticker
        ema20 = ta.trend.ema_indicator(data[ticker], window=20)

        # Calculate the percentage difference between the 200-day MA and the 20-day EMA
        ma_diff_percentage = ((ema20 - ma200) / ma200)

        # Add the moving average difference to the ma_diff_table DataFrame
        ma_diff_table[ticker] = ma_diff_percentage > 0
        ma_diff_table = ma_diff_table.dropna()

        monthly_data = ma_diff_table.resample('M').last()
        
    return monthly_data[tickers]

def scale_weight_factor (weight_arr, scale_arr):
    # Reshape weight_arr and scale_arr to be 1-dimensional
    weight_arr = weight_arr.reshape(-1)
    scale_arr = scale_arr.reshape(-1)
    
    # Initialize an array of ones for the scaling factors
    scale_factors = np.ones_like(scale_arr)
    # Compute the scale factors for values above 0.005 and below or equal to 0.05
    indices = (scale_arr > 0.002) 
    scale_factors[indices] = 2.5
    
    indices = (scale_arr < 0.001)
    scale_factors[indices] = 0.5
    
#     indices = (scale_arr > 0.06) 
#     scale_factors[indices] = 2 - 5 * (scale_arr[indices])
    
#     # Keep the scale factors at 1.1 for values above 0.05 and below or equal to 0.07
#     indices = (scale_arr > 0.05) & (scale_arr <= 0.07)
#     scale_factors[indices] = 2
    
    #Compute the scale factors for values above 0.07
#     indices = scale_arr >= 0.10
#     scale_factors[indices] = 2 - * (scale_arr[indices] - 0.005)
    
    # Scale the elements in weight_arr by their corresponding scale factor
    weight_arr *= scale_factors
    # Normalize the weights so they sum up to 1

    
    if np.sum(weight_arr) > 1:
        return weight_arr / np.sum(weight_arr)
    
    return weight_arr


## Targeted Risk Contribution Function

### risk_parity
This function determines the optimal asset weights for a portfolio based on the risk parity approach, ensuring each asset contributes equally to the overall portfolio risk. The function uses the SLSQP optimization algorithm to minimize the mean squared error between the relative risk contributions of the assets and a given target. If any asset's relative risk contribution falls below a threshold, a penalty-based approach is applied, and the optimization is repeated using the 'trust-constr' algorithm. The optimal weights are then scaled to achieve a desired portfolio volatility. If the sum of the scaled weights exceeds 1, they are normalized.

Parameters:
- `target`: Array of target risk contributions for each asset.
- `target_vol`: Desired portfolio volatility.
- `returns`: DataFrame containing historical returns for each asset.
- `tickers`: List of asset tickers.

Returns:
- An array of optimal asset weights that achieve risk parity while targeting the specified portfolio v

### Note
Some parts of the code are commented out as they were intended to provide additional checks for risk contribution thresholds. If needed, these lines can be uncommented.olatility.


In [27]:
def risk_parity(target, target_vol, returns,tickers):
    
    noa = len(tickers)
    weight = np.ones(noa) / noa
    
    bnds = noa * [(0.10, None),]  # Lower bound set to 0
    # The constraint that weights must sum to 1
    cons = ({'type': 'eq', 'fun': lambda w:  np.sum(w) - 1.0})

    opt = minimize(lambda w: mse_risk_contributions(w, target=target,rets=returns),
                   weight,
                   bounds = bnds,  # added constraints here
                   method='SLSQP')  # 'trust-constr' algorithm

    optimal_weights = opt['x']
    
#     ###if weights cant be found with SLSQP -> deploy trust-constr for more stricter check
#     if not all(i > 0.09 for i in optimal_weights):
#         opt = minimize(lambda w: mse_risk_contributions(w, target=target,rets=returns),
#                        optimal_weights,
#                        bounds = bnds,
#                        constraints = cons,
#                        method='trust-constr')  # 'trust-constr' algorithm
#         optimal_weights = opt['x']
#         print('optimal_weights below level')
    
    if any(i < 0.05 for i in rel_risk_contributions(optimal_weights,returns)):
        threshold = 0.05
        def mse_risk_contributions2(weights, target, rets, threshold, penalty_factor):
            rc = rel_risk_contributions(weights, rets)
            mse = ((rc - target) ** 2).mean()

            # Compute the penalty for constraint violation
            penalty = penalty_factor * np.sum((threshold - rc[rc < threshold]) ** 2)

            return (mse + penalty) * 100
        
        penalty_factor = 10000 # Adjust this to make the constraint more or less strict

        opt = minimize(lambda w: mse_risk_contributions2(w, target=target, rets=returns, threshold=threshold, penalty_factor=penalty_factor),
                       weight,
                       bounds=bnds,
                       method='trust-constr')

        optimal_weights = opt['x']

    
    unscaled_vol = portfolio_volatility(optimal_weights, returns)
    scale_factor = target_vol / unscaled_vol
    optimal_weights_scaled = optimal_weights * scale_factor
    sum_weights = np.sum(optimal_weights_scaled)

        
    
    if sum_weights > 1:
        normalized_weights = optimal_weights_scaled / sum_weights
        return normalized_weights
    return optimal_weights_scaled

## Dynamic Asset Allocation Strategy

### risk_balance
This function determines the optimal asset weights for a portfolio based on the risk parity approach over a specified time period. The portfolio is rebalanced at a specified frequency (either weekly or monthly). For each rebalancing period, the function calculates the risk contributions, portfolio volatility, and leverage. The function starts by downloading historical data for the specified tickers and calculates the logarithmic returns. It then uses the `risk_parity` function to determine the optimal weights for each rebalancing period, considering the risk contributions and the desired portfolio volatility. The weights are further adjusted based on moving average differences using the `scale_weight_factor` function. The function constructs tables for weights, risk contributions, portfolio volatility, and leverage for each rebalancing period and returns a consolidated table containing all this information.

Parameters:
- `tickers`: List of asset tickers.
- `start_date`: Starting date for the analysis.
- `end_date`: Ending date for the analysis.
- `look_back_time`: Number of months to look back for calculating returns.
- `frequency`: Rebalancing frequency, either 'weekly' or 'monthly'.

Returns:
- A DataFrame containing the optimal asset weights, risk contributions, portfolio volatility, and leverage for each rebalancing period.


In [28]:
def risk_balance(tickers,start_date,end_date,look_back_time, frequency):
    
    ###### Extra for Distance
    
    
    ######
    
    # Download historical data as dataframe
    data = yf.download(tickers, start=start_date, end=end_date)

    # Use only Adjusted Close prices
    data = data['Close']

    # Calculate returns
    data = data.dropna()
    
    ma_diff_table = generate_ma_table(data,start_date,end_date) # 2005-10-31	0.072032	-0.003128	-0.005133	-0.003325

    rets = np.log(data / data.shift(1))

    noa = len(tickers) # number of assets

    weight = np.ones(noa) / noa

    # Rebalancing monthly
    start_date = pd.to_datetime(start_date)
    end_date = pd.to_datetime(end_date)
    current_date = start_date + relativedelta(months=look_back_time) # start after 6 months

    weights_table = []
    risk_table = []
    volatility_table = []


    # Initialize the tables
    weights_table = []
    risk_table = []
    volatility_table = []
    leverage_table = []

    # Now start the loop
    while current_date <= end_date:
        # Select the relevant six months of returns
        six_month_data = rets[(rets.index < current_date) & (rets.index >= current_date - relativedelta(months=look_back_time))]
        
        scale_arr = ma_diff_table[(ma_diff_table.index.year == current_date.year) & (ma_diff_table.index.month == current_date.month)].values
        
        target = np.ones(noa) / noa
        optimal_weights = risk_parity([0.40,0.40,0.10,0.10],0.10, six_month_data,tickers)
        
        if scale_arr.size != 0:
            optimal_weights = scale_weight_factor(optimal_weights,scale_arr)

        # Weight table
        weights_table.append(pd.Series(optimal_weights, index=tickers, name=current_date))

        # Calculate and add risk contributions to risk_table
        risk_contributions = rel_risk_contributions(optimal_weights, six_month_data)
        risk_table.append(pd.Series(risk_contributions, index=tickers, name=current_date))

        # Calculate and add volatility to vol_table
        # Calculate and add annualized portfolio volatility to volatility_table
        port_volatility = portfolio_volatility(optimal_weights, six_month_data)
        volatility_table.append(pd.Series(port_volatility, index=['Volatility'], name=current_date))

        #Leverage
        total_leverage = sum(optimal_weights)
        leverage_table.append(pd.Series(total_leverage, index=['Leverage'],name=current_date))
        
        # Update current_date based on specified frequency
        if frequency == 'weekly':
            current_date = current_date + pd.DateOffset(weeks=1)
        elif frequency == 'monthly':
            current_date = current_date + relativedelta(months=1)
        else:
            print("Invalid frequency input. Please choose either 'weekly' or 'monthly'.")
            return None
        
#         if any(i < 0.04 for i in rel_risk_contributions(optimal_weights,six_month_data)):
#             print(current_date)
#             print(rel_risk_contributions(optimal_weights,six_month_data))


        
    weights_table = pd.concat(weights_table, axis=1).T
    risk_table = pd.concat(risk_table, axis=1).T
    volatility_table = pd.concat(volatility_table, axis=1).T
    leverage_table = pd.concat(leverage_table,axis=1).T

    final_table = pd.concat([weights_table, risk_table.add_suffix('_risk'), volatility_table, leverage_table], axis=1)


#     final_table = final_table.applymap(lambda x: "{:.0f}%".format(x*100))

    final_table = final_table.applymap(lambda x: x*100)
    final_table = final_table.round(2)
    pd.set_option('display.max_rows', None)
#     pd.set_option('display.float_format', lambda x: '%.2f' % x)
    col_order = list(tickers) + [ticker + '_risk' for ticker in tickers] + ['Volatility'] + ['Leverage']
    
    return final_table[col_order]

## Creating Dataframe

### df2
The provided code segment executes the `risk_balance` function, which determines the optimal asset allocation for a portfolio based on the risk parity approach over a specified time frame. The function's output, a DataFrame named `df2`, contains details about the optimal asset weights, risk contributions, portfolio volatility, and leverage for each rebalancing period. The index of this DataFrame, originally in DateTime format, is then converted to a simple date format for clarity. Finally, the last five rows of the DataFrame are displayed, providing a recent snapshot of the portfolio's risk-balanced allocations and associated metrics.

Returns:
- A Snapshot of the most recent 5 months


In [29]:
df2 = risk_balance(tickers,start_date,end_date,look_back_time, 'monthly')
df2.index = df2.index.date
df2.tail()

[*********************100%***********************]  4 of 4 completed


Unnamed: 0,^SPX,IEF,GLD,LQD,^SPX_risk,IEF_risk,GLD_risk,LQD_risk,Volatility,Leverage
2023-04-01,30.95,47.68,11.01,10.36,40.0,40.0,10.0,10.0,10.98,100.0
2023-05-01,30.23,46.17,11.2,12.39,40.0,40.0,10.0,10.0,10.25,100.0
2023-06-01,42.87,12.46,16.35,28.33,57.77,8.36,11.45,22.41,9.41,100.0
2023-07-01,40.66,11.7,15.47,32.17,55.16,8.16,11.58,25.09,8.58,100.0
2023-08-01,54.97,15.5,20.42,9.11,74.55,11.88,13.27,0.31,10.04,100.0
