## Revtsov Midterm Project (Part 2)

### Discussion

The goal of this project is to compare how different portfolio optimization methods (mean-variance, mean-MAD, and mean-CVaR) and data frequencies (daily, weekly, monthly, quarterly, yearly) impact asset allocation within a universe of ETFs (IWB, IWM, EFA, EEM, VNQ, LQD, SHY).

Upon analysis, we observed that all three optimization methods consistently allocate the capital to three key assets: IWB, LQD, and SHY. However, nuanced differences emerge among the methods in terms of asset preferences. Mean-Var, on average, exhibits a higher propensity for allocating capital to IWB compared to Mean-MAD and Mean-CVaR. Conversely, Mean-CVaR tends to allocate the least to IWB. Mean-CVaR demonstrates a distinct preference for LQD over SHY, particularly as the portfolio return target increases, whereas Mean-Var favors a combination of IWB and SHY over LQD. With LQD having a low CVaR and the Mean-Var approach focusing on risk/return ratio, this decision makes sense.

Despite these allocation disparities, the resulting portfolio statistics—such as standard deviation, MAD, VaR, and CVaR, remain remarkably similar across the different optimization metrics. This phenomenon resonates with the findings of Mark Kritzman's paper "Are Optimizers Error Maximisers". Although Kritzman discussed variations in inputs while we're comparing differnt risk metrics, the point of different allocations yielding similar portfolio characteristics still stands.

The analysis also sheds light on the influence of periodicity on portfolio allocation. Notably, the balance between LQD and SHY fluctuates with different frequencies of returns, with lower allocations to LQD observed in monthly and quarterly frequencies. Additionally, monthly and quarterly frequencies exhibit a slightly higher allocation to IWB on average, indicating a subtle periodicity-driven shift in asset allocation preferences. Potentially because some frequencies smooth out short-term volatility, making equities appear more attractive.

As expected, the target return significantly impacts asset allocation. Higher return targets lead to a greater allocation to equities and a decrease in defensive fixed income. Investors prioritize potentially higher returns from equities when aiming for ambitious targets. Interestingly, at a 6% target return, the impact of data frequency is more pronounced, with a larger dispersion of weights across frequencies. This suggests that the choice of frequency becomes more crucial when targeting higher returns.

The initial prediction that IWB would be the primary equity choice due to its favorable risk-return profile (higher return per unit of risk) was confirmed by the results. The meaningful allocations to SHY also align with earlier predications. This low-risk diversifying asset played a role in most resulting portfolios. However, the significant allocation to LQD, especially with the mean-CVaR method, was contrary to expectations. This highlights the importance of LQD's low correlation with other assets and low standalone CVaR; it offers valuable diversification and potentially improving the portfolio's CVaR.

In conclusion, this analysis reinforces the importance of considering different risk measures and data frequencies when constructing optimal portfolios. While the overall portfolio characteristics may be similar, specific asset allocations can vary. However, if the primary focus is on overall portfolio risk-return the choice of optimization method is not as critical. Further analysis could explore the impact of transaction costs and incorporate additional asset classes to see if the observed patterns persist.

### Results
It's worth noting that although the mean, standard deviation, and MAD measures are annualized, VaR and CVaR are not. Annualizing the latter two measures implies that the periodic tail events are observed for the duration of the year, which I am not sure is intuitive. For example, annalizing the daily 90% VaR assumes that return in the 10th worse percentile happens every day for  year.

#### Portfolio Statistics

In [31]:
stats.T.style.format('{:,.3%}')

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Mean,StDev,MAD,VaR90,CVaR90
Frequency,Return Target,Method,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
days_001,2.0%,Mean-Var,2.000%,1.438%,0.960%,0.084%,0.155%
days_001,2.0%,Mean-MAD,2.000%,1.438%,0.960%,0.084%,0.155%
days_001,2.0%,Mean-CVaR,2.000%,1.438%,0.960%,0.084%,0.155%
days_001,4.0%,Mean-Var,4.000%,5.158%,3.291%,0.303%,0.576%
days_001,4.0%,Mean-MAD,4.000%,5.190%,3.271%,0.298%,0.571%
days_001,4.0%,Mean-CVaR,4.000%,5.206%,3.272%,0.298%,0.571%
days_001,6.0%,Mean-Var,6.000%,9.641%,6.024%,0.556%,1.075%
days_001,6.0%,Mean-MAD,6.000%,9.721%,5.983%,0.539%,1.062%
days_001,6.0%,Mean-CVaR,6.000%,9.797%,5.990%,0.538%,1.060%
days_005,2.0%,Mean-Var,2.000%,1.343%,0.899%,0.152%,0.295%


#### Portfolio Weights

In [32]:
wts.T.style.format('{:,.3%}')

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,IWB,IWM,EFA,EEM,VNQ,LQD,SHY
Frequency,Return Target,Method,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
days_001,2.0%,Mean-Var,3.757%,0.000%,0.000%,0.000%,0.000%,0.000%,96.243%
days_001,2.0%,Mean-MAD,3.757%,-0.000%,-0.000%,-0.000%,-0.000%,0.000%,96.243%
days_001,2.0%,Mean-CVaR,3.757%,0.000%,0.000%,0.000%,0.000%,0.000%,96.243%
days_001,4.0%,Mean-Var,23.892%,0.000%,0.000%,0.000%,0.000%,19.575%,56.534%
days_001,4.0%,Mean-MAD,22.150%,-0.000%,-0.000%,-0.000%,-0.000%,26.076%,51.773%
days_001,4.0%,Mean-CVaR,21.756%,0.000%,0.000%,0.000%,0.000%,27.550%,50.694%
days_001,6.0%,Mean-Var,43.206%,0.000%,0.000%,0.000%,0.000%,42.211%,14.582%
days_001,6.0%,Mean-MAD,39.430%,-0.000%,-0.000%,-0.000%,-0.000%,56.312%,4.259%
days_001,6.0%,Mean-CVaR,37.912%,0.000%,0.000%,0.000%,0.000%,61.979%,0.109%
days_005,2.0%,Mean-Var,3.712%,0.000%,0.000%,0.000%,0.000%,0.000%,96.287%


### Code

In [33]:
import numpy as np
import pandas as pd
import cvxpy as cp
import itertools
from scipy.optimize import minimize

def solve_cvxpy_problem(obj: cp.Minimize, constraints: list, solver=cp.ECOS) -> cp.Problem:
    """
    Solve problem in CVXPY given an objective and constraints
    """
    prob = cp.Problem(
        objective=obj,
        constraints=constraints,
    )
    
    prob.solve(solver=solver)
    assert prob.status == 'optimal'
    return prob

def solve_mean_var(rts: pd.DataFrame, target: float) -> np.array:
    """
    Solve mean-variance objective given a DataFrame of returns and a return target
    """
    mean = rts.mean().values
    cov = rts.cov().values
    n_assets = rts.shape[1]
    # define the vector we're solving
    w = cp.Variable(n_assets)
    
    constraints = [
        # sum of all weights is one
        cp.sum(w) == 1,
        # all weights non-negative
        w >= 0,
        # set the expected return
        w @ mean >= target,
    ]

    # minimize variance of portfolio
    obj = cp.Minimize(cp.quad_form(w, cov))
    prob = solve_cvxpy_problem(obj, constraints)
    return np.round(w.value, 6)

def solve_mean_mad(rts: pd.DataFrame, target: float) -> np.array:
    """
    Solve mean-MAD objective given a DataFrame of returns and a return target
    """    
    mean = rts.mean().values
    cov = rts.cov().values
    n_assets = rts.shape[1]    
    w = cp.Variable(n_assets)
    
    constraints = [
        # sum of all weights is one
        cp.sum(w) == 1,
        # all weights non-negative
        w >= 0,
        # set the expected return
        w @ mean >= target,
    ]
    
    obj = cp.Minimize(cp.sum(cp.abs((rts.values @ w) - (mean @ w))))
    prob = solve_cvxpy_problem(obj, constraints)
    return np.round(w.value, 6)

def var(xs, alpha):
    """
    Calculate VaR for a pandas Series
    """
    return np.percentile(xs, alpha, method='interpolated_inverted_cdf')

def cvar(xs, alpha):
    """
    Calcualte CVaR for a pandas Series
    """
    return xs[xs < var(xs, alpha)].mean()
    
def solve_mean_cvar(rts: pd.DataFrame, target: float, alpha: int | float) -> np.array:
    """
    Solve mean-MAD objective given a DataFrame of returns and a return target, and CVaR alpha, where alpha is given in npercent
    """      
    def objective(weights, pars):
        alpha = pars[0]
        portfolio_rts = (rts @ weights)
        return cvar(portfolio_rts, alpha) * -1
    
    # function to be used for total weight constraint
    def total_constraint(x, total_weight):
        return np.sum(x) - total_weight

    # function ot be used for targe return constraint
    def target_return_constraint(x, mean, target_return):
        return (x @ mean) - target_return

    mean = rts.mean().values
    cov = rts.cov().values
    n_assets = rts.shape[1]    
    
    # Initial guess for the weights
    w0 = np.ones(n_assets) / n_assets
    
    # Define bounds for the weights (weights should be between 0 and 1)
    bounds = [(0, 1) for _ in range(n_assets)]
    
    cons = (
        # sum of weights = 1
        {'type': 'eq', 'fun': total_constraint, 'args': [1]},
        # target return
        {'type': 'eq', 'fun': target_return_constraint, 'args': [mean, target]},
    )
    
    # Minimize the CVaR
    result = minimize(
        objective, 
        w0, 
        constraints=cons, 
        args=[alpha], 
        bounds=bounds
    )
    
    # Display results
    return np.round(result.x, 6)


def run_opt_partition(rts: pd.DataFrame, target: float, freq: int, alpha: int = 10) -> tuple[pd.DataFrame, pd.DataFrame]:
    """
    Run mean-var, mean-MAD, and mean-CVaR optimization for a given set of returns and portfolio expected return target
    Output the resulting weights and basic statistics about the portfolio.
    """
    tgt = target * (freq / 252)
    sheet_name = f'days_{freq}'
    sheet_name = f'days_{freq:03d}'
    # run the different optimizations
    w_mv = solve_mean_var(rts, tgt)
    w_mad = solve_mean_mad(rts, tgt)
    w_cvar = solve_mean_cvar(rts, tgt, alpha=alpha)
    
    methods = ['Mean-Var', 'Mean-MAD', 'Mean-CVaR']
    ix = pd.MultiIndex.from_product(
        [[sheet_name], [str.format('{:,.1%}', target)], methods],
        names=['Frequency', 'Return Target', 'Method'])
    # construct a frame of weights for all opt methods
    wts = pd.DataFrame(
        index=list(rts.columns),
        columns=ix,
        data=np.array([w_mv,  w_mad, w_cvar]).T)
    # calculate portfolio returns
    rts_p = pd.DataFrame(
        index=list(rts.index),
        columns=methods,
        data=np.array([rts.values @ w_mv,  rts.values @ w_mad, rts.values @ w_cvar]).T)

    # calc stats before returning
    
    stats = pd.DataFrame(
        columns=methods,
    )
    stats.loc['Mean', :] = rts_p.mean() * (252 / freq)
    stats.loc['StDev', :] = rts_p.std() * np.sqrt(252 / freq)
    stats.loc['MAD', :] = np.sum(
        np.abs((rts.values @ wts.values) - (rts.mean().values @ wts.values)), axis=0) \
        / rts_p.shape[0] * np.sqrt(252 / freq)
    stats.loc[f'VaR{100-alpha}', :] = rts_p.apply(var, axis=0, alpha=10) * -1# * (252 / freq)    
    stats.loc[f'CVaR{100-alpha}', :] = rts_p.apply(cvar, axis=0, alpha=10) * -1# * (252 / freq)
    stats.columns = ix
    
    return (wts, stats)
    

def run_opts(asset_returns: dict, portfolio_return_targets: list) -> tuple[pd.DataFrame, pd.DataFrame]:
    """
    Main entry point of the code. Run a series of optimizations given:
        * dictionary of DataFrames, each frame containing returns at different frequencies
        * list of portfolio return targets
    For each frequency/return target, run a mean-variance, mean-MAD, and mean-CVaR optimization. For each optimization
        we are assuming fully funded non-negative weights.
    """
    jobs = []
    wts_all = []
    stats_all = []
    for params in list(itertools.product(list(asset_returns.keys()), portfolio_return_targets)):
        # 
        sheet_name = params[0]
        freq = int(sheet_name.split('_')[-1])
        tgt = params[1]
        rts = asset_returns[sheet_name]
        rts['Date'] = pd.to_datetime(rts.Date).dt.date
        rts = rts.set_index('Date')
        wts_xs, stats_xs = run_opt_partition(rts, tgt, freq)
        wts_all.append(wts_xs)
        stats_all.append(stats_xs)

    wts = pd.concat(wts_all, axis=1)
    stats = pd.concat(stats_all, axis=1)
    
    return wts, stats

# run the code - load data and set return targets
all_data = pd.read_excel('Project1Data.xlsx', sheet_name=None)
portfolio_return_targets = [0.02, 0.04, 0.06]

wts, stats = run_opts(all_data, portfolio_return_targets)