# Portfolio Optimization

Portfolio optimization is the process of choosing the best combination of assets to achieve the highest possible return for a given level of risk, or alternatively, the lowest risk for a desired return.  The goal is to build a well-diversified portfolio that performs efficiently under various market conditions.

## Modern Portfolio Theory


Modern Portfolio Theory (MPT), also known as Mean-Variance Portfolio Theory, represents a major breakthrough in finance. It is based on the premise that asset returns are normally distributed, meaning that their behavior can be described using just the mean (expected return) and variance (risk or volatility).

The core idea of MPT is to achieve diversification by constructing a portfolio that either minimizes risk for a given level of expected return or maximizes expected return for a given level of risk.

The Efficient Frontier represents the set of optimal portfolios along the risk-return spectrum. Portfolios that lie below the Efficient Frontier are considered sub-optimal, as they offer lower returns for a given level of risk or higher risk for a given level of return.

**Portfolios on the Efficient Frontier provide:**

a. The highest expected return for a given level of risk`, or

b. The lowest level of risk for a given level of expected return`

In essence, an investor's goal is to determine the level of risk they are comfortable with, and then select a portfolio on the Efficient Frontier that offers the best possible return for that risk level. 

**Import Libraries**

In [None]:
# Import client from TradingStrategy
from tradingstrategy.client import Client

from tradingstrategy.chain import ChainId
from tradingstrategy.exchange import ExchangeUniverse
from tradingstrategy.timebucket import TimeBucket
from tradingstrategy.pair import PandasPairUniverse, HumanReadableTradingPairDescription

# Create client for data import
client = Client.create_jupyter_client()

from pyarrow import Table
import pandas as pd

## Pair Universe Retrieval

In [None]:
# Fetch exchange universe
exchange_universe: ExchangeUniverse = client.fetch_exchange_universe()

# Fetch pair dataset, decompress raw PyArrow data to Python map
columnar_pair_table: Table = client.fetch_pair_universe()
print(f"Total pairs {len(columnar_pair_table)}, total exchanges {len(exchange_universe.exchanges)}")

# Wrap the data in a helper class with indexes for easier access
pair_universe = PandasPairUniverse(columnar_pair_table.to_pandas(), exchange_universe=exchange_universe)

## Retrieve trading pairs with specific human description

Human descriptions for Uniswap v3 are retrieved from https://tradingstrategy.ai/trading-view/ethereum/uniswap-v3

In [None]:

# Get BTC-USD pair description on Uniswap v3 [less pairs than v2]
btc_usd_desc: HumanReadableTradingPairDescription = (ChainId.ethereum, "uniswap-v3", "WBTC", "USDC", 0.003) # Wrapped BTC-USD Coin https://tradingstrategy.ai/trading-view/ethereum/uniswap-v3/wbtc-usdc-fee-30
display(btc_usd_desc)

# Get ETH-USD pair description on Uniswap v3 [less pairs than v2]
eth_usd_desc: HumanReadableTradingPairDescription = (ChainId.ethereum, "uniswap-v3", "WETH", "USDC", 0.0005) # Ether-USD Coin https://tradingstrategy.ai/trading-view/ethereum/uniswap-v3/eth-usdc-fee-5
display(eth_usd_desc)

# Retrieve pair using description as index
btc_usd_pair = pair_universe.get_pair_by_human_description(btc_usd_desc)
eth_usd_pair = pair_universe.get_pair_by_human_description(eth_usd_desc)
display(btc_usd_pair)
display(eth_usd_pair)

## Retrieve all candles (sub-optimal unless caching for later use) 

In [None]:
# Download all 24h candles as Parquet columnar data
all_candles = client.fetch_all_candles(TimeBucket.d1)

# Convert PyArrow table to Pandas
all_candles_dataframe = all_candles.to_pandas()

# Retrieve candles 
btc_usd_candles: pd.DataFrame = all_candles_dataframe.loc[all_candles_dataframe['pair_id'] == btc_usd_pair.pair_id]
eth_usd_candles: pd.DataFrame = all_candles_dataframe.loc[all_candles_dataframe['pair_id'] == eth_usd_pair.pair_id]

print(f"Uniswap v2 BTC-USDC has {len(btc_usd_candles)} daily candles: {btc_usd_candles}")
print(f"Uniswap v2 ETH-USDC has {len(eth_usd_candles)} daily candles: {eth_usd_candles}")

# Swap exchanges separate buy volume from sell volume so we'll append a new total volume column
btc_usd_candles["volume"] = btc_usd_candles["buy_volume"] + btc_usd_candles["sell_volume"]
eth_usd_candles["volume"] = eth_usd_candles["buy_volume"] + eth_usd_candles["sell_volume"]


## Assemble Price Table

In [None]:
# build price table for covariance analysis from common timestamps
merge_table = pd.merge(btc_usd_candles, eth_usd_candles, on='timestamp', how='inner', suffixes=('_btc', '_eth'))

# represent price as price at close
btc_price = merge_table['close_btc']
eth_price = merge_table['close_eth']                      

display(btc_price)
display(eth_price)

price_table = pd.DataFrame([btc_price, eth_price])
# display prices
display(price_table)


## Alternative Price Table Representation

In [None]:
returns = pd.concat([btc_price, eth_price], axis=1, join='inner')
display(returns)

## Assemble Returns Matrix

Not represented here, we can also provide log returns calculated using `np.log(prices / prices.shift(1))`

In [None]:
btc_returns = btc_price.pct_change()
eth_returns = eth_price.pct_change()
returns_table = pd.DataFrame([btc_returns, eth_returns])

# strip first row where ptc_change is Nan
returns_table = returns_table.dropna(axis=1)
display(returns_table)

## Calculate Volatility

In [None]:
# Calculate daily standard deviation (volatility)
btc_volatility = btc_returns.std()
eth_volatility = eth_returns.std()

display(btc_volatility)
display(eth_volatility)

# Calculate daily covariance

## Annualized Volatility

For annualizing daily volatility, multiply by the square root of the number of trading days (e.g., 252 for equities, 365 for cryptocurrencies).

In [None]:
import numpy as np

btc_annualized_volatility = btc_volatility * np.sqrt(365)
eth_annualized_volatility = eth_volatility * np.sqrt(365)

display(btc_annualized_volatility)
display(eth_annualized_volatility)

## Annual Returns and Volatility, alternative formulation 

In [None]:
btc_annual_returns = round(btc_returns.mean()*260*100,2)
btc_annual_stdev = round(btc_returns.std()*np.sqrt(260)*100,2)

display(btc_annual_returns)
display(btc_annual_stdev)

eth_annual_returns = round(eth_returns.mean()*260*100,2)
eth_annual_stdev = round(eth_returns.std()*np.sqrt(260)*100,2)

display(eth_annual_returns)
display(eth_annual_stdev)



## Calculate Covariance

In [None]:
volatility_table = pd.DataFrame([btc_volatility, eth_volatility])
display(volatility_table)

## Portfolio Statistics

Consider a portfolio which is fully invested in risky assets. Let $w$ and $\mu$ be the vector of weights and mean returns of *n* assets. <br><br>

$$
\ {w=}\left( 
\begin{array}{c}
w_1 \\
w_2 \\
\vdots \\
w_n \\ 
\end{array}%
\right);
\ \mathbf{\mu=}\left( 
\begin{array}{ccc}
\mu_1 \\ 
\mu_2 \\ 
\vdots \\
\mu_n \\ 
\end{array}%
\right)
$$ 

where the $\sum_{i=1}^{n}w_i=1$

**`Expected Portfolio Return`** is then the dot product of the expected returns and their weights. <br><br>

$$\mu_\pi = w^T\cdot\mu$$

which is also equivalent to the $\Sigma_{i=1}^{n}w_i\mu_i$


**`Expected Portfolio Variance`** is then the multidot product of weights and the covariance matrix. <br><br>

$$\sigma^2_\pi = w^T\cdot\Sigma\cdot w $$

where, ${\Sigma}$ is the covariance matrix

$$
{\Sigma=}\left( 
\begin{array}{ccc}
\Sigma_{1,1} & \dots & \Sigma_{1,n} \\ 
\vdots & \ddots & \vdots  \\ 
\Sigma_{n,1} & \dots & \Sigma_{n,n} \\ %
\end{array}%
\right)
$$

In [None]:
# Let's troubleshoot our returns_table representation

# Compute statistics
mean_returns = (returns_table.mean()*260)
cov_matrix = (returns_table.cov() * 260)
display(mean_returns)
display(mean_returns.mean())

# Display the covariance matrix
display(cov_matrix)

# Gather length of mean returns for subsequent notebook cells
n = len(mean_returns) 


**Maximum Sharpe Ratio Portfolio**

CVXPY is designed specifically for convex optimization, while the original Sharpe ratio maximization problem is non-convex due to its fractional (ratio) form.

To address this, we apply a mathematical transformation. We set the portfolio excess return equal to 1 and then find the portfolio that achieves exactly 1 unit of excess return with minimum risk. This works due to the homogeneity of the Sharpe ratio. If a set of portfolio weights $w^{*}$ maximizes the Sharpe ratio, then any scaled version $k * w^{*}$ also achieves the same Sharpe ratio.

In [None]:
# ---- 1. Maximum Sharpe Ratio Portfolio ----

import cvxpy as cp

def optimize_max_sharpe(mean_returns, cov_matrix, risk_free_rate=0.0):

    # Gather excess return
    excess_return = mean_returns - risk_free_rate

    # Gather length of mean returns
    n = len(mean_returns) 

    # Create random weight vector
    w = cp.Variable(n)
    w.value = np.random.rand(1535)

    # Matrix transforms for convexity
    cov_matrix_np = cov_matrix.values
    cov_matrix_psd = cp.psd_wrap(cov_matrix_np)  # Ensure PSD for CVXPY

    # Calculate portfolio return and risk
    port_return =  w @ excess_return
    port_risk = cp.quad_form(w, cov_matrix_psd)
    
    # Minimize risk for 1 unit of excess return
    objective = cp.Minimize(port_risk) 
    constraints = [w >= 0, port_return == 1]
    
    prob = cp.Problem(objective, constraints)
    prob.solve()            
    #print(prob.solver_stats.solver_name)
    
    if w.value is not None:
        # Normalize weights to sum to 1
        w_normalized = w.value / np.sum(w.value)
        return w_normalized
    else:
        return None

**Display available solvers**

In [None]:
# List of the solvers CVXPY supports
from cvxpy import installed_solvers
print(installed_solvers())

**Calculate and Display Sharpe Optimization**

In [None]:
# ---- Run Optimizations ----
msr_weights = optimize_max_sharpe(mean_returns, cov_matrix) 
msr_weights

**Minimum Variance Portfolio**

The Minimum Variance Portfolio aims to find asset weights that minimize overall portfolio risk, regardless of expected returns. In this formulation, we minimize the portfolio's variance using the covariance matrix of asset returns. The optimization is subject to two constraints: the weights must sum to 1 (fully invested portfolio), and short-selling is not allowed (weights ≥ 0). This problem is convex and efficiently solved using CVXPY.

In [16]:
# ---- 2. Minimum Variance Portfolio ----

def optimize_min_variance(cov_matrix):
    
    w = cp.Variable(n)

    cov_matrix_np = cov_matrix.values
    cov_matrix_psd = cp.psd_wrap(cov_matrix_np)  # Ensure PSD for CVXPY

    objective = cp.Minimize(cp.quad_form(w, cov_matrix_psd))
    constraints = [cp.sum(w) == 1, w >= 0]
    
    prob = cp.Problem(objective, constraints)
    prob.solve()
    
    return w.value

**Calculate and Display Minimum Variance Portfolio**

In [None]:
display(cov_matrix)

# ---- Run Optimizations ----
mv_weights = optimize_min_variance(cov_matrix) 
mv_weights

**Maximum Return Portfolio**

The Maximum Return Portfolio focuses solely on maximizing expected returns, without considering portfolio risk. Given the vector of mean asset returns, this optimization finds the portfolio weights that yield the highest expected return, subject to two constraints: full investment (weights sum to 1) and no short-selling (weights ≥ 0). This is a linear program and is efficiently solvable using CVXPY.

In [18]:
# ---- 3. Maximum Return Portfolio ----

def optimize_max_return(mean_returns):
    
    w = cp.Variable(n)
    
    objective = cp.Maximize(w @ mean_returns)
    constraints = [cp.sum(w) == 1, w >= 0]
    
    prob = cp.Problem(objective, constraints)
    prob.solve()
    
    return w.value

**Calculate and Display Maximum Return Portfolio**

In [None]:
# ---- Run Optimizations ----
mr_weights = optimize_max_return(mean_returns)
mr_weights

**Portfolio Composition after Optimization**

In [20]:
# MV Weights
#assets = ['BTC-USD', 'ETH-USD']
#mvwtdf = pd.DataFrame(100*mv_weights, index=assets, columns=['wts'])
#mvwtdf.iplot(kind='pie', showlegend=True, title='MV Weights')

# Efficient Frontier

The Efficient Frontier is formed by a set of portfolios offering the highest expected portfolio return for a certain volatility or offering the lowest volatility for a certain level of expected returns. 

**`Minimize Portfolio Risk for a Target Return`:** 

* Risk objective and Return constraint

$$\underset{w_1,w_2,\dots,w_n}{minimize} \space\space \sigma^2_{p}(w_1,w_2,\dots,w_n)$$

subject to,

$$E[R_p] = m$$


**`Maximize Portfolio Return for a Target Risk`**:
* Return objective and Risk constraint

$$\underset{w_1,w_2,\dots,w_n}{maximize} \space\space E[R_p(w_1,w_2,\dots,w_n)]$$

subject to,

$$\sigma^2_{p}(w_1,w_2,\dots,w_n)=v^2$$

where, $\sum_{i=1}^{n}w_i=1$, 

$m$ is the target return, and 

$v$ is the target volatility for the above objectives. 

We can use numerical optimization techniques such as quadratic programming to solve these problems. The goal is to find the optimal portfolio weights that minimize or maximize the objective function, while satisfying the specified constraints. These techniques allow us to compute the full efficient frontier by iterating across a range of return or risk levels.

In [31]:
# ---- 4. Efficient Frontier ----

def efficient_frontier(mean_returns, cov_matrix, points=100):
    
    cov_matrix_np = cov_matrix.values
    cov_matrix_psd = cp.psd_wrap(cov_matrix_np)  # Ensure PSD for CVXPY

    target_returns = np.linspace(mean_returns.min(), mean_returns.max(), points)
    frontier = []

    for target in target_returns:
        
        w = cp.Variable(n)
        w.value = np.random.rand(1535)
        
        port_risk = cp.quad_form(w, cov_matrix_psd)
        
        objective = cp.Minimize(port_risk)
        constraints = [cp.sum(w) == 1, w >= 0, w @ mean_returns == target]
         
        prob = cp.Problem(objective, constraints)
        prob.solve(solver=cp.SCS, verbose=True)
        
        if w.value is not None:
            vol = np.sqrt(w.value.T @ cov_matrix @ w.value)
            frontier.append((vol, target))
            
    return np.array(frontier)

**Plot Efficient Frontier**

In [None]:
# ---- Get Optimized Portfolio Statistics ----
def get_stats(w):
    ret = mean_returns @ w
    vol = np.sqrt(w.T @ cov_matrix @ w)
    return 100*ret, 100*vol

msr_ret,msr_vol = get_stats(msr_weights)
mv_ret, mv_vol = get_stats(mv_weights)
mr_ret, mr_vol = get_stats(mr_weights)

In [None]:
#import cufflinks as cf
#cf.go_offline() # enables offline mode for Plotly

import plotly.graph_objects as go

# ---- 5. Plot Efficient Frontier ----

ef_curve = efficient_frontier(mean_returns, cov_matrix)
ef_port = 100 * pd.DataFrame(ef_curve, columns=['Volatility', 'Return'])

fig = ef_port.iplot(
    kind='scatter',
    x='Volatility', 
    y='Return', 
    title='Efficient Frontier Portfolio', 
    name='Efficient Frontier', 
    xaxis_title="Volatility (Risk)", 
    yaxis_title="Expected Return", 
    showlegend=True
    ) 

print("fig:", fig)  # Check if fig is None

fig = go.Figure()

fig.add_trace(go.Scatter(x=[msr_vol], y=[msr_ret], 
    marker=dict(size=10, color='green'), text=["Max Sharpe"], name='Max Sharpe'))

fig.add_trace(go.Scatter(x=[mv_vol], y=[mv_ret], 
        marker=dict(size=10, color='blue'), text=["Min Variance"], name='Min Variance'))    