<h1>Portfolio Optimization</h1>

<p>We learned how to calculate the main metrics to analyze and evaluate a portfolio of stocks.</p>

<p>Now we can use the power of Python to optimize a portfolio!</p>
    
<p>Portfolio optimization is the technique of allocating assets so that it has the maximum return and minimum risk.<br/>
This can be done by finding the allocation that results in the maximum Sharpe ratio.</p>

<p>The simplest way to find the best allocation is to check many random allocations and find the one that has the best Sharpe ratio.</p>

<p>This process of randomly guessing is known as a Monte Carlo Simulation which uses random weights to find the optimal combination of stock weights which has the best sharpe ratio. </p>

<h4>To get started, let's define the initial stocks, download their price data, and calculate the daily returns.</h4>

In [1]:
from IPython.display import display
import matplotlib.pyplot as plt
import yfinance as yf
import numpy as np
import pandas as pd

In [2]:
stocks = ['AAPL', 'AMZN', 'MSFT', 'TSLA']

In [3]:
# Download data
data_df = yf.download(stocks, start='2018-01-01')

# Convert the index to datetime
data_df.index = pd.to_datetime(data_df.index)

YF.download() has changed argument auto_adjust default to True


[*********************100%***********************]  4 of 4 completed


In [4]:
display(data_df)

Price,Close,Close,Close,Close,High,High,High,High,Low,Low,Low,Low,Open,Open,Open,Open,Volume,Volume,Volume,Volume
Ticker,AAPL,AMZN,MSFT,TSLA,AAPL,AMZN,MSFT,TSLA,AAPL,AMZN,MSFT,TSLA,AAPL,AMZN,MSFT,TSLA,AAPL,AMZN,MSFT,TSLA
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2
2018-01-02,40.479832,59.450500,79.474144,21.368668,40.489233,59.500000,79.807021,21.474001,39.774854,58.525501,79.058052,20.733334,39.986349,58.599998,79.640582,20.799999,102223600,53890000,22483800,65283000
2018-01-03,40.472794,60.209999,79.844032,21.150000,41.017978,60.274502,79.991981,21.683332,40.409348,59.415001,79.492666,21.036667,40.543292,59.415001,79.575881,21.400000,118071600,62176000,26061400,67822500
2018-01-04,40.660782,60.479500,80.546753,20.974667,40.764179,60.793499,81.055316,21.236668,40.437540,60.233002,80.047438,20.378668,40.545634,60.250000,80.065928,20.858000,89738400,60442000,21912000,149194500
2018-01-05,41.123711,61.457001,81.545395,21.105333,41.210657,61.457001,81.748820,21.149332,40.665476,60.500000,80.842655,20.799999,40.757123,60.875500,81.055328,21.108000,94640000,70894000,23407100,68868000
2018-01-08,40.970970,62.343498,81.628624,22.427334,41.267060,62.653999,81.906024,22.468000,40.872270,61.601501,80.999858,21.033333,40.970970,61.799999,81.554650,21.066668,82271200,85590000,22113000,147891000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-04-01,223.190002,192.169998,382.190002,268.459991,223.679993,193.929993,382.850006,277.450012,218.899994,187.199997,373.230011,259.250000,219.809998,187.860001,374.649994,263.799988,36412700,41267300,19689500,146486900
2025-04-02,223.889999,196.009995,382.140015,282.760010,225.190002,198.339996,385.079987,284.989990,221.020004,187.660004,376.619995,251.270004,221.320007,187.660004,377.970001,254.600006,35905900,53679200,16092600,212787800
2025-04-03,203.190002,178.410004,373.109985,267.279999,207.490005,184.130005,377.480011,276.299988,201.250000,176.919998,369.350006,261.510010,205.539993,183.000000,374.790009,265.290009,103419000,95553600,30198000,136174300
2025-04-04,188.380005,171.000000,359.839996,239.429993,199.880005,178.139999,374.589996,261.000000,187.339996,166.000000,359.480011,236.000000,193.889999,167.149994,364.130005,255.380005,125569000,122951300,49138700,180324400


In [5]:
def get_data_before(data_df, end: str):
    filtered_df = data_df[data_df.index <= end]
    return filtered_df

In [6]:
def optimize_portfolio(data_df, stocks: list, end: str):
    current_data_df = get_data_before(data_df, end)
    
    # Calculating daily returns
    current_data_df = current_data_df['Close']
    x = current_data_df.pct_change()

    # Storing the weights, returns and Sharpe ratios for each portfolio
    p_weights, p_returns, p_risk, p_sharpe = [], [], [], []

    # Running a for loop, generate the random weights and calculate the returns, volatility and Sharpe ratio of the portfolio.
    count = 5000
    for k in range(0, count):
        # Randomly assign a weight to each stock in our portfolio, and then calculate the metrics for that portfolio, including the Sharpe ratio.
        wts = np.random.uniform(size = len(stocks))
        wts = wts/np.sum(wts)
        p_weights.append(wts)
    
        # Returns
        mean_ret = (x.mean() * wts).sum()*252
        p_returns.append(mean_ret)
        
        # Volatility
        ret = (x * wts).sum(axis = 1)
        annual_std = np.std(ret) * np.sqrt(252)
        p_risk.append(annual_std)
            
        # Sharpe ratio
        sharpe = (np.mean(ret) / np.std(ret))*np.sqrt(252)
        p_sharpe.append(sharpe)

    # Finding the optimal index
    max_ind = np.argmax(p_sharpe)

    # Finding the max sharpe ratio
    max_sharpe_ratio = p_sharpe[max_ind]
    
    # Finding the optimal stock weights
    optimal_stock_weights = p_weights[max_ind]

    return float(max_sharpe_ratio), optimal_stock_weights.tolist()

In [7]:
max_sharpe_ratio, optimal_stock_weights = optimize_portfolio(
    data_df=data_df,
    stocks=stocks, 
    end='2025-04-07'
)

# Max Sharpe ratio
print("The maximum sharpe ratio: ", max_sharpe_ratio)
# Stocks
print("The stocks: ", stocks)
# Stock Weights
print("The optimal stock weights that gives the maximum sharpe ratio: ", optimal_stock_weights)

The maximum sharpe ratio:  1.004389654867879
The stocks:  ['AAPL', 'AMZN', 'MSFT', 'TSLA']
The optimal stock weights that gives the maximum sharpe ratio:  [0.2915239034904829, 0.013332612198246089, 0.4395670541360039, 0.25557643017526704]


In [8]:
for test_index in range(len(stocks)):
    test_value = optimal_stock_weights[test_index]
    print(f"{test_index = }")
    for i in range(10):
        max_sharpe_ratio, optimal_stock_weights = optimize_portfolio(
            data_df=data_df,
            stocks=stocks, 
            end='2025-04-07'
        )
        if abs(optimal_stock_weights[test_index] - test_value) > 0.1:
            print(f"diff bigger than 0.1", test_value, optimal_stock_weights[test_index])
    print("---------------------------------------")

test_index = 0
---------------------------------------
test_index = 1
---------------------------------------
test_index = 2
---------------------------------------
test_index = 3
---------------------------------------


In [9]:
from datetime import datetime, timedelta

simulation_optimal_weights = {}

# Start and end dates
start_date = datetime(2025, 4, 1)
end_date = datetime(2025, 4, 7)

# Loop through each date
current_date = start_date
while current_date <= end_date:
    current_date_str = current_date.strftime('%Y-%m-%d')
    
    max_sharpe_ratio, optimal_stock_weights = optimize_portfolio(
        data_df=data_df,
        stocks=stocks, 
        end=current_date_str
    )

    simulation_optimal_weights[current_date_str] = {
        "stocks": stocks,
        "max_sharpe_ratio": max_sharpe_ratio,
        "optimal_stock_weights": optimal_stock_weights   
    }

    print(f"{current_date_str = }")
    print(f"{stocks = }")
    print(f"{max_sharpe_ratio = }")
    print(f"{optimal_stock_weights = }")
    print("------------------------")

    current_date += timedelta(days=1)

current_date_str = '2025-04-01'
stocks = ['AAPL', 'AMZN', 'MSFT', 'TSLA']
max_sharpe_ratio = 1.0846771049694763
optimal_stock_weights = [0.411854551990074, 0.0025385743884358703, 0.34631867801048904, 0.23928819561100118]
------------------------
current_date_str = '2025-04-02'
stocks = ['AAPL', 'AMZN', 'MSFT', 'TSLA']
max_sharpe_ratio = 1.0897382957143258
optimal_stock_weights = [0.4634072435616382, 0.0007288481337383953, 0.30611682774091553, 0.2297470805637078]
------------------------
current_date_str = '2025-04-03'
stocks = ['AAPL', 'AMZN', 'MSFT', 'TSLA']
max_sharpe_ratio = 1.061344686194766
optimal_stock_weights = [0.3288778063109424, 0.012376146734767232, 0.4196578666659874, 0.23908818028830295]
------------------------
current_date_str = '2025-04-04'
stocks = ['AAPL', 'AMZN', 'MSFT', 'TSLA']
max_sharpe_ratio = 1.030506123725664
optimal_stock_weights = [0.34311083681594057, 0.0005889818362135035, 0.4184297840143378, 0.237870397333508]
------------------------
current_date_str = '

In [10]:
for k, v in simulation_optimal_weights.items():
   print(sum(v["optimal_stock_weights"]))

1.0
0.9999999999999999
1.0
0.9999999999999999
1.0
1.0
1.0
