## Constructing a Minimum-Variance Portfolio

First, we need to import all of the libraries that we'll be using:

* NumPy for linear algebra
* pandas for dataframes and variance/covariance calculations
* yfinance for financial data
* scipy.optimize for solving for the optimal stock weights

In [1]:
# Imports
import numpy as np
import pandas as pd
import yfinance as yf
from scipy import optimize

### Collecting and Preparing the Data

In this example, we'll use the top 10 largest stocks by market capitalization to build our portfolio. This can easily be changed by adding/removing entries from the tickers list. The other parameters that can be changed and tuned are:

* period: how far back we will go when calculating returns, variances, and covariances.
* interval: the frequency that returns will be calculated over (i.e. daily returns, monthly returns, etc.).
* allow_short: whether or not short selling is allowed, we'll assume that it is free when allowed.

The following code will download a dataframe with the closing price for each of the selected stocks with the given parameters. Then, it will print out the first five rows to verify that the correct data has been downloaded.

In [2]:
### USER PARAMETERS ###

tickers = [
    'AAPL',
    '2222.SR',
    'MSFT',
    'GOOG',
    'AMZN',
    'BRK-B',
    'V',
    'TSM',
    'TCEHY',
    'XOM'
]

period = '1y' # Options are 1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, 10y, ytd, and max

interval = '1d' # Options are 1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, and 3mo

allow_short = True

#######################

# Create a list of tuples for each given stock with the format ('AAPL', 'Close') so that we can request only the closing prices
fields = []
for i in range(len(tickers)):
    fields.append((tickers[i], 'Close'))

# Download the data with the user parameters
data = yf.download(
    tickers = tickers,
    period = period,
    interval = interval,
    group_by = 'ticker'
)[fields]

# Print the first five rows
print(data.head())

[*********************100%***********************]  10 of 10 completed
                  AAPL    2222.SR        MSFT        GOOG        AMZN  \
                 Close      Close       Close       Close       Close   
Date                                                                    
2022-01-18         NaN  33.454544         NaN         NaN         NaN   
2022-01-19  166.229996  33.363636  303.329987  135.651993  156.298996   
2022-01-20  164.509995  33.500000  301.600006  133.506500  151.667496   
2022-01-21  162.410004        NaN  296.029999  130.091995  142.643005   
2022-01-23         NaN  33.181816         NaN         NaN         NaN   

                 BRK-B           V         TSM      TCEHY        XOM  
                 Close       Close       Close      Close      Close  
Date                                                                  
2022-01-18         NaN         NaN         NaN        NaN        NaN  
2022-01-19  314.750000  214.679993  131.009995  57.599998  7

Since we're interested in the stocks' returns, not their closing prices, we will create a new dataframe that instead has the percent change for each row (the last row will be dropped since we can't calculate its percent change).

In [3]:
# Find the returns for each row (except the last one) and store it in a new dataframe
returns = data.pct_change().dropna() * 100

# Print the first five rows to verify
print(returns.head())

                AAPL   2222.SR      MSFT      GOOG      AMZN     BRK-B  \
               Close     Close     Close     Close     Close     Close   
Date                                                                     
2022-01-20 -1.034712  0.408720 -0.570330 -1.581615 -2.963231 -1.188242   
2022-01-21 -1.276513  0.000000 -1.846819 -2.557557 -5.950181 -1.861679   
2022-01-23  0.000000 -0.949803  0.000000  0.000000  0.000000  0.000000   
2022-01-24 -0.486429 -0.547939  0.114852  0.215231  1.332698 -0.488169   
2022-01-25 -1.138471  0.826448 -2.658840 -2.789324 -3.153375  1.139167   

                   V       TSM     TCEHY       XOM  
               Close     Close     Close     Close  
Date                                                
2022-01-20 -0.153711 -2.045640  5.034725  0.218843  
2022-01-21 -3.928161 -2.961118 -1.388430 -1.501295  
2022-01-23  0.000000  0.000000  0.000000  0.000000  
2022-01-24 -1.981249  1.148318 -0.771034  0.859087  
2022-01-25  0.074310 -2.762778  1.52

### Covariance Matrix

Before we can calculate and optimize the portfolio variance, we need the covariance matrix of the portfolio. The following code will calculate it and store it in a NumPy array for future linear algebra.

In [4]:
# Create covariance matrix and covert from pandas dataframe to NumPy array
covariance = returns.cov().to_numpy()

# Print to verify
print(covariance)

[[ 4.12351406  0.03879593  3.29913064  3.47974593  3.99424053  1.81212763
   2.53185515  2.98937576  2.16429601  1.16159124]
 [ 0.03879593  1.65193044  0.16416702  0.03248357 -0.12920911  0.04554862
  -0.05566208 -0.19061545  0.5147812   0.35156811]
 [ 3.29913064  0.16416702  4.02474411  3.70580828  4.26966785  1.67330161
   2.33214066  2.6559232   1.95940384  0.91130484]
 [ 3.47974593  0.03248357  3.70580828  4.81335377  4.60322023  1.77725008
   2.28815193  3.06079214  2.38061389  0.88781615]
 [ 3.99424053 -0.12920911  4.26966785  4.60322023  8.26843634  2.10296784
   2.77754649  3.52813678  2.93093419  1.36153556]
 [ 1.81212763  0.04554862  1.67330161  1.77725008  2.10296784  1.62610561
   1.44481129  1.40267777  0.62224619  1.07399141]
 [ 2.53185515 -0.05566208  2.33214066  2.28815193  2.77754649  1.44481129
   3.11633483  2.1577444   1.97836396  0.81804077]
 [ 2.98937576 -0.19061545  2.6559232   3.06079214  3.52813678  1.40267777
   2.1577444   4.90384524  3.32417384  1.11364756]


### Optimization

Now it is time to optimize our portfolio using this formulation:

$$minimizeVariance = w \Sigma w^T,$$
$$s.t. \sum_i w_i = 1$$

We also will bound the weights to [0, 1] when shorting is not allowed.

In [5]:
# Objective function
def portfolioVar(w, cov):
    w = np.matrix(w) # Convert w and cov into matrices for easy linear algebra
    cov = np.matrix(cov)
    result = w * cov * w.T # Calculate the objective function
    return result

# Constraint
constraint = ({'type': 'eq', 'fun': lambda x:  np.sum(x) - 1}) # The sum of the array weights is equal to 1

# Bounds
bounds = [()]
if allow_short:
    bounds = tuple((-np.inf, np.inf) for i in tickers)
else:
    bounds = tuple((0, 1) for i in tickers)

Now that we have our objective function, constraint, and bounds, the last thing we need is an initial guess. Since it doesn't need to be a good guess, just a feasible one, we will just use equal weights for all of the stocks.

The following code creates that initial guess, then solves for the optimal stock weights. Finally, it prints out the weights.

In [6]:

# Initial guess, equal weights
w0 = [1 / len(tickers)] * len(tickers)

result = optimize.minimize(
    fun = portfolioVar, # Objective function
    x0 = w0, # Initial guess
    args = covariance, # Covariance matrix for objective function parameters
    method = 'SLSQP', # Sequential least squares programming method
    bounds = bounds, # Bounds
    constraints = constraint # Constraint
)

# Round the optimal result
result.x = result.x.round(5)

# Print the optimal weights of each stock
for i in range(len(tickers)):
    print(tickers[i], '=', result.x[i], '%')

AAPL = -0.11261 %
2222.SR = 0.46779 %
MSFT = -0.02431 %
GOOG = 0.00222 %
AMZN = -0.03192 %
BRK-B = 0.46165 %
V = 0.10619 %
TSM = 0.08869 %
TCEHY = 0.00947 %
XOM = 0.03285 %
