## Constructing a Minimum-Variance Portfolio

First, we need to import all of the libraries that we'll be using:

* NumPy for linear algebra
* pandas for dataframes and variance/covariance calculations
* yfinance for financial data
* scipy.optimize for solving for the optimal stock weights

In [1]:
# Imports
import numpy as np
import pandas as pd
import yfinance as yf
from scipy import optimize

### Collecting and Preparing the Data

In this example, we'll use the top 10 largest US stocks by market capitalization to build our portfolio. This can easily be changed by adding/removing entries from the tickers list. The other parameters that can be changed and tuned are:

* period: how far back we will go when calculating returns, variances, and covariances.
* interval: the frequency that returns will be calculated over (i.e. daily returns, monthly returns, etc.).
* allow_short: whether or not short selling is allowed, we'll assume that it is free when allowed.

The following code will download a dataframe with the closing price for each of the selected stocks with the given parameters. Then, it will print out the first five rows to verify that the correct data has been downloaded.

In [2]:
### USER PARAMETERS ###

tickers = [
    'AAPL',
    'MSFT',
    'GOOG',
    'AMZN',
    'BRK-B',
    'V',
    'XOM',
    'UNH',
    'JNJ',
    'NVDA',
]

period = '1y' # Options are 1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, 10y, ytd, and max

interval = '1d' # Options are 1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, and 3mo

allow_short = True

#######################

# Create a list of tuples for each given stock with the format ('AAPL', 'Close') so that we can request only the closing prices
fields = []
for i in range(len(tickers)):
    fields.append((tickers[i], 'Close'))

# Download the data with the user parameters
data = yf.download(
    tickers = tickers,
    period = period,
    interval = interval,
    group_by = 'ticker'
)[fields]

# Print the first five rows
print(data.head())

[*********************100%***********************]  10 of 10 completed
                  AAPL        MSFT        GOOG        AMZN       BRK-B  \
                 Close       Close       Close       Close       Close   
Date                                                                     
2022-01-19  166.229996  303.329987  135.651993  156.298996  314.750000   
2022-01-20  164.509995  301.600006  133.506500  151.667496  311.010010   
2022-01-21  162.410004  296.029999  130.091995  142.643005  305.220001   
2022-01-24  161.619995  296.369995  130.371994  144.544006  303.730011   
2022-01-25  159.779999  288.489990  126.735497  139.985992  307.190002   

                     V        XOM         UNH         JNJ        NVDA  
                 Close      Close       Close       Close       Close  
Date                                                                   
2022-01-19  214.679993  73.110001  462.519989  166.580002  250.669998  
2022-01-20  214.350006  73.269997  463.000000  1

Since we're interested in the stocks' returns, not their closing prices, we will create a new dataframe that instead has the percent change for each row (the last row will be dropped since we can't calculate its percent change).

In [3]:
# Find the returns for each row (except the last one) and store it in a new dataframe
returns = data.pct_change().dropna() * 100

# Print the first five rows to verify
print(returns.head())

                AAPL      MSFT      GOOG      AMZN     BRK-B         V  \
               Close     Close     Close     Close     Close     Close   
Date                                                                     
2022-01-20 -1.034712 -0.570330 -1.581615 -2.963231 -1.188242 -0.153711   
2022-01-21 -1.276513 -1.846819 -2.557557 -5.950181 -1.861679 -3.928161   
2022-01-24 -0.486429  0.114852  0.215231  1.332698 -0.488169 -1.981249   
2022-01-25 -1.138471 -2.658840 -2.789324 -3.153375  1.139167  0.074310   
2022-01-26 -0.056325  2.849319  1.976170 -0.795433  0.673852  1.915839   

                 XOM       UNH       JNJ      NVDA  
               Close     Close     Close     Close  
Date                                                
2022-01-20  0.218843  0.103782 -0.798416 -3.658195  
2022-01-21 -1.501295 -0.395245 -0.229958 -3.213248  
2022-01-24  0.859087  0.238518 -1.152419 -0.008558  
2022-01-25  2.939963 -1.174637  2.859424 -4.483996  
2022-01-26 -1.014283  0.348042  0.44

### Covariance Matrix

Before we can calculate and optimize the portfolio variance, we need the covariance matrix of the portfolio. The following code will calculate it and store it in a NumPy array for future linear algebra.

In [4]:
# Create covariance matrix and covert from pandas dataframe to NumPy array
covariance = returns.cov().to_numpy()

# Print to verify
print(covariance)

[[ 5.09994322  4.08001946  4.30287533  4.93906319  2.24152603  3.13207822
   1.4389114   1.65839801  0.90711647  6.84604703]
 [ 4.08001946  4.97729435  4.58205068  5.27923042  2.0698116   2.88514128
   1.12999834  1.6034463   0.77201349  6.87473508]
 [ 4.30287533  4.58205068  5.95079833  5.69039874  2.19840703  2.8309998
   1.10278706  1.45627268  0.75248208  7.35408776]
 [ 4.93906319  5.27923042  5.69039874 10.22353071  2.60130932  3.43645993
   1.6894665   1.53717236  0.80132641  8.90690433]
 [ 2.24152603  2.0698116   2.19840703  2.60130932  2.01140718  1.78714701
   1.32840908  1.14955998  0.78742188  3.34340322]
 [ 3.13207822  2.88514128  2.8309998   3.43645993  1.78714701  3.85459487
   1.01086043  1.23674981  0.71136204  4.8838375 ]
 [ 1.4389114   1.12999834  1.10278706  1.6894665   1.32840908  1.01086043
   4.82044765  0.8850761   0.25158663  2.00515245]
 [ 1.65839801  1.6034463   1.45627268  1.53717236  1.14955998  1.23674981
   0.8850761   2.35102423  0.94799552  2.09878012]
 

### Optimization

Now it is time to optimize our portfolio using this formulation:

$$minimizeVariance = w \Sigma w^T,$$
$$s.t. \sum_i w_i = 1$$

We also will bound the weights to [0, 1] when shorting is not allowed.

In [5]:
# Objective function
def portfolioVar(w, cov):
    w = np.matrix(w) # Convert w and cov into matrices for easy linear algebra
    cov = np.matrix(cov)
    result = w * cov * w.T # Calculate the objective function
    return result

# Constraint
constraint = ({'type': 'eq', 'fun': lambda x:  np.sum(x) - 1}) # The sum of the array weights is equal to 1

# Bounds
bounds = [()]
if allow_short:
    bounds = tuple((-np.inf, np.inf) for i in tickers)
else:
    bounds = tuple((0, 1) for i in tickers)

Now that we have our objective function, constraint, and bounds, the last thing we need is an initial guess. Since it doesn't need to be a good guess, just a feasible one, we will just use equal weights for all of the stocks.

The following code creates that initial guess, then solves for the optimal stock weights. Finally, it prints out the weights.

In [6]:

# Initial guess, equal weights
w0 = [1 / len(tickers)] * len(tickers)

result = optimize.minimize(
    fun = portfolioVar, # Objective function
    x0 = w0, # Initial guess
    args = covariance, # Covariance matrix for objective function parameters
    method = 'SLSQP', # Sequential least squares programming method
    bounds = bounds, # Bounds
    constraints = constraint # Constraint
)

# Round the optimal result
result.x = result.x.round(5)

# Print the optimal weights of each stock
for i in range(len(tickers)):
    print(tickers[i], '=', result.x[i], '%')

AAPL = -0.11603 %
MSFT = 0.05872 %
GOOG = 0.06806 %
AMZN = -0.02426 %
BRK-B = 0.12957 %
V = 0.10708 %
XOM = 0.13723 %
UNH = 0.04627 %
JNJ = 0.63966 %
NVDA = -0.0463 %
