### Group Assignment
### Team Number: 3
### Team Member Names: Derek Tan, Jeff Peng, Yuqian Lin
### Team Strategy Chosen: SAFE

### Abstract

Our portfolio optimization strategy involves the use and implementation of the Modern Portfolio Theory (MPT) and analysis of the Efficient Frontier graphs. The objective of the portfolio optimization strategy is to maximize the portfolio return while maintaining the minimum portfolio risk. 

Modern Portfolio Theory states that since it is assumed that all investors are risk-adverse, when considering the possible portfolio allocation strategies, the investor will prefer the portfolio that maximizes the possible return while maintaining a given amount of risk. 

The Efficient Frontier (EF), the core of our strategy, was introduced by Nobel Laureate Harry Markowitz and is fundamental to MPT. The EP is a graph that illustrates all possible portfolios portfolio allocation distributions. The x-axis represents the volatility/risk of the portfolio, while the y-axis represents the expected return of the portfolio.

The Efficient Frontier shows the optimized portfolios that offer the highest expected return for a given level of risk and the lowest level of risk for a given level of expected return.

An example of the an Efficient Frontier graph is shown below:

![EF Graph](ef.png)

As seen from the graph, the light blue dot is the portfolio that takes on the highest level of risk coupled with the highest degree of return. Conversely, the left-most purple dot depicts the portfolio that with the lowest level of risk and lowest given level of return. Typically, risk-seeking investors will select portfolios that lie on the right end as they yield a higher return for a high level of risk. In our group's case, we chose the "safe" strategy, and thus will be selecting the portfolio on the left-end of the graph as it yields a lower return for a lower level of risk.

We will be discussing more about how we graphed each portfolio along the EF graph below.

In [2]:
from IPython.display import display, Math, Latex
from datetime import datetime

import pandas as pd
import numpy as np
import numpy_financial as npf
import yfinance as yf
import matplotlib.pyplot as plt

In [3]:
# Import Financial Data

tickers = pd.read_csv("Tickers.csv", index_col=False)

start_date = "2018-01-01"
end_date = "2021-10-31"

tickers = ["CSCO", "TGT", "BK", "MRK", "PFE", "COP", "LLY", "CL", "GOOG", "COF"]

data = yf.download(tickers, start=start_date, end=end_date)

closing_prices = data["Adj Close"]

closing_prices.head()

[*********************100%***********************]  10 of 10 completed


Unnamed: 0_level_0,BK,CL,COF,COP,CSCO,GOOG,LLY,MRK,PFE,TGT
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2018-01-02,48.540565,68.314125,93.266144,49.569435,34.448799,1065.0,78.010117,47.838715,29.757999,61.324654
2018-01-03,48.838196,68.050461,93.106972,50.483414,34.723602,1082.47998,78.433876,47.770645,29.978493,60.907539
2018-01-04,49.325222,68.486862,94.960876,51.065853,34.821842,1086.400024,78.783958,48.544983,30.043821,59.710602
2018-01-05,49.451504,68.568687,94.97023,50.967274,35.304111,1102.22998,79.751236,48.493931,30.10099,60.345341
2018-01-08,49.658932,68.668709,94.380363,51.486992,35.670284,1106.939941,79.345901,48.213123,29.766172,60.916607


To compare price fluctuations, we will calculate the daily percentage change in the price of each stock. By calculating percent change, it makes it easier to compare price fluctuations between stocks as their price changes will all be with respect to the same ratio, that is a percentage.

**The stuff in red is what Derek wrote. We can keep it for now and see what to get rid of later**

<span style = "color: red">To compare the price fluctuations, we will calculate the daily percent change in the price of each stock. This way, it makes the price fluctuations easier to compare, as the price of the stock itself becomes removed from the equation, and we only consider the magnitude of the price movements in comparison to the stock price.* </span>

In [4]:
# Calculate percent change

percent_change = closing_prices.pct_change().apply(lambda x: np.log(1+x))

percent_change.head()

Unnamed: 0_level_0,BK,CL,COF,COP,CSCO,GOOG,LLY,MRK,PFE,TGT
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2018-01-02,,,,,,,,,,
2018-01-03,0.006113,-0.003867,-0.001708,0.01827,0.007945,0.01628,0.005417,-0.001424,0.007382,-0.006825
2018-01-04,0.009923,0.006392,0.019716,0.011471,0.002825,0.003615,0.004453,0.01608,0.002177,-0.019847
2018-01-05,0.002557,0.001194,9.8e-05,-0.001932,0.013755,0.014466,0.012203,-0.001052,0.001901,0.010574
2018-01-08,0.004186,0.001458,-0.00623,0.010145,0.010319,0.004264,-0.005095,-0.005807,-0.011185,0.009422


### Constructing the Efficient Frontier Graph
To construct an Efficient Frontier Graph, we require three factors:
- Covariance of the securities in the portfolio
- Standard deviation also known as risk
- The expected return of the portfolio

Below, we will be calculating all these three factors.

#### Covariance

We will now analyze the covariance of each stock in relation to one another. The covariance of two stocks (stock X, stock Y) is calculated using the following equation:

\begin{align*}
COV(X,Y)=\frac{\sum(x_i-\overline{X})\times(y_i-\overline{Y})}{N}
\end{align*}

We will store the results of the covariance calculations in 'cov_matrix'.

In [5]:
cov_matrix = percent_change.cov()

cov_matrix

Unnamed: 0,BK,CL,COF,COP,CSCO,GOOG,LLY,MRK,PFE,TGT
BK,0.000417,9.5e-05,0.000404,0.000355,0.0002,0.000171,0.000116,0.000127,0.000138,0.000133
CL,9.5e-05,0.000193,0.000122,0.000104,0.000119,0.000104,0.00011,0.0001,9.5e-05,9.6e-05
COF,0.000404,0.000122,0.000764,0.000514,0.000259,0.000241,0.000127,0.000157,0.000168,0.000174
COP,0.000355,0.000104,0.000514,0.00084,0.000255,0.000238,0.000149,0.000151,0.000148,0.000152
CSCO,0.0002,0.000119,0.000259,0.000255,0.000345,0.000206,0.000155,0.000134,0.000151,0.000151
GOOG,0.000171,0.000104,0.000241,0.000238,0.000206,0.000345,0.00013,0.000113,0.00012,0.00012
LLY,0.000116,0.00011,0.000127,0.000149,0.000155,0.00013,0.00036,0.00015,0.00016,0.000113
MRK,0.000127,0.0001,0.000157,0.000151,0.000134,0.000113,0.00015,0.000219,0.00014,9.2e-05
PFE,0.000138,9.5e-05,0.000168,0.000148,0.000151,0.00012,0.00016,0.00014,0.000242,9.4e-05
TGT,0.000133,9.6e-05,0.000174,0.000152,0.000151,0.00012,0.000113,9.2e-05,9.4e-05,0.000378


#### Standard Deviation

To calculate standard deviation, we need to calculate the correlation between stocks.

To do this, we will use a correlation matrix.

The correlation of two stocks (stock X, stock Y) is calculated using the following equation:

\begin{align*}
\rho(X,Y)=\frac{COV(X,Y)}{\sigma_X \sigma_Y}
\end{align*}

Where $\rho_{x,y}$ is the correlation between the two variables, $cov(r_x, r_y)$ is the covariance of return X and return Y, and $\sigma_x$ and $\sigma_y$ are the standard deviations of X and Y respectively.

Note that each stock has a correlation of 1 with itself, a perfect positive correlation.

There exists a positive correlation between stocks X and Y if $0 < \rho_{x,y} < 1$.

There exists a negative (inverse) correlation between stocks X and Y if $-1 < \rho_{x,y} < 0$.

There exists no (zero) correlation between stocks X and Y if $\rho_{x,y} = 0$. In reality, it is almost impossible for two stocks to have zero correlation with each other.

We will store the results of the correlation calculations in 'corr_matrix'.

In [6]:
corr_matrix = percent_change.corr()

corr_matrix

Unnamed: 0,BK,CL,COF,COP,CSCO,GOOG,LLY,MRK,PFE,TGT
BK,1.0,0.33369,0.715997,0.59989,0.528991,0.452034,0.299603,0.421078,0.435655,0.336288
CL,0.33369,1.0,0.318425,0.259415,0.460421,0.404164,0.41931,0.484883,0.438185,0.356295
COF,0.715997,0.318425,1.0,0.641595,0.50397,0.468597,0.242755,0.385147,0.390804,0.322902
COP,0.59989,0.259415,0.641595,1.0,0.474678,0.441667,0.27116,0.35271,0.329132,0.269761
CSCO,0.528991,0.460421,0.50397,0.474678,1.0,0.598118,0.441088,0.488685,0.520996,0.416975
GOOG,0.452034,0.404164,0.468597,0.441667,0.598118,1.0,0.368269,0.410805,0.414551,0.331866
LLY,0.299603,0.41931,0.242755,0.27116,0.441088,0.368269,1.0,0.534484,0.541516,0.306423
MRK,0.421078,0.484883,0.385147,0.35271,0.488685,0.410805,0.534484,1.0,0.607066,0.31852
PFE,0.435655,0.438185,0.390804,0.329132,0.520996,0.414551,0.541516,0.607066,1.0,0.311788
TGT,0.336288,0.356295,0.322902,0.269761,0.416975,0.331866,0.306423,0.31852,0.311788,1.0


#### Expected Return
Finally, we will calculate the expected return of each portfolio. The expected return of a portfolio is caluclated by the equation below:

\begin{align*}
E(X)=\overline{X}=\frac{\sum x_i}{N}
\end{align*}

where $x_i$ are individual returns of some security $X$, $N$ is the total number of observations (time periods for us)

In [101]:
# Calculate Yearly Expected Returns (Returns)

individual_expected_returns = closing_prices.resample('Y').first().pct_change().mean()

yearly_stats = pd.DataFrame(individual_expected_returns, columns=['Returns'])

# Calculate Annual Standard Deviation (Volatility)

trading_days = 250

annual_standard_deviation = percent_change.std().apply(lambda x: x * np.sqrt(trading_days))

yearly_stats['Volatility'] = annual_standard_deviation

yearly_stats

Unnamed: 0,Returns,Volatility
BK,-0.051882,0.322761
CL,0.084679,0.219588
COF,0.037188,0.437067
COP,-0.049693,0.458348
CSCO,0.082123,0.293571
GOOG,0.184453,0.293741
LLY,0.281487,0.299857
MRK,0.181363,0.233781
PFE,0.066535,0.246056
TGT,0.470211,0.30739


In [170]:
# Change this number to change the number of randomly generated portfolios
# The more number of random portfolios generated, the more optimized
#   the final optimized portfolio will be

number_of_portfolios = 10

In [179]:
# Generate portfolios with random weights

# generate_portfolios(tickers, number_of_portfolios) generates
#   a collection of [number_of_portfolios] portfolios from the
#   list of [tickers]

"""
Params:
    tickers (listof Str): List of stock tickers to choose from
    number_of_portfolios (Nat): Number of portfolios to generate
"""


def generate_portfolios(tickers, number_of_portfolios):
    weights = []
    returns = []
    volatility = []

    for i in range(0, number_of_portfolios):
        individual_weights = np.random.random(len(tickers))
        individual_weights = individual_weights / np.sum(individual_weights)
        weights.append(individual_weights)

        individual_returns = np.dot(individual_weights, yearly_stats.Returns)
        returns.append(individual_returns)

        portfolio_variance = (
            cov_matrix.mul(individual_weights, axis=0)
            .mul(individual_weights, axis=1)
            .sum()
            .sum()
        )
        standard_deviation = np.sqrt(portfolio_variance)
        individual_volatility = standard_deviation * np.sqrt(trading_days)
        volatility.append(individual_volatility)

    return weights, returns, volatility


generate_portfolios(tickers, number_of_portfolios)

([array([0.172325  , 0.16624943, 0.12115664, 0.15779976, 0.02394793,
         0.0897988 , 0.05208857, 0.16993522, 0.01299221, 0.03370644]),
  array([0.14281504, 0.0212789 , 0.11969673, 0.04257229, 0.12813155,
         0.11423564, 0.02817855, 0.17132836, 0.20466243, 0.02710051]),
  array([0.05123614, 0.14917657, 0.16332987, 0.02969551, 0.12991149,
         0.08201924, 0.16726944, 0.14183297, 0.08478805, 0.00074071]),
  array([0.04626095, 0.00135806, 0.14740552, 0.12170638, 0.01263634,
         0.15165199, 0.20184607, 0.04825034, 0.06623792, 0.20264644]),
  array([0.07248578, 0.07555458, 0.06327082, 0.14239242, 0.02501378,
         0.09706651, 0.08846755, 0.15859535, 0.16549601, 0.1116572 ]),
  array([0.11831994, 0.01193679, 0.10354302, 0.20492493, 0.07134795,
         0.17383441, 0.00198978, 0.07826754, 0.20499572, 0.03083991]),
  array([0.17337184, 0.04307985, 0.0597589 , 0.10158822, 0.13526352,
         0.1616948 , 0.06605498, 0.06407644, 0.11393107, 0.08118038]),
  array([0.01026205,

## Contribution Declaration

The following team members made a meaningful contribution to this assignment:

Derek, Yuqian, Jeff

### Sources

*I will make this look nicer later*

Image Link: https://www.cryptimi.com/guides/is-diversification-the-right-strategy-for-your-cryptocurrency-portfolio

Equations: Professor Thompson's notes


Definition of MPT & EF: https://www.investopedia.com/terms/e/efficientfrontier.asp https://www.investopedia.com/terms/m/modernportfoliotheory.asp
