# 1. Lecture overview

We start thinking about future returns. We model them as random variables and we are interested in estimating the expectation and volatility of the probability distribution associated with these random variables. We will use sample means to estimate the expectation (i.e. expected returns) and the sample standard deviation to estimate the volatility (i.e. total risk). We then introduce the concept of a risk premium and we show how investments should be assessed based on the risk - return tradeoff they offer.

- Using sample means and standard deviations to estimate expected return and risk
- Risk-free rate, excess returns, and risk premia
- Mean variance analysis and the Sharpe Ratio

# 2. Sample mean and standard deviation

Investments are about **future** rewards. We hope that we have found an investment that will give us high rewards, but because we can never perfectly predict the future, there will always be some amount of uncertainty associated with any investment. Hence, to give ourselves a chance at some **expected return**, we must expose ourselves to some amount of **risk** that we may loose money from that investment. The goal is to find investments which we *believe* (we can not know) will provide us with a good risk-return tradeoff: a large expected return per unit of risk. We will see below that this is called the **Sharpe ratio** of that investment. 

So, to make good investment decisions, we must find a way to measure (estimate) the risk-return tradeoff they offer. As explained above, this entails estimating the risk and expected return of that investment. We do so by thinking about future returns on that investment as random variables: the future return is an unknown (random) quantity which takes possible values according to some probability distribution. The mean of that probability distribution is the expected return on the investment, and the standard deviation of that distribution is a good measure of the risk of the investment because it quantifies how uncertain we would be if we were to predict future returns. 

The mean and standard deviation of future returns are unknown quantities because future returns are unknown quantities. To estimate them, we must assume that *past* returns came from the same probability distribution as future returns. If that is a good assumption, then we can use data on past returns to estimate the mean and standard deviation of future returns.

Specifically, if we have $N$ observations of past returns on a given investment (call these returns $R_{1}$, $R_{2}$, ... , $R_{N}$), then: 

- The **sample mean** is given by:

$$\mu = \frac{R_1 + R_2 + ... + R_N}{N}$$

- The **sample standard deviation** is given by:

$$\sigma = \sqrt{\frac{(R_1 - \mu)^2 + (R_2 - \mu)^2 + ... + (R_N - \mu)^2}{N-1}}$$

Another variable that we often encounter is the **sample variance** which is just the square of the sample standard deviation: $Variance = \sigma^2$

### 2.1. Example

Suppose for the last 4 months, TLSA had returns of 10%, 5%, -5%, and 14% respectively. Using this data alone, what is your estimate for the expected return and risk associated with TSLA's next month's return?

In [1]:
# Import packages
import numpy as np

# Inputs
r1 = 0.1
r2 = 0.05
r3 = -0.05
r4 = 0.14
N  = 4

# Mean
m = (r1 + r2 + r3 + r4)/N

# Standard deviation
sd = (((r1 - m)**2 + (r2 - m)**2 + (r3 - m)**2 + (r4 - m)**2)/(N - 1))**(1/2)

# Print results
print(f"Solution 1: \n  Estimated expected return = {m: .3f} \n  Estimated risk = {sd: .3f}")

# Solution 2: use numpy array and the "average" and "std" methods
returns = np.array([r1, r2, r3, r4])
m2  = np.average(returns)
sd2 = np.std(returns)
print(f"Solution 2: \n  Estimated expected return = {m: .3f} \n  Estimated risk = {sd: .3f}")

Solution 1: 
  Estimated expected return =  0.060 
  Estimated risk =  0.082
Solution 2: 
  Estimated expected return =  0.060 
  Estimated risk =  0.082


# 3. The risk-free rate, excess returns, and risk premia

To quantify the reward we obtain from an investment in exchange for taking on the risk of that investment, we generally use not is expected return, but its expected return in excess of the return on a risk-free investment.

The only **risk-free investment** is a short-term (1-3 month) U.S. Treasury bill (Tbill). It is assumed that the U.S. government will not default on its debt over the next 1-3 months. For this class we will use the 1-month Tbill as the risk free asset but the 3-month Tbill is also often used. 
- The risk-free rate is the yield on the 1-month Tbill
- Because the asset is riskless (there is no uncertainty about its future payoff)
    - Its expected return is also equal to its yield:    
$$E[R_f] = R_f$$
    - Its risk (standard deviation of future returns) is zero:    
$$\sigma(R_f) = 0$$


The **excess return** on any investment $i$ equals its return minus the risk-free rate:

$$ExcessReturn_i = R_i - R_f$$

The **risk-premium** on any investment $i$ equals its expected return minus the risk-free rate:

$$RiskPremium_i = E[R_i] - R_f$$ 

### 3.1. Example

Suppose returns on TSLA over the past 3 months were 10% (in the most recent month), 5%, and -3% respectively and that the yield on a 1-month Tbill were 0.1% (in the most recent month), 0.2%, and 0.3% respectively. What were the excess returns on TSLA over the past 3 months. What is your estimate for the risk premium on TSLA?

In [2]:
# Inputs
r1 = 0.1
r2 = 0.05
r3 = -0.03

rf1 = 0.001
rf2 = 0.002
rf3 = 0.003

# Excess returns
er1 = r1 - rf1
er2 = r2 - rf2
er3 = r3 - rf3
print(f"Excess returns over the past 3 months were: \n {er1} \n {er2} \n {er3}")

# Risk remium
exp_return = (r1 + r2 + r3)/3
risk_premium = exp_return - rf1 #note that we are using the most recent risk-free rate
print(f"Risk premium estimate: \n {risk_premium: .3f}")

Excess returns over the past 3 months were: 
 0.099 
 0.048 
 -0.033
Risk premium estimate: 
  0.039


4. Mean-variance analysis and the Sharpe ratio

Mean-variance analysis is the process of assessing investments based on their expected returns and total risk. Investors are said to be mean-variance optimizers if they only care about these two aspects of their investments (one could imagine other potentially important aspects such as the environmental impact of the firm or treatment of its employees). In this course we will assume that the marginal investor is a mean-variance optimizer (loosely speaking, you can think of the marginal investor as the "average" investor in the market).

If investors are mean-variance optimizers, they will prefer investments with the highest possible ratio of expected reward to potential risk. More 

### 4.1. Example

Assume that you are only allowed to invest in TSLA or AAPL. You estimate that TSLA has an expected return of 10% and a standard deviation of 12% and AAPL has an expected return of 7% and a standard deviation of 6%. The yield on a 1-month Tbill is 1%. If you are a mean-variance optimizer, which stock should you invest in?

In [3]:
# Inputs
tsla_er = 0.10
tsla_sd = 0.12
aapl_er = 0.07
aapl_sd = 0.06
rf = 0.01

# Sharpe ratios
tsla_sr = (tsla_er - rf) / tsla_sd
aapl_sr = (aapl_er - rf) / aapl_sd

# Print results
print(f" TSLA Sharpe Ratio = {tsla_sr: .3f} \n AAPL Sharpe Ratio = {aapl_sr: .3f}")

 TSLA Sharpe Ratio =  0.750 
 AAPL Sharpe Ratio =  1.000


## 4.2. Discussion

We will soon see that the answer to the above question is based on the Sharpe ratio only if the investor is allowed to borrow and lend at the risk-free rate. If that is not the case, and they are restricted to investing in one of those two assets alone, (which is very unrealistic), investors with lower risk aversion should choose the riskier asset and investors with higher risk aversion should choose the less risky asset.

# 5. Application

Download monthy price data on TSLA and AAPL from Yahoo Finance over the 2016-2020 period. Estimate which one of them is a better investment from a mean-variance standpoint. Assume the risk-free rate is 0.1%.

In [4]:
# Import the needed libraries
import yfinance as yf
import pandas as pd

In [5]:
# Download data on both TSLA and AAPL
price_dta = yf.download(tickers = ['TSLA', 'AAPL'],
                      start = '2016-01-01',
                      end = '2020-12-31',
                      interval = '1mo',
                      progress = False).dropna()
price_dta.head()

Unnamed: 0_level_0,Adj Close,Adj Close,Close,Close,High,High,Low,Low,Open,Open,Volume,Volume
Unnamed: 0_level_1,AAPL,TSLA,AAPL,TSLA,AAPL,TSLA,AAPL,TSLA,AAPL,TSLA,AAPL,TSLA
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
2016-01-01,22.512041,38.240002,24.334999,38.240002,26.4625,46.276001,23.0975,36.481998,25.6525,46.144001,5087392000.0,396236000.0
2016-02-01,22.361717,38.386002,24.172501,38.386002,24.7225,39.903999,23.147499,28.209999,24.1175,37.751999,3243450000.0,668529000.0
2016-03-01,25.343138,45.953999,27.247499,45.953999,27.605,47.976002,24.355,36.299999,24.4125,38.849998,2984198000.0,514610000.0
2016-04-01,21.7971,48.152,23.434999,48.152,28.0975,53.868,23.127501,46.650002,27.195,48.966,3489535000.0,677536500.0
2016-05-01,23.220165,44.646,24.965,44.646,25.182501,48.638,22.3675,40.731998,23.4925,48.299999,3602686000.0,516537500.0


In [6]:
# Calculate monthly returns
tsla_ret = price_dta[('Adj Close', 'TSLA')].pct_change()
aapl_ret = price_dta[('Adj Close', 'AAPL')].pct_change()
tsla_ret.head()

Date
2016-01-01         NaN
2016-02-01    0.003818
2016-03-01    0.197155
2016-04-01    0.047830
2016-05-01   -0.072811
Name: (Adj Close, TSLA), dtype: float64

In [7]:
# Estimate expected returns and standard deviations
tsla_er = tsla_ret.mean()
tsla_sd = tsla_ret.std()
aapl_er = aapl_ret.mean()
aapl_sd = aapl_ret.std()

# Print them out
print(f"TSLA: \n  E[R] = {tsla_er: .3f} \n  SD = {tsla_sd: .3f}")
print(f"AAPL: \n  E[R] = {aapl_er: .3f} \n  SD = {aapl_sd: .3f}")

TSLA: 
  E[R] =  0.066 
  SD =  0.192
AAPL: 
  E[R] =  0.034 
  SD =  0.086


In [8]:
# Estimate Sharpe ratios
rf = 0.001
tsla_sr = (tsla_er - rf) / tsla_sd
aapl_sr = (aapl_er - rf) / aapl_sd

# Print them out
print(f"TSLA Sharpe Ratio = {tsla_sr: .3f}")
print(f"AAPl Sharpe Ratio = {aapl_sr: .3f}")

TSLA Sharpe Ratio =  0.339
AAPl Sharpe Ratio =  0.384
