# AF3214
# Week 7. Introduction to Risk Measures and Measuring Algorithms Performance

# Part 1: Risk Measures

### Obtain data from APIs 

### e.g., Alpha Vantage, https://github.com/RomelTorres/alpha_vantage 

In [None]:
!pip install alpha_vantage

Let's start by loading data that we will use. We need two firms. Here we pick Amazon and Apple.

In [1]:
# Import pandas and numpy
import pandas as pd
import numpy as np

In [2]:
# Create an empty dictionary. We name it as "stock_data"
stock_data = {}

In [3]:
# API: alpha_vantage (an API by Alpha Vantage)
from alpha_vantage.timeseries import TimeSeries
import time

# If you want to download the data please use your own Alpha Vantage key
ts = TimeSeries(key='J4TEYW0NMM3KQH5Y', output_format='pandas')

tickers = ['AAPL','AMZN']

for ticker in tickers: 
    filename = ticker + '.csv'
    data, meta_data = ts.get_daily(symbol=ticker, outputsize='full')
    stock_data[ticker] = data
    data.to_csv(filename)
    time.sleep(5)  

# meta data: a set of data that describes and gives information about other data
# There are two parts in the API response: "Meta Data" and "Time Series"
# The library is mapping meta_data to "Meta Data" and "Time Series" to data

In [4]:
meta_data

{'1. Information': 'Daily Prices (open, high, low, close) and Volumes',
 '2. Symbol': 'AMZN',
 '3. Last Refreshed': '2025-03-11',
 '4. Output Size': 'Full size',
 '5. Time Zone': 'US/Eastern'}

In [5]:
stock_data

{'AAPL':             1. open   2. high    3. low  4. close   5. volume
 date                                                         
 2025-03-11  223.805  225.8399  217.4500    220.84  73971209.0
 2025-03-10  235.540  236.1600  224.2200    227.48  71451281.0
 2025-03-07  235.105  241.3700  234.7600    239.07  46273565.0
 2025-03-06  234.435  237.8600  233.1581    235.33  45170419.0
 2025-03-05  235.420  236.5500  229.2300    235.74  47227643.0
 ...             ...       ...       ...       ...         ...
 1999-11-05   84.620   88.3700   84.0000     88.31   3721500.0
 1999-11-04   82.060   85.3700   80.6200     83.62   3384700.0
 1999-11-03   81.620   83.2500   81.0000     81.50   2932700.0
 1999-11-02   78.000   81.6900   77.3100     80.25   3564600.0
 1999-11-01   80.000   80.6900   77.3700     77.62   2487300.0
 
 [6378 rows x 5 columns],
 'AMZN':             1. open   2. high    3. low  4. close   5. volume
 date                                                         
 2025-03-11

In [6]:
type(stock_data['AAPL'])

pandas.core.frame.DataFrame

In [7]:
stock_data['AAPL']

Unnamed: 0_level_0,1. open,2. high,3. low,4. close,5. volume
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-03-11,223.805,225.8399,217.4500,220.84,73971209.0
2025-03-10,235.540,236.1600,224.2200,227.48,71451281.0
2025-03-07,235.105,241.3700,234.7600,239.07,46273565.0
2025-03-06,234.435,237.8600,233.1581,235.33,45170419.0
2025-03-05,235.420,236.5500,229.2300,235.74,47227643.0
...,...,...,...,...,...
1999-11-05,84.620,88.3700,84.0000,88.31,3721500.0
1999-11-04,82.060,85.3700,80.6200,83.62,3384700.0
1999-11-03,81.620,83.2500,81.0000,81.50,2932700.0
1999-11-02,78.000,81.6900,77.3100,80.25,3564600.0


In [8]:
stock_data['AMZN']

Unnamed: 0_level_0,1. open,2. high,3. low,4. close,5. volume
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-03-11,193.90,200.1800,193.4000,196.59,52302854.0
2025-03-10,195.60,196.7300,190.8500,194.54,61829231.0
2025-03-07,199.49,202.2653,192.5300,199.25,59802821.0
2025-03-06,204.40,205.7700,198.3015,200.70,49863755.0
2025-03-05,204.80,209.9800,203.2600,208.36,38610085.0
...,...,...,...,...,...
1999-11-05,64.75,65.5000,62.2500,64.94,11091400.0
1999-11-04,67.19,67.1900,61.0000,63.06,16759200.0
1999-11-03,68.19,68.5000,65.0000,65.81,10772100.0
1999-11-02,69.75,70.0000,65.0600,66.44,13243200.0


In [9]:
stock_final_data = pd.DataFrame()
for ticker in tickers:
    stock_final_data[ticker] = stock_data[ticker]['4. close']
idx_sort = stock_final_data.sort_values(by="date")
print(idx_sort)

              AAPL    AMZN
date                      
1999-11-01   77.62   69.13
1999-11-02   80.25   66.44
1999-11-03   81.50   65.81
1999-11-04   83.62   63.06
1999-11-05   88.31   64.94
...            ...     ...
2025-03-05  235.74  208.36
2025-03-06  235.33  200.70
2025-03-07  239.07  199.25
2025-03-10  227.48  194.54
2025-03-11  220.84  196.59

[6378 rows x 2 columns]


In [10]:
type(stock_final_data)

pandas.core.frame.DataFrame

In [11]:
stock_final_data['AAPL']

date
2025-03-11    220.84
2025-03-10    227.48
2025-03-07    239.07
2025-03-06    235.33
2025-03-05    235.74
               ...  
1999-11-05     88.31
1999-11-04     83.62
1999-11-03     81.50
1999-11-02     80.25
1999-11-01     77.62
Name: AAPL, Length: 6378, dtype: float64

In [12]:
stock_final_data['AMZN']

date
2025-03-11    196.59
2025-03-10    194.54
2025-03-07    199.25
2025-03-06    200.70
2025-03-05    208.36
               ...  
1999-11-05     64.94
1999-11-04     63.06
1999-11-03     65.81
1999-11-02     66.44
1999-11-01     69.13
Name: AMZN, Length: 6378, dtype: float64

In [20]:
stock_final_data.head()

Unnamed: 0_level_0,AAPL,AMZN
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2025-03-11,220.84,196.59
2025-03-10,227.48,194.54
2025-03-07,239.07,199.25
2025-03-06,235.33,200.7
2025-03-05,235.74,208.36


In [18]:
stock_final_data.tail()
# how to print rows in between?

Unnamed: 0_level_0,AAPL,AMZN
date,Unnamed: 1_level_1,Unnamed: 2_level_1
1999-11-05,88.31,64.94
1999-11-04,83.62,63.06
1999-11-03,81.5,65.81
1999-11-02,80.25,66.44
1999-11-01,77.62,69.13


In [None]:
# Sort the data by 'date'
stock_final_data = stock_final_data.sort_values(by='date')

In [None]:
stock_final_data.head()

### Calculating the Log Returns:

### Log Return:
$$
   Log\_Return_t = Log(Price_t) - Log(Price_{t-1})
$$

Python code:
log_ret = np.log(df) - np.log(df.shift(1))

For more details, please refer to: https://www.allquant.co/post/magic-of-log-returns-concept-part-1 and https://www.allquant.co/post/magic-of-log-returns-practical-part-2

In [27]:
# Method 1:
stock_log_ret = np.log(stock_final_data) - np.log(stock_final_data.shift(1))
stock_log_ret

Unnamed: 0_level_0,AAPL,AMZN
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2025-03-11,,
2025-03-10,0.029624,-0.010483
2025-03-07,0.049694,0.023923
2025-03-06,-0.015768,0.007251
2025-03-05,0.001741,0.037456
...,...,...
1999-11-05,-0.087342,-0.183245
1999-11-04,-0.054571,-0.029377
1999-11-03,-0.025680,0.042685
1999-11-02,-0.015456,0.009527


$$
   Log\_Return_t = Log(Price_t) - Log(Price_{t-1}) = Log(Price_t/Price_{t-1})
$$

In [28]:
# Method 2:
stock_log_ret = np.log(stock_final_data/stock_final_data.shift(1))

In [29]:
stock_log_ret

Unnamed: 0_level_0,AAPL,AMZN
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2025-03-11,,
2025-03-10,0.029624,-0.010483
2025-03-07,0.049694,0.023923
2025-03-06,-0.015768,0.007251
2025-03-05,0.001741,0.037456
...,...,...
1999-11-05,-0.087342,-0.183245
1999-11-04,-0.054571,-0.029377
1999-11-03,-0.025680,0.042685
1999-11-02,-0.015456,0.009527


### Calculating Expected Return 

Realized returns are often used as a proxy for expected returns. The use of average realized returns as a prxoy for expected returnes relies on a belief that information surprises tend to cancel out over the period of the study and realized returns are therefore an unbiased estimate of expected returns.

In [23]:
# Daily Expected Return (i.e., mean of returns)
aapl_er = stock_log_ret['AAPL'].mean()
print("The daily Expected Return is "+ str(aapl_er*100) + '%')

The daily Expected Return is -0.01639663370668159%


In [None]:
# Daily Expected Return (i.e., mean of returns)
amzn_er = stock_log_ret['AMZN'].mean()
print (str(amzn_er * 100) +'%')

### Annualized Expected Return:

### What is annualized return?

Annualized return: Yearly rate of return inferred from any time period.

(1) The annualized return is the return that an investment earns each year for a given period.

(2) It is useful when comparing investments with different lengths of time.

### <font color='red'> Since we are using log returns, we do not need to compound it as log returns are already continuously compounded. We just need to multiply by the # of tradings days (assuming 252 trading days per year). </font>

https://www.nyse.com/publicdocs/Trading_Days.pdf

In [None]:
# Annualized return for Apple
aapl_ann_ret = aapl_er * 252
print ('Annualized return is ' + str(aapl_ann_ret*100)+' %')

In [None]:
# Annualized return for Amazon
amzn_ann_ret = amzn_er * 252
print('Annualized return is ' + str(amzn_ann_ret*100)+' %')

## Now let's work on portfolio

### Learn a new Python function that we will use: "np.array"

#### Know more about the "array" function in Numpy:
np.array: Create an array

What is an array?
https://2.bp.blogspot.com/-TUYyIovFJXc/VhU8CxS68tI/AAAAAAAAD6o/EblM_W5YdPs/w1200-h630-p-k-no-nu/What%2Bis%2Bin%2Barray.jpg

In [None]:
# Example of np.array
import numpy as np
np.array([0, 1, 2])

### Now let's use "np.array" to calculate expected return of a portfolio

In [None]:
# Cacluated Expected Return of a Portfolio
# Assuming an equally weighted portfolio
portfolio_weights = np.array([0.5, 0.5])

### Expected return of a portfolio


$ Expected \ Return \ of \ Portfolio = \sum Expected \ Return \ of \ Stock_i*Weight_i $


In [None]:
expected_return = np.sum( (stock_log_ret.mean() * portfolio_weights))
expected_return

In [None]:
# Annualized return
ann_return = expected_return*252
ann_return

### Calculate daily return of a portfolio

In [None]:
stock_port_ret = (stock_log_ret*portfolio_weights)

In [None]:
stock_port_ret

In [None]:
# df.loc[:,'New_Column'] = 'value' - You can use '.loc' with ':' to add a specified value for all rows.
# https://www.re-thought.com/blog/how-to-add-new-columns-in-a-dataframe-in-pandas 

stock_port_ret.loc[:,'Portfolio']= stock_port_ret.sum(axis=1)

"""
pandas.DataFrame.sum(axis=1):
to find the sum of all rows in DataFrame; 
axis=1 specifies that the sum will be done on the rows.
"""

In [None]:
stock_port_ret

In [None]:
ticker ='DIA'
filename = ticker + '.csv'
data, meta_data = ts.get_intraday(symbol=ticker, outputsize='full')
data.to_csv(filename)
data

In [None]:
market_return = data.sort_values(by='date')
market_return = market_return['4. close']
market_return = np.log(market_return/market_return.shift(1))
market_return

In [None]:
market_return = market_return.drop(market_return.index[0])
market_return

In [None]:
# Add the market return into stock_port_ret
stock_port_ret['DIA'] = market_return

In [None]:
stock_port_ret = stock_port_ret.drop(stock_port_ret.index[0])

In [None]:
stock_port_ret

---------------------------------------

## Risk Measure: Standard Deviation

### Variance

In [None]:
# Variance of a Single Stock
aapl_variance = stock_log_ret['AAPL'].var()
print(aapl_variance)

### Standard Deviation


In [None]:
# Method 1:
np.sqrt(aapl_variance) 

In [None]:
# Method 2:
stock_log_ret['AAPL'].std()

## Variance of the Portfolio of Stocks

$$
Variance = (Weight_1)^2*Var_1 + (Weight_2)^2*Var_2 + 2*Weight_1*Weight_2*cov
$$

#### Method 1：

In [None]:
portfolio_weights

In [None]:
stock_log_ret.cov()

In [None]:
#df.loc['row_label', 'column_label']
stock_log_ret.cov().loc['AAPL','AMZN']

In [None]:
# variance of portfolio of 2 assets 
# = (weight_1)^2*var_1 + (weight_2)^2*var_2 + 2*weight_1*weight_2*cov_12

port_var = portfolio_weights[0]**2 * stock_log_ret['AAPL'].var() + portfolio_weights[1]**2 * stock_log_ret['AMZN'].var() + 2*portfolio_weights[0]*portfolio_weights[1]*stock_log_ret.cov().loc['AAPL','AMZN']
# loc: Access a group of rows and columns.

In [None]:
port_var

In [None]:
# Annual variance
print (port_var*252)

In [None]:
# Standard deviation
port_std = np.sqrt(port_var*252)
print(port_std)

#### (Optional) Method 2：
Cacluating using Matrixes, so you could easily use this for any number of assets.

https://community.wolfram.com/c/portal/getImageAttachment?filename=var_covar_formula.gif&userId=196586

portfolio variance = weight_vector' * cov matrix * weight_vector

### Learn a new Python function that we will use: "np.dot"

#### Explain np.dot:

What is dot product in Math: https://en.wikipedia.org/wiki/Dot_product

In Python, np.dot: product of two arrays. To compute dot product of numpy arrays, you can use numpy.dot() function.

For more information, please see https://numpy.org/doc/stable/reference/generated/numpy.dot.html

In [None]:
# Example of np.dot

import numpy as np

# Create two arrays
A = np.array([2, 1, 5, 4])
B = np.array([3, 4, 7, 8])

# dot product
output = np.dot(A, B)

print(output)

In [None]:
"""
output = [2, 1, 5, 4].[3, 4, 7, 8]
       = 2*3 + 1*4 + 5*7 + 4*8
       = 77
"""

### Now let's use "np.dot" to calculate variance of a portfolio

$ Variance = (Weight_1)^2*Var_1 + (Weight_2)^2*Var_2 + 2*Weight_1*Weight_2*cov $

In [None]:
stock_log_ret.cov()

In [None]:
portfolio_weights

In [None]:
# port_var_mat = np.dot(np.dot(A, B), A.T)
# numpy.T, Returns an array with axes transposed, View of the transposed array.
# https://numpy.org/doc/stable/reference/generated/numpy.ndarray.T.html
port_var_mat = np.dot(np.dot(portfolio_weights, stock_log_ret.cov()), portfolio_weights.T)
print (port_var_mat)

In [None]:
# Alternative:
port_var_mat = np.dot(portfolio_weights.T, np.dot(stock_log_ret.cov(), portfolio_weights))
print (port_var_mat)

In [None]:
# Annual variance
print (port_var*252)

In [None]:
# Standard deviation
np.sqrt(port_var_mat*252)

## Risk Measure: Beta (optional, for advanced students, not to cover in class)

### Beta

We have a portfolio called "p", which includes Apple and Amazon. We have a market index called "m". $ B_p $ is Beta for Portfolio p. Using CAPM, we have the following

$
E (R_p) = R_f + B_p [ E (R_m) - R_f]
$

That is, Expected Return of Portfolio = Risk-free Rate + Beta*(Expected Return of Market - Risk-free Rate)

Therefore we can cacluate Beta $ B_p $ using

$
B_p = [E (R_p) - R_f] / [ E (R_m) - R_f]
$

## Using regressions to obtain Beta

To be simple, let's assume $ R_f = 0 $ for now:   
$
E (R_p)  = R_f + B_p [ E (R_m) - R_f] =  B_p*E (R_m)
$


Now let's use a simple linear regression: $ Y = \alpha + \beta X + \epsilon $, where $ Y $ is $ R_p $, $ X $ is $ R_m $, and $ \beta $ is $ B_p $.

So we will run a regression below:

$
R_p = \alpha + B_p * R_m + \epsilon
$

### Run regressions in Python: use statsmodels module

statsmodels is a Python module that provides classes and functions for the estimation of many different statistical models, as well as for conducting statistical tests, and statistical data exploration.

https://www.statsmodels.org/stable/install.html

In [None]:
# Install statsmodels
!pip install statsmodels

In [None]:
# import statsmodels API
import statsmodels.api as sm

"""
statsmodels.api: Cross-sectional models and methods.
The API focuses on models and the most frequently used statistical test, 
and tools.
Canonically imported using import statsmodels.api as sm.
"""

### Regression Code:

In [None]:
# X is the market index return
X = stock_port_ret['AAPL']

# y is the portfolio return
y = stock_port_ret['Portfolio']

# Add a constant in the regression model
# An intercept is not included by default and should be added by the user
X1 = sm.add_constant(X)

# Regression model
# OLS: Ordinary Least Squares
model = sm.OLS(y,X1)

# Fit the model and print results
results = model.fit()
print(results.summary())

In [None]:
print(results.params)

In [None]:
beta = results.params[1]
print('Beta is ' + str(beta))

In [None]:
print('R2: ', results.rsquared)

## Sharpe Ratio 

Assume the risk-free rate is 0%

Formula and Calculation for Sharpe Ratio
\begin{aligned} &\textit{Sharpe Ratio} = \frac{R_p - R_f}{\sigma_p}\\ &\textbf{where:}\\ &R_{p}=\text{return of portfolio}\\ &R_{f} = \text{risk-free rate}\\ &\sigma_p = \text{standard deviation of the portfolio's excess return}\\ \end{aligned} 

In [None]:
sharpe = (ann_return - 0)/port_std
print(sharpe)

### Treynor Ratio

The Formula for the Treynor Ratio is:
\begin{aligned} &\text{Treynor Ratio}=\frac{R_p - R_f}{\beta_p}\\ &\textbf{where:}\\ &R_p = \text{Portfolio return}\\ &R_f = \text{Risk-free rate}\\ &\beta_p = \text{Beta of the portfolio}\\ \end{aligned} 

In [None]:
treynor = (ann_return - 0)/beta
print(treynor)