# Measures of Risk and Reward

### Motivation: Volatility of Returns

Average of two series exactly same. But spikiness of series could be different. Difference in spikiness can be _more_ crucial than average returns.

- First step: **Demeaning** - Taking return series and subtracting the mean 
- Measuring deviations from mean - square them.
- **Variance of a set of returns**: Average of the square of the deviations from its mean.
    - Variance involves squares.
    - Therefore, square root of variance => Standard Deviation => Measure of volatility
    - Can't compare volatility of daily data with volatility from monthly data
        - Approximately 252 trading days in a year
        - Annualized Volatility from daily data => Daily volatility multiplied by the square root of 252.
        - Annualized Volatility from monthly data => Multiply monthly volatility by square root of 12
        
- Some way to compare returns that have different risk
    - **Motivation**: US small cap (small stocks) have annualized returns of 17.2% compared to large caps with 9.5%. But the volatility is different, at 36.8% versus 18.7%. How do you compare? 
    - One way: *Return on Risk Ratio*: Return / Risk
    - Another way: What is the additional return you get if there's volatility, compared with the return without?
    
**Risk Free Rate**: Return you would get with virtually no risk. Typically: very short term US Treasury Bill (30 days or less) used as a proxy.

**Sharpe Ratio**: One of the most significant ratios.

$Sharpe Ratio = \frac{Return - Risk Free Rate} {Volatility}$

Or in mathematical terms,

$Sharpe Ratio(P) = \frac{R_p - R_f}{\sigma_p}$

*Potential fact*: Adjusted for the risk free rate, it looks like small caps give a slightly better risk-adjusted return than large caps.


In [1]:
import pandas as pd
prices = pd.read_csv("data/sample_prices.csv")
returns = prices.pct_change()
returns

Unnamed: 0,BLUE,ORANGE
0,,
1,0.023621,0.039662
2,-0.021807,-0.033638
3,-0.031763,0.082232
4,0.034477,0.044544
5,0.037786,-0.026381
6,-0.011452,-0.049187
7,0.032676,0.117008
8,-0.012581,0.067353
9,0.029581,0.078249


The first row is NaN's because there's no previous day to compare with.

In [2]:
returns = returns.dropna()
returns

Unnamed: 0,BLUE,ORANGE
1,0.023621,0.039662
2,-0.021807,-0.033638
3,-0.031763,0.082232
4,0.034477,0.044544
5,0.037786,-0.026381
6,-0.011452,-0.049187
7,0.032676,0.117008
8,-0.012581,0.067353
9,0.029581,0.078249
10,0.006151,-0.168261


Pandas' built-in method for standard deviation can return risk.

In [3]:
returns.std()

BLUE      0.023977
ORANGE    0.079601
dtype: float64

Perhaps better to do it with "known" formulas.

In [5]:
deviations = returns - returns.mean() #Demeaning
squared_deviations = deviations**2 #Squared deviations
variance = squared_deviations.mean() #variance = mean of squared deviations
import numpy as np
volatility = np.sqrt(variance)
volatility

BLUE      0.022957
ORANGE    0.076212
dtype: float64

Numbers don't match because standard deviation uses /(n-1). Mean uses /n. Population STD versus sample STD.

In [7]:
#Deviations and squared deviations good.
number_of_obs = returns.shape[0]
variance = squared_deviations.sum()/(number_of_obs - 1)
volatility = variance **0.5
volatility

BLUE      0.023977
ORANGE    0.079601
dtype: float64

In [8]:
returns.std()

BLUE      0.023977
ORANGE    0.079601
dtype: float64

Standard deviation and volatility now match. But data is monthly. So now need to scale up to annual.

In [10]:
returns.std()*np.sqrt(12)

BLUE      0.083060
ORANGE    0.275747
dtype: float64

Now with real data. 

CSV we will use is segmented into lowest 30%, middle 40%, and highest 30% by market cap. Another grouping: low 20, Quntiles 2->4, high 20.