# Security's risk

Risk and return are the two most important dimensions on investment decision making.
But how can we measure and forecast risk?

## What is Risk?

For example we bought a security with \\$1000(USD) and we know that the stock's average return is 15\%.
How was this average made whether:
* The historical observations of earn are 14\% the first year 16\% the next year 13\% the next year and 17\% the last year?
* Or the historical observations are +50\% vs -20\% vs -20\% vs +50\%.

In the first case we have a stable amount of earn over time (between 13\% to 17\%).
But in the second case things change a lot because we have huge amount of variability between the observations. The average rate of return is the same but we can't have an idea of what comes next because of the huge variability.

### Hence variability plays an important role and it is best measure of risk.

A volatile market is more prone to deviate from the historical rate of returns.

Most people want to have a clear idea of the returns of a security or a portfolio of securities and they try to reduce the risk that they exposed to.

### A great deal to help us quantify the risk are the statistical measures  Variance(S^2) and Standard deviation(S).

* The variance of a security measures the dispersion of a set of data points around the mean.
$$ S^2 = \frac{Σ(Χ-\bar{X})^2}{N-1} $$

Hence, in the previous example the mean is 15\% and we have to subtract from each of the observations in the power of two  and add the result.
$ (14\%-15\%)^2 + (16\% -15\%)^2 + (13\%-15\%)^2 + (17\% -15\%)^2 $. These are the four dispersions from the mean in the power of two.
After we have the result we have to divide by the number of observations minus one. In this case $4-1$.
Finally we have the variance which is 
$$S^2=0.0333$$

Now, if we'll take the square root of the $S^2$ we'll find the standard deviation:
$$ S=\sqrt{S^2}=1.8\%$$

Let's do the some thing for the second set of observations.
We'll find $ S^2=16\% $ and $S=\sqrt{S^2}=40\%$

The conclusion is that the second set has higher disporsions so is more risky that the first one.

# Let's calculate the risk of a security!!

In [1]:
# Import the libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf

In [3]:
# Create tickers
tickers=['PG','TSLA']
# create the DataFrame
data= pd.DataFrame()
#Load the data
for t in tickers:
    data[t]=yf.download(t,start='2012-1-1')['Adj Close']
# See the data
data.head()

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,PG,TSLA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2012-01-03,51.261551,28.08
2012-01-04,51.238533,27.709999
2012-01-05,51.023746,27.120001
2012-01-06,50.901016,26.91
2012-01-09,51.115807,27.25


In [4]:
#Let's save the data
data.to_csv("data-for-risk-measure.csv")


In [7]:
#Let's import the data
data=pd.read_csv("data-for-risk-measure.csv",index_col=0)
#See the data
data.head()

Unnamed: 0_level_0,PG,TSLA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2012-01-03,51.261551,28.08
2012-01-04,51.238533,27.709999
2012-01-05,51.023746,27.120001
2012-01-06,50.901016,26.91
2012-01-09,51.115807,27.25


In [8]:
# Let's calculate the log rate of return
data_return=np.log(data/data.shift(1))
# See the first five rows
data_return.head()

Unnamed: 0_level_0,PG,TSLA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2012-01-03,,
2012-01-04,-0.000449,-0.013264
2012-01-05,-0.004201,-0.021522
2012-01-06,-0.002408,-0.007774
2012-01-09,0.004211,0.012556


### To calculate the variance and the standard deviation we have first to find the mean.

In [10]:
# Find the mean of PG
data_return['PG'].mean()

0.00037626531986121263

In [11]:
# Let's make it annual
data_return['PG'].mean()*250

0.09406632996530316

In [12]:
#Let's find the standard deviation
data_return['PG'].std()

0.011252554548964053

In [13]:
# Let's make it annual
data_return["PG"].std()*250**0.2 #Don't forget the square root on 250

0.03394994919252126

### Let's do the same thing in Tesla

In [16]:
#Calculate the mean
data_return['TSLA'].mean()

0.0015889990780202118

In [17]:
#Make it annual
data_return['TSLA'].mean()*250

0.397249769505053

In [18]:
# Let's calculate the standard deviation
data_return['TSLA'].std()

0.033802114072277345

In [19]:
#Let's make it annual
data_return['TSLA'].std()*250**0.5

0.5344583509861293

It would be easier for us to make some conclusions if we have the two means and the two std, one next to the other. 
* The first way to achieve this is to print the equations of the annualized means. Let's do it.

In [21]:
print(data_return['PG'].mean()*250)
print(data_return['TSLA'].mean()*250)

0.09406632996530316
0.397249769505053


* The second way is the way that I recommend. Let's do it!


In [24]:
data_return.mean()*250

PG      0.094066
TSLA    0.397250
dtype: float64

### Now let's put the std together.

In [27]:
data_return.std()*250**0.5

PG      0.177919
TSLA    0.534458
dtype: float64

We can see that Tesla has a higher standard deviation so is more risky that Procter and Gamble.

# The relationship between securities

It is reasonable to think that common factors influence the prices of shares in a Stock Exchange. The most common factor is the development of the economy.
* Favorable macroeconomic conditions facilitate the business of all companies.
* In recession times, consumer spending decrease and businesses suffer.
Hence, one relationship could be: the state of the economy influences the stock prices

However different industries influenced in a different way.

* For example in a period of recession car industries suffers more that a supermarket because people can't stop buying food and groceries. But the can postpone buying a new car. 

So the state of the economy impacts different industries in a different way.

## How important is this to an investor?

It could be good investor to build a portfolio with stocks of different industries e.g.( 1 share of Facebook, 1 shape of Walmart) because it gives us protection on a recession for example in internet industry(Facebook) or a recession in the retail sector(Walmart). If we buy stocks of different industries we would not be exposed on different industry recessions.

### This is a relationship between the prices of companies, will help us to optimize investment porftolio.


# Measuring the relationship between stocks

Let's make an example to help us to understand the relationship between stocks

#### House pricing.

* Houses size is one of the key variables determining the house prices. Typically larger houses tend to be more expensive.

We can understand that we have a relationship between the size of the house and the price. The statistician use correlation to determine the relationship between two variables and takes price between -1 and 1.

*The correlation coefficient is:
$$ ρ_{xy} = \frac{(x-\bar{x})(y-\bar{y})}{σ_xσ_y}$$

and shows the relation between two variables.
* The covariance function is:
$$ σ_{xy}=\frac{(x-\bar{x})(y-\bar{y})}{n-1}$$

Cavariance gives us an idea how the two variables are moving.
* If covar> 0 The two variables moving in the some direction
* If covar< 0 The two variables moving in the opposite direction
* If covar= 0 The two variables are independent

In our example x is house size and y is house price

# Correlation adjust covariance
Interpreting Correlation
## Perfect correlation
* House price are directly proportional to house size. For every square foot of house the price increases by \\$1000(USD).

But in reality, several variables have impact on house prices, and perfect correlation is not so common. It is more likely to find imperfect correlation. 

# In the some way, several variables determine share price:
* Industry growth
* Revenue growth 
* Profitability 
* Regulatory environment

The most similar the context in which the two companies operate, the more correlation the will be between their share prices.

## No correlation
* Variables with 0 correlation are absolute independed from each other.
* For example the price of the coffee in Brazil and the price of a house in London. They don't have something in common.

## Negative correlation
* The two variables move in opposite directions.
* Perfect negative correlation: -1
* Imperfect negative correlation: -1 and 0

For example a business that produces ice cream and a business that produces umbrellas 

# Let's calculate with Python covariance and correlation coefficient

In [32]:
# In python we don't have to do mathematical calculation to find variance. We can use the numpy method var()
PG_var=data_return['PG'].var()
PG_var

0.0001266199838774116

In [33]:
# Calculate the varience of Tesla
TS_var=data_return['TSLA'].var()
TS_var

0.0011425829157552503

In [34]:
#Let's annualize them
PG_var_a=PG_var*250
TS_var_a=TS_var*250

In [36]:
# Let's now use the numpy method cov() to calculate the covariance.
cov_matrix=data_return.cov()
cov_matrix
# We can see that in the diagonally is the variance.

Unnamed: 0,PG,TSLA
PG,0.000127,5.4e-05
TSLA,5.4e-05,0.001143


In [37]:
#Let's annualize
cov_matrix_a=cov_matrix*250
cov_matrix_a

Unnamed: 0,PG,TSLA
PG,0.031655,0.013481
TSLA,0.013481,0.285646


In [40]:
# Now let's calculate the correlation with the corr() method
corr_matrix=data_return.corr()
corr_matrix

Unnamed: 0,PG,TSLA
PG,1.0,0.141767
TSLA,0.141767,1.0


Along the main diagonal we can see that is exactly one (perfect correlation). It is true because it is the correlation of each stock with itself and makes sense that we have perfect correlation. The other number are the same because is the correlation of the same stocks. It shows that we have weakly correlation between the stock shares. 

### THIS IS NOT THE CORRELATION BETWEEN THE PRICES OF THE TWO EQUITES
* This is the correlation between returns. Correlation between returns and prices are different numbers.
* Corr(returns): reflects the dependence between prices at different times and focuses on the returns of our portfolio.
* Corr(price): Focuses on stock price level

### Finally

Don't fail in the trap to annualizing the correlation table because it doesn't contain daily values.

# Portfolio variance

Now we are going to calculate the variance of a portfolio.
* For example a portfolio with 1 share of Facebook and 1 share of LinkedIn has different risk of a portfolio with 1 share of Facebook and 1 share Walmart

### Let's see some algebra
$$ (a +b)^2= a^2 + 2ab + b^2$$

### Let's do the same thing with a portfolio contains two stocks

Portfolio variance (2 stocks):
* The first stock has weight = $w_1$ and the second= $w_2$ and $ w_1 + w_2 =1$
* $(w_1σ_1 + w_2σ_2)^2=?$

### The portfolio's variance it'll be given by:
$(w_1σ_1 + w_2σ_2)^2=w_1^2σ_1^2+2w_1σ_1w_2σ_{2ρ_{12}}+w_2^2σ_2^2$

# Calculating a porftolio Risk

### Equal weighting scheme:

In [41]:
# I will use numpy to create the weights
weights=np.array([0.5,0.5])

### Portfolio variance:
We are going to use the numpy.dot() method because we want to multiply matrices
* When we have number $(ab)^2= a^2b^2$
* With metrics is like that $ (aB)^2= a^TBa$
* To take the transposed matrix we'll use the .T notation after the matrix that we want to transpose.

In [42]:
porf_var=np.dot(weights.T,np.dot(data_return.cov()*250,weights))
porf_var

0.08606549741680392

#### Portfolios volatility

In [43]:
porf_vol=(np.dot(weights.T,np.dot(data_return.cov()*250,weights)))**0.5
porf_vol

0.29336921688685047

In [48]:
print (str(round(porf_vol,5)*100)+'%')

29.337000000000003%


# Portfolio variance
is the sum of:
* 1) Variance of Securities
* 2) Correlation (and Coveriance)

### Two types of investment risk
1) Un-diversifiable (This component depends on the variance of each individual security. It is also known as systematic)
* Systematic Risk cannot be eliminated. It is made of the day to day stock changes and it caused by events that effect every company. Such as:
                        * Reccesion of the Economy
                        * Low consumer spending
                        * wars
                        * Force of nature


2) Diversifiable (Un-systematic)
* Idiosyncratic risk (also known as company specific risk).Driven by company-specific events. Diversifiable risk can be eliminated if we invest in no-correlated assets such as:
                        * Automotive
                        * Construction
                        * Energy
                        * Technology
For example S&P500 is well diversified and most investments try to have shares same as S&P500.

# Calculating Diversifiable and Un-diversifiable Risk of a portfolio

In [51]:
# Let's see the weights again
weights[0], weights[1]

(0.5, 0.5)

#### Diversifiable Risk

$ Diversifiable\ Risk=\ portfolio\ varience\ -\ weighted\ annual\ variances $

In [53]:
# Calculating Diversifiable Risk
df=porf_var - (weights[0]**2*PG_var_a)- (weights[1]**2*TS_var_a)
df

0.006740316189762549

In [60]:
# let's see the present
print(f'{df*100:.3f} %')

0.674 %


### Non-Diversifiable Risk:
* Two ways to calculate it:

In [67]:
# First way
n_dr_1=porf_var-df
n_dr_1

0.07932518122704137

In [68]:
# Second way
n_dr_2=(weights[0]**2*PG_var_a)+(weights[1]**2*TS_var_a)
n_dr_2

0.07932518122704137

In [66]:
n_dr_1==n_dr_2

True