<p><img alt="Colaboratory logo" height="45px" src="https://www.quantreo.com/wp-content/uploads/2021/10/Original-on-Transparent.png" align="left" hspace="10px" vspace="0px"></p>

# Descriptive statistics
This chapter will explain the most important statistics to describe a dataset in the financial world. Indeed, these methods are helpful in portfolio management, financial analysis, and trading.

<br>

### After this Chapter you will be able to:
* Compute and understand how to interprate the mean
* Compute and understand how to interprate the median
* Compute and understand how to interprate mode
* Compute and understand how to interprate the variance
* Compute and understand how to interprate the standard  deviation
* Compute and understand how to interprate the covariance
* Compute and understand how to interprate the variance-covariance matrix
* Compute and understand how to interprate the skweness ?
* Compute and understand how to interprate the kurtosis ? 

<br>
<br>
<br>

### Exercises (Trading / Portfolio management):
* Compute the risk/return of a financial asset
* Compute the correlation between asset properly





<br>
<br>

💰Join our community: https://discord.gg/wXjNPAc5BH

📚Read our book: https://www.amazon.com/gp/product/B09HG18CYL 

🖥️Quantreo's YouTube channel: https://www.youtube.com/channel/UCp7jckfiEglNf_Gj62VR0pw

In [2]:
!pip install yfinance



In [3]:
import numpy as np
import pandas as pd
import yfinance as yf
from scipy.stats import skew, kurtosis

In [None]:
# Import some data
df = yf.download("GOOG")["Adj Close"].pct_change(1).dropna()
df

[*********************100%***********************]  1 of 1 completed


Date
2014-03-28    0.002740
2014-03-31   -0.005393
2014-04-01    0.018295
2014-04-02   -0.000282
2014-04-03    0.004833
                ...   
2021-12-06    0.008953
2021-12-07    0.029486
2021-12-08    0.004620
2021-12-09   -0.004132
2021-12-10    0.003842
Name: Adj Close, Length: 1942, dtype: float64

# Central tendency measure

### Mean

In [None]:
# -------- Mean with numpy ------------
mean = np.mean(df, axis=0)*100
print(f"Daily Mean: {'%.2f' % mean} %")

# Annualization of the mean return
annual_mean = mean * 252
print(f"Mean Annual: {'%.2f' % annual_mean} % ")

# day mean return --> monthly mean return
monthly_mean = mean * 21
print(f"Monthly Mean: {'%.2f' % monthly_mean} %")

Daily Mean: 0.10 %
Mean Annual: 25.10 % 
Monthly Mean: 2.09 %


### Median

In [None]:
# -------- Median with numpy ------------
median = np.median(df, axis=0)*100
print(f"Daily Median: {'%.2f' % median} %")

# Annualization of the mean return
annual_median = median * 252
print(f"Yearly Median: {'%.2f' % annual_median} % ")

# day mean return --> monthly mean return
monthly_median = median * 21
print(f"Monthly Median: {'%.2f' % monthly_median} %")

### Centile

In [None]:
# -------- Centile with numpy ------------
centile_10 = np.quantile(df, 0.1, axis=0)*100
print(f"Centile 10%: {'%.2f' % centile_10} %")

centile_50 = np.quantile(df, 0.5, axis=0)*100
print(f"Centile 50%: {'%.2f' % centile_50} %")

centile_99 = np.quantile(df, 0.99, axis=0)*100
print(f"Centile 99%: {'%.2f' % centile_99} %")

Centile 10%: -1.63 %
Centile 50%: 0.10 %
Centile 99%: 4.27 %


# Standard dispersion measurement

### Variance

In [None]:
# -------- Variance with numpy ------------
var = np.var(df, axis=0)*100
print(f"Daily Median: {'%.2f' % var} %")

# Annualization of the mean return
annual_var = var * 252
print(f"Median Annual: {'%.2f' % annual_var} % ")

# day mean return --> monthly mean return
monthly_var = var * 21
print(f"Monthly Mean: {'%.2f' % monthly_var} %")

Daily Median: 0.03 %
Median Annual: 6.75 % 
Monthly Mean: 0.56 %


### Standard deviation

In [None]:
# -------- Stadard-Deviation with numpy ------------
std = np.std(df, axis=0)*100
print(f"Daily Volatility: {'%.2f' % std} %")

# Annualization of the mean return
annual_std = std * np.sqrt(252)
print(f"Annual Volatility: {'%.2f' % annual_std} % ")

# day mean return --> monthly mean return
monthly_std = std * np.sqrt(21)
print(f"Monthly Volatility: {'%.2f' % monthly_std} %")

Daily Volatility: 1.64 %
Annual Volatility: 25.98 % 
Monthly Volatility: 7.50 %


### Skweness

In [None]:
# -------- Skweness with numpy ------------
skw = skew(df)
print(f"Skweness: {'%.2f' % skw} ")

Skweness: 0.47 


### Kurtosis

In [None]:
# -------- Kurtosis with numpy ------------
kurto = kurtosis(df)
print(f"Kurtosis: {'%.2f' % kurto}")

Kurtosis: 9.68


# Relationship measurement

### Variance Covariance matrix

In [None]:
# Import several assets
df = yf.download(["GOOG","EURUSD=X"])["Adj Close"].pct_change(1).dropna()

[*********************100%***********************]  2 of 2 completed


In [None]:
# Variance Covariance matrix
mat = np.cov(df, rowvar=False)
mat

array([[ 2.40234701e-05, -1.13876470e-06],
       [-1.13876470e-06,  2.58833065e-04]])

### Covariance

In [None]:
# Covariance
mat[0][1]

-1.1387646967552557e-06

### Correlation

In [None]:
# Correlation matrix
df.corr()

Unnamed: 0,EURUSD=X,GOOG
EURUSD=X,1.0,-0.014441
GOOG,-0.014441,1.0


# EXERCISES

### Exercise 1: Compute the **annualized risk return** couple for Microsoft stock price (Yahoo symbol: MSFT). Don't forget to use the variations price

In [None]:
# Import the prices
df = yf.download("MSFT")["Adj Close"].pct_change(1).dropna()

# Compute risk return
mean = np.mean(df) * 252 * 100
vol = np.std(df) * np.sqrt(252) * 100

print(f"MSFT | \t returns: {'%.2f' % mean} % \t volatility: {'%.2f' % vol} %")

[*********************100%***********************]  1 of 1 completed
MSFT | 	 returns: 29.85 % 	 volatility: 33.81 %


### Exercise 2: Compute the covariance and the correlation matrix for the following assets: ["AMZN", "MSFT", "GOOG", "EURUSD=X", "BTC-USD"]

In [None]:
df = yf.download(["AMZN", "MSFT", "GOOG", "EURUSD=X", "BTC-USD"])["Adj Close"].pct_change(1).dropna()
df.cov()

[*********************100%***********************]  5 of 5 completed


Unnamed: 0,AMZN,BTC-USD,EURUSD=X,GOOG,MSFT
AMZN,0.0002480388,4.2e-05,-2.160787e-07,0.0001383901,0.00014
BTC-USD,4.191048e-05,0.001521,-1.706122e-06,5.484372e-05,6.5e-05
EURUSD=X,-2.160787e-07,-2e-06,1.793829e-05,-8.180438e-07,-1e-06
GOOG,0.0001383901,5.5e-05,-8.180438e-07,0.0001888534,0.000138
MSFT,0.0001400978,6.5e-05,-1.026322e-06,0.000138332,0.000196


In [None]:
df.corr()

Unnamed: 0,AMZN,BTC-USD,EURUSD=X,GOOG,MSFT
AMZN,1.0,0.068243,-0.003239,0.639415,0.635379
BTC-USD,0.068243,1.0,-0.01033,0.102344,0.119769
EURUSD=X,-0.003239,-0.01033,1.0,-0.014055,-0.017308
GOOG,0.639415,0.102344,-0.014055,1.0,0.718987
MSFT,0.635379,0.119769,-0.017308,0.718987,1.0
