<a href="https://colab.research.google.com/github/joanby/trading-algoritmico-estadistica-probabilidad/blob/main/ES_EyP_Cap%C3%ADtulo_01_Estad%C3%ADstica_Descriptiva.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<p><img alt="Colaboratory logo" height="45px" src="https://www.quantreo.com/wp-content/uploads/2021/10/Original-on-Transparent.png" align="left" hspace="10px" vspace="0px">
<img alt="Colaboratory logo" height="45px" src="https://static-881c.kxcdn.com/wp-content/uploads/2016/03/frogamesLogoFull4.png" align="left" hspace="10px" vspace="0px"></p>


# Estadística Descriptiva
Este capítulo explicará las estadísticas más importantes para describir un conjunto de datos en el mundo financiero. De hecho, estos métodos son útiles en la gestión de carteras, el análisis financiero y el comercio.

<br>

### Después de este Capítulo serás capaz de:
* Calcular y entender cómo interpretar la media
* Calcular y entender cómo interpretar la mediana
* Calcular y entender cómo interpretar la moda
* Calcular y comprender cómo interpretar la varianza
* Calcular y comprender cómo interpretar la desviación estándar
* Calcular y comprender cómo interpretar la covarianza
* Calcular y comprender cómo interpretar la matriz de varianza-covarianza
* Calcular y entender cómo interpretar el skweness
* Calcular y comprender cómo interpretar la curtosis

<br>
<br>
<br>

### Ejercicios (Trading / Gestión de cartera):
* Calcular el riesgo/rendimiento de un activo financiero
* Calcular la correlación entre los activos correctamente


<br>
<br>

💰 Únete a la comunidad de [Discord](https://discord.gg/wXjNPAc5BH)

📚 Puedes leer nuestro libro en [Amazon](https://www.amazon.com/gp/product/B09HG18CYL)

🖥️ El canal de [YouTube de Quantreo's](https://www.youtube.com/channel/UCp7jckfiEglNf_Gj62VR0pw) (en inglés) y el de [Frogames](https://www.youtube.com/channel/UCMUxXNYrVCv6-bQakhomvBg) en español




In [17]:
!pip install yfinance

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [2]:
import numpy as np
import pandas as pd
import yfinance as yf
from scipy.stats import skew, kurtosis

In [2]:
# Importar algunos datos
df = yf.download("GOOG")["Adj Close"].pct_change(1).dropna()
df

[*********************100%***********************]  1 of 1 completed


Date
2004-08-20    0.079430
2004-08-23    0.010064
2004-08-24   -0.041408
2004-08-25    0.010775
2004-08-26    0.018019
                ...   
2022-08-19   -0.022671
2022-08-22   -0.025821
2022-08-23   -0.002607
2022-08-24   -0.000610
2022-08-25    0.026155
Name: Adj Close, Length: 4536, dtype: float64

# Medidas de tendencia central

### Media

In [5]:
# -------- Media con numpy ------------
mean = np.mean(df, axis=0)*100
print(f"Media Diaria: {'%.2f' % mean} %")

# Anualización de la rentabilidad media
annual_mean = mean * 252
print(f"Media Anual: {'%.2f' % annual_mean} % ")

# Retorno Medio Diario --> Retorno Medio Mensual
monthly_mean = mean * 21
print(f"Media Mensual: {'%.2f' % monthly_mean} %")

Media Diaria: 0.10 %
Media Anual: 25.96 % 
Media Mensual: 2.16 %


### Mediana

In [6]:
# -------- Mediana con numpy ------------
median = np.median(df, axis=0)*100
print(f"Mediana Diaria: {'%.2f' % median} %")

# Anualización de la rentabilidad mediana
annual_median = median * 252
print(f"Mediana Anual: {'%.2f' % annual_median} % ")

# Retorno Mediano Diario --> Retorno Mediano Mensual
monthly_median = median * 21
print(f"Monthly Median: {'%.2f' % monthly_median} %")

#Comentario

Mediana Diaria: 0.08 %
Mediana Anual: 19.14 % 
Monthly Median: 1.60 %


### Percentiles

In [6]:
# -------- Percentil con numpy ------------
centile_10 = np.quantile(df, 0.1, axis=0)*100
print(f"Percentil 10%: {'%.2f' % centile_10} %")

centile_50 = np.quantile(df, 0.5, axis=0)*100
print(f"Percentil 50%: {'%.2f' % centile_50} %")

centile_99 = np.quantile(df, 0.99, axis=0)*100
print(f"Percentil 99%: {'%.2f' % centile_99} %")

Percentil 10%: -1.90 %
Percentil 50%: 0.08 %
Percentil 99%: 5.68 %


# Medición de la dispersión estándar

### Varianza

In [5]:
# -------- Varianza con numpy ------------
var = np.var(df, axis=0)*100
print(f"Varianza Diaria: {'%.2f' % var} %")

# Anualización de la varianza
annual_var = var * 252
print(f"Varianza Anual: {'%.2f' % annual_var} % ")

# Varianza Diaria --> Varianza Mensual
monthly_var = var * 21
print(f"Varianza Mensual: {'%.2f' % monthly_var} %")

Varianza Diaria: 0.04 %
Varianza Anual: 9.32 % 
Varianza Mensual: 0.78 %


### Desviación Estándar

In [8]:
# -------- Desviación Estándar con numpy ------------
std = np.std(df, axis=0)*100
print(f"Volatilidad Diaria: {'%.2f' % std} %")

# Anualización de la Desviación Estándar
annual_std = std * np.sqrt(252)
print(f"Volatilidad Anual: {'%.2f' % annual_std} % ")

# Desviación Estándar Diaria --> Desviación Estándar Mensual
monthly_std = std * np.sqrt(21)
print(f"Volatilidad Mensual: {'%.2f' % monthly_std} %")

Volatilidad Diaria: 1.92 %
Volatilidad Anual: 30.45 % 
Volatilidad Mensual: 8.79 %


### Sesgo

In [9]:
# -------- Sesgo con numpy ------------
skw = skew(df)
print(f"Sesgo: {'%.2f' % skw} ")

Sesgo: 0.74 


### Kurtosis

In [10]:
# -------- Curtosis with numpy ------------
kurto = kurtosis(df)
print(f"Curtosis: {'%.2f' % kurto}")

Curtosis: 10.01


# Medidas de Relación

### Matriz de Varianzas y Covarianzas

In [9]:
# Importar los assets
df = yf.download(["GOOG","EURUSD=X", "MSFT", "AMZN", "TSLA"])["Adj Close"].pct_change(1).dropna()
df

[*********************100%***********************]  5 of 5 completed


Unnamed: 0_level_0,AMZN,EURUSD=X,GOOG,MSFT,TSLA
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2010-06-30,0.005985,0.003584,-0.020495,-0.012870,-0.002512
2010-07-01,0.015559,0.022689,-0.012271,0.006519,-0.078472
2010-07-02,-0.016402,0.005079,-0.006690,0.004750,-0.125683
2010-07-05,0.000000,-0.002708,0.000000,0.000000,0.000000
2010-07-06,0.008430,0.006943,-0.001100,0.023636,-0.160937
...,...,...,...,...,...
2022-08-19,-0.028602,-0.008869,-0.022671,-0.013854,-0.020482
2022-08-22,-0.036244,-0.005419,-0.025821,-0.029355,-0.022764
2022-08-23,0.003003,-0.009542,-0.002607,-0.004716,0.022558
2022-08-24,0.001347,0.002761,-0.000610,-0.002351,0.002170


In [10]:
# Matriz de Varianzas y Covarianzas
mat = np.cov(df, rowvar=False)
mat

array([[ 3.97417617e-04,  3.41505354e-06,  1.91317670e-04,
         1.72788366e-04,  2.39209623e-04],
       [ 3.41505354e-06,  2.89706178e-05, -5.28828784e-07,
        -8.31587586e-07, -2.36092719e-06],
       [ 1.91317670e-04, -5.28828784e-07,  2.66107855e-04,
         1.63834408e-04,  1.89156466e-04],
       [ 1.72788366e-04, -8.31587586e-07,  1.63834408e-04,
         2.50931981e-04,  1.95463812e-04],
       [ 2.39209623e-04, -2.36092719e-06,  1.89156466e-04,
         1.95463812e-04,  1.24268123e-03]])

In [12]:
pd.DataFrame(mat*100, columns = df.columns, index = df.columns)

Unnamed: 0,AMZN,EURUSD=X,GOOG,MSFT,TSLA
AMZN,0.039742,0.000342,0.019132,0.017279,0.023921
EURUSD=X,0.000342,0.002897,-5.3e-05,-8.3e-05,-0.000236
GOOG,0.019132,-5.3e-05,0.026611,0.016383,0.018916
MSFT,0.017279,-8.3e-05,0.016383,0.025093,0.019546
TSLA,0.023921,-0.000236,0.018916,0.019546,0.124268


### Covarianza

In [16]:
# Covarianza
mat[0][2]

0.0001913176703722284

### Correlación

In [15]:
# Matriz de Correlaciones
df.corr()

Unnamed: 0,AMZN,EURUSD=X,GOOG,MSFT,TSLA
AMZN,1.0,0.031827,0.588305,0.547158,0.340389
EURUSD=X,0.031827,1.0,-0.006023,-0.009753,-0.012443
GOOG,0.588305,-0.006023,1.0,0.634013,0.328937
MSFT,0.547158,-0.009753,0.634013,1.0,0.350033
TSLA,0.340389,-0.012443,0.328937,0.350033,1.0


# EJERCICIOS

### Ejercicio 1: Calcula la pareja de **riesgo (volatilidad) y retorno anualizado** para el precio de las acciones de Microsoft (símbolo de Yahoo: MSFT). No olvides utilizar las variaciones porcentuales del precio

In [12]:
# Importar los precios
df = yf.download("BTC-USD")["Adj Close"].pct_change(1).dropna()
print(df)
# Calcular el riesgo del retorno
mean = np.mean(df) * 365 * 100
vol = np.std(df) * np.sqrt(365) * 100

print(f"MSFT | \t retorno: {'%.2f' % mean} % \t volatilidad: {'%.2f' % vol} %")

[*********************100%***********************]  1 of 1 completed
Date
2014-09-18   -0.071926
2014-09-19   -0.069843
2014-09-20    0.035735
2014-09-21   -0.024659
2014-09-22    0.008352
                ...   
2022-08-21    0.017389
2022-08-22   -0.006279
2022-08-23    0.006037
2022-08-24   -0.006181
2022-08-25    0.009623
Name: Adj Close, Length: 2899, dtype: float64
MSFT | 	 retorno: 76.16 % 	 volatilidad: 73.92 %


In [13]:
# Importar los precios
df = yf.download("MSFT")["Adj Close"].pct_change(1).dropna()
# Calcular el riesgo del retorno
mean = np.mean(df) * 252 * 100
vol = np.std(df) * np.sqrt(252) * 100

print(f"MSFT | \t retorno: {'%.2f' % mean} % \t volatilidad: {'%.2f' % vol} %")

[*********************100%***********************]  1 of 1 completed
MSFT | 	 retorno: 28.87 % 	 volatilidad: 33.83 %


### Ejercicio 2: Calcula la matriz de covarianzas y la matriz de correlación para los siguientes activos: ["AMZN", "MSFT", "GOOG", "EURUSD=X", "BTC-USD"]

In [20]:
df = yf.download(["AMZN", "MSFT", "GOOG", "EURUSD=X", "BTC-USD"])["Adj Close"].pct_change(1).dropna()
df.cov()

[*********************100%***********************]  5 of 5 completed


Unnamed: 0,AMZN,BTC-USD,EURUSD=X,GOOG,MSFT
AMZN,0.0002809726,7.3e-05,8.003295e-07,0.0001547634,0.0001548825
BTC-USD,7.335965e-05,0.00151,-1.096928e-06,7.266364e-05,8.624681e-05
EURUSD=X,8.003295e-07,-1e-06,1.808402e-05,-9.039355e-07,-9.750464e-07
GOOG,0.0001547634,7.3e-05,-9.039355e-07,0.0002017117,0.0001490245
MSFT,0.0001548825,8.6e-05,-9.750464e-07,0.0001490245,0.0002052922


In [21]:
df.corr()

Unnamed: 0,AMZN,BTC-USD,EURUSD=X,GOOG,MSFT
AMZN,1.0,0.112608,0.011228,0.650086,0.644888
BTC-USD,0.112608,1.0,-0.006637,0.131642,0.154881
EURUSD=X,0.011228,-0.006637,1.0,-0.014967,-0.016003
GOOG,0.650086,0.131642,-0.014967,1.0,0.732328
MSFT,0.644888,0.154881,-0.016003,0.732328,1.0
