# Deviations from Normality

In [2]:
%load_ext autoreload
%autoreload 2

import pandas as pd
import edhec_risk_kit as erk

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [4]:
hfi = erk.get_hfi_returns()
hfi.head()

Unnamed: 0_level_0,Convertible Arbitrage,CTA Global,Distressed Securities,Emerging Markets,Equity Market Neutral,Event Driven,Fixed Income Arbitrage,Global Macro,Long/Short Equity,Merger Arbitrage,Relative Value,Short Selling,Funds Of Funds
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1997-01,0.0119,0.0393,0.0178,0.0791,0.0189,0.0213,0.0191,0.0573,0.0281,0.015,0.018,-0.0166,0.0317
1997-02,0.0123,0.0298,0.0122,0.0525,0.0101,0.0084,0.0122,0.0175,-0.0006,0.0034,0.0118,0.0426,0.0106
1997-03,0.0078,-0.0021,-0.0012,-0.012,0.0016,-0.0023,0.0109,-0.0119,-0.0084,0.006,0.001,0.0778,-0.0077
1997-04,0.0086,-0.017,0.003,0.0119,0.0119,-0.0005,0.013,0.0172,0.0084,-0.0001,0.0122,-0.0129,0.0009
1997-05,0.0156,-0.0015,0.0233,0.0315,0.0189,0.0346,0.0118,0.0108,0.0394,0.0197,0.0173,-0.0737,0.0275


In [6]:
(pd.concat([hfi.mean(), hfi.median(), hfi.mean()>hfi.median()], axis="columns")
 .rename(columns={0:"mean",
                  1:"median",
                  2:"mean > median"}))

Unnamed: 0,mean,median,mean > median
Convertible Arbitrage,0.005508,0.0065,False
CTA Global,0.004074,0.0014,True
Distressed Securities,0.006946,0.0089,False
Emerging Markets,0.006253,0.0096,False
Equity Market Neutral,0.004498,0.0051,False
Event Driven,0.006344,0.0084,False
Fixed Income Arbitrage,0.004365,0.0055,False
Global Macro,0.005403,0.0038,True
Long/Short Equity,0.006331,0.0079,False
Merger Arbitrage,0.005356,0.006,False


### Skewness

$$ S(R) = \frac{E [ (R-E(R))^3 ]}{\sigma_R^3}$$

1. De-meaned returns
2. cube it
3. Take the mean of that ($E$)
4. Divide it by the volatility cubed

In [7]:
erk.skewness(hfi).sort_values()

Fixed Income Arbitrage   -3.940320
Convertible Arbitrage    -2.639592
Equity Market Neutral    -2.124435
Relative Value           -1.815470
Event Driven             -1.409154
Merger Arbitrage         -1.320083
Distressed Securities    -1.300842
Emerging Markets         -1.167067
Long/Short Equity        -0.390227
Funds Of Funds           -0.361783
CTA Global                0.173699
Short Selling             0.767975
Global Macro              0.982922
dtype: float64

* If it were normal you'd get a skewness of 0.
* Negative skewness means median > mean
    * Greater probability of returns on downside.

In [10]:
import scipy.stats
pd.Series(scipy.stats.skew(hfi)).sort_values()

6    -3.940320
0    -2.639592
4    -2.124435
10   -1.815470
5    -1.409154
9    -1.320083
2    -1.300842
3    -1.167067
8    -0.390227
12   -0.361783
1     0.173699
11    0.767975
7     0.982922
dtype: float64

In [12]:
hfi.shape

(263, 13)

In [11]:
import numpy as np

In [15]:
normal_rets = np.random.normal(0, 0.15, size=(263,1))

In [16]:
erk.skewness(normal_rets)

0.2937114582576718

# Kurtosis

$$ K(R) = \frac{E[ (R-E(R))^4 ]}{\sigma_R^4} $$

* Exactly like skewness except instead of raising to third power we raise it by 4

In [17]:
erk.kurtosis(normal_rets)

2.804183727574043

In [18]:
erk.kurtosis(hfi)

Convertible Arbitrage     23.280834
CTA Global                 2.952960
Distressed Securities      7.889983
Emerging Markets           9.250788
Equity Market Neutral     17.218555
Event Driven               8.035828
Fixed Income Arbitrage    29.842199
Global Macro               5.741679
Long/Short Equity          4.523893
Merger Arbitrage           8.738950
Relative Value            12.121208
Short Selling              6.117772
Funds Of Funds             7.070153
dtype: float64

This is far from 3 (what you would expect from a normal distributed returns)

In [20]:
scipy.stats.kurtosis(normal_rets)

array([-0.19581627])

SciPy gives you the excess kurtosis over the expected kurtosis of 3. It's subtracting the 3.

Are our returns normal or not? That's what the jarque-bera test is for.

In [21]:
scipy.stats.jarque_bera(normal_rets)

(4.201531245361322, 0.12236270871559851)

What do this numbers tell us?
* jb_value: the test statistic
* p-value for the hypothesis test: The hypothesis is that the sample data we gave it has the skewness and kurtosis that matches a normal distribution.
    * The p-value should be more than 1%

In [23]:
scipy.stats.jarque_bera(hfi) #not what we want

(25656.585999171326, 0.0)

In [24]:
erk.is_normal(normal_rets)

True

In [25]:
erk.is_normal(hfi)

False

In [26]:
hfi.aggregate(erk.is_normal) # aggregate applies function on every column

Convertible Arbitrage     False
CTA Global                 True
Distressed Securities     False
Emerging Markets          False
Equity Market Neutral     False
Event Driven              False
Fixed Income Arbitrage    False
Global Macro              False
Long/Short Equity         False
Merger Arbitrage          False
Relative Value            False
Short Selling             False
Funds Of Funds            False
dtype: bool

In [27]:
ffme = erk.get_ffme_returns()
erk.skewness(ffme)

small_cap    4.410739
large_cap    0.233445
dtype: float64

In [28]:
erk.kurtosis(ffme)

small_cap    46.845008
large_cap    10.694654
dtype: float64

In [30]:
ffme.aggregate(erk.is_normal)

small_cap    False
large_cap    False
dtype: bool