<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

# Python for Asset Management

### Debunking the Normality Assumption

&copy; Dr. Yves J. Hilpisch | The Python Quants GmbH

http://tpq.io | [training@tpq.io](mailto:trainin@tpq.io) | [@dyjh](http://twitter.com/dyjh)

## Normality Distribution

Topics of interest include:

* benchmark distribution
* return distributions
* normality tests

## Real Financial Data

**_Historical end-of-day financial time series data._**

See Artificial Intelligence in Finance (ch. 04)  and `http://hilpisch.com/aiif_eikon_eod_data.csv`.

## Imports

In [None]:
import math
import numpy as np
import pandas as pd
import scipy.stats as scs
import statsmodels.api as sm
from pylab import plt
plt.style.use('seaborn-v0_8')

## Benchmark Data

In [None]:
N = 10000

In [None]:
snrn = np.random.standard_normal(N)
snrn -= snrn.mean()  # moment matching
snrn /= snrn.std()  # moment matching

In [None]:
round(snrn.mean(), 4)

In [None]:
round(snrn.std(), 4)

In [None]:
plt.figure(figsize=(10, 6))
plt.hist(snrn, bins=35);

In [None]:
numbers = np.ones(N) * 1.5
split = int(0.25 * N)
numbers[split:3 * split] = -1
numbers[3 * split:4 * split] = 0

In [None]:
numbers -= numbers.mean()
numbers /= numbers.std()

In [None]:
round(numbers.mean(), 4)

In [None]:
round(numbers.std(), 4)

In [None]:
plt.figure(figsize=(10, 6))
plt.hist(numbers, bins=35);

In [None]:
import math
import scipy.stats as scs
import statsmodels.api as sm

In [None]:
def dN(x, mu, sigma):
    ''' Probability density function of a normal random variable x.
    '''
    z = (x - mu) / sigma
    pdf = np.exp(-0.5 * z ** 2) / math.sqrt(2 * math.pi * sigma ** 2)
    return pdf

In [None]:
def return_histogram(rets, title=''):
    ''' Plots a histogram of the returns.
    '''
    plt.figure(figsize=(10, 6))
    x = np.linspace(min(rets), max(rets), 100)
    plt.hist(np.array(rets), bins=50,
             density=True, label='frequency')
    y = dN(x, np.mean(rets), np.std(rets))
    plt.plot(x, y, linewidth=2, label='PDF')
    plt.xlabel('log returns')
    plt.ylabel('frequency/probability')
    plt.title(title)
    plt.legend()

In [None]:
return_histogram(snrn)

In [None]:
return_histogram(numbers)

## Normality Tests

In [None]:
def return_qqplot(rets, title=''):
    ''' Generates a Q-Q plot of the returns.
    '''
    fig = sm.qqplot(rets, line='s', alpha=0.5)
    fig.set_size_inches(10, 6)
    plt.title(title)
    plt.xlabel('theoretical quantiles')
    plt.ylabel('sample quantiles')

In [None]:
return_qqplot(snrn)

In [None]:
return_qqplot(numbers)

In [None]:
def print_statistics(rets):
    print('RETURN SAMPLE STATISTICS')
    print('---------------------------------------------')
    print('Skew of Sample Log Returns {:9.6f}'.format(
                scs.skew(rets)))
    print('Skew Normal Test p-value   {:9.6f}'.format(
                scs.skewtest(rets)[1]))
    print('---------------------------------------------')
    print('Kurt of Sample Log Returns {:9.6f}'.format(
                scs.kurtosis(rets)))
    print('Kurt Normal Test p-value   {:9.6f}'.format(
                scs.kurtosistest(rets)[1]))
    print('---------------------------------------------')
    print('Normal Test p-value        {:9.6f}'.format(
                scs.normaltest(rets)[1]))
    print('---------------------------------------------')

In [None]:
print_statistics(snrn)

In [None]:
print_statistics(numbers)

## Real Financial Returns

In [None]:
raw = pd.read_csv('http://hilpisch.com/aiif_eikon_eod_data.csv',
                  index_col=0, parse_dates=True).dropna()

In [None]:
rets = np.log(raw / raw.shift(1)).dropna()  # log returns

In [None]:
# rets = raw.pct_change().dropna()  # simple returns

In [None]:
# rets.head()

In [None]:
symbol = '.SPX'
# symbol = 'AAPL.O'
# symbol = 'GLD'

In [None]:
return_histogram(rets[symbol].values, symbol)

In [None]:
return_qqplot(rets[symbol].values, symbol)

In [None]:
symbols = ['.SPX', 'AMZN.O', 'EUR=', 'GLD']

In [None]:
for sym in symbols:
    print('\n{}'.format(sym))
    print(45 * '=')
    print_statistics(rets[sym].values)

<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="30%" align="right" border="0"><br>

<a href="http://tpq.io" target="_blank">http://tpq.io</a> | <a href="http://twitter.com/dyjh" target="_blank">@dyjh</a> | <a href="mailto:training@tpq.io">training@tpq.io</a>