*This notebook is intellectual property of Auquan and is distributed under the [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License](https://creativecommons.org/licenses/by-nc-nd/4.0/legalcode). Any modification or distribution of this notebook without express permission of Auquan is prohibited and will result in legal prosecution.*

# Bonus Content

In [None]:
# Install yahoo finance to obtain historical market data
!pip install yfinance

## Log Normal Distribution

Let's go over one final distribution before we end this section, the log-normal distribution.
Consider a positive random variable $X$ such that natural logarithm of X is normally distributed. That is:

$ln(X)$ **~** $N(\mu , \sigma^2)$

Then $X$ follows a log normal distribution.

**Ex1: Create a class `LogNormalRandomVariable` that draws sample from a log normal distribution with (mean, std) = (mu, sigma)**

You can again use `np.random.normal()` function and take `np.exp()` to draw your samples. 

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

class LogNormalRandomVariable():
    def __init__(self, mu = 0, sigma = 1):
        self.variableType = "LogNormal"
        self.mu = mu
        self.sigma = sigma
        return
    def draw(self, numberOfSamples):
        # Write your code here #
        samples = np.random.normal(self.mu, self.sigma, numberOfSamples)
        return np.exp(samples)

**Ex2: Now use this to plot pdf for a lognormal distribtion with $\mu = 0$ and $\sigma = 1$. Overlay a normal distribution with $\mu = 0$ and $\sigma = 1$ on this.**


In [None]:
mu, sigma = 0, 1
n = 10000
normal_samples = np.random.normal(mu, sigma, n)

# Create log normal random variable
X = LogNormalRandomVariable(mu, sigma)

# Draw from the lognormal distribution
lognormal_samples = X.draw(n)

plt.figure(figsize=(7,7))
plt.hist(normal_samples, density=True, bins = 500, histtype='step', label = 'Normal')
plt.hist(lognormal_samples, density=True, bins=500, histtype='step', label = 'Lognormal')
plt.xlim(-5, 5)
plt.ylim(0, 0.8)
plt.legend()
plt.show();

What do you notice? You chart should look similar to the plot below


In [None]:
mu_1 = 0
sigma_1 = 1
x = np.linspace(-5, 5, 200)
y = (1/(sigma_1 * np.sqrt(2 * 3.14159))) * np.exp(-(x - mu_1)*(x - mu_1) / (2 * sigma_1 * sigma_1))
z = (1/(sigma_1 * np.sqrt(2 * 3.14159) * x)) * np.exp(-( np.log(x) - mu_1)*(np.log(x) - mu_1) / (2 * sigma_1 * sigma_1))
plt.figure(figsize=(7,7))
plt.plot(x, y, label='Normal')
plt.plot(x, z, label='Lognormal')
plt.xlim(-5, 5)
plt.ylim(0, 0.8)
plt.xlabel('Value')
plt.ylabel('Probability')
plt.legend()
plt.show()

The distribution has higer peak compared to a normal distribution, it is also skewed to the left. Often times, we find that log of changes in stock prices behave normal instead of the returns thmeselves, because of the skew in prices. 

Let's see if we can fit lognormal distribution to Apple prices. Try the Jarque-Bera test on log of difference in stock prices. Since log is undefined for -ve values, we use $\log(P(t+1)/P(t))$

In [None]:
import yfinance as yf

startDateStr = '2007-12-01'
endDateStr = '2017-12-01'
data = yf.download('AAPL', startDateStr, endDateStr)
prices = data.Close

In [None]:
log_returns = np.log( prices/prices.shift(1) )
log_returns = log_returns.dropna()
plt.hist(log_returns, bins = 100)
plt.xlabel('Value')
plt.ylabel('Occurrences')
plt.show()

In [None]:
returns = prices / prices.shift(1) - 1

In [None]:
import scipy
import scipy.stats

#Set a cutoff
cutoff = 0.01
# log_returns = returns/returns.shift(1)
# log_returns = log_returns.replace([np.inf, -np.inf], np.nan)
# log_returns = log_returns.dropna()
# Get the p-value of the normality test
k2, p_value = scipy.stats.normaltest(log_returns.values)
print("The JB test p-value is: ", p_value)
print("We reject the hypothesis that the data are normally distributed: ", p_value < cutoff)

Still no luck! You can see why fitting a distribution to market prices is no easy task