In [1]:
from scipy import stats
import numpy as np

Please refer to class notes for the details of the calculations here.

### Discrete Random Variables

In [None]:
# What's the probability that BTC goes up in exactly 2 out of 12 month.
# Assume the probability of going up in each month is 0.5
stats.binom.pmf(k = 2, n = 12, p = 0.5)

0.016113281249999997

What's the probability that BTC goes up in 0, 1, or 2 months out of 12?

In [None]:
stats.binom.pmf(k = [0,1,2], n = 12, p = 0.5).sum()

0.019287109374999997

We can also answer the above questions using the CDF.

In [None]:
# CDF adds the probabilities until a specified point
stats.binom.cdf(k = 2, n = 12, p = 0.5)

0.019287109375

### Continuous Random Variables

The pdf here refers to the *density* function.

In [2]:
# PDF of a uniform random variable between -1 and 1
# PDF of X ~ Unif(-1, 1)
stats.uniform.pdf(0, loc = -1, scale = 2)

0.5

However, the cdf has the same meaning as before.

In [None]:
# CDF of a uniform random variable between -1 and 1
# CDF of X ~ Unif(-1, 1)
stats.uniform.cdf(-0.5, loc = -1, scale = 2)

0.25

In [None]:
# What's the probability that 0 <= X <= 0.25
stats.uniform.cdf(0.25, loc = -1, scale = 2)-stats.uniform.cdf(0, loc = -1, scale = 2)

0.125

In [None]:
# What's the probability that 0 < X < 0.25
stats.uniform.cdf(0.25, loc = -1, scale = 2)-stats.uniform.cdf(0, loc = -1, scale = 2)

0.125

NOTE: Endpoints make no difference to the probability calculation for a continuous random variable. (However, it would indeed matter for a discrete random variable)

In [None]:
# Suppose X follows a standard normal distribution.
# What is the probability that -1 <= X <= 1?
stats.norm.cdf(1) - stats.norm.cdf(-1)

0.6826894921370859

In [None]:
# What is the probability that -1 <= X <= 1 or 2 <= X <= 3?
# Answer: Add up the probability of the individual windows

In [3]:
# Quantiles for the standard normal distribution
# Here's the median
stats.norm.ppf(0.5)

0.0

And here's the 97.5% quantile.

In [None]:
stats.norm.ppf(0.975)

1.959963984540054

And the 2.5% quantile.

In [5]:
stats.norm.ppf(0.025)

-1.9599639845400545

#### Expected Value Calculation

These calculations only work for discrete random variables. Recall why we can't do this for continuous random variables.

In [29]:
x_values = np.array([-0.3, 0, 0.1, 0.2, 0.5])
x_prob = np.array([0.05, 0.20, 0.50, 0.20, 0.05])

We generally label expected values using $\mu$.

In [30]:
mu_x = (x_values * x_prob).sum()
mu_x

0.1

In [9]:
# Expected value of a binomial random variable
# Y: values from 0 to 12, p = 0.5
y_values = np.array(range(0, 13))
y_prob = stats.binom.pmf(k = y, n = 12, p = 0.5)

In [11]:
mu_y = (y_values * y_prob).sum()
mu_y

6.000000000000002

Below, we show expected value of $X^2$ but the procedure works for *any* function of a random variable.

In [12]:
# Expected Value of X^2
(x_values**2 * x_prob).sum()

0.030000000000000002

In [13]:
# Expected Value of |X|
(np.abs(x_values) * x_prob).sum()

0.13

#### Variance and Standard Deviation

In [14]:
((x_values - mu_x)**2 * x_prob).sum()

0.020000000000000004

We generally label the standard deviation using $\sigma$.

In [32]:
sigma_x = np.sqrt(((x_values - mu_x)**2 * x_prob).sum())
sigma_x

0.14142135623730953

Here's an alternate way to calculate the variance.

In [17]:
(x_values**2 * x_prob).sum() - (x_values * x_prob).sum()**2

0.02

Suppose $X$ follows a uniform distribution with endpoints 3 and 7.

The stats method by default gives us just the mean and variance but you can modify it to also get skewness and kurtosis. Check the documentation.

In [18]:
stats.uniform.stats(loc=3, scale=4)

(5.0, 1.3333333333333333)

If $X \sim N(1, 4)$, what is $P(-1.5 \le X \le 2)$?

In [19]:
stats.norm.cdf(2, loc = 1, scale = 2) - stats.norm.cdf(-1.5, loc = 1, scale = 2)

0.5858126876071578

$R_A \sim N(0.02, 0.10^2)$ and $R_B \sim N(0.01, 0.05^2)$

What is the probability of losing at least 10\% in either stock?

In [20]:
stats.norm.cdf(-0.10, loc=0.02, scale=0.10)

0.11506967022170822

In [21]:
stats.norm.cdf(-0.10, loc=0.01, scale=0.05)

0.013903447513498616

What's the probability of losing at least 100% in a stock if the simple return is normally distributed with a mean of 1% and standard deviation of 15%?

In [23]:
stats.norm.cdf(-1, loc=0.01, scale=0.15)

8.290979087206774e-12

Calculating skewness and kurtosis for discrete random variables is similar to how we calculated variance. The bottom line is: if you know how to compute expected value, you can do the others.

In [33]:
# Skewness of X:
((x_values - mu_x)**3 * x_prob).sum()

0.0

In [34]:
# Kurtosis of X:
((x_values - mu_x)**4 * x_prob).sum()

0.0026000000000000007

In [35]:
# Comparing normal & t distributions:
stats.norm.cdf(-2)

0.022750131948179195

In [40]:
stats.t.cdf(-2, df=100)

0.02410608936556682