## In-class notebook: 2024-01-22

In this notebook, we will look at some common usages of classical statistical inference. We first look at empirical estimates of error bars, then look at hypothesis testing and ways to compare distributions. 

This notebook is intended to support Chapter 4.6-4.9 of the textbook, and material is taken from the following scripts (from astroML):

* https://github.com/astroML/astroML-notebooks/blob/main/chapter4/astroml_chapter4_Hypothesis_testing.ipynb
* https://github.com/astroML/astroML-notebooks/blob/main/chapter4/astroml_chapter4_Comparison_of_distributions.ipynb

In [None]:
import numpy as np
from matplotlib import pyplot as plt

## Hypothesis testing

### Rejecting a null hypothesis

We flip a coin eight times and get six tails; should we reject the hypothesis that the coin is fair? We will assume the null hypothesis that the coin is indeed fair. Recall that we can find probabilities of coin flips with the binomial distribution,

$$ p(k|b,N) = \frac{N!}{k!(N-k)!} b^k (1-b)^{N-k}. $$

Since p-values are defined as the probability that something *at least* as extreme as your data could have occurred (assuming the null hypothesis is correct), we can find the p-value by adding the probability of 6/8, 7/8, and 8/8 tails.

$$ \frac{8!}{6!2!}\frac{1}{2}^6 \frac{1}{2}^2 + \frac{8!}{7!1!}\frac{1}{2}^7 \frac{1}{2}^1 + \frac{8!}{8!0!}\frac{1}{2}^8 \frac{1}{2}^0$$

We get that the probability of this occurring is 0.145; thus, we cannot reject the null hypothesis at the 0.05 significance level.


### Hypothesis testing and classification

This is the example in class. 

Assume that $h_B(x) = \mathcal{N} (\mu = 100, \sigma = 10) $ and $h_s(x) = \mathcal{N} (\mu = 150, \sigma = 12)$, with $a$ = 0.1 and $N = 10^6$ (this will be image with 1000 x 1000 resolution elements; the $x$ values correspond to the sum of background and source counts). We will plot these two distributions below.

In [None]:
# Generate and draw the curves
x = np.linspace(50, 200, 1000)
p1 = 0.9 * norm(100, 10).pdf(x)
p2 = 0.1 * norm(150, 12).pdf(x)

# plot the distributions
fig, ax = plt.subplots(figsize=(14, 8))
ax.fill(x, p1, ec='k', fc='#AAAAAA', alpha=0.5)
ax.fill(x, p2, '-k', fc='#AAAAAA', alpha=0.5)

# plot x_c = 120
ax.plot([120, 120], [0.0, 0.04], '--k')

ax.text(100, 0.036, r'$h_B(x)$', ha='center', va='bottom', fontsize = 14)
ax.text(150, 0.0035, r'$h_S(x)$', ha='center', va='bottom', fontsize = 14)
ax.text(122, 0.039, r'$x_c=120$', ha='left', va='top', fontsize = 14)
ax.text(125, 0.01, r'$(x > x_c\ {\rm classified\ as\ sources})$', fontsize = 14)

ax.set_xlim(50, 200)
ax.set_ylim(0, 0.04)

ax.set_xlabel('$x$', fontsize = 14)
ax.set_ylabel('$p(x)$', fontsize = 14)
plt.show()

If we naively choose $x_c$=120 (a "$2\sigma$ cut” away from the mean for $h_B$, corresponding to a Type I error probability of $\alpha$ = 0.024 ((1-95.45\%)/2), **21,600 values will be incorrectly classified as a source!** The sample completeness for this value of $x_c$ is 0.994 and **99,400 values are correctly classified as a source.** Although the Type I error rate is only 0.024, the sample contamination is 21,600/(21,600+99,400) = 0.179, or over 7 times higher!

## Comparison of distributions: KS-tests


In [None]:
np.random.seed(4)
plt.figure(figsize=(12, 8))

plt.step(np.sort(stats.norm.rvs(0,3,25)), np.linspace(0, 1, 25) ,lw = 3)
plt.plot(np.sort(stats.norm.rvs(0,3,1000)), np.linspace(0, 1, 1000), lw=3)

plt.annotate("", xy=(2.3, 0.965), xytext=(2.3, 0.77),
            arrowprops=dict(arrowstyle="<->",lw=2))

plt.text(2.6,0.86, "D", fontsize = 20)

plt.legend(['CDF 1', 'CDF 2'])
plt.title('Comparing CDFs for K-S test')
plt.show()

In [None]:
np.random.seed(0)
vals = np.random.normal(loc=0, scale=1, size= 1000)

print(f'Normal: {stats.kstest(vals, "norm")}')
print(f'Uniform: {stats.kstest(vals, "uniform")}')


In [None]:
np.random.seed(0)
sample1 = np.random.uniform(low=0.0, high=1.0,size=100)
sample2 = np.random.normal(loc=0.0, scale=1.0,size=110)
sample3 = np.random.normal(loc=0.0, scale=1.0,size=95)

print(f'Uniform vs. Normal: {stats.ks_2samp(sample1, sample2)}')
print(f'Normal vs. Normal: {stats.ks_2samp(sample2, sample3)}')

Lastly, we'll show an example using `scipy.stats.kstwo`, which performs the two-sided test statistic distribution. Similarly to other `scipy.stats` classes, we can calculate the first four moments using `kstwo.stats`. Additionally, we can compare the histogram of random samples generated using `kstwo.rvs` to the pdf using `kstwo.pdf`.

In [None]:
from scipy.stats import kstwo

# Calculate the first four moments for a given n
n = 500
mean, var, skew, kurt = kstwo.stats(n, moments='mvsk')

#Generate random values
r = kstwo.rvs(n, size=1000)

#Plot the ksone pdf and histogram

plt.figure(figsize=(12,6))
x = np.linspace(kstwo.ppf(0.01, n), kstwo.ppf(0.99, n), 100)
plt.hist(r, density=True, bins='auto', histtype='stepfilled', alpha = 0.5, label = 'kstwo hist')
plt.plot(x, kstwo.pdf(x, n),label='kstwo pdf')
plt.xlim([x[0], x[-1]])
plt.legend(loc='best')
plt.show()

In [None]:
## check out many of the other tests...