<div class='alert alert-warning'>

SciPy's interactive examples with Jupyterlite are experimental and may not always work as expected. Execution of cells containing imports may result in large downloads (up to 60MB of content for the first import from SciPy). Load times when importing from SciPy may take roughly 10-20 seconds. If you notice any problems, feel free to open an [issue](https://github.com/scipy/scipy/issues/new/choose).

</div>

Suppose we wish to test whether a small sample has been drawn from a normal
distribution. We decide that we will use the skew of the sample as a
test statistic, and we will consider a p-value of 0.05 to be statistically
significant.


In [None]:
import numpy as np
from scipy import stats
def statistic(x, axis):
    return stats.skew(x, axis)

After collecting our data, we calculate the observed value of the test
statistic.


In [None]:
rng = np.random.default_rng()
x = stats.skewnorm.rvs(a=1, size=50, random_state=rng)
statistic(x, axis=0)

0.12457412450240658

To determine the probability of observing such an extreme value of the
skewness by chance if the sample were drawn from the normal distribution,
we can perform a Monte Carlo hypothesis test. The test will draw many
samples at random from their normal distribution, calculate the skewness
of each sample, and compare our original skewness against this
distribution to determine an approximate p-value.


In [None]:
from scipy.stats import monte_carlo_test
# because our statistic is vectorized, we pass `vectorized=True`
rvs = lambda size: stats.norm.rvs(size=size, random_state=rng)
res = monte_carlo_test(x, rvs, statistic, vectorized=True)
print(res.statistic)

0.12457412450240658

In [None]:
print(res.pvalue)

0.7012

The probability of obtaining a test statistic less than or equal to the
observed value under the null hypothesis is ~70%. This is greater than
our chosen threshold of 5%, so we cannot consider this to be significant
evidence against the null hypothesis.

Note that this p-value essentially matches that of
`scipy.stats.skewtest`, which relies on an asymptotic distribution of a
test statistic based on the sample skewness.


In [None]:
stats.skewtest(x).pvalue

0.6892046027110614

This asymptotic approximation is not valid for small sample sizes, but
`monte_carlo_test` can be used with samples of any size.


In [None]:
x = stats.skewnorm.rvs(a=1, size=7, random_state=rng)
# stats.skewtest(x) would produce an error due to small sample
res = monte_carlo_test(x, rvs, statistic, vectorized=True)

The Monte Carlo distribution of the test statistic is provided for
further investigation.


In [None]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.hist(res.null_distribution, bins=50)
ax.set_title("Monte Carlo distribution of test statistic")
ax.set_xlabel("Value of Statistic")
ax.set_ylabel("Frequency")
plt.show()