<div class='alert alert-warning'>

SciPy's interactive examples with Jupyterlite are experimental and may not always work as expected. Execution of cells containing imports may result in large downloads (up to 60MB of content for the first import from SciPy). Load times when importing from SciPy may take roughly 10-20 seconds. If you notice any problems, feel free to open an [issue](https://github.com/scipy/scipy/issues/new/choose).

</div>

Here are some data comparing the time to relief of three brands of
headache medicine, reported in minutes. Data adapted from [3].


In [None]:
import numpy as np
from scipy.stats import tukey_hsd
group0 = [24.5, 23.5, 26.4, 27.1, 29.9]
group1 = [28.4, 34.2, 29.5, 32.2, 30.1]
group2 = [26.1, 28.3, 24.3, 26.2, 27.8]

We would like to see if the means between any of the groups are
significantly different. First, visually examine a box and whisker plot.


In [None]:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
ax.boxplot([group0, group1, group2])
ax.set_xticklabels(["group0", "group1", "group2"]) # doctest: +SKIP
ax.set_ylabel("mean") # doctest: +SKIP
plt.show()

From the box and whisker plot, we can see overlap in the interquartile
ranges group 1 to group 2 and group 3, but we can apply the ``tukey_hsd``
test to determine if the difference between means is significant. We
set a significance level of .05 to reject the null hypothesis.


In [None]:
res = tukey_hsd(group0, group1, group2)
print(res)

Tukey's HSD Pairwise Group Comparisons (95.0% Confidence Interval)
Comparison  Statistic  p-value   Lower CI   Upper CI
(0 - 1)     -4.600      0.014     -8.249     -0.951
(0 - 2)     -0.260      0.980     -3.909      3.389
(1 - 0)      4.600      0.014      0.951      8.249
(1 - 2)      4.340      0.020      0.691      7.989
(2 - 0)      0.260      0.980     -3.389      3.909
(2 - 1)     -4.340      0.020     -7.989     -0.691

The null hypothesis is that each group has the same mean. The p-value for
comparisons between ``group0`` and ``group1`` as well as ``group1`` and
``group2`` do not exceed .05, so we reject the null hypothesis that they
have the same means. The p-value of the comparison between ``group0``
and ``group2`` exceeds .05, so we accept the null hypothesis that there
is not a significant difference between their means.

We can also compute the confidence interval associated with our chosen
confidence level.


In [None]:
group0 = [24.5, 23.5, 26.4, 27.1, 29.9]
group1 = [28.4, 34.2, 29.5, 32.2, 30.1]
group2 = [26.1, 28.3, 24.3, 26.2, 27.8]
result = tukey_hsd(group0, group1, group2)
conf = res.confidence_interval(confidence_level=.99)
for ((i, j), l) in np.ndenumerate(conf.low):
    # filter out self comparisons
    if i != j:
        h = conf.high[i,j]
        print(f"({i} - {j}) {l:>6.3f} {h:>6.3f}")

(0 - 1) -9.480  0.280
(0 - 2) -5.140  4.620
(1 - 0) -0.280  9.480
(1 - 2) -0.540  9.220
(2 - 0) -4.620  5.140
(2 - 1) -9.220  0.540