<div class='alert alert-warning'>

SciPy's interactive examples with Jupyterlite are experimental and may not always work as expected. Execution of cells containing imports may result in large downloads (up to 60MB of content for the first import from SciPy). Load times when importing from SciPy may take roughly 10-20 seconds. If you notice any problems, feel free to open an [issue](https://github.com/scipy/scipy/issues/new/choose).

</div>

In [4], the use of aspirin to prevent cardiovascular events in women
and men was investigated. The study notably concluded:

    ...aspirin therapy reduced the risk of a composite of
    cardiovascular events due to its effect on reducing the risk of
    ischemic stroke in women [...]

The article lists studies of various cardiovascular events. Let's
focus on the ischemic stoke in women.

The following table summarizes the results of the experiment in which
participants took aspirin or a placebo on a regular basis for several
years. Cases of ischemic stroke were recorded
```

Aspirin   Control/Placebo
Ischemic stroke     176           230
No stroke         21035         21018

```
Is there evidence that the aspirin reduces the risk of ischemic stroke?
We begin by formulating a null hypothesis $H_0$:

    The effect of aspirin is equivalent to that of placebo.

Let's assess the plausibility of this hypothesis with
a chi-square test.


In [None]:
import numpy as np
from scipy.stats import chi2_contingency
table = np.array([[176, 230], [21035, 21018]])
res = chi2_contingency(table)
res.statistic

6.892569132546561

In [None]:
res.pvalue

0.008655478161175739

Using a significance level of 5%, we would reject the null hypothesis in
favor of the alternative hypothesis: "the effect of aspirin
is not equivalent to the effect of placebo".
Because `scipy.stats.contingency.chi2_contingency` performs a two-sided
test, the alternative hypothesis does not indicate the direction of the
effect. We can use `stats.contingency.odds_ratio` to support the
conclusion that aspirin *reduces* the risk of ischemic stroke.

Below are further examples showing how larger contingency tables can be
tested.

A two-way example (2 x 3):


In [None]:
obs = np.array([[10, 10, 20], [20, 20, 20]])
res = chi2_contingency(obs)
res.statistic

2.7777777777777777

In [None]:
res.pvalue

0.24935220877729619

In [None]:
res.dof

2

In [None]:
res.expected_freq

array([[ 12.,  12.,  16.],
       [ 18.,  18.,  24.]])

Perform the test using the log-likelihood ratio (i.e. the "G-test")
instead of Pearson's chi-squared statistic.


In [None]:
res = chi2_contingency(obs, lambda_="log-likelihood")
res.statistic

2.7688587616781319

In [None]:
res.pvalue

0.25046668010954165

A four-way example (2 x 2 x 2 x 2):


In [None]:
obs = np.array(
    [[[[12, 17],
       [11, 16]],
      [[11, 12],
       [15, 16]]],
     [[[23, 15],
       [30, 22]],
      [[14, 17],
       [15, 16]]]])
res = chi2_contingency(obs)
res.statistic

8.7584514426741897

In [None]:
res.pvalue

0.64417725029295503