# Biased Die
You have to verify if a die is biased. The data in the file *eyes.csv* contains the outcomes for a large number of rolls of the die. Use a $\chi^2$ test to challenge the hypothesis that the die is unbiased.

In [None]:
import numpy as np
from scipy.stats import chisquare

# simulate results
# rng = np.random.default_rng()
# N = 12_000
# high = 61
# eyes = rng.integers(low=0, high=high, size=N) % 6 + 1

# read results from file
eyes = np.loadtxt('eyes.csv')

### Calculate expected and observed counts
Calculate the expected counts (for a fair die) and the observed counts (based on the data).

In [None]:
N = len(eyes) # number of rolls

# calculate expected counts
expected_counts = np.ones(6) * N // 6

# calculate observed counts
values, observed_counts = np.unique(eyes, return_counts=True)

### Hypothesis test
Using the method *chisquared* in scipy.stats, perform a $\chi^2$ test on the data and decide whether the die can be assumed to be fair.

In [None]:
statistic, pvalue = chisquare(observed_counts, expected_counts)

print(f'The p-value is {pvalue:.6f}')

For $p<0.05$ it is very unlikely that the hypothesis is true. We have to assume that the die is biased.

*** Graphical representation
Make a graph for the observed counts vs the number of eyes including the statistical errors and show that one outcome is significantly higher than the expected value.

In [None]:
import matplotlib.pyplot as plt

variances = np.sqrt(observed_counts)
sigma = np.sqrt(N/6)

average = np.mean(observed_counts[values>1])

fig, ax = plt.subplots()
ax.bar(values, observed_counts, label='observed')
ax.errorbar(values, observed_counts, yerr=variances, fmt='r.', capsize=2)
ax.hlines(N/6, xmin=0.5, xmax=6.5, colors='orange', label='expected')
ax.hlines(average, xmin=0.5, xmax=6.5, colors='green', label='average 2 to 6')
ax.set_title('Counts for biased die')
ax.set_xlabel('# eyes')
ax.set_ylabel('counts')
ax.legend()
plt.show()

It is evident from the histogram that the outcome *1 eye* occurs significantly too often. The average without the first outcome is lower than the expected value and seems to be compatible with equal probability for the other five faces of the die.

In [None]:
counts_2to6 = observed_counts[1:]
_, pvalue = chisquare(counts_2to6)

print(f'The p-value for the hypothesis that 2 to 6 eyes have the same probability is {pvalue:.6f}.')