## Chi-Squared ($\chi ^ 2$) Test for Uniformity

The Chi-Squared test is used to compare the observed frequencies of events to expected frequencies, which are based on a hypothesis of a uniform distribution. If the observed frequencies match the expected frequencies, the numbers are considered uniformly distributed.

In [7]:
#| message: FALSE
library(dplyr)
library(tidyverse)

### Loading the Data

We load the random numbers from a CSV file to analyze their distribution.

In [8]:
randoms <- read.csv("../Data/randoms2.csv")$n

max_val <- 10
min_val <- 0

### Creating Bins

The numbers will be grouped into bins for comparison. We use a binning method to divide the range [min, max] into 10 equal intervals. This helps in comparing observed versus expected frequency in each bin.

In [9]:
bins <- cut(randoms,
            breaks = seq(min_val, max_val,
                         length.out = 11),
            include.lowest = TRUE)

### Frequency Calculation

We count how many numbers fall into each bin. This gives us the observed frequencies for the test.

In [10]:
freq <- table(bins)

### Expected Frequency

Under a uniform distribution, we expect the numbers to be evenly distributed across the bins. The expected frequency for each bin is the total number of random numbers divided by the number of bins.

In [11]:
expected <- rep(length(randoms) / length(freq),
                length(freq))

### Running the Chi-Squared Test

We now apply the Chi-Squared test to check whether the observed frequencies significantly deviate from the expected frequencies for a uniform distribution.

In [12]:
chi_test <- chisq.test(freq,
                       p = expected / sum(expected))

print(chi_test)


	Chi-squared test for given probabilities

data:  freq
X-squared = 12.8, df = 9, p-value = 0.1719



## Interpreting the Results

The Chi-Squared test provides a p-value which tells us whether there is a significant difference between the observed and expected frequencies. A low p-value (<0.05) suggests that the random numbers are not uniformly distributed, while a high p-value indicates no significant difference, supporting the hypothesis of uniformity.

- **$\text{p-value} > 0.05$**: There is significant evidence that the random numbers are not uniformly distributed.
- **$\text{p-value} \ge 0.05$**: There is insufficient evidence to reject the hypothesis that the random numbers are uniformly distributed.