## Frequency Test for Uniformity

The **frequency test (or bin test)** is a simple way to check whether a dataset follows a uniform distribution. It works by:
1. Dividing the range of numbers into equal-sized bins.
2. Counting how many numbers fall into each bin.
3. Comparing the observed frequencies with the expected frequencies under uniformity.

We use the Chi-Squared test to determine if the observed distribution significantly deviates from uniformity.

In [2]:
library(dplyr)
library(tidyverse)

### Loading the Data

We load the dataset containing the random numbers, which will be tested for uniformity.

In [None]:
randoms <- read.csv("../../Data/randoms2.csv")$n

max_val <- 10
min_val <- 0

### Defining Bins

We divide the range of numbers into `num_bins` equal intervals. This allows us to compare the actual frequencies of numbers in each bin to the expected frequencies under uniformity.

In [4]:
num_bins <- 10
breaks <- seq(min(randoms),
              max(randoms),
              length.out = num_bins + 1)

### Counting Frequencies

We count how many values fall into each bin to determine the observed frequency distribution.

In [5]:
freq <- table(cut(randoms,
                  breaks = breaks,
                  include.lowest = TRUE))

### Computing Expected Frequencies

Under uniformity, we expect each bin to contain approximately the same number of values. The expected frequency is calculated as:

$$
\text{Expected Frequency} = \frac{\text{Total Count of Numbers}}{\text{Number of Bins}}
$$

In [6]:
expected <- rep(length(randoms) / num_bins, num_bins)

### Performing the Chi-Squared Test

We use the Chi-Squared test to compare the observed frequencies with the expected uniform distribution.

In [7]:
chi_test <- chisq.test(freq,
                       p = expected / sum(expected))
print(chi_test)


	Chi-squared test for given probabilities

data:  freq
X-squared = 12.8, df = 9, p-value = 0.1719



## Interpretation of the Frequency Test Results

The Chi-Squared test provides a p-value, which helps us determine if the numbers are uniformly distributed:

- **$\text{p-value} > 0.05$**: The difference between observed and expected frequencies is statistically significant, meaning the numbers *are not* uniformly distributed.
- **$\text{p-value} \ge 0.05$**: There is no significant deviation, meaning we *fail to reject* the hypothesis that the numbers are uniformly distributed.