## Equidistribution Test for Uniformity

The **equidistribution test** evaluates uniformity by checking whether the **empirical mean** of a dataset matches the expected integral of a uniform distribution.

For a truly uniform distribution over $[n,m]$, the expected mean is given by:

$$
E[X] = \int_n^m x \, dx = \frac{1}{2} (m^2 - n^2)
$$

If we investigate over $[0,1]$, the expected mean is $E[X] = \frac{1}{2}$.

If the empirical mean of the dataset is close to this expected value, it suggests uniformity.

In [1]:
library(dplyr)
library(tidyverse)


Attaching package: 'dplyr'


The following objects are masked from 'package:stats':

    filter, lag


The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union


-- [1mAttaching core tidyverse packages[22m ------------------------ tidyverse 2.0.0 --
[32mv[39m [34mforcats  [39m 1.0.0     [32mv[39m [34mreadr    [39m 2.1.5
[32mv[39m [34mggplot2  [39m 3.5.1     [32mv[39m [34mstringr  [39m 1.5.1
[32mv[39m [34mlubridate[39m 1.9.3     [32mv[39m [34mtibble   [39m 3.2.1
[32mv[39m [34mpurrr    [39m 1.0.2     [32mv[39m [34mtidyr    [39m 1.3.1
-- [1mConflicts[22m ------------------------------------------ tidyverse_conflicts() --
[31mx[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31mx[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
[36mi[39m Use the conflicted package ([3m[34m<http://conflicted.r-lib.org/>[39m[23m) to force all conflicts to become errors


### Loading the Data

We load the random numbers from a CSV file to analyze their distribution.

In [2]:
randoms <- read.csv("../../Data/randoms2.csv")$n

max_val <- 10
min_val <- 0

### Scaling the Data

Since the equidistribution test assumes values in the range $[0,1]$, we scale the data to this range.

In [3]:
randoms_scaled <- (randoms - min(randoms)) / (max(randoms) - min(randoms))

### Computing the Empirical Mean

We compute the empirical mean of the dataset:

$$
\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i
$$

In [4]:
empirical_mean <- function(x){
  return(mean(x))
}

empirical_mean_val <- empirical_mean(randoms_scaled)
cat("Empirical mean for randoms (scaled to [0,1]):", empirical_mean_val)

Empirical mean for randoms (scaled to [0,1]): 0.504

### Expected Mean from the Uniform Distribution

For a uniform distribution $U(0,1)$, the expected mean is:

$$
E[X] = \int_0^1 x \, dx = 0.5
$$

We define this as a function.

In [5]:
integral_f <- function() {
  return(0.5)
}

cat("Expected integral value of f(x) = x over [0,1]:", integral_f())

Expected integral value of f(x) = x over [0,1]: 0.5

### Computing the Difference

We calculate the absolute difference between the empirical mean and the expected mean.

In [6]:
diff <- abs(empirical_mean_val - integral_f())

cat("Difference for randoms:", diff)

Difference for randoms: 0.004

## Interpretation of the Equidistribution Test Results

- If the empirical mean is **close to 0.5**, the dataset is likely uniform.
- If the empirical mean **deviates significantly** from 0.5, the data may not be uniformly distributed.
- The **absolute difference** quantifies the deviation from uniformity. A smaller value suggests a more uniform distribution.