The R package ewens gives functions for the Ewens sampling distribution. The Ewens distribution is a probability distribution over partitions of an integer
The Ewens distribution can be thought of as a distribution over partitions of an integer, but it originated in population genetics to give the probability of alleles in a sample, and sampling from the Ewens distribution involves running through the Chinese Restaurant Process
You can install the development version of ewens like so:
if (!requireNamespace("remotes", quietly = TRUE)) {
install.packages("remotes")
}
remotes::install_github("chrishanretty/ewens")Suppose I wish to divide a class of 24 into groups of different sizes. I set rewens
set.seed(2923)
library(ewens)
allocation <- rewens(24, theta = 1)The vector allocation gives the groups to which each student has been assigned. The sizes of the groups can be recovered by tabulating the result.
table(allocation)Suppose a student challenges their allocation into a particular group. What can we say about the probability of the particular allocation, given
pr <- dewens(allocation, theta = 1)
prThe probability of drawing this particular allocation is very low -- but then again, given that there are 1575 different partitions of twenty-four, we should not be too surprised.
It is also possible to calculate the probability that there are dewens_k. Thus, the probability of creating the groups above is
given by
dewens_k(k = length(unique(allocation)),
n = 24,
theta = 1)