Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statistical assessment of spatial distributions #12835

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mweatherley
Copy link
Contributor

Objective

In #12484 the question arose of how we would actually go about testing the point-sampling methods introduced. In this PR, we introduce statistical tools for assessing the quality of spatial distributions in general and, in particular, of the ShapeSample implementations that presently exist.

Background and approach

A uniform probability distribution is one where the probability density is proportional to the area — that is, for any given region, the probability of a sample being drawn from that region is equal to the proportion of the total area that region occupies.

It follows from this that, if one discretizes the sample space by partitioning it into labeled regions and assigning to each sample the label of the region it falls into, the discrete probability distribution sampled from the labels is a multinomial distribution with probabilities given by the proportions of the total area taken by each region of the partition.

Given, then, some probability distribution which is supposed to be uniform on some region, we can attempt to assess its uniformity by discretizing — as described above — and then performing statistical analysis of the resulting discrete distribution using Pearson's chi-squared test. The point is that, if the distribution exhibits some bias, it might be detected in the discrete distribution, which will fail to conform adequately to the associated multinomial density.

Solution

This branch contains a small library that supports this process with a few parts:

/// A trait implemented by a type which discretizes the sample space of a [`Distribution`] simultaneously
/// in `N` dimensions. To sample an implementing type as a [`Distribution`], use the [`BinSampler`] wrapper
/// type.
pub trait Binned<const N: usize> {
    /// The type defining the sample space discretized by this type.
    type IntermediateValue;

    /// The inner distribution type whose samples are to be discretized.
    type InnerDistribution: Distribution<Self::IntermediateValue>;

    /// The concrete inner distribution of this distribution, used to sample into an `N`-dimensional histogram.
    fn inner_dist(&self) -> Self::InnerDistribution;

    /// A function that takes output from the inner distribution and maps it to `N` bins. This allows
    /// any implementor of `Binned` to be a [`Distribution`] — the output of the distribution is `Option<[usize; N]>`
    /// because the mapping to bins is generally fallible, resulting in an error state when a sample misses every bin.
    fn bin(&self, value: Self::IntermediateValue) -> Option<[usize; N]>;

    /// Bin-sample the discretized distribution.
    fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Option<[usize; N]> {
        let v = self.inner_dist().sample(rng);
        self.bin(v)
    }
}

The preceding trait models the discretization process for arbitrary spatial distributions, but provides no metadata about what the associated multinomial densities should be; that is supported by the following additional trait:

/// A discretized ([`Binned`]) probability distribution that also has extrinsic weights associated to its bins;
/// primarily intended for use in chi-squared analysis of spatial distributions.
pub trait WithBinDistributions<const N: usize>: Binned<N> {
    /// Get the bin weights to compare with actual samples.
    fn get_bins(&self) -> [BinDistribution; N];

    /// Get the degrees of freedom of each set of bins.
    fn dfs(&self) -> [usize; N] {
        self.get_bins().map(|b| b.bins.len().saturating_sub(1))
    }
}

Next, an N-dimensional histogram type is used to actually aggregate samples for the purposes of comparison:

/// An `N`-dimensional histogram, holding data simultaneously assessed to lie in
/// `N` different families of bins.
///
/// Constructed via its [`FromIterator`] implementation, hence by calling [`Iterator::collect`]
/// on an iterator whose items are of type `Option<[usize; N]>`. Most notably, the sample iterator
/// of [`BinSampler<T>`](super::traits::BinSampler) where `T` implements [`Binned`](super::traits::Binned)
/// produces values of this type.
pub struct Histogram<const N: usize> {
    /// The actual histogram, with the invalid items diverted to `invalid`
    pub(crate) inner: BTreeMap<[usize; N], usize>,

    /// The total samples present in the histogram — i.e., excluding invalid items.
    pub total: usize,

    /// Count of invalid items, separate from the actual histogram.
    pub invalid_count: usize,
}

Finally, chi-squared analysis functions take these histograms (or their projections) as input and produce actual chi-squared values:

/// Compute the chi-squared goodness-of-fit test statistic for the `histogram` relative to the ideal
/// distribution described by `ideal`. Note that this is distinct from the p-value, which must be
/// assessed separately.
pub fn chi_squared_fit(histogram: &Histogram<1>, ideal: &BinDistribution) -> f64 { //... }

Presently, the actual testing implemented by this branch includes Binned implementations for the interiors and boundaries of Circle and Sphere. Two wrapper types, InteriorOf<T> and BoundaryOf<T> have been introduced for implementors of ShapeSample, with the purpose of allowing the constituent sampling methods to be used directly as Distributions. This adds modularity; the library itself operates also at the level of Distributions.


Changelog

  • Moved shape_sampling.rs into a new sampling submodule of bevy_math that holds all of the rand dependencies.
  • New wrapper structs InteriorOf<T> and BoundaryOf<T> allow conversion of ShapeSample implementors into Distributions.

Discussion

Caveat emptor

The statistical tests in sampling/statistical_tests/impls.rs are marked #[ignore] so that they do not run in CI testing. They must never, ever, ever run in CI testing. The purpose of these statistical tests is that they reliably fail when something is wrong — not that they always succeed when everything is fine.

Presently, the alpha-level of each individual test is .001, meaning that each constituent check fails 1/1000th of the time; with the current volume of tests, this means that about 1% of the time, a failure would occur even if everything was perfect.

On the other hand, chi-squared error has the property that it grows with sample size for mismatched distributions, while remaining constant for matched ones. That is to say: statistical biases in the output should lead to the tests failing quite reliably, meaning they do not need to be run particularly often. We can use very large sample sizes to ensure this if need be.

Personally, I am not sure what the best way of using these tests would be other than running them manually. Presently, this can be done as follows:

cargo run -p bevy_math -- --ignored

What?

I'm sure this looks like building a death ray to kill an ant. In a sense, it is. Frankly, the reason that I made this isn't because I wanted to (not that I didn't enjoy myself), but really that I couldn't think of any other way to externally assess the quality of our sampling code that was actually meaningful in any way. For example, using a fixed-seed RNG and comparing output to some known values doesn't really demonstrate anything (and, in fact, breaks spuriously when the code is refactored).

@NthTensor NthTensor added C-Testing A change that impacts how we test Bevy or how users test their apps A-Math Fundamental domain-agnostic mathematical operations labels Apr 1, 2024
@NthTensor
Copy link
Contributor

Chi-squared, Nice! Looks pretty good at a glance, I will have time to review later in the week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Math Fundamental domain-agnostic mathematical operations C-Testing A change that impacts how we test Bevy or how users test their apps
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants