Using `std::normal_distribution` rather than inverse transform sampling? #196

matt-graham · 2023-07-27T15:01:45Z

The lines

NESO/solvers/Electrostatic2D3V/ParticleSystems/charged_particles.hpp

Lines 255 to 257 in 9423bff

    
           auto rvx = boost::math::erf_inv(2 * uniform01(rng_phasespace) - 1); 
        
           auto rvy = boost::math::erf_inv(2 * uniform01(rng_phasespace) - 1); 
        
           auto rvz = boost::math::erf_inv(2 * uniform01(rng_phasespace) - 1);

are I think sampling Cartesian components of a velocity vector from independent standard normal distributions, using an inverse transform sampling approach. While it probably has a negligible performance impact overall, for a standard normal distribution inverse transform sampling is generally a relatively inefficient way to generate random variates, and the implementation of random variate generation for std::normal_distribution is likely to be quicker (*). Probably more importantly, inverse transform sampling can also be less numerically stable than alternatives when sampling values in the tails of the distribution, depending on how accurate the inverse cumulative distribution function / (or here inverse error function) approximation is, and also I would say is less readable compared to something like

std::normal_distribution<double> standard_normal(0, 1);
...
auto rvx = standard_normal(rng_phasespace); 
auto rvy = standard_normal(rng_phasespace);
auto rvz = standard_normal(rng_phasespace);

Is there some other reason for preferring the current inverse transform sampling approach that I'm missing though?

(*) Sampling with std::normal_distribution appears to be about 2× quicker than the current inverse transform sampling method in the example below on my laptop.

Microbenchmark

#include <chrono>
#include <iostream>
#include <random>
#include <boost/math/special_functions/erf.hpp>

int main() {
    const long int seed = 278342987407;
    const int num_draws = 100000000;
    std::mt19937 rng(seed);
    std::uniform_real_distribution<double> standard_uniform(0, 1);
    std::normal_distribution<double> standard_normal(0, 1);
    double draw;
    
    auto start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < num_draws; i++) {
        draw = standard_normal(rng);
    }
    auto stop = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> time_taken = stop - start;
    
    std::cout << "Time taken for " << num_draws
              << " std::normal_distribution draws: "
              << time_taken.count() << " seconds" << std::endl;
         
    start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < num_draws; i++) {
        draw = boost::math::erf_inv(2 * standard_uniform(rng) - 1);
    }
    stop = std::chrono::high_resolution_clock::now();
    time_taken = stop - start;
    
    std::cout << "Time taken for " << num_draws
              << " boost::math::erf_inv inverse transform sampling draws: "
              << time_taken.count() << " seconds" << std::endl;
    
    return 0;
}

Output

Time taken for 100000000 std::normal_distribution draws: 2.45805 seconds
Time taken for 100000000 boost::math::erf_inv inverse transform sampling draws: 5.36911 seconds

The text was updated successfully, but these errors were encountered:

jwscook · 2023-07-27T19:25:15Z

Hello Matt. Thanks for your interest!

inverse transform sampling can also be less numerically stable than alternatives when sampling values in the tails of the distribution

I didn't know this. Can you provide a source (not that I don't believe you, but I'd like to read more)?

I'd be more than happy to review a PR if you'd like to contribute? Also, out of paranoia, are there any cheeky differing factors of sqrt(2) between these functions?

matt-graham · 2023-07-27T21:28:53Z

inverse transform sampling can also be less numerically stable than alternatives when sampling values in the tails of the distribution

I didn't know this. Can you provide a source (not that I don't believe you, but I'd like to read more)?

In looking for a reference for this I think I've instead found that it might have been an outdated belief on my part about the accuracy of numerical approximations to the inverse error function 😅. There is a discussion of the relative precision of the the inverse transform and Box-Muller methods for generating normal random variates in this blog post by Christian Robert which found that counter to his expectations the inverse transform method remained accurate in the tails and I've just a quick investigation with std::normal_distribution and the boost::math::erf_inv based inverse transform sampling approach which suggest both give similarly decent tail probability estimates even quite far in to the tails (both give reasonable estimates for $\mathrm{Pr}(|X| > 6)$ with $X\sim\mathcal{N}(0,1)$). So on the accuracy side it doesn't seem there is any benefit actually.

I'd be more than happy to review a PR if you'd like to contribute?

I'd be happy to submit a PR though equally if you think in the context of above its not worthwhile that's also fine.

Also, out of paranoia, are there any cheeky differing factors of sqrt(2) between these functions?

Good catch - there is indeed a missing factor of sqrt(2) in what I had above as the relationship between the inverse standard normal CDF and inverse error function is $\Phi^{-1}(u) = \sqrt{2} \mathrm{erf}^{-1}(2 u - 1)$ so would need something more like

std::normal_distribution<double> normal_half_var(0, 1 / sqrt(2));
...
auto rvx = normal_half_var(rng_phasespace); 
auto rvy = normal_half_var(rng_phasespace);
auto rvz = normal_half_var(rng_phasespace);

to match current distribution.

jwscook · 2023-07-28T10:22:08Z

Great, thanks for this, and I'm happy to hear that I have unknowingly not fallen into a trap by avoiding the extra couple of lines required to write a Box-Muller transform.

I think using a normal distribution is less surprising than the inv_erf function, and more people will understand the intent. If you don't mind pushing a PR, I'd be happy to review + merge.

matt-graham linked a pull request Aug 2, 2023 that will close this issue

[WIP] Using std::normal_distribution in place of inverse transform sampling #203

Draft

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using `std::normal_distribution` rather than inverse transform sampling? #196

Using `std::normal_distribution` rather than inverse transform sampling? #196

matt-graham commented Jul 27, 2023

jwscook commented Jul 27, 2023

matt-graham commented Jul 27, 2023

jwscook commented Jul 28, 2023

Using std::normal_distribution rather than inverse transform sampling? #196

Using std::normal_distribution rather than inverse transform sampling? #196

Comments

matt-graham commented Jul 27, 2023

jwscook commented Jul 27, 2023

matt-graham commented Jul 27, 2023

jwscook commented Jul 28, 2023

Using `std::normal_distribution` rather than inverse transform sampling? #196

Using `std::normal_distribution` rather than inverse transform sampling? #196