Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using std::normal_distribution rather than inverse transform sampling? #196

Open
matt-graham opened this issue Jul 27, 2023 · 3 comments · May be fixed by #203
Open

Using std::normal_distribution rather than inverse transform sampling? #196

matt-graham opened this issue Jul 27, 2023 · 3 comments · May be fixed by #203

Comments

@matt-graham
Copy link

The lines

auto rvx = boost::math::erf_inv(2 * uniform01(rng_phasespace) - 1);
auto rvy = boost::math::erf_inv(2 * uniform01(rng_phasespace) - 1);
auto rvz = boost::math::erf_inv(2 * uniform01(rng_phasespace) - 1);

are I think sampling Cartesian components of a velocity vector from independent standard normal distributions, using an inverse transform sampling approach. While it probably has a negligible performance impact overall, for a standard normal distribution inverse transform sampling is generally a relatively inefficient way to generate random variates, and the implementation of random variate generation for std::normal_distribution is likely to be quicker (*). Probably more importantly, inverse transform sampling can also be less numerically stable than alternatives when sampling values in the tails of the distribution, depending on how accurate the inverse cumulative distribution function / (or here inverse error function) approximation is, and also I would say is less readable compared to something like

std::normal_distribution<double> standard_normal(0, 1);
...
auto rvx = standard_normal(rng_phasespace); 
auto rvy = standard_normal(rng_phasespace);
auto rvz = standard_normal(rng_phasespace);

Is there some other reason for preferring the current inverse transform sampling approach that I'm missing though?

(*) Sampling with std::normal_distribution appears to be about 2× quicker than the current inverse transform sampling method in the example below on my laptop.

Microbenchmark
#include <chrono>
#include <iostream>
#include <random>
#include <boost/math/special_functions/erf.hpp>

int main() {
    const long int seed = 278342987407;
    const int num_draws = 100000000;
    std::mt19937 rng(seed);
    std::uniform_real_distribution<double> standard_uniform(0, 1);
    std::normal_distribution<double> standard_normal(0, 1);
    double draw;
    
    auto start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < num_draws; i++) {
        draw = standard_normal(rng);
    }
    auto stop = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double> time_taken = stop - start;
    
    std::cout << "Time taken for " << num_draws
              << " std::normal_distribution draws: "
              << time_taken.count() << " seconds" << std::endl;
         
    start = std::chrono::high_resolution_clock::now();
    for (int i = 0; i < num_draws; i++) {
        draw = boost::math::erf_inv(2 * standard_uniform(rng) - 1);
    }
    stop = std::chrono::high_resolution_clock::now();
    time_taken = stop - start;
    
    std::cout << "Time taken for " << num_draws
              << " boost::math::erf_inv inverse transform sampling draws: "
              << time_taken.count() << " seconds" << std::endl;
    
    return 0;
}

Output

Time taken for 100000000 std::normal_distribution draws: 2.45805 seconds
Time taken for 100000000 boost::math::erf_inv inverse transform sampling draws: 5.36911 seconds
@jwscook
Copy link
Member

jwscook commented Jul 27, 2023

Hello Matt. Thanks for your interest!

inverse transform sampling can also be less numerically stable than alternatives when sampling values in the tails of the distribution

I didn't know this. Can you provide a source (not that I don't believe you, but I'd like to read more)?

I'd be more than happy to review a PR if you'd like to contribute? Also, out of paranoia, are there any cheeky differing factors of sqrt(2) between these functions?

@matt-graham
Copy link
Author

inverse transform sampling can also be less numerically stable than alternatives when sampling values in the tails of the distribution

I didn't know this. Can you provide a source (not that I don't believe you, but I'd like to read more)?

In looking for a reference for this I think I've instead found that it might have been an outdated belief on my part about the accuracy of numerical approximations to the inverse error function 😅. There is a discussion of the relative precision of the the inverse transform and Box-Muller methods for generating normal random variates in this blog post by Christian Robert which found that counter to his expectations the inverse transform method remained accurate in the tails and I've just a quick investigation with std::normal_distribution and the boost::math::erf_inv based inverse transform sampling approach which suggest both give similarly decent tail probability estimates even quite far in to the tails (both give reasonable estimates for $\mathrm{Pr}(|X| &gt; 6)$ with $X\sim\mathcal{N}(0,1)$). So on the accuracy side it doesn't seem there is any benefit actually.

I'd be more than happy to review a PR if you'd like to contribute?

I'd be happy to submit a PR though equally if you think in the context of above its not worthwhile that's also fine.

Also, out of paranoia, are there any cheeky differing factors of sqrt(2) between these functions?

Good catch - there is indeed a missing factor of sqrt(2) in what I had above as the relationship between the inverse standard normal CDF and inverse error function is $\Phi^{-1}(u) = \sqrt{2} \mathrm{erf}^{-1}(2 u - 1)$ so would need something more like

std::normal_distribution<double> normal_half_var(0, 1 / sqrt(2));
...
auto rvx = normal_half_var(rng_phasespace); 
auto rvy = normal_half_var(rng_phasespace);
auto rvz = normal_half_var(rng_phasespace);

to match current distribution.

@jwscook
Copy link
Member

jwscook commented Jul 28, 2023

Great, thanks for this, and I'm happy to hear that I have unknowingly not fallen into a trap by avoiding the extra couple of lines required to write a Box-Muller transform.

I think using a normal distribution is less surprising than the inv_erf function, and more people will understand the intent. If you don't mind pushing a PR, I'd be happy to review + merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants