boost::compute::normal_distribution variates have wrong standard deviation, and can produce infinite/nan values

The normal_distribution implementation in boost::compute has some problems.

First, it generates values with standard deviation about 1.2 times what was requested.  The following code demonstrates:

```
#include <boost/compute/system.hpp>
#include <boost/compute/command_queue.hpp>
#include <boost/compute/algorithm/accumulate.hpp>
#include <boost/compute/algorithm/inner_product.hpp>
#include <boost/compute/container/vector.hpp>
#include <boost/compute/random/default_random_engine.hpp>
#include <boost/compute/random/normal_distribution.hpp>
#include <boost/compute/lambda.hpp>

int main() {
    boost::compute::device device(boost::compute::system::default_device());
    boost::compute::context context(boost::compute::system::default_context());
    boost::compute::command_queue queue(boost::compute::system::default_queue());
    boost::compute::vector<float> vec(1000000, context);
    boost::compute::default_random_engine engine(queue);
    boost::compute::normal_distribution<float> distribution(5.0f, 1.0f);
    distribution.generate(vec.begin(), vec.end(), engine, queue);
    float mean = boost::compute::accumulate(vec.begin(), vec.end(), 0.f) / vec.size();
    float mean_sq = boost::compute::inner_product(vec.begin(), vec.end(), vec.begin(), 0.f) / vec.size();
    float stdev = sqrt(mean_sq - mean*mean);
    std::cout << "mean: " << mean << " (expect 5); stdev: " << stdev << " (expect 1)\n";
}
```

which outputs:

mean: 5.00037 (expect 5); stdev: 1.20036 (expect 1)

This is caused by these lines in boost/compute/random/normal_distribution.hpp:

```
            const RealType z1 = sqrt(-2.f * log2(x1)) * cos(2.f * M_PI_F * x2);
            const RealType z2 = sqrt(-2.f * log2(x1)) * sin(2.f * M_PI_F * x2);
```

both of those log2 calls have to be log (i.e. using the natural logarithm).  The ~1.2 scale in the standard deviation comes out of this mathematically: log2(x) = ln(x) / ln(2), and the square root of 1/ln(2) equals 1.2011224.  i.e. the standardized variates are too large by a factor of sqrt(1/ln(2)).

I _think_ (but have not verified) that there is another problem here: the generated x1 and x2 values can take on values of either 0 or a value larger than 1:

```
            const RealType x1 = x.x / (RealType) (UINT_MAX - 1);
            const RealType x2 = x.y / (RealType) (UINT_MAX - 1);
```

When x.x equals 0, x1 = 0, which results in both z1 and z2 being infinite, which is bad.

When x.x equals UINT_MAX, you can get a value larger than 1 for x1, which gives a positive value for the log, which then results in taking the sqrt of a negative, which is nan.  Note, however, that this will only shows up when RealType has sufficient precision: when RealType is float, UINT_MAX-1 (=4294967294) actually rounds to UINT_MAX+1 (=4294967296).  If RealType is double, however, you can run into the problem.

Basically, x1 needs to be in (0,1], and it currently violates the strictness of the left of that range, and can, depending on RealType, violate the right end of the range.

See boost/random/normal_distribution.hpp from a pre-1.56.0 version of boost (https://github.com/boostorg/random/blob/8c30753bf49504b5f677b8b410a3787907c86bd0/include/boost/random/normal_distribution.hpp) for a properly-working Box-Muller implementation (pre-1.56 because, in 1.56.0, the Box-Muller implementation was replaced with a ziggurat implementation).
## 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

boost::compute::normal_distribution variates have wrong standard deviation, and can produce infinite/nan values #587

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

boost::compute::normal_distribution variates have wrong standard deviation, and can produce infinite/nan values #587

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions