mir.random.nonuniform: Add Ziggurat method for Normal & Exponential #261

wilzbach · 2016-07-19T03:30:07Z

Adds the Ziggurat sampling algorithm for Normal & Exponential distribution.

Paper

Marsaglia, George, and Wai Wan Tsang. "The ziggurat method for generating random variables."
Journal of statistical software 5.8 (2000): 1-7.

Thoughts

there are two other important papers
design flaws of the Ziggurat algorithm - the main critic is that the same variable is used to pick the block & value
An Improved Ziggurat Method to Generate Normal Random Samples
it doesn't look perfect yet, the mean is -0.02 :/
Ziggurat works for all distributions that can be reduced to a all monotone decreasing distributions (-> will assemble a list tomorrow)

Distribution plots (dub ./examples/nonuniform_plot.d):

Ping @joseph-wakeling-sociomantic @9il

wilzbach · 2016-07-19T10:12:07Z

source/mir/random/nonuniform.d

+        }
+    };
+
+    return Ziggurat!(T, fallback, R, true)(pdf, invPdf, 128, rightEnd, T(9.91256303526217e-3));


I actually would like to run the initialization in CTFE as it will never change, but exp uses inline assembler which isn't supported in CTFE :/
Has anyone an idea?

Well, start by filing an issue against phobos asking for a CTFE'able exp.

BTW where on earth does this magic constant 9.91256...e-3 come from? I would suggest making it a named manifest constant.

Well, start by filing an issue against phobos asking for a CTFE'able exp.

Ok thanks - done. I will test whether copying the non-inline version from Phobos works.

BTW where on earth does this magic constant 9.91256...e-3 come from? I would suggest making it a named manifest constant.

It also comes from [Marsaglia00] - there it's called v.
It's the area of every block and thus depends on k and the distribution.

-> I will declare it more explicitly.

codecov-io · 2016-07-20T10:35:27Z

Current coverage is 96.74% (diff: 97.77%)

Merging #261 into master will increase coverage by 0.21%

@@             master       #261   diff @@
==========================================
  Files            19         21     +2   
  Lines          3262       3874   +612   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits           3149       3748   +599   
- Misses          113        126    +13   
  Partials          0          0

Powered by Codecov. Last update d24907b...b7541d9

wilzbach · 2016-07-20T11:57:44Z

(will post more summaries here soon, here's a brief overview)

Design flaws of the Ziggurat algorithm

two main critics:

same variable is used to pick the block & value (last 7 or 8 bits). However 2^50 values are needed to detect this
- SHR3 not uniform (shouldn't affect us)

An Improved Ziggurat Method to Generate Normal Random Samples

use double
two random variables (one separate 7bit one to pick the block) -> optimization uses 1 + 1/4 random variables
no precomputation of f(x_i)
idea: use two random integers for higher-precisions values

wilzbach · 2016-07-20T12:04:54Z

btw an interesting overview paper is Gaussian Random Number Generators by Thomas et. al.
Summary: Wallace-method is the fastest, but doesn't provide good statistical quality. Ziggurat was the second fastest method among the huge benchmark while passing the chi-squared and achieving good scores at the high sigma test (see table 3 and 4).

joseph-wakeling-sociomantic · 2016-08-05T10:10:12Z

source/mir/random/nonuniform.d

+Authors: Sebastian Wilzbach
+*/
+
+module mir.random.nonuniform;


Minor, but didn't we agree to create a mir.random.distribution package to contain everything ... ?

9il · 2016-08-05T11:36:16Z

Does Ziggurat method yield better results for normal distribution then other methods in Atmosphere or dstats?

wilzbach · 2016-08-05T11:37:44Z

Does Ziggurat method yield better results for normal distribution then other methods in Atmosphere or dstats?

What's the best way to compare?

wilzbach · 2016-08-05T12:00:30Z

@joseph-wakeling-sociomantic

NormalDist CDF looks good from what I can judge

Exp doesn't

9il · 2016-08-05T12:04:45Z

Does Ziggurat method yield better results for normal distribution then other methods in Atmosphere or dstats?
What's the best way to compare?

If you don't know if Ziggurat yield better result, then we do not need Ziggurat

9il · 2016-08-05T12:06:57Z

What's the best way to compare?

There are no the best way. The best option is to read a couple of articles. We need to understand need we Ziggurat or not first, before spend time on it

joseph-wakeling-sociomantic · 2016-08-05T12:08:05Z

If you don't know if Ziggurat yield better result, then we do not need Ziggurat

The existing academic literature would suggest that Ziggurat is a very effective method; the paper by Thomas et al. offers a variety of tests of statistical quality that could be used, IIRC.

Question is, given the timelines, is it worth pushing on with Ziggurat, or would it be better to implement more basic implementations of the various distributions, and return to Ziggurat as a longer-term work?

9il · 2016-08-05T12:13:36Z

Question is, given the timelines, is it worth pushing on with Ziggurat, or would it be better to implement more basic implementations of the various distributions, and return to Ziggurat as a longer-term work?

First question is Ziggurat better for basic distributions than basic implementations of them? If there no strong Yes, then we the next after Tinflex is basic implementations.

wilzbach · 2016-08-05T12:19:10Z

The existing academic literature would suggest that Ziggurat is a very effective method; the paper by Thomas et al. offers a variety of tests of statistical quality that could be used, IIRC.

Yes sorry I should have been more precise. @9il what would be needed to convince you that Ziggurat is better than the algorithms in Atmosphere or dstats?
Is a X^2 test (that's what they use to evaluate the goodness of the fit) ok or should I also do the high-sigma test (more complex)?

Question is, given the timelines, is it worth pushing on with Ziggurat, or would it be better to implement more basic implementations of the various distributions, and return to Ziggurat as a longer-term work?

We have time at least time until October. I would prefer to go with the "better" algorithm and tune it. We already have the basics implementations in dstats against which we can benchmark and compare.

wilzbach · 2016-08-05T12:21:24Z

First question is Ziggurat better for basic distributions than basic implementations of them? If there no strong Yes, then we the next after Tinflex is basic implementations.

Quote from my comment above:

btw an interesting overview paper is Gaussian Random Number Generators by Thomas et. al.
Summary: Wallace-method is the fastest, but doesn't provide good statistical quality. Ziggurat was the second fastest method among the huge benchmark while passing the chi-squared and achieving good scores at the high sigma test (see table 3 and 4).

9il · 2016-08-05T12:24:35Z

btw an interesting overview paper is Gaussian Random Number Generators by Thomas et. al.
Summary: Wallace-method is the fastest, but doesn't provide good statistical quality. Ziggurat was the second fastest method among the huge benchmark while passing the chi-squared and achieving good scores at the high sigma test (see table 3 and 4).

Wallace-method is not specialised for Normal if I am not wrong

wilzbach · 2016-08-05T12:27:26Z

Wallace-method is not specialised for Normal if I am not wrong

Afaik Ziggurat is neither. In the literature it's just commonly only used for Normal and exponential distributions, however it's a general method that works for all monotone decreasing distributions (or if their symmetric half is monotone decreasing)

joseph-wakeling-sociomantic · 2016-08-05T12:59:00Z

First question is Ziggurat better for basic distributions than basic implementations of them?

I would say, "Yes, but." The "but" is because Ziggurat is more complicated to implement correctly (as we're learning). So, in terms of the current project, I would say there's a tradeoff between focusing on Ziggurat correct, versus getting a good variety of basic distributions in place with simpler (but more limited) algorithms.

joseph-wakeling-sociomantic · 2016-08-05T12:59:41Z

however it's a general method that works for all monotone decreasing distributions (or if their symmetric half is monotone decreasing

Yes, exactly.

WebDrake · 2016-08-10T07:55:11Z

@wilzbach:

We have time at least time until October.

The emails I'm getting from GSoC suggest that we're supposed to be finished up by the end of August, with 23 August as your own deadline for finalizing code and 29 August as the deadline for Ilya and me to submit our final evaluation report?
https://developers.google.com/open-source/gsoc/timeline

wilzbach · 2016-08-10T08:39:30Z

The emails I'm getting from GSoC suggest that we're supposed to be finished up by the end of August, with 23 August as your own deadline for finalizing code and 29 August as the deadline for Ilya and me to submit our final evaluation report?
https://developers.google.com/open-source/gsoc/timeline

Yep, but that doesn't stop me to continue to work (I know that I wasted quite a lot of time)
My submission will only include the (Tin)flex algorithm, but I am still onto the mission to write (the building blocks for) a new & fast std.random for D. Hence I suggested to do it properly and as the benchmark in #286 suggests it's worth it.

WebDrake · 2016-08-10T08:46:22Z

Yep, but that doesn't stop me to continue to work (I know that I wasted quite a lot of time)

It's great that you want to keep working, but I was concerned about the expectations raised in the description of your GSoC project and what the people responsible might expect to see (which is why earlier I raised the possibility of doing some basic implementations of a variety of non-uniform distributions).

@9il you're the primary mentor here, so what are your thoughts?

9il · 2016-08-10T09:02:04Z

@9il you're the primary mentor here, so what are your thoughts?

We already have 2 general purpose Discrete RNG realizations and Tinflex will be ready soon.

Tinflex is hard numeric project without obvious workforce requirements. R version contains a lot of numeric bugs, many of them are fixed in the this project. We can stamp / copy-past boost rng, and this is not problem. But copy-pasting is not related to the proper RNG numbers. First, we need proper shell over std.random and fixed uniform generators. @wilzbach expected that he would add them during GSoC, in the same I didn't expect it.

If @wilzbach implemented a variety of non-uniform distributions but not Tinflex, then I would not be able to consider this GSoC project as finished.

WebDrake · 2016-08-10T09:10:28Z

If @wilzbach implemented a variety of non-uniform distributions but not Tinflex, then I would not be able to consider this GSoC project as finished.

I agree with that; I'm asking whether we should implement a variety of non-uniform distributions instead of (short term) focusing on Ziggurat. Tinflex obviously takes primacy as it is the most significant part of the promised work.

9il · 2016-08-10T10:07:48Z

I agree with that; I'm asking whether we should implement a variety of non-uniform distributions instead of (short term) focusing on Ziggurat. Tinflex obviously takes primacy as it is the most significant part of the promised work.

Yes, I prefer to add more a variety of non-uniform distributions instead of Ziggurat

9il · 2016-11-27T07:53:45Z

Please reopen for mir-random

mir.random.nonuniform: Add Ziggurat method for Normal & Exponential

6a6565f

wilzbach mentioned this pull request Jul 19, 2016

mir.random.nonuniform #262

Closed

7 tasks

wilzbach reviewed Jul 19, 2016
View reviewed changes

R -> UIntType

9897526

use fixed-size arrays

2b8263a

wilzbach added 2 commits July 20, 2016 14:01

add more ddoc comments to Ziggurat

2660915

fix style

894f364

joseph-wakeling-sociomantic reviewed Aug 5, 2016
View reviewed changes

Add cumulative histogram plotting

b7541d9

wilzbach mentioned this pull request Aug 10, 2016

Flex2 - first benchmark #287

Merged

9il closed this Nov 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mir.random.nonuniform: Add Ziggurat method for Normal & Exponential #261

mir.random.nonuniform: Add Ziggurat method for Normal & Exponential #261

wilzbach commented Jul 19, 2016 •

edited

Loading

wilzbach Jul 19, 2016

joseph-wakeling-sociomantic Jul 19, 2016

wilzbach Jul 19, 2016

codecov-io commented Jul 20, 2016 •

edited

Loading

wilzbach commented Jul 20, 2016 •

edited

Loading

wilzbach commented Jul 20, 2016 •

edited

Loading

joseph-wakeling-sociomantic Aug 5, 2016

wilzbach Aug 5, 2016

9il commented Aug 5, 2016

wilzbach commented Aug 5, 2016

wilzbach commented Aug 5, 2016

9il commented Aug 5, 2016

9il commented Aug 5, 2016

joseph-wakeling-sociomantic commented Aug 5, 2016

9il commented Aug 5, 2016

wilzbach commented Aug 5, 2016

wilzbach commented Aug 5, 2016

9il commented Aug 5, 2016

wilzbach commented Aug 5, 2016

joseph-wakeling-sociomantic commented Aug 5, 2016

joseph-wakeling-sociomantic commented Aug 5, 2016

WebDrake commented Aug 10, 2016

wilzbach commented Aug 10, 2016

WebDrake commented Aug 10, 2016

9il commented Aug 10, 2016

WebDrake commented Aug 10, 2016

9il commented Aug 10, 2016

9il commented Nov 27, 2016

mir.random.nonuniform: Add Ziggurat method for Normal & Exponential #261

mir.random.nonuniform: Add Ziggurat method for Normal & Exponential #261

Conversation

wilzbach commented Jul 19, 2016 • edited Loading

wilzbach Jul 19, 2016

Choose a reason for hiding this comment

joseph-wakeling-sociomantic Jul 19, 2016

Choose a reason for hiding this comment

wilzbach Jul 19, 2016

Choose a reason for hiding this comment

codecov-io commented Jul 20, 2016 • edited Loading

Current coverage is 96.74% (diff: 97.77%)

wilzbach commented Jul 20, 2016 • edited Loading

Design flaws of the Ziggurat algorithm

An Improved Ziggurat Method to Generate Normal Random Samples

wilzbach commented Jul 20, 2016 • edited Loading

joseph-wakeling-sociomantic Aug 5, 2016

Choose a reason for hiding this comment

wilzbach Aug 5, 2016

Choose a reason for hiding this comment

9il commented Aug 5, 2016

wilzbach commented Aug 5, 2016

wilzbach commented Aug 5, 2016

9il commented Aug 5, 2016

9il commented Aug 5, 2016

joseph-wakeling-sociomantic commented Aug 5, 2016

9il commented Aug 5, 2016

wilzbach commented Aug 5, 2016

wilzbach commented Aug 5, 2016

9il commented Aug 5, 2016

wilzbach commented Aug 5, 2016

joseph-wakeling-sociomantic commented Aug 5, 2016

joseph-wakeling-sociomantic commented Aug 5, 2016

WebDrake commented Aug 10, 2016

wilzbach commented Aug 10, 2016

WebDrake commented Aug 10, 2016

9il commented Aug 10, 2016

WebDrake commented Aug 10, 2016

9il commented Aug 10, 2016

9il commented Nov 27, 2016

wilzbach commented Jul 19, 2016 •

edited

Loading

codecov-io commented Jul 20, 2016 •

edited

Loading

wilzbach commented Jul 20, 2016 •

edited

Loading

wilzbach commented Jul 20, 2016 •

edited

Loading