Skip to content
This repository

PI example bug? #1

Closed
simendsjo opened this Issue April 15, 2011 · 9 comments

2 participants

simendsjo David Simcha
simendsjo

I tried the first PI example in your documentation, and I get different results each time I run it. Seems the 12 first digits are the same each time. I added the following at the end:
writefln("%1.19f", pi);

I've tried without -O and -inline, but I get the same results each time. This is using dmd 2.052 on widows on an Intel Xeon with 4 cores and HT.

simendsjo

FYI, running sequential returns the same number each time.

simendsjo simendsjo closed this April 15, 2011
simendsjo simendsjo reopened this April 15, 2011
David Simcha
Owner
David Simcha
Owner

Thanks for your report. These results are probably correct. As I note in the documentation for parallel reduce, the reduction operator must be associative. Addition in exact arithmetic is associative. Floating point arithmetic is not associative, but is approximately associative in the well-behaved cases. Therefore, there will always be some non-determinism in the low-order bits, at least when work unit size, etc. is varied. As far as non-determinism across runs on the same hardware with the same settings, I'll look into it tonight, but I suspect it's still some weird floating issue, not a bug in std.parallelism.

David Simcha dsimcha closed this April 15, 2011
David Simcha dsimcha reopened this April 15, 2011
David Simcha
Owner

I can't reproduce this on my dual core Athlon 64 X2. The results are slightly different depending on how many threads I use, which is expected since floating point addition is only approximately associative. However, across identical runs the results are consistent.

If you can still reproduce this bug, it may be that different cores are in different floating point rounding modes or something, and the answer depends on what core things get scheduled on. Based on rereading the code, I can't see how only the low-order bits could be affected by any kind of concurrency bug.

David Simcha dsimcha closed this April 15, 2011
simendsjo

What do you mean by "identical runs"?
I'm at another computer now, but the results remain the same. I changed the example a bit:

void calcPi() {
    immutable pi = 4.0 * taskPool.reduce!"a + b"(
        std.algorithm.map!getTerm(iota(n))
    );
    writefln("%1.19f", pi);
}

calcPi();
calcPi();
calcPi();

C:\temp>dmd -inline -O -release -ofpi pi

C:\temp>pi
3.1415926555897901559
3.1415926555897906796
3.1415926555897890834

C:\temp>pi
3.1415926555897859763
3.1415926555897986311
3.1415926555897861944

Not sure that's what you meant by identical though..

simendsjo

Oh, and by using --nCpu=2 I get 13/14 decimals the same each run (more than running 4/8)... I don't know much about floating point, so I don't understand much of what you are saying.
Running on multiple cores do what..? "in different floating point rounding mode"? Is parallel computation of floating point somehow dangerous..?

David Simcha
Owner

My apologies. I can reproduce this. I actually have two different pi examples and for some reason I thought you meant the other one. Reopening. I still think this is probably related to some obscure floating point minutiae and not a "real" bug, but I'd like to find out for sure before closing it.

As far as rounding modes, Wikipedia has a good description (http://en.wikipedia.org/wiki/Floating_point#Rounding_modes). I don't understand much more. All I know is that rounding modes are set per-CPU and I can't think of any other reason for such strange behavior.

Generally I don't even look at the low order bits of my floating point results because there are so many details (such as compiler optimizations and rounding modes) that can change them and the answer will still be right for all practical purposes. In this case I think looking into it further is justified because I can't think of any reason why the answers wouldn't be exactly the same.

David Simcha dsimcha reopened this April 15, 2011
David Simcha
Owner

Another weird observation: The smallest element of the range we're summing is about 5e-10. Therefore, differences that only show up in the 12th decimal place have to be somehow rounding-related or something. We're definitely not skipping any terms, etc.

David Simcha
Owner

Thanks for your report. It was extremely interesting to track down. It turns out that it's not a bug in std.parallelism, it's a bug in the way druntime and Windows handle floating point state when creating new threads. See http://d.puremagic.com/issues/show_bug.cgi?id=5847 .

David Simcha dsimcha closed this April 16, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.