Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stackplot_test_baseline has different results on 32-bit and 64-bit platforms #1726

Merged
merged 1 commit into from Feb 4, 2013

Conversation

mdboom
Copy link
Member

@mdboom mdboom commented Feb 4, 2013

I haven't quite gotten down to the bottom of this one yet, but this is the cause of the current failures on Travis.

On a 64-bit Linux, I get the following for stackplot_test_baseline (and I suspect this is correct):

stackplot_test_baseline-expected

And on 32-bit:

stackplot_test_baseline

The problem appears to be somewhere in the calculation of the data limits. On 64-bit, the xmin is exactly 0, on 32-bit it's a very small negative number which causes the locator to add an extra step on the left.

@dmcdougall
Copy link
Member

I feel like we've had an issue exactly like this in the past with bounding boxes. A cheap-and-dirty hack would be to update the the [xy]lim in the test so that no bad rounding happens in the 32-bit case.

@mdboom
Copy link
Member Author

mdboom commented Jan 31, 2013

That's not a bad idea for a quick fix to get the tests passing -- but I'd still like to get to the bottom of this, of course.

@mdboom
Copy link
Member Author

mdboom commented Feb 1, 2013

Okay -- I'm posting this mostly out of desperation to see if anyone else has any ideas.

The difference in these platforms comes down to the affine transformation. At one point, the collection is mapped through a transformation to determine the bounds of the data. While the inputs to _path.cpp::affine_transform are bit-for-bit exactly the same, the outputs are not. It is all double arithmetic throughout, so I am surprised there is a platform difference (as one might expect from long doubles which are different on these platforms).

I've boiled it down to a simple C program:

#include <stdio.h>

int main() {
  double b = 0.0d;
  double x = 80.0d;
  double d = 0x1.5555555555555p-9d;
  double y = 48.0d;
  double f = -0x1.ffffffffffffdp-4d;

  double t = b * x + d * y + f;

  printf("Output %g %a\n", t, t);
}

In the Travis environment (which is a recent i686 Ubuntu with gcc 4.6), I get:

Output 3.46945e-17 0x1.4p-55

On my personal machine, F18 x86_64 with gcc 4.7, I get:

Output 4.16334e-17 0x1.8p-55

It is this small difference that is causing the limit detection to fall over on this test.

We have some ways to solve this: 1) find out why there is this inconsistency and work around it if possible. 2) make the locators to ignore very small crossings of the boundaries -- this of course needs to be some dynamically determined fraction of the overall limits. (That is, if the true range is (-1e-45 to 100), it should just round to (0, 100)).

In the long run, we're probably going to have to do something like 2) in any event to be more robust against floating-point error anyway. Any other thoughts?

@WeatherGod
Copy link
Member

Additional datapoint: CentOS6, 64-bit, gcc version 4.4.6
Output 4.16334e-17 0x1.8p-55

So, it seems to be dependent on 32/64 bits and not gcc versions. If I can
get my old laptop running tonight, I can confirm with a 32-bit data point.

@mdboom
Copy link
Member Author

mdboom commented Feb 4, 2013

Alright. This was a fun one.

This article was helpful in decoding some of the quirks of x86 floating-point. It may actually be the case that to trigger this you have to be both x86 (not x86-64) and non-SSE (which is true for the Travis VMs).

http://www.yosefk.com/blog/consistency-how-to-defeat-the-purpose-of-ieee-floating-point.html

A fix is attached, which (hopefully) should pass on Travis.

@mdboom
Copy link
Member Author

mdboom commented Feb 4, 2013

The Travis tests are better than master (only one failure now), so I'm ready to count this as a success ;)

mdboom added a commit that referenced this pull request Feb 4, 2013
stackplot_test_baseline has different results on 32-bit and 64-bit platforms
@mdboom mdboom merged commit a4e4c68 into matplotlib:master Feb 4, 2013
@juliantaylor
Copy link
Contributor

the code kind of looks like its relevant for performance, would it make sense to only use the unoptimized version on the platform its needed via ifdefs?
Also doesn't the line right below the fixed one have the same problem?

maybe one could also use long double instead of double, those still have decent performance on i386

@mdboom
Copy link
Member Author

mdboom commented Feb 20, 2013

@juliantaylor: This fix isn't really about optimization -- it's able consistent floating point results across platforms. Can you clarify what you mean by "Also doesn't the line right below the fixed one have the same problem?"

Using long double instead of double would actually exacerbate the problem, since they are of different sizes on i386 and x86_64.

@juliantaylor
Copy link
Contributor

the else clause below the fixed one has the same expression with lots of temporaries which are not spilled to the stack explicitly, it may have the same inconsistency problem.

long double has the same size on i386 and amd64, 80 bit the size of the fpu stack (with gcc aligned to 16 byte). Using them means you don't lose your excess precision when the variable spills on the stack making it more consistent (but you lose vectorization on amd64)
It may not exist on some non x86 platforms, but this issue is x86 specific.

a drastic alternative would be to ditch support for non SSE x86 cpu's and change the fpu mode to sse globally (-mfpmath=sse with gcc), but it would make all float calculations in matplotlib more consistent.

@mdboom mdboom deleted the affine_transform_math branch August 7, 2014 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants