Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

stackplot_test_baseline has different results on 32-bit and 64-bit platforms #1726

merged 1 commit into from Feb 4, 2013


None yet
4 participants

mdboom commented Feb 4, 2013

I haven't quite gotten down to the bottom of this one yet, but this is the cause of the current failures on Travis.

On a 64-bit Linux, I get the following for stackplot_test_baseline (and I suspect this is correct):


And on 32-bit:


The problem appears to be somewhere in the calculation of the data limits. On 64-bit, the xmin is exactly 0, on 32-bit it's a very small negative number which causes the locator to add an extra step on the left.


dmcdougall commented Jan 31, 2013

I feel like we've had an issue exactly like this in the past with bounding boxes. A cheap-and-dirty hack would be to update the the [xy]lim in the test so that no bad rounding happens in the 32-bit case.


mdboom commented Jan 31, 2013

That's not a bad idea for a quick fix to get the tests passing -- but I'd still like to get to the bottom of this, of course.


mdboom commented Feb 1, 2013

Okay -- I'm posting this mostly out of desperation to see if anyone else has any ideas.

The difference in these platforms comes down to the affine transformation. At one point, the collection is mapped through a transformation to determine the bounds of the data. While the inputs to _path.cpp::affine_transform are bit-for-bit exactly the same, the outputs are not. It is all double arithmetic throughout, so I am surprised there is a platform difference (as one might expect from long doubles which are different on these platforms).

I've boiled it down to a simple C program:

#include <stdio.h>

int main() {
  double b = 0.0d;
  double x = 80.0d;
  double d = 0x1.5555555555555p-9d;
  double y = 48.0d;
  double f = -0x1.ffffffffffffdp-4d;

  double t = b * x + d * y + f;

  printf("Output %g %a\n", t, t);

In the Travis environment (which is a recent i686 Ubuntu with gcc 4.6), I get:

Output 3.46945e-17 0x1.4p-55

On my personal machine, F18 x86_64 with gcc 4.7, I get:

Output 4.16334e-17 0x1.8p-55

It is this small difference that is causing the limit detection to fall over on this test.

We have some ways to solve this: 1) find out why there is this inconsistency and work around it if possible. 2) make the locators to ignore very small crossings of the boundaries -- this of course needs to be some dynamically determined fraction of the overall limits. (That is, if the true range is (-1e-45 to 100), it should just round to (0, 100)).

In the long run, we're probably going to have to do something like 2) in any event to be more robust against floating-point error anyway. Any other thoughts?


WeatherGod commented Feb 1, 2013

Additional datapoint: CentOS6, 64-bit, gcc version 4.4.6
Output 4.16334e-17 0x1.8p-55

So, it seems to be dependent on 32/64 bits and not gcc versions. If I can
get my old laptop running tonight, I can confirm with a 32-bit data point.


mdboom commented Feb 4, 2013

Alright. This was a fun one.

This article was helpful in decoding some of the quirks of x86 floating-point. It may actually be the case that to trigger this you have to be both x86 (not x86-64) and non-SSE (which is true for the Travis VMs).


A fix is attached, which (hopefully) should pass on Travis.


mdboom commented Feb 4, 2013

The Travis tests are better than master (only one failure now), so I'm ready to count this as a success ;)

@mdboom mdboom added a commit that referenced this pull request Feb 4, 2013

@mdboom mdboom Merge pull request #1726 from mdboom/affine_transform_math
stackplot_test_baseline has different results on 32-bit and 64-bit platforms

@mdboom mdboom merged commit a4e4c68 into matplotlib:master Feb 4, 2013

1 check failed

default The Travis build failed

the code kind of looks like its relevant for performance, would it make sense to only use the unoptimized version on the platform its needed via ifdefs?
Also doesn't the line right below the fixed one have the same problem?

maybe one could also use long double instead of double, those still have decent performance on i386


mdboom commented Feb 20, 2013

@juliantaylor: This fix isn't really about optimization -- it's able consistent floating point results across platforms. Can you clarify what you mean by "Also doesn't the line right below the fixed one have the same problem?"

Using long double instead of double would actually exacerbate the problem, since they are of different sizes on i386 and x86_64.

the else clause below the fixed one has the same expression with lots of temporaries which are not spilled to the stack explicitly, it may have the same inconsistency problem.

long double has the same size on i386 and amd64, 80 bit the size of the fpu stack (with gcc aligned to 16 byte). Using them means you don't lose your excess precision when the variable spills on the stack making it more consistent (but you lose vectorization on amd64)
It may not exist on some non x86 platforms, but this issue is x86 specific.

a drastic alternative would be to ditch support for non SSE x86 cpu's and change the fpu mode to sse globally (-mfpmath=sse with gcc), but it would make all float calculations in matplotlib more consistent.

@mdboom mdboom deleted the mdboom:affine_transform_math branch Aug 7, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment