Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

masked arrays broken in py3k + gcc 5.1 on arch linux #4525

Closed
tacaswell opened this issue Jun 14, 2015 · 11 comments
Closed

masked arrays broken in py3k + gcc 5.1 on arch linux #4525

tacaswell opened this issue Jun 14, 2015 · 11 comments
Assignees
Labels
Build Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Milestone

Comments

@tacaswell
Copy link
Member

This demo http://matplotlib.org/devdocs/examples/pylab_examples/color_by_yvalue.html is broken on python 3.

Only the left most plotted region is shown and happens either with ma arrays or np.nan filled arrays (have not started digging yet, I assume the eventually follow the same code path). Filling with 0 works as expected.

I think bisect on this goes back to the cxx removal but many of the commits in there don't compile on python3 due to a numpy related gcc error so that isn't very helpful.

This smells like a subtle clipping bug.

attn @mdboom @efiring

The most common compilation error is:

gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DPY_ARRAY_UNIQUE_SYMBOL=MPL_matplotlib_ft2font_ARRAY_API -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -I/home/tcaswell/.virtualenvs/dd_3k/lib/python3.4/site-packages/numpy/core/include -I/home/tcaswell/.virtualenvs/dd_3k/include/freetype2 -I/usr/local/include -I/usr/include -I. -I/home/tcaswell/.virtualenvs/dd_3k/include/python3.4m -c src/ft2font_wrapper.cpp -o build/temp.linux-x86_64-3.4/src/ft2font_wrapper.o
In file included from src/file_compat.h:7:0,
                 from src/ft2font_wrapper.cpp:3:
src/ft2font_wrapper.cpp: In functionPyObject* PyFT2Image_as_array(PyFT2Image*, PyObject*, PyObject*)’:
src/ft2font_wrapper.cpp:142:47: error: ‘PyArray_UBYTEwas not declared in this scope
     return PyArray_SimpleNewFromData(2, dims, PyArray_UBYTE, self->x->get_buffer());
                                               ^
/home/tcaswell/.virtualenvs/dd_3k/lib/python3.4/site-packages/numpy/core/include/numpy/ndarrayobject.h:129:46: note: in definition of macroPyArray_SimpleNewFromDataPyArray_New(&PyArray_Type, nd, dims, typenum, NULL, \
                                              ^

Going to try rolling back to older versions of numpy te see if that fixes it.

For future reference, the output of bisect:

The first bad commit could be any of:

4a56cda
cfbf8d5
339537d
8dae818
6e19dac
a225968
697b3ef
c5ab747
5204635
af578f5
0624664
6d07179
ba40160
d862c47
d8c1d0d

We cannot bisect more!

Bisect Log:
git bisect start 'master' 'v1.4.3'
a165735 bad Merge pull request #4521 from jkseppan/issue-4475
21f46ff good REL : version 1.4.3

git bisect bad 056ba7a
056ba7a bad MNT : Remove NavigationToolbar2QTAgg

git bisect bad 112f199
112f199 bad Merge pull request #3752 from astrofrog/rc_context_exceptions

git bisect good 34ed45a
34ed45a good Merge pull request #3291 from joferkington/lightsource-enhancements

git bisect bad c6f24d3
c6f24d3 bad Merge pull request #3665 from jkseppan/fix-gdk-strides

git bisect good 0f4a68d
0f4a68d good Merge pull request #3547 from jkseppan/numpy-deprecation

git bisect bad 7891790
7891790 bad Remove reference to PyCXX

git bisect skip 697b3ef
697b3ef skip Restore parallel testing

git bisect skip 6d07179
6d07179 skip Change how array converter functions are defined

git bisect good 7b206eb
7b206eb good Skip shading of masked array test on numpy 1.6

git bisect skip 8dae818
8dae818 skip Remove some obsolete TODOs

git bisect skip af578f5
af578f5 skip Add .clang-format to repository

git bisect good ea8617d
ea8617d good fix E201 and E202 pep8 errors

git bisect skip 4a56cda
4a56cda skip Fix compilation

git bisect skip d862c47
d862c47 skip Fix Mac compile again

git bisect skip cfbf8d5
cfbf8d5 skip Remove debugging prints

git bisect skip ba40160
ba40160 skip Remove use of PyCXX in core C++ extensions

git bisect skip 339537d
339537d skip Fix curve simplification

git bisect good dbeed94
dbeed94 good Merge pull request #3661 from jenshnielsen/numpy_16_fixes

git bisect skip 6e19dac
6e19dac skip Fix segfault on Python 3

git bisect skip 0624664
0624664 skip Fix test_hexbin_pickable test

git bisect skip c5ab747
c5ab747 skip More Python 3 fixes

git bisect skip a225968
a225968 skip Documentation

git bisect skip 5204635
5204635 skip Windows-related fixes, thanks to @cgohlke

git bisect bad d8c1d0d
d8c1d0d bad Remove usage of old Numpy API

@tacaswell tacaswell added the Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. label Jun 14, 2015
@tacaswell tacaswell added this to the next point release milestone Jun 14, 2015
@tacaswell
Copy link
Member Author

but one of my co-workers can not reproduce this

@ericdill
Copy link
Contributor

I am not seeing the same bug. I am running current master

$ git describe
v1.4.3-1752-g62bef78

with python 3

$ python --version
Python 3.4.3 :: Continuum Analytics, Inc.

$ ipython -c "import matplotlib; print('version = %s' % matplotlib.__version__); print('file = %s' % matplotlib.__file__)"
version = 1.5.dev1

file = /home/edill/dev/python/matplotlib/lib/matplotlib/__init__.py

and gcc 4.9.2

$ gcc --version
gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

and numpy 1.9.2

$ ipython -c "import numpy; print('version = %s' %  numpy.__version__)"
version = 1.9.2

@ericdill
Copy link
Contributor

Oh, right. And the output from running

# use masked arrays to plot a line with different colors by y-value
from numpy import logical_or, arange, sin, pi
from numpy import ma
from matplotlib.pyplot import plot, show

t = arange(0.0, 2.0, 0.01)
s = sin(2*pi*t)

upper = 0.77
lower = -0.77


supper = ma.masked_where(s < upper, s)
slower = ma.masked_where(s > lower, s)
smiddle = ma.masked_where(logical_or(s < lower, s > upper), s)

plot(t, slower, 'r', t, smiddle, 'b', t, supper, 'g')
show()

is:

screenshot from 2015-06-14 19 03 30

@efiring
Copy link
Member

efiring commented Jun 14, 2015

I'm seeing the same thing as @ericdill.

@tacaswell tacaswell changed the title masked arrays broken in py3k masked arrays broken in py3k + gcc 5.1 on arch linux Jun 15, 2015
@tacaswell
Copy link
Member Author

19:10 $ python --version
Python 3.4.3 :: Continuum Analytics, Inc.
19:13 $ gcc --version
gcc (GCC) 5.1.0
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I get this behavior with two diffenent Arch Linux boxes with both numpy 1.9.2 and 1.8.2

On a ubuntu box I get the expected behavior (gcc 4.9.2).

It does not work with system python + pip-installed almost everything on arch.

@tacaswell
Copy link
Member Author

The other fun thing about this is that as I pan what ever the left-most line segment is, is what is shown.

I am seeing warnings like:

gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DPY_ARRAY_UNIQUE_SYMBOL=MPL_matplotlib_backends__backend_agg_ARRAY_API -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -I/home/tcaswell/.virtualenvs/dd_3k/lib/python3.4/site-packages/numpy/core/include -I/home/tcaswell/.virtualenvs/dd_3k/include/freetype2 -I/usr/local/include -I/usr/include -I. -Iextern/agg24-svn/include -I/home/tcaswell/.virtualenvs/dd_3k/include/python3.4m -c src/_backend_agg_wrapper.cpp -o build/temp.linux-x86_64-3.4/src/_backend_agg_wrapper.o
In file included from src/path_converters.h:8:0,
                 from src/_backend_agg_basic_types.h:11,
                 from src/py_converters.h:18,
                 from src/_backend_agg_wrapper.cpp:2:
src/_backend_agg.h: In instantiation of ‘void RendererAgg::draw_markers(GCAgg&, PathIterator&, agg::trans_affine&, PathIterator&, agg::trans_affine&, agg::rgba) [with PathIterator = py::PathIterator]’:
src/_backend_agg_wrapper.cpp:275:5:   required from here
src/MPL_isnan.h:62:19: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
   ( (( MPL_U64(u) & 0x7ff0000000000000LL)  == 0x7ff0000000000000LL)) ? 1:0
                   ^
src/_backend_agg.h:600:21: note: in expansion of macro ‘MPL_notisfinite64’
                 if (MPL_notisfinite64(x) || MPL_notisfinite64(y)) {
                     ^
src/MPL_isnan.h:62:19: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
   ( (( MPL_U64(u) & 0x7ff0000000000000LL)  == 0x7ff0000000000000LL)) ? 1:0
                   ^
src/_backend_agg.h:600:45: note: in expansion of macro ‘MPL_notisfinite64’
                 if (MPL_notisfinite64(x) || MPL_notisfinite64(y)) {
                                             ^
src/MPL_isnan.h:62:19: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
   ( (( MPL_U64(u) & 0x7ff0000000000000LL)  == 0x7ff0000000000000LL)) ? 1:0
                   ^
src/_backend_agg.h:632:21: note: in expansion of macro ‘MPL_notisfinite64’
                 if (MPL_notisfinite64(x) || MPL_notisfinite64(y)) {
                     ^
src/MPL_isnan.h:62:19: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
   ( (( MPL_U64(u) & 0x7ff0000000000000LL)  == 0x7ff0000000000000LL)) ? 1:0
                   ^
src/_backend_agg.h:632:45: note: in expansion of macro ‘MPL_notisfinite64’
                 if (MPL_notisfinite64(x) || MPL_notisfinite64(y)) {
                                             ^

but I see them on the ubuntu box too

@tacaswell
Copy link
Member Author

This is almost certianly a complier/optimization if I throw in some couts in the nan-removal loop it works, remove them and it does not

22:55 $ git diff
diff --git a/src/path_converters.h b/src/path_converters.h
index 7755772..1db4f21 100644
--- a/src/path_converters.h
+++ b/src/path_converters.h
@@ -2,7 +2,7 @@

 #ifndef __PATH_CONVERTERS_H__
 #define __PATH_CONVERTERS_H__
-
+#include <iostream>
 #include "agg_path_storage.h"
 #include "agg_clip_liang_barsky.h"
 #include "MPL_isnan.h"
@@ -213,15 +213,18 @@ class PathNanRemover : protected EmbeddedQueue<4>
         {
             /* This is the fast path for when we know we have no curves */
             code = m_source->vertex(x, y);
+           std::cout << "in fast path" << std::endl;

             if (code == agg::path_cmd_stop ||
                 code == (agg::path_cmd_end_poly | agg::path_flags_close)) {
+               std::cout << "found end" << std::endl;
                 return code;
             }
-
+           std::cout << "x: " << *x << "y: "<< *y << std::endl;
             if (MPL_notisfinite64(*x) || MPL_notisfinite64(*y)) {
                 do {
                     code = m_source->vertex(x, y);
+                   std::cout << "** x: " << *x << "y: "<< *y << std::endl;
                     if (code == agg::path_cmd_stop ||
                         code == (agg::path_cmd_end_poly | agg::path_flags_close)) {
                         return code;

I will have a PR in soon that fixes this is a quieter way.

tacaswell added a commit to tacaswell/matplotlib that referenced this issue Jun 15, 2015
With gcc 5.1.0 it seems that the inner loop of the fast-path in
PathNanRemover gets optimized out of existence and the first non-finite
entry in the path prevents any further drawing.

This PR (rather hackishly) adds explicit de-references to x and y
to the loop to make sure the compiler does not decide it is unneeded.

Closes matplotlib#4525
@mdboom mdboom self-assigned this Jun 15, 2015
@mdboom
Copy link
Member

mdboom commented Jun 15, 2015

Thanks for doing all of the legwork/research on this. A possible fix is in #4527.

@jenshnielsen
Copy link
Member

The fix in #4527 makes sense to me. However, perhaps a test based on the example should be added since this seems not to be covered by any tests presently

@mdboom
Copy link
Member

mdboom commented Jun 15, 2015

It is covered by tests -- in fact, building with a gcc 5.1.0 compiler results in test failures on master. We don't have gcc 5.1.0 on Travis, of course.

@jenshnielsen
Copy link
Member

Ok great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Release critical For bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Development

No branches or pull requests

5 participants