Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy all array_view members in copy constructor #3811

Merged
merged 3 commits into from Nov 18, 2014

Conversation

jbmohler
Copy link
Contributor

On my windows 7 32bit machine, the following script gives a traceback at the scatter call. While I've determined that this code operates with-out traceback and seemingly correct on my linux machine, the proposed changes are OS independent (frankly, I'm baffled why it works on Linux).

The copy ctor as written before this patch seemingly ignores the other members of array_view. This is not compatible with implementation of the operator[] returning a ND-1 dimension array but pointing to the same PyArray!

I suppose we'll need a comment from @mdboom since this was introduced in the removal of PyCXX.

import numpy
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
from matplotlib.figure import Figure

POINTS = 500

figure = Figure(figsize=(6, 6), dpi=72)
ax = figure.add_subplot(1, 1, 1, projection=None)
scat = ax.scatter(numpy.arange(POINTS), numpy.sin(numpy.arange(POINTS)))
Traceback (most recent call last):
  File "C:\work\mpl_scatter_example.py", line 9, in <module>
    scat = ax.scatter(numpy.array([float(i) for i in range(POINTS)]), numpy.sin(numpy.arange(POINTS)))
  File "C:\Python27\lib\site-packages\matplotlib\axes\_axes.py", line 3690, in scatter
    self.add_collection(collection)
  File "C:\Python27\lib\site-packages\matplotlib\axes\_base.py", line 1459, in add_collection
    self.update_datalim(collection.get_datalim(self.transData))
  File "C:\Python27\lib\site-packages\matplotlib\collections.py", line 198, in get_datalim
    offsets, transOffset.frozen())
  File "C:\Python27\lib\site-packages\matplotlib\path.py", line 977, in get_path_collection_extents
    master_transform, paths, transforms, offsets, offset_transform))
ValueError: object too deep for desired array

@WeatherGod
Copy link
Member

I would speculate that this might even be the source of issues I have been seeing with tkagg...

Py_INCREF(m_arr);
m_data = other.m_data;
m_shape = other.m_shape;
m_strides = other.m_strides;
}

array_view(PyArrayObject *arr, char *data, npy_intp *shape, npy_intp *strides)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this ctor also need to specify the member variables?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been awhile since I did heavy C++, but couldn't the copy ctor call the other ctor below for good code reuse?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling ctor's from other ctor's isn't supported since a ctor is special (sets up vtables & stuff). it's perhaps a good idea to call a helper, but that's arguably no better since it would still reference all the actual names of members.

whatever the case about this, travis isn't happy (in ways that really confuse me).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I could have sworn I was able to do something like this using gcc, but maybe I am thinking of subclassing?

In any case, it appears that C++11 added "delegating constructors": http://www-01.ibm.com/support/knowledgecenter/SSGH3R_11.1.0/com.ibm.xlcpp111.aix.doc/language_ref/delegating_ctors.html

Don't know what the compiler support is, but it is nice to know it has a name.

@mdboom
Copy link
Member

mdboom commented Nov 17, 2014

It's not that the current constructor ignores the other members of the view. It's that it uses set to initialize them. IIRC, the reason for doing that was so that the copy can request contiguous memory if it needs to... but in the end it turned out that we didn't need that and the contiguous parameter wasn't getting passed along correctly anyway.

So this seems fine -- but let's take the contiguous default parameter off of the copy constructor just so that no one expects it to work in the future.

@jbmohler
Copy link
Contributor Author

True, @mdboom does look superfluous, but I'm speaking of other.m_data & friends. The set call only sees the m_arr member of other, but the line

typename TransformArray::sub_t subtrans = transforms[i % Ntransforms];

in _path.h sets up a new array_view in operator[] which points to the same PyArray but with different m_data and friends.

@mdboom
Copy link
Member

mdboom commented Nov 17, 2014

Right -- I see that the copy constructor needs to be less smart, so I agree your change is correct. It must be being called in more cases with the MS compiler, though, otherwise I'd expect far less to be working for us on Linux.

TL;DR -- this looks good -- let's just additionally take the contiguous argument off.

Also, Travis-CI doesn't seem to like it. When you push your additional change, that will give it a chance to re-run the tests, and we'll see if they are consistently reproducible.

@jbmohler
Copy link
Contributor Author

travis is not happy, but I'm quite confused. The tests are passing on both my linux & windows box, but hanging in travis. travis also seems slower than usual, I'm going to push the next change sometime in the wee hours for less competition.

@jbmohler
Copy link
Contributor Author

(for those interested in (fairly important) c++ pedantry)

I think I've decoded the gcc/MSVC difference. In the following example program, the printf prints only in MSVC -- that is, MSVC takes the extra bool and still claims that constructor as a proper copy constructor. Applying this to our problem, I believe this means that gcc sends our array_view through a default copy constructor which just duplicates the entire structure. That would make the appearance of working, but I wonder about the reference counts.

Well, I'm not actually sure what gcc does, but the printf doesn't get called. It may (from reading stackoverflow) be a standards approved eliding of the copy constructor from a temporary.

#include <stdio.h>

class array_view
{
public:
    array_view(){};
    array_view(const array_view &other, bool contiguous=false)
    {
        printf("this prints when only compiled with MSVC\n");
    };
};

int main()
{
    array_view i = array_view();
    return 0;
}

@WeatherGod
Copy link
Member

Wow, this is fun! If I take out the "const" in the ctor's call signature, I get the following compiler error:

[centos6] [broot@rd22 ~]$ make testy
g++     testy.C   -o testy
testy.C: In function ‘int main()’:
testy.C:15: error: no matching function for call to ‘array_view::array_view(array_view)’
testy.C:7: note: candidates are: array_view::array_view(array_view&)
testy.C:6: note:                 array_view::array_view()
make: *** [testy] Error 1

Ok, so the compiler doesn't want to deal with mutable arguments. So, if I take out the ampersand, making it pass-by-value, I get this compiler error:

[centos6] [broot@rd22 ~]$ make testy
g++     testy.C   -o testy
testy.C:7: error: invalid constructor; you probably meant ‘array_view (const array_view&)’
make: *** [testy] Error 1

which means that gcc is looking for the const by-reference version of the ctor, but for whatever reason is then dropping it (indeed, the original version does not print that message).

@mdboom
Copy link
Member

mdboom commented Nov 17, 2014

@WeatherGod: My understanding is that the compiler is free to drop the copy constructor in this case and convert

array_view i = array_view();

to the equivalent:

array_view i();

It seems that gcc does this optimization, but maybe MSVC is not. Which also suggests that MSVC is calling the copy constructor in some context in practice where gcc does not, hence the likely source of this bug...

@mdboom
Copy link
Member

mdboom commented Nov 17, 2014

I'm going to fire up my VirtualBox image that simulates Travis and see if I can reproduce the Travis issues there.

@mdboom
Copy link
Member

mdboom commented Nov 17, 2014

The fix for Travis (I think) is to use Py_XINCREF in the copy constructor, since arr can legitimately be NULL when the array is zero-length or was instantiated with None. That seems to get things passing for me. I made a PR against this one: jbmohler#1

@ianthomas23
Copy link
Member

As pedantry is encouraged in this thread, I cannot resist 'helping'. Mike's comment that

array_view i = array_view();

is treated as

array_view i;

by g++ is correct. It is called Return Value Optimisation, and the first section of http://en.wikipedia.org/wiki/Return_value_optimization explains it well. To anyone still following this thread, I would not recommend reading that link unless you are already comfortable with C++ as it may give you sleepless nights! Evidently g++ uses RVO in this case, but MSVC does not.

@jbmohler: Your conclusion that g++ sends your array_view through a default (i.e. compiler-generated) copy constructor isn't correct. The user-defined copy constructor with the default argument is a valid copy constructor, and under these conditions the compiler will not generate another copy constructor. Hence provided the array_view code already in numpy_cpp.h has the correct reference counts, no extra compiler-generated code will be added and we should not have a problem under g++.

For completeness, ff you want to fully test the use of the copy constructor under g++ you need to change your test from

array_view i = array_view();

to

array_view a;
array_view i(a);

so that there is no RVO.

@jbmohler
Copy link
Contributor Author

@ianthomas23 -- thanks for that link to RVO. I deduced about as much by putting a printf in a ~array_view as well. I also learned by experimentation that MSVC does do the RVO when there is not the extra defaulted bool parameter. Well, actually, I don't think my simple example above is RVO -- I think that it is http://en.wikipedia.org/wiki/Copy_elision has implied by @mdboom .

It's been an educational day.

@jbmohler
Copy link
Contributor Author

I've removed the unused parameter contiguous. Thanks @mdboom for the PY_XINCREF (and for patiently working around my disbelief of travis). I believe this addresses everything above.

(Ignore my flubbed git push which briefly included more things than I intended in my git branch for this PR -- I rebased and the branch is now correct.)

@mdboom
Copy link
Member

mdboom commented Nov 18, 2014

Great -- Travis is now passing, and I'll take your word that this is resolving the original issue on Windows. Merging. Thanks for spotting this and tracking down a solution!

mdboom added a commit that referenced this pull request Nov 18, 2014
copy all array_view members in copy constructor
@mdboom mdboom merged commit d045fef into matplotlib:master Nov 18, 2014
@jbmohler jbmohler deleted the array_view_ctor branch November 18, 2014 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants