Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix SystemError when pickling datetime64 array with pickle5 #12748

Merged
merged 5 commits into from Jan 23, 2019

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented Jan 15, 2019

Check for error when calling the PickleBuffer constructor in ndarray.__reduce_ex__,
and fall back on the regular __reduce__ implementation for arrays which don't
support taking a buffer.

Also fix some reference leaks.

Fixes gh-12745, gh-12793

@charris charris added this to the 1.16.1 release milestone Jan 15, 2019
@pitrou
Copy link
Member Author

pitrou commented Jan 15, 2019

@pierreglaser What are you doing on _pickle.c?

@pierreglaser
Copy link
Contributor

Dynamic/nested scope objects pickling (See this python-dev thread, been working on this for a little while already), feel free to reach out if you are interested!

@pitrou
Copy link
Member Author

pitrou commented Jan 19, 2019

I'll note that I get this error after running numpy/core/tests/test_multiarray.py:

*** Reference count error detected: 
an attempt was made to deallocate 12 (d) ***

The error is still there if I disable all pickling tests, though, so it's probably not pickling-related.

#if PY_VERSION_HEX >= 0x03080000
pickle_module = PyImport_ImportModule("pickle");
pickle_module = PyImport_ImportModule("pickle");
#elif PY_VERSION_HEX < 0x03080000 && PY_VERSION_HEX >= 0x03060000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit: you could drop the PY_VERSION_HEX < 0x03080000 here, it's implied by the condition above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.

((PyObject*)self)->ob_type != &PyArray_Type) ||
PyDataType_ISUNSIZED(descr)) {
/* The PickleBuffer class from version 5 of the pickle protocol
* can only be used for arrays backed by a contiguous data buffer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PEP says "PickleBuffer can wrap any kind of buffer, including non-contiguous buffers.". Am I missing something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only contiguous buffers are correctly handled by the pickle module, though.
(this is something that should probably be improved in the future, but is non-trivial)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that mentioned in the PEP, pickle docs, or cpython bug tracker?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I have to update the PEP.

del a, DATA, carray
# check for reference leaks (gh-12793)
for ref in refs:
assert ref() is None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattip, this type of test will fail on pypy, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes, I should call gc.collect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could skip for pypy. These sorts of tests do tend to be fragile.

@seberg
Copy link
Member

seberg commented Jan 19, 2019

@pitrou I am curious, what do I need to reproduce that error when running the tests? It seems plausible that is new, or was hidden behind some reference leaks before?

@pitrou
Copy link
Member Author

pitrou commented Jan 19, 2019

python runtests.py -t numpy/core/tests/test_multiarray.py reproduces it here.

@seberg
Copy link
Member

seberg commented Jan 19, 2019

Ah, was wondering if I need a debug build at the least or so, because I cannot reproduce it (am on python 3.7.2).

@pitrou
Copy link
Member Author

pitrou commented Jan 19, 2019

I have Python 3.7.2 from Anaconda. Weird.

@seberg
Copy link
Member

seberg commented Jan 19, 2019

Am on arch linux right now, but not quite sure why it should make a difference... I guess that error message does mean the actual integer 12. Maybe there are just too many things potentially touching it to make it easy to reproduce :/.

@pitrou
Copy link
Member Author

pitrou commented Jan 19, 2019

Ok, I bisected the tests manually. The following test is sufficient to reproduce here:

def test_array_interface():
    # Test scalar coercion within the array interface
    class Foo(object):
        def __init__(self, value):
            self.value = value
            self.iface = {'typestr': '=f8'}

        def __float__(self):
            return float(self.value)

        @property
        def __array_interface__(self):
            return self.iface

    f = Foo(0.5)
    f.iface['shape'] = (2,)
    assert_raises(ValueError, np.array, f)

@seberg
Copy link
Member

seberg commented Jan 19, 2019

@pitrou thanks a lot! I can see the refcount decrase, I guess I just don't reach 0 on my system.

EDIT: Or dunno if its that, but I see the error if I run the test often enough anyway.

@pitrou
Copy link
Member Author

pitrou commented Jan 19, 2019

The following change seems to solve the issue:

diff --git a/numpy/core/src/multiarray/ctors.c b/numpy/core/src/multiarray/ctors.c
index 63bf5377a..4724b8472 100644
--- a/numpy/core/src/multiarray/ctors.c
+++ b/numpy/core/src/multiarray/ctors.c
@@ -2503,6 +2503,7 @@ PyArray_FromInterface(PyObject *origin)
     if (ret == NULL) {
         goto fail;
     }
+    dtype = NULL;  /* ref was stolen by PyArray_NewFromDescrAndBase */
     if (data == NULL) {
         if (PyArray_SIZE(ret) > 1) {
             PyErr_SetString(PyExc_ValueError,

I'll let someone else handle it, as I don't really know what the surrounding code is doing.

@seberg
Copy link
Member

seberg commented Jan 19, 2019

Dang, should have checked here earlier, just found that as well :).

@seberg
Copy link
Member

seberg commented Jan 19, 2019

Thanks @pitrou. I doubt many know that code well, but I figured the same fix, so that seems like a good sign.

@charris
Copy link
Member

charris commented Jan 22, 2019

Is this PR waiting on completion?

@pitrou
Copy link
Member Author

pitrou commented Jan 22, 2019

No, it should be ready.

@mattip
Copy link
Member

mattip commented Jan 22, 2019

one very minor nit. We can check after merge if the new test is flaky on PyPy and skip it in a new PR. LGTM.

@pitrou pitrou force-pushed the gh-12745-pickle5-systemerror branch from 9884804 to 7801e07 Compare January 22, 2019 09:17
@charris charris merged commit a5cace4 into numpy:master Jan 23, 2019
@charris
Copy link
Member

charris commented Jan 23, 2019

Thanks @pitrou .

@pitrou pitrou deleted the gh-12745-pickle5-systemerror branch January 23, 2019 19:44
charris pushed a commit to charris/numpy that referenced this pull request Jan 24, 2019
…py#12748)

* BUG: Fix SystemError when pickling datetime64 array with pickle5

Fixes numpygh-12745

* Fix reference and error handling

* Add a test for the fixed reference leak

* Fix for PyPy + simplify #if condition

* Fix comment
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jan 24, 2019
@charris charris removed this from the 1.16.1 release milestone Jan 24, 2019
@pierreglaser
Copy link
Contributor

thanks for working on that @pitrou!

}
else {
PyErr_Format(PyExc_ValueError,
"__reduce_ex__ called with protocol > 5");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we expect this not to work with a later Python pickle protocol?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ndarray.__reduce_ex__ doesn't check return value of PickleBuffer.__new__
7 participants