Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Segmentation fault when unpickling an empty ndarray with a none-zero dimension #21009

Closed
wuciting opened this issue Feb 7, 2022 · 6 comments

Comments

@wuciting
Copy link

wuciting commented Feb 7, 2022

Describe the issue:

There is a Segmentation fault when unpickling an empty ndarray with a none-zero dimension.
The first dimension 93049 may depend on the platform. If you cannot reproduce the error, please increase the first dimension.
I think it has something to do with the page allocation.

Reproduce the code example:

import pickle

import numpy as np

a = np.array([]).reshape(93049, 0)
print(a.shape)
s = pickle.dumps(a)
print(s)
b = pickle.loads(s)
print(b)

Error message:

Segmentation fault (core dumped)

NumPy/Python version information:

1.22.1 3.8.11 (default, Aug 3 2021, 15:09:35)
[GCC 7.5.0]

@seberg
Copy link
Member

seberg commented Feb 7, 2022

I am very confused why our tests appear to not catch this at least when running in valgrind (at least I assume that they should). I am pretty sure I know the issue, but it should be a fairly straight forward fix if anyone wants to go bug hunting with gdb or valgrind. (A test that actually triggers this reliably would be good though, ideally even covering a few other branches.)

EDIT: silly me, you need the empty array, and we likely don't have a test with empty arrays.

@doctormartin67
Copy link

doctormartin67 commented Feb 15, 2022

I traced the problem down using gdb and it seems the issue comes from numpy. The seg fault occurs
at numpy/core/src/multiarray/methods.c:2208:
memcpy(PyArray_DATA(self), datastr, num);
When I take a look at the PyArray_DATA function it casts self to a PyArrayObject_fields struct pointer to access it's data field.
The comments above the declaration of this struct says that accessing its field is deprecated and that NPY_NO_DEPRECATED_API should be defined to avoid this issue. Not sure why it isn't though. This is as far I can could go without studying numpy in more depths.

@seberg
Copy link
Member

seberg commented Feb 15, 2022

Yes, that is all correct. The "deprecated" stuff isn't a worry, it only applies to users outside of NumPy anyway (and even then isn't problematic).
The code works for non-empty arrays, so the problem is just the num.

@seberg
Copy link
Member

seberg commented Feb 15, 2022

@doctormartin67 thanks for looking into this, @alexdesiqueira had just started on this yesterday and opened a PR now, sorry about that bad timing.

@doctormartin67
Copy link

@seberg No worries, always have fun taking a look at seg faults in gdb

seberg pushed a commit that referenced this issue Feb 16, 2022
Changing num to the number of bytes in the input array, PyArray_NBYTES(self). Solves #21009.

* Fixing nbyte size in methods.c:memcpy

* Adding a test

* Re-adding removed newline

* Shrinking the test array to save memory
charris pushed a commit to charris/numpy that referenced this issue Mar 2, 2022
…y#21067)

Changing num to the number of bytes in the input array, PyArray_NBYTES(self). Solves numpy#21009.

* Fixing nbyte size in methods.c:memcpy

* Adding a test

* Re-adding removed newline

* Shrinking the test array to save memory
lithomas1 pushed a commit to lithomas1/numpy that referenced this issue Mar 6, 2022
…y#21067)

Changing num to the number of bytes in the input array, PyArray_NBYTES(self). Solves numpy#21009.

* Fixing nbyte size in methods.c:memcpy

* Adding a test

* Re-adding removed newline

* Shrinking the test array to save memory
@charris
Copy link
Member

charris commented Mar 7, 2022

@seberg This is fixed, correct?

@seberg seberg closed this as completed Mar 7, 2022
melissawm pushed a commit to melissawm/numpy that referenced this issue Apr 12, 2022
…y#21067)

Changing num to the number of bytes in the input array, PyArray_NBYTES(self). Solves numpy#21009.

* Fixing nbyte size in methods.c:memcpy

* Adding a test

* Re-adding removed newline

* Shrinking the test array to save memory
seberg pushed a commit to seberg/numpy that referenced this issue Apr 24, 2022
…y#21067)

Changing num to the number of bytes in the input array, PyArray_NBYTES(self). Solves numpy#21009.

* Fixing nbyte size in methods.c:memcpy

* Adding a test

* Re-adding removed newline

* Shrinking the test array to save memory
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants