numpy 1.11. Segfault: numpy.random.permutation on list of long strings #7710

asanakoy · 2016-06-06T18:42:07Z

I'm getting Segmentation fault when running numpy.random.permutation on list of long strings. With small strings works well.

import numpy as np
a = ['a', 'a' * 100]
z = np.random.permutation(np.array(a))

NumPy version: 1.11.0
Python 2.7
OS: Ubuntu 14.04.4 LTS

GDB trace:

(gdb) run
Starting program: /usr/bin/python test.py

Program received signal SIGSEGV, Segmentation fault.
visit_decref.48915 (op=<unknown at remote 0x6161616161616161>, data=0x0) at ../Modules/gcmodule.c:360
360 ../Modules/gcmodule.c: No such file or directory.
(gdb) bt

0 visit_decref.48915 (op=<unknown at remote 0x6161616161616161>, data=0x0) at ../Modules/gcmodule.c:360

1 0x000000000057392b in dict_traverse.18526 (
op={<unknown at remote 0x6161616161616161>: <unknown at remote 0x6161616161616161>, <unknown at remote 0x6161616161616161>: <unknown at remote 0x6161616161616161>, <unknown at remote 0x6161616161616161>: <unknown at remote 0x61616161>, '__builtins__': {'bytearray': <type at remote 0x910680>, 'IndexError': <type at remote 0x913740>, 'all': <built-in function all>, 'help': <_Helper at remote 0x7ffff7e86210>, 'vars': <built-in function vars>, 'SyntaxError': <type at remote 0x915820>, 'unicode': <type at remote 0x9199c0>, 'UnicodeDecodeError': <type at remote 0x914760>, 'memoryview': <type at remote 0x907e00>, 'isinstance': <built-in function isinstance>, 'copyright': <_Printer(_Printer__data='Copyright (c) 2001-2014 Python Software Foundation.\nAll Rights Reserved.\n\nCopyright (c) 2000 BeOpen.com.\nAll Rights Reserved.\n\nCopyright (c) 1995-2001 Corporation for National Research Initiatives.\nAll Rights Reserved.\n\nCopyright (c) 1991-1995 Stichting Mathematisch Centrum, Amsterdam.\nAll Rights Reserved.', _Printer...(truncated), visit=0x54eee0 <visit_decref.48915>, 
arg=0x0) at ../Objects/dictobject.c:2113
2 0x0000000000536476 in subtract_refs (containers=0x9186e0 <generations+96>) at ../Modules/gcmodule.c:385

3 collect.49008 (generation=) at ../Modules/gcmodule.c:925

4 0x000000000042749e in PyGC_Collect () at ../Modules/gcmodule.c:1440

5 0x0000000000437d47 in Py_Finalize () at ../Python/pythonrun.c:449

6 0x000000000044f993 in Py_Main (argc=, argv=0x7fffffffdc18) at ../Modules/main.c:665

7 0x00007ffff7818f45 in __libc_start_main (main=0x44f9c2 , argc=2, argv=0x7fffffffdc18, init=,
fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7fffffffdc08) at libc-start.c:287
8 0x0000000000578c4e in _start ()

Reproducibility: ~ 50%

The text was updated successfully, but these errors were encountered:

charris · 2016-06-06T18:49:08Z

Hmm, I don't see this on 64 bit fedora with 16 GiB memory.

In [1]: np.__version__
Out[1]: '1.11.0'

In [2]: a = ['a', 'a' * 100]

In [3]: z = np.random.permutation(np.array(a))

In [4]:

charris · 2016-06-06T18:49:54Z

What is the 50% reproducibility.

charris · 2016-06-06T18:51:35Z

And what is the full python version? Python 2.7.11 here.

asanakoy · 2016-06-06T18:54:01Z

I have 64bit Ubuntu with 32Gb memory.
Python 2.7.6
50% reproducibility means you need to run it at least 10 times. Not always crashing.

And if you run it from command line, then you will see segfault after you quit the command line (ipython for example)

asanakoy · 2016-06-06T19:11:50Z

With Python 2.7.11 it worked for me too. So I don't know if it's an issue with numpy or with Python.

charris · 2016-06-06T19:42:46Z

OK, I got a segfault, but very unreliably. Ran once, happened. Ran in loop 1,000,000 times, nada. Not sure what is going on.

njsmith · 2016-06-06T20:56:34Z

The segfault is happening at interpreter shutdown... did you start and then
shutdown python 1,000,000 times?
On Jun 6, 2016 12:42 PM, "Charles Harris" notifications@github.com wrote:

OK, I got a segfault, but very unreliably. Ran once, happened. Ran in loop
1,000,000 times, nada. Not sure what is going on.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#7710 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAlOaJilXet0Bg7N91bUumMZ07PILBH6ks5qJHg5gaJpZM4IvMN5
.

charris · 2016-06-06T21:11:31Z

IIRC, I got a segfault before shutting down.

njsmith · 2016-06-06T22:45:56Z

Clearly some sort of memory/GC corruption so some indeterminism is to be expected, but it also makes sense that interpreter shutdown would be a particularly likely time to hit the corruption, since when shutting down the interpreter tries to tear-down and garbage-collect all objects (and the traceback at the top of this thread shows it being hit during this process in Py_Finalize).

simongibbons · 2016-06-09T11:32:27Z

Ok so I think I've got this one figured out.

The issue is when shuffle allocates a buffer for switching elements on this line

It will pick up the wrong length of the string in the buffer's dtype

In [2]: a = np.array(['a', 'a' * 100])

In [3]: a.dtype
Out[3]: dtype('<U100')

In [4]: buf = np.empty_like(a[0])

In [5]: buf.dtype
Out[5]: dtype('<U1')

Now when we swap a longer element into that buffer it will end up overflowing, almost certainly overwriting something important which will cause the segfault when the garbage collection is run.

A simple fix for this would be to explicitly set the dtype to be that of the array

In [6]: buf = np.empty_like(a[0], dtype=a.dtype)

In [7]: buf.dtype
Out[7]: dtype('<U100')

np.random.shuffle will allocate a buffer based on the size of the first element of an array of strings. If the first element is smaller than another in the array this buffer will overflow, causing a segfault when garbage is collected. Additionally if the array contains objects then one would be left in the buffer and have it's refcount erroniously decrimented on function exit, causing that object to be deallocated too early. To fix this we change the buffer to be an array of int8 of the the size of the array's dtype, which sidesteps both issues. Fixes numpy#7710

asanakoy changed the title ~~Segfault: numpy.random.permutation on list of long strings~~ numpy 1.11. Segfault: numpy.random.permutation on list of long strings Jun 6, 2016

charris added 00 - Bug component: numpy.random labels Jun 6, 2016

charris mentioned this issue Jun 7, 2016

Memory Leak #7714

Closed

simongibbons mentioned this issue Jun 9, 2016

BUG: Fix segfault in np.random.shuffle for arrays of different length strings #7719

Merged

charris added this to the 1.11.1 release milestone Jun 9, 2016

charris closed this as completed in #7719 Jun 10, 2016

charris mentioned this issue Jun 10, 2016

Backport 7719, BUG: Fix segfaults in np.random.shuffle #7724

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numpy 1.11. Segfault: numpy.random.permutation on list of long strings #7710

numpy 1.11. Segfault: numpy.random.permutation on list of long strings #7710

asanakoy commented Jun 6, 2016 •

edited

0 visit_decref.48915 (op=<unknown at remote 0x6161616161616161>, data=0x0) at ../Modules/gcmodule.c:360

1 0x000000000057392b in dict_traverse.18526 (

2 0x0000000000536476 in subtract_refs (containers=0x9186e0 <generations+96>) at ../Modules/gcmodule.c:385

3 collect.49008 (generation=) at ../Modules/gcmodule.c:925

4 0x000000000042749e in PyGC_Collect () at ../Modules/gcmodule.c:1440

5 0x0000000000437d47 in Py_Finalize () at ../Python/pythonrun.c:449

6 0x000000000044f993 in Py_Main (argc=, argv=0x7fffffffdc18) at ../Modules/main.c:665

7 0x00007ffff7818f45 in __libc_start_main (main=0x44f9c2 , argc=2, argv=0x7fffffffdc18, init=,

8 0x0000000000578c4e in _start ()

charris commented Jun 6, 2016

charris commented Jun 6, 2016

charris commented Jun 6, 2016

asanakoy commented Jun 6, 2016 •

edited

asanakoy commented Jun 6, 2016 •

edited

charris commented Jun 6, 2016

njsmith commented Jun 6, 2016

charris commented Jun 6, 2016

njsmith commented Jun 6, 2016

simongibbons commented Jun 9, 2016 •

edited

numpy 1.11. Segfault: numpy.random.permutation on list of long strings #7710

numpy 1.11. Segfault: numpy.random.permutation on list of long strings #7710

Comments

asanakoy commented Jun 6, 2016 • edited

0 visit_decref.48915 (op=<unknown at remote 0x6161616161616161>, data=0x0) at ../Modules/gcmodule.c:360

1 0x000000000057392b in dict_traverse.18526 (

2 0x0000000000536476 in subtract_refs (containers=0x9186e0 <generations+96>) at ../Modules/gcmodule.c:385

3 collect.49008 (generation=) at ../Modules/gcmodule.c:925

4 0x000000000042749e in PyGC_Collect () at ../Modules/gcmodule.c:1440

5 0x0000000000437d47 in Py_Finalize () at ../Python/pythonrun.c:449

6 0x000000000044f993 in Py_Main (argc=, argv=0x7fffffffdc18) at ../Modules/main.c:665

7 0x00007ffff7818f45 in __libc_start_main (main=0x44f9c2 , argc=2, argv=0x7fffffffdc18, init=,

8 0x0000000000578c4e in _start ()

charris commented Jun 6, 2016

charris commented Jun 6, 2016

charris commented Jun 6, 2016

asanakoy commented Jun 6, 2016 • edited

asanakoy commented Jun 6, 2016 • edited

charris commented Jun 6, 2016

njsmith commented Jun 6, 2016

charris commented Jun 6, 2016

njsmith commented Jun 6, 2016

simongibbons commented Jun 9, 2016 • edited

asanakoy commented Jun 6, 2016 •

edited

asanakoy commented Jun 6, 2016 •

edited

asanakoy commented Jun 6, 2016 •

edited

simongibbons commented Jun 9, 2016 •

edited