Skip to content

BUG: [IPC] Refcount corruption and crash in boolean memmap reads due to double evaluation in PyArrayScalar_RETURN_BOOL_FROM_LONG #30389

@y7070

Description

@y7070

Describe the issue:

I am using numpy.memmap on a local file (located in /dev/shm) as an inter-process communication mechanism.

In the example setup, I create a (400, 8000) memmapped array at /dev/shm/test.npy.
The workflow is:

  1. Run writer.py, which writes elements to the memmap one by one.
  2. Run reader.py, which reads elements sequentially, waiting for each slot to be written before proceeding to the next.

When running the reader process, it crashes with the following error:

Reading round 0
Reading round 1
free(): invalid pointer
[1] 11203 abort (core dumped)

Reproduce the code example:

# There are two python scripts writer.py and reader.py:
# writer.py
import time
import numpy as np

arr = np.memmap("/dev/shm/test.npy", dtype=np.bool_, mode="w+", shape=(400, 8000))

for i in range(arr.shape[0]):
    print(f"Writing round {i}")
    for j in range(arr.shape[1]):
        arr[i, j] = True
        time.sleep(0.0001)

# reader.py
import numpy as np

arr = np.memmap("/dev/shm/test.npy", dtype=np.bool_, mode="r", shape=(400, 8000))

for i in range(arr.shape[0]):
    print(f"Reading round {i}")
    for j in range(arr.shape[1]):
        while not arr[i, j]: pass

Error message:

Reading round 0
Reading round 1
Reading round 2
Reading round 3
Reading round 4
Reading round 5
Reading round 6
Reading round 7
Reading round 8
Reading round 9
free(): invalid pointer
--Type <RET> for more, q to quit, c to continue without paging--

Thread 1 "python" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7c5d859 in __GI_abort () at abort.c:79
#2  0x00007ffff7cc8266 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7df2298 "%s\n") at ../sysdeps/posix/libc_fatal.c:156
#3  0x00007ffff7cd02fc in malloc_printerr (str=str@entry=0x7ffff7df04c1 "free(): invalid pointer") at malloc.c:5347
#4  0x00007ffff7cd1b2c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4173
#5  0x000055555572ff74 in _Py_DECREF () at /tmp/build/80754af9/python-split_1634043551344/work/Include/object.h:478
#6  _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at /tmp/build/80754af9/python-split_1634043551344/work/Python/ceval.c:3079
#7  0x00005555557201f0 in PyEval_EvalFrameEx (throwflag=0, f=0x7ffff72d5440) at /tmp/build/80754af9/python-split_1634043551344/work/Python/ceval.c:741
#8  _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=0x0, kwargs=0x0, 
    kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at /tmp/build/80754af9/python-split_1634043551344/work/Python/ceval.c:4298
#9  0x0000555555721aa3 in PyEval_EvalCodeEx (closure=0x0, kwdefs=0x0, defcount=0, defs=0x0, kwcount=0, kws=0x0, argcount=0, args=0x0, locals=<optimized out>, 
    globals=<optimized out>, _co=<optimized out>) at /tmp/build/80754af9/python-split_1634043551344/work/Python/ceval.c:4327
#10 PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at /tmp/build/80754af9/python-split_1634043551344/work/Python/ceval.c:718
#11 0x0000555555795382 in run_eval_code_obj (co=0x7ffff723b030, globals=0x7ffff7314e40, locals=0x7ffff7314e40)
    at /tmp/build/80754af9/python-split_1634043551344/work/Python/pythonrun.c:1166
#12 0x00005555557a6202 in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7ffff7314e40, locals=0x7ffff7314e40, flags=<optimized out>, arena=<optimized out>)
    at /tmp/build/80754af9/python-split_1634043551344/work/Python/pythonrun.c:1188
#13 0x00005555557a93ab in pyrun_file (fp=0x555555927d10, filename=0x7ffff71cdbf0, start=<optimized out>, globals=0x7ffff7314e40, locals=0x7ffff7314e40, closeit=1, 
    flags=0x7fffffffd8f8) at /tmp/build/80754af9/python-split_1634043551344/work/Python/pythonrun.c:1085
#14 0x00005555557a958f in pyrun_simple_file (flags=0x7fffffffd8f8, closeit=1, filename=0x7ffff71cdbf0, fp=0x555555927d10)
    at /tmp/build/80754af9/python-split_1634043551344/work/Python/pythonrun.c:439
#15 PyRun_SimpleFileExFlags (fp=0x555555927d10, filename=<optimized out>, closeit=1, flags=0x7fffffffd8f8)
    at /tmp/build/80754af9/python-split_1634043551344/work/Python/pythonrun.c:472
#16 0x00005555557a9a69 in pymain_run_file (cf=0x7fffffffd8f8, config=0x5555558f1720) at /tmp/build/80754af9/python-split_1634043551344/work/Modules/main.c:391
#17 pymain_run_python (exitcode=0x7fffffffd8f0) at /tmp/build/80754af9/python-split_1634043551344/work/Modules/main.c:616
#18 Py_RunMain () at /tmp/build/80754af9/python-split_1634043551344/work/Modules/main.c:695
#19 0x00005555557a9c69 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /tmp/build/80754af9/python-split_1634043551344/work/Modules/main.c:1127
#20 0x00007ffff7c5f083 in __libc_start_main (main=0x55555565c7d0 <main>, argc=2, argv=0x7fffffffdaf8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fffffffdae8) at ../csu/libc-start.c:308
#21 0x000055555574b427 in _start ()

Python and NumPy Versions:

import sys, numpy; print(numpy.__version__); print(sys.version) shows:

1.24.4
3.8.12 (default, Oct 12 2021, 13:49:34) 
[GCC 7.5.0]

Runtime Environment:

import numpy; numpy.show_runtime() shows:

[{'simd_extensions': {'baseline': ['SSE', 'SSE2', 'SSE3'],
                      'found': ['SSSE3',
                                'SSE41',
                                'POPCNT',
                                'SSE42',
                                'AVX',
                                'F16C',
                                'FMA3',
                                'AVX2'],
                      'not_found': ['AVX512F',
                                    'AVX512CD',
                                    'AVX512_KNL',
                                    'AVX512_KNM',
                                    'AVX512_SKX',
                                    'AVX512_CLX',
                                    'AVX512_CNL',
                                    'AVX512_ICL']}},
 {'architecture': 'Prescott',
  'filepath': '/home/yang/miniconda3/lib/python3.8/site-packages/numpy.libs/libopenblas64_p-r0-15028c96.3.21.so',
  'internal_api': 'openblas',
  'num_threads': 24,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.21'}]

Context for the issue:

After reviewing NumPy’s source code, I believe I have identified the root cause of the crash.

Root Cause Analysis

When the reader process performs arr[i, j], NumPy executes PyArray_Scalar, implemented in
numpy/_core/src/multiarray/scalarapi.c (see line 506):

    if (type_num == NPY_BOOL) {
        PyArrayScalar_RETURN_BOOL_FROM_LONG(*(npy_bool*)data);
    }

Here, data points to the underlying memory inside the shared memmap region. Because the writer process may modify this region concurrently, the value may change between operations.

Expanding the macro reveals the issue.
PyArrayScalar_RETURN_BOOL_FROM_LONG is defined as:

#define PyArrayScalar_RETURN_BOOL_FROM_LONG(i)                  \
        return Py_INCREF(PyArrayScalar_FromLong(i)), \
                PyArrayScalar_FromLong(i)

And PyArrayScalar_FromLong returns one of NumPy’s two internal singleton objects for boolean scalars:

#define PyArrayScalar_FromLong(i) \
        ((PyObject *)(&(_PyArrayScalar_BoolValues[((i)!=0)])))

Thus, for a given boolean value, NumPy:

  1. Reads *(npy_bool*)data
  2. Maps it to one of the internal objects _PyArrayScalar_BoolValues[False] or [True]
  3. Increments the reference count on the selected object
  4. Reads *(npy_bool*)data again
  5. Returns (possibly) a different singleton object, depending on whether the value changed between steps (1) and (4)

Because the shared memory is being modified concurrently by the writer process, the following race can occur:

  • The first dereference sees False, so the macro increments the refcount of _PyArrayScalar_BoolValues[False].
  • Before the return expression runs, the writer changes the value to True.
  • The macro returns _PyArrayScalar_BoolValues[True] instead.
  • Later, Py_DECREF is called on the returned object (True), not the one whose refcount was incremented (False).

This causes the reference count of _PyArrayScalar_BoolValues[True] to reach zero incorrectly, leading to a crash such as:

free(): invalid pointer
Aborted (core dumped)

Proposed Solution

The core issue is that PyArrayScalar_RETURN_BOOL_FROM_LONG is a macro, so it evaluates data twice. Rewriting this macro into a function might be too invasive, but a minimal fix can be applied directly in scalarapi.c.

Specifically, the boolean value can be safely stored before passing it to the macro:

    if (type_num == NPY_BOOL) {
        npy_bool val = *(npy_bool*)data;
        PyArrayScalar_RETURN_BOOL_FROM_LONG(val);
    }

Remaining Question

np.bool_ is implemented using two internal singleton objects.
My remaining question is:

Is np.bool_ the only dtype whose scalar-returning path relies on internal singleton objects and therefore is vulnerable to this kind of race?

If other dtypes also use shared singleton scalar objects, similar issues may exist.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions