Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Structured arrays do not clean up properly when freed, causing memory leaks #25949

Closed
chrisb-fico opened this issue Mar 6, 2024 · 7 comments · Fixed by #25950
Closed

Comments

@chrisb-fico
Copy link
Contributor

Describe the issue:

Given a structured array with a field of type object, when the array is freed, NumPy 1.24 and earlier would decref the objects in the array allowing them to be freed. Since 1.25 this no longer happens, so the objects are leaked. This was introduced in #23620 and persists in the latest release (1.26.4).

Summary of behaviour in 1.24:

  • PyArray_ClearBuffer is called when the array is freed
  • PyArray_GetClearFunction returns func=traverse_fields_function with auxdata set to a fields_clear_data struct with fields[0] pointing to clear_object_strided_loop
  • `clear_object_strided_loop decrefs every item in the array before clearing it

Summary of behaviour in 1.25.0:

  • PyArray_ClearBuffer is called when the array is freed
  • PyArray_GetClearFunction returns func=zerofill_fields_function with auxdata set to a fields_clear_data struct with fields[0] pointing to fill_zero_object_strided_loop
  • zerofill_fields_function memsets the array item to zero, then calls fill_zero_object_strided_loop, which copies a PyObject representing zero into the array and increfs it

That last point should makes it clear that this is a bug: the new item is increfed, but the previous item is zeroed out without first decrefing.

Note that my dtype implements fill and fillwithscalar: it would be natural to call these when the array is being zeroed, but they are not called.

Reproduce the code example:

# custom.Custom is an object which prints a message when it is freed. With NumPy 1.24.2 the message is printed twice, with 1.25.0 the message is printed only once.
import custom
import numpy as np
x = np.array([custom.Custom()])
del x
y = np.array([custom.Custom()], dtype=custom.custom_dtype)
del y

# C code with the implementation of the Custom object and the structured dtype is below.

#include <Python.h>
#include <numpy/arrayobject.h>

typedef struct {
    PyObject_HEAD
} CustomObject;

void custom_dealloc (PyObject *self) {
    printf("Deallocating Custom object\n");
    Py_TYPE (self) -> tp_free ((PyObject *) self);
}

static PyTypeObject CustomType = {
    .ob_base = PyVarObject_HEAD_INIT(NULL, 0)
    .tp_name = "custom.Custom",
    .tp_doc = PyDoc_STR("Custom objects"),
    .tp_basicsize = sizeof(CustomObject),
    .tp_itemsize = 0,
    .tp_flags = Py_TPFLAGS_DEFAULT,
    .tp_new = PyType_GenericNew,
    .tp_dealloc = custom_dealloc,
};

static PyModuleDef custommodule = {
    .m_base = PyModuleDef_HEAD_INIT,
    .m_name = "custom",
    .m_doc = "Example module that creates an extension type.",
    .m_size = -1,
};

static PyObject *custom_getitem(void *data, void *arr);
static int custom_setitem(PyObject *item, void *data, void *arr);
static void custom_copyswapn(void *dest, npy_intp dstride, void *src, npy_intp sstride, npy_intp n, int swap, void *arr);
static void custom_copyswap (void *dest, void *src, int swap, void *arr);
static npy_bool custom_nonzero (void* data, void* arr);
static int custom_fill (void *data, npy_intp length, void *arr);
static int custom_fillwithscalar (void *buffer, npy_intp length, void *value, void *arr);

PyArray_ArrFuncs arrayFuncs;

PyMODINIT_FUNC
PyInit_custom(void)
{
    PyObject *retval = NULL;
    PyObject *module = NULL;
    PyArray_Descr *objDescr = NULL;
    PyArray_Descr *descrProto = NULL;
    PyArray_Descr *descr = NULL;
    PyObject *fields = NULL;
    int typeCode;

    import_array ();

    PyArray_InitArrFuncs(&arrayFuncs);
    arrayFuncs.getitem   = custom_getitem;
    arrayFuncs.setitem   = custom_setitem;
    arrayFuncs.copyswapn = custom_copyswapn;
    arrayFuncs.copyswap  = custom_copyswap;
    arrayFuncs.nonzero   = custom_nonzero;
    arrayFuncs.fill      = custom_fill;
    arrayFuncs.fillwithscalar = custom_fillwithscalar;

    objDescr = PyArray_DescrFromType (NPY_OBJECT);
    descrProto = PyArray_DescrNew(objDescr);
    fields = PyDict_New();
    if (PyDict_SetItemString (fields, "custom", Py_BuildValue ("(Oi)", (PyObject *) objDescr, 0))) {
        goto exit_cleanup;
    }

    descrProto->type    = 'C';
    descrProto->typeobj = &PyBaseObject_Type;
    descrProto->f       = &arrayFuncs;
    descrProto->fields  = fields;
    descrProto->names   = Py_BuildValue ("(s)", "custom");
    fields = NULL;

    typeCode = PyArray_RegisterDataType (descrProto);
    if (typeCode == -1) {
        goto exit_cleanup;
    }

    descr = PyArray_DescrFromType (typeCode);
    if (descr == NULL) {
        goto exit_cleanup;
    }

    module = PyModule_Create(&custommodule);
    if (module == NULL) {
        goto exit_cleanup;
    }

    if (PyModule_AddObjectRef(module, "custom_dtype", (PyObject *) descr) < 0) {
        goto exit_cleanup;
    }

    if (PyType_Ready(&CustomType) < 0) {
        goto exit_cleanup;
    }

    if (PyModule_AddObjectRef(module, "Custom", (PyObject *) &CustomType) < 0) {
        goto exit_cleanup;
    }

    if (PyType_Ready(&CustomType) < 0) {
        goto exit_cleanup;
    }

    retval = module;
    module = NULL;

exit_cleanup:
    Py_XDECREF(module);
    Py_XDECREF(objDescr);
    Py_XDECREF(descrProto);
    Py_XDECREF(descr);
    Py_XDECREF(fields);
    return retval;
}

static PyObject *custom_getitem(void *data, void *arr) {
    PyObject *ret = (PyObject *) (*(PyObject **) data);
    Py_XINCREF (ret);
    return ret;
}

static int custom_setitem(PyObject *item, void *data, void *arr) {
    PyObject *orig = *((PyObject**) data);
    *(PyObject **) data = item;
    Py_XINCREF(item);
    Py_XDECREF(orig);
    return 0;
}

static NPY_INLINE void bswap (PyObject **val) {
    char *bytes = (char *) val;

    size_t fwd = 0;
    size_t bwd = sizeof (*val) - 1;

    for (; fwd < bwd; ++fwd, --bwd) {
        const char tmp = bytes[fwd];
        bytes[fwd] = bytes[bwd];
        bytes[bwd] = tmp;
    }
}

static void custom_copyswapn(void *dest, npy_intp dstride, void *src, npy_intp sstride, npy_intp n, int swap, void *arr) {
    char *Dest = (char *) dest;
    char *Src  = (char *) src;

    npy_intp i;

    if (Src == NULL && !swap) {
        return;
    }

    for (i = 0; i < n; i++) {
        PyObject **d = (PyObject **) (Dest + dstride * i);

        if (Src != NULL) {
            PyObject **s = (PyObject **) (Src + sstride * i);
            PyObject *orig = *d;
            *d = *s;
            Py_XINCREF (*d);
            Py_XDECREF (orig);
        }

        if (swap) {
            bswap (d);
        }
    }
}

static void custom_copyswap (void *dest, void *src, int swap, void *arr) {
  PyObject **s = (PyObject **) src;
  PyObject **d = (PyObject **) dest;

  if (src != NULL) {
    PyObject *orig = *d;
    *d = *s;
    Py_XINCREF (*d);
    Py_XDECREF (orig);
  }

  if (swap) {
    bswap (d);
  }
}

static npy_bool custom_nonzero (void* data, void* arr) {
  return NPY_TRUE;
}

static int custom_fill (void *data, npy_intp length, void *arr) {
    printf("Called fill\n");
    return 0;
}

static int custom_fillwithscalar (void *buffer, npy_intp length, void *value, void *arr) {
    printf("Called fillwithscalar\n");
    return 0;
}

Error message:

No response

Python and NumPy Versions:

1.26.4
3.10.11 | packaged by conda-forge | (main, May 10 2023, 19:01:19) [Clang 14.0.6 ]

Runtime Environment:

[{'numpy_version': '1.26.4',
  'python': '3.10.11 | packaged by conda-forge | (main, May 10 2023, 19:01:19) '
            '[Clang 14.0.6 ]',
  'uname': uname_result(system='Darwin', node='HN49DX6RGC', release='23.2.0', version='Darwin Kernel Version 23.2.0: Wed Nov 15 21:55:06 PST 2023; root:xnu-10002.61.3~2/RELEASE_ARM64_T6020', machine='arm64')},
 {'simd_extensions': {'baseline': ['NEON', 'NEON_FP16', 'NEON_VFPV4', 'ASIMD'],
                      'found': ['ASIMDHP'],
                      'not_found': ['ASIMDFHM']}},
 {'architecture': 'armv8',
  'filepath': '/Users/cbrown/mambaforge/envs/xpress-test2/lib/python3.10/site-packages/numpy/.dylibs/libopenblas64_.0.dylib',
  'internal_api': 'openblas',
  'num_threads': 12,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.23.dev'}]

Context for the issue:

We use structured dtypes in our commercial Python extension: https://pypi.org/project/xpress. The memory leaks are causing issues in customer environments.

@mattip mattip added this to the 2.0.0 release milestone Mar 6, 2024
seberg added a commit to seberg/numpy that referenced this issue Mar 6, 2024
This was just a typo, add a test for the regression (checked that it
triggered it).

Closes numpygh-25949
seberg added a commit to seberg/numpy that referenced this issue Mar 6, 2024
This was just a typo, add a test for the regression (checked that it
triggered it).

Closes numpygh-25949
@seberg
Copy link
Member

seberg commented Mar 6, 2024

Thanks for finding this and the (very!) detailed report. The annoying thing is it's just a simple typo, but not sure we will have another 1.26 release to backport it :(.

A fix is in gh-25950.

@chrisb-fico
Copy link
Contributor Author

Thank you for the amazingly fast fix! As long as the fix appears in some 1.x release then I am happy. We can just tell our customers to upgrade their NumPy.

@charris
Copy link
Member

charris commented Mar 6, 2024

Looks like another 1.26 release, which might be a pain at this point :)

@seberg
Copy link
Member

seberg commented Mar 6, 2024

I am almost wondering if I should create "this is utterly terrible but works" hack to reach into the NumPy internals and fix things up for you? It seems like it might actually be the bettre solution here.

@chrisb-fico
Copy link
Contributor Author

If a hack is the only way then I am ok with that. The downside is that I will need to backport it to several supported releases of our extension...

@seberg
Copy link
Member

seberg commented Mar 6, 2024

Here is the hack you can add. I guess the question is what's easier NumPy or your release, and at this point I am not sure:

typedef struct {
    PyHeapTypeObject super;
    PyArray_Descr *singleton;
    int type_num;
    PyTypeObject *scalar_type;
    npy_uint64 flags;
    void *dt_slots;
} PyArray_DTypeMeta_actual;  /* full struct was not exposed */


static inline void fix_user_dtype_slots(PyArray_Descr *descr) {
    int version = PyArray_GetNDArrayCFeatureVersion();
    if (version != 0x00000011) {
        /* Not NumPy version 1.25 or 1.26, so nothing to do. */
        return;
    }
    /* 
     * Note: The below code reaches into NumPy ABI to hardcode things,
     *       it only works on 1.25 and 1.26.
     */
    void **my_slots = ((PyArray_DTypeMeta_actual *)Py_TYPE(descr))->dt_slots;

    PyArray_Descr *void_dt = PyArray_DescrFromType(NPY_VOID);
    void **void_slots = ((PyArray_DTypeMeta_actual *)Py_TYPE(void_dt))->dt_slots;
    Py_DECREF(void_dt);

    /* Copy slots 8 and 9 from void (zerofill and clear loop) */
    my_slots[8] = void_slots[8];
    my_slots[9] = void_slots[9];
}

EDIT: The function needs to be called after registration on the actual dtype instance of course.

@chrisb-fico
Copy link
Contributor Author

Thanks for that, it has resolved the issue

charris pushed a commit to charris/numpy that referenced this issue Mar 14, 2024
This was just a typo, add a test for the regression (checked that it
triggered it).

Closes numpygh-25949
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants