Skip to content

array.tolist() speed enhancement (Trac #1779) #2372

@numpy-gitbot

Description

@numpy-gitbot

Original ticket http://projects.scipy.org/numpy/ticket/1779 on 2011-03-23 by trac user Han, assigned to unknown.

Hi,

For a while, a small issue has been bugging me, where array.tolist() takes a huge amount of time, compared to the speed of Python.

To illustrate, here are some timings (on Windows XP):

>>> timeit.timeit('a.tolist()', 'from numpy import arange; a = arange(1e5)', number=500)
19.07303038231646

The conversion of 500 x 100000 elements takes up to 20(!) seconds.

>>> timeit.timeit('a = arange(1e5)', 'from numpy import arange', number=500)
0.4194663546483639

While creating a NumPy arrays with the same amount of items takes .5 seconds.

>>> timeit.timeit('a = range(int(1e5))', number=500)
0.97656364422391562

And creating a Python list takes no more than 1 second, this is 20x faster than array.tolist(). So where is this discrepancy coming from?

To find out, I did some runs with valgrind on NumPy 1.4.1 dbg (Debian), and used kcachegrind to produce a few calling graphs.

The first thing I noticed was the amount of array_alloc and consequent array_dealloc calls. The number of calls amount up to the number of elements in the array! PyArray_NewFromDescr is called per array element, which creates a lot of overhead.

In NumPy 1.6.1b1, this overhead still exists; it stems from the PyArray_ToList function in convert.c:

NPY_NO_EXPORT PyObject *
PyArray_ToList(PyArrayObject *self)
{
    PyObject *lp;
    PyArrayObject *v;
    intp sz, i;

    if (!PyArray_Check(self)) {
        return (PyObject *)self;
    }
    if (self->nd == 0) {
        return self->descr->f->getitem(self->data,self);
    }

    sz = self->dimensions[0];
    lp = PyList_New(sz);
    for (i = 0; i < sz; i++) {
        v = (PyArrayObject *)array_big_item(self, i);
        if (PyArray_Check(v) && (v->nd >= self->nd)) {
            PyErr_SetString(PyExc_RuntimeError,
                            "array_item not returning smaller-" \
                            "dimensional array");
            Py_DECREF(v);
            Py_DECREF(lp);
            return NULL;
        }
        PyList_SetItem(lp, i, PyArray_ToList(v));
        Py_DECREF(v);
    }
    return lp;
}

For every element in the array, a array_big_item call is made to create a new array with that element and given recursively to PyArray_ToList to get the actual element item.

I added an extra clause to the function to account for 1-dimensional arrays:

    if (self->nd == 1) {
        sz = self->dimensions[0];
        lp = PyList_New(sz);
        for (i = 0; i < sz; i++) {
            PyList_SetItem(lp, i, self->descr->f->getitem(index2ptr(self, i),self));
        }
        return lp;
    }

Which gets the time down to 2 seconds on Windows.

I'm not sure about the patch, though, because it does not account for errors, and might be more optimized / streamlined.

Anyway, hope it can go in at 1.6.0 in some way or another, because it really helps in NumPy->Python conversions!

[attached: calling graphs]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions