-
-
Notifications
You must be signed in to change notification settings - Fork 32.6k
Description
The Python C API has an efficient API to access tuple/list items:
seq = PySequence_Fast(obj)
size = PySequence_Fast_GET_SIZE(seq);
item = PySequence_Fast_GET_ITEM(seq, index);
items = PySequence_Fast_ITEMS(seq);
-- then you can useitems[0]
,items[1]
, ...Py_DECREF(seq);
-- release the "view" on the tuple/list
Problem: If obj is not a tuple or a list, the function is inefficient: it creates a temporary list. It's not possible to implement an "object array view" protocol in *3rd party C extension types.
The other problem is that the &PyTuple_GET_ITEM(tuple, 0)
and &PyList_GET_ITEM(tuple, 0)
code to get a direct access to an object array doesn't give a clear control on when the array remains valid. The returning pointer becomes a dangling pointer if the tuple/list is removed in the meanwhile.
I propose designing a new more generic API:
PyAPI_FUNC(int) PySequence_AsObjectArray(
PyObject *,
PyResource *res,
PyObject ***parray,
Py_ssize_t *psize);
The API gives a PyObject**
array and it's Py_ssize_t
size and rely on a new PyResource API to "release the resource" (view on this sequence).
The PyResource API is proposed separately: see issue #106592.
Example of usage:
void func(PyObject *seq)
{
PyResource res;
PyObject **items;
Py_ssize_t nitem;
if (PySequence_AsObjectArray(seq, &res, &items, &nitem) < 0) {
if (PyErr_ExceptionMatches(PyExc_TypeError)) {
PyErr_SetString(PyExc_TypeError, "items() returned non-iterable");
}
goto error;
}
if (nitem != 2) {
PyErr_SetString(PyExc_TypeError,
"items() returned item which size is not 2");
PyResource_Release(&res);
goto error;
}
// items or it may be cleared while accessing __abstractmethod__
// So we need to keep strong reference for key
PyObject *key = Py_NewRef(items[0]);
PyObject *value = Py_NewRef(items[1]);
PyResource_Release(&res);
// ... use key and value ...
Py_DECREF(key);
Py_DECREF(value);
}
This design is more generic: later, we can add a protocol to let a custom type to implement its own "object array view" and implement the "release function" with arbitrary code: it doesn't have to rely on PyObject reference counting. For example, a view can be a memory block allocated on the heap, the release function just would release the memory.
Providing such protocol is out of the scope of this issue. Maybe we can reuse the Py_buffer
protocol for that.