Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C API: Add internal C API to build a tuple: _PyTupleBuilder #107137

Closed
vstinner opened this issue Jul 23, 2023 · 10 comments
Closed

C API: Add internal C API to build a tuple: _PyTupleBuilder #107137

vstinner opened this issue Jul 23, 2023 · 10 comments
Labels
topic-C-API type-feature A feature request or enhancement

Comments

@vstinner
Copy link
Member

vstinner commented Jul 23, 2023

The current Python C API PyTuple_New() to build a tuple has multiple flaws:

  • The Python tuple object is immediately tracked by the GC, whereas its items are initialized to NULL (inconsistent tuple)
  • The PyTuple_SET_ITEM() function must only used on a tuple while is being created

I propose adding a new internal C API to build a tuple: _PyTupleBuilder.

  • User of the API doesn't access the tuple directly but uses higher API
  • Over-allocate the tuple (by 25%) to reduce the number of _PyTuple_Resize() calls
  • _PyTupleBuild_Append() grows the tuple if needed
  • _PyTupleBuild_AppendUnsafe() is a thin wrapper to PyTuple_SET_ITEM()
  • Only track the tuple once its fully initialized: correct size, all items are set
  • Less error-prone API

Later, once we will consider this API stable, we can add it to the public C API and deprecate the old API to create an incomplete tuple:

  • PyTuple_New()
  • PyTuple_SetItem()
  • PyTuple_SET_ITEM()
  • _PyTuple_Resize()

See capi-workgroup/problems#56 for details on problems caused by API which let creating incomplete/inconsistent objects.

Linked PRs

@vstinner vstinner added type-feature A feature request or enhancement topic-C-API labels Jul 23, 2023
vstinner added a commit to vstinner/cpython that referenced this issue Jul 23, 2023
Add _PyTupleBuilder structure and functions:

* _PyTupleBuilder_Init()
* _PyTupleBuilder_Alloc()
* _PyTupleBuilder_Append()
* _PyTupleBuilder_AppendUnsafe()
* _PyTupleBuilder_Finish()
* _PyTupleBuilder_Dealloc()

The builder tracks the size of the tuple and resize it in
_PyTupleBuilder_Finish() if needed. Don't allocate empty tuple.

_PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the
number of _PyTuple_Resize() calls.
vstinner added a commit to vstinner/cpython that referenced this issue Jul 23, 2023
Add _PyTupleBuilder structure and functions:

* _PyTupleBuilder_Init()
* _PyTupleBuilder_Alloc()
* _PyTupleBuilder_Append()
* _PyTupleBuilder_AppendUnsafe()
* _PyTupleBuilder_Finish()
* _PyTupleBuilder_Dealloc()

The builder tracks the size of the tuple and resize it in
_PyTupleBuilder_Finish() if needed. Don't allocate empty tuple.

_PyTupleBuilder_Append() overallocates the tuple by 25% to reduce the
number of _PyTuple_Resize() calls.

Use _PyTupleBuilder API in itertools batched_traverse(),
PySequence_Tuple() and initialize_structseq_dict().

Add also helper functions:

* _PyTuple_ResizeNoTrack()
* _PyTuple_NewNoTrack()
vstinner added a commit to vstinner/cpython that referenced this issue Jul 23, 2023
Add _PyTupleBuilder structure and functions:

* _PyTupleBuilder_Init()
* _PyTupleBuilder_Alloc()
* _PyTupleBuilder_Append()
* _PyTupleBuilder_AppendUnsafe()
* _PyTupleBuilder_Finish()
* _PyTupleBuilder_Dealloc()

The builder tracks the size of the tuple and resize it in
_PyTupleBuilder_Finish() if needed. Don't allocate empty tuple.
Allocate an array of 16 objects on the stack to avoid allocating
small tuple. _PyTupleBuilder_Append() overallocates the tuple by 25%
to reduce the number of _PyTuple_Resize() calls.

Do no track the temporary internal tuple by the GC before
_PyTupleBuilder_Finish() creates the final complete and consistent
tuple object.

Use _PyTupleBuilder API in itertools batched_traverse(),
PySequence_Tuple() and initialize_structseq_dict().

Add also helper functions:

* _PyTuple_ResizeNoTrack()
* _PyTuple_NewNoTrack()
@vstinner
Copy link
Member Author

This API is for creating tuples when the final size is not known in advance. For known size, use existing APIs like PyTuple_Pack() or Py_BuildValue().

@vstinner
Copy link
Member Author

This API is a fix for this old issue: #59313 "Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash" which was closed as "not a bug" in 2021 by @pablogsal.

vstinner added a commit to vstinner/cpython that referenced this issue Jul 23, 2023
Add _PyTupleBuilder structure and functions:

* _PyTupleBuilder_Init()
* _PyTupleBuilder_Alloc()
* _PyTupleBuilder_Append()
* _PyTupleBuilder_AppendUnsafe()
* _PyTupleBuilder_Finish()
* _PyTupleBuilder_Dealloc()

The builder tracks the size of the tuple and resize it in
_PyTupleBuilder_Finish() if needed. Don't allocate empty tuple.
Allocate an array of 16 objects on the stack to avoid allocating
small tuple. _PyTupleBuilder_Append() overallocates the tuple by 25%
to reduce the number of _PyTuple_Resize() calls.

Do no track the temporary internal tuple by the GC before
_PyTupleBuilder_Finish() creates the final complete and consistent
tuple object.

Use _PyTupleBuilder API in itertools batched_traverse(),
PySequence_Tuple() and initialize_structseq_dict().

Add also helper functions:

* _PyTuple_ResizeNoTrack()
* _PyTuple_NewNoTrack()
vstinner added a commit to vstinner/cpython that referenced this issue Jul 24, 2023
Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to
the internal C API: similar to PyTuple_New() and _PyTuple_Resize(),
but don't track the tuple by the GC.

Modify PySequence_Tuple() to use these functions. It prevents leaking
the tuple which is being initialized in the GC, via functions like
gc.get_referrers(). Previously, accessing to such partially
initialized tuple could crash Python.

Now PySequence_Tuple() only tracks the tuple once it's fully
initialized.
vstinner added a commit to vstinner/cpython that referenced this issue Jul 24, 2023
Fix a crash in PySequence_Tuple() when the tuple which is being
initialized was accessed via GC introspection, like the
gc.get_referrers() function. Now the function only tracks the tuple
by the GC once it's fully initialized.

Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to
the internal C API: similar to PyTuple_New() and _PyTuple_Resize(),
but don't track the tuple by the GC.

Modify PySequence_Tuple() to use these functions.
vstinner added a commit to vstinner/cpython that referenced this issue Jul 24, 2023
Fix a crash in PySequence_Tuple() when the tuple which is being
initialized was accessed via GC introspection, like the
gc.get_referrers() function. Now the function only tracks the tuple
by the GC once it's fully initialized.

Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to
the internal C API: similar to PyTuple_New() and _PyTuple_Resize(),
but don't track the tuple by the GC.

Modify PySequence_Tuple() to use these functions.
vstinner added a commit to vstinner/cpython that referenced this issue Jul 24, 2023
Fix a crash in PySequence_Tuple() when the tuple which is being
initialized was accessed via GC introspection, like the
gc.get_referrers() function. Now the function only tracks the tuple
by the GC once it's fully initialized.

Add _PyTuple_NewNoTrack() and _PyTuple_ResizeNoTrack() functions to
the internal C API: similar to PyTuple_New() and _PyTuple_Resize(),
but don't track the tuple by the GC.

Modify PySequence_Tuple() to use these functions.
@markshannon
Copy link
Member

What's the benefit of this over building a list then calling PyList_AsTuple()?

@vstinner
Copy link
Member Author

You avoid creating a temporary list and then destroy it. It may be faster.

@markshannon
Copy link
Member

How is creating a temporary tuple builder better than creating a temporary list?

If building a list is slow, we should fix that. It would be far more beneficial than adding a new API.

@vstinner
Copy link
Member Author

How is creating a temporary tuple builder better than creating a temporary list?

Code using PyList_AsTuple() requires to creates a temporary list object and then create the final tuple object: two objects.

My proposed API only creates a single Python object: less memory allocations, no conversion of list items to tuple items (copy + INCREF).

If building a list is slow, we should fix that. It would be far more beneficial than adding a new API.

How do you plan to optimize list creation?

@erlend-aasland
Copy link
Contributor

This API is a fix for this old issue: #59313 "Incomplete tuple created by PyTuple_New() and accessed via the GC can trigged a crash" which was closed as "not a bug" in 2021 by @pablogsal.

FTR, we can fix the sqlite3 issue by using the internal _PyTuple_FromArray API. I've been discouraged from using internal API in the sqlite3 module earlier, but lately, a lot of private API snuck into that extension module, so perhaps we can reconsider _PyTuple_FromArray for that particular crasher.

@erlend-aasland
Copy link
Contributor

Hm, PyList_AsTuple is just a thin wrapper for _PyTuple_FromArray. @markshannon's proposal of building a list should be performant enough.

@erlend-aasland
Copy link
Contributor

Here's a quick and non-scientific benchmark:

$ ./python.exe -m timeit "buildtuple(10)"
2000000 loops, best of 5: 140 nsec per loop
$ ./python.exe -m timeit "buildtuple(100)"
500000 loops, best of 5: 864 nsec per loop
$ ./python.exe -m timeit "buildtuple(1000)"
5000 loops, best of 5: 57.8 usec per loop
$ ./python.exe -m timeit "buildtuple(10000)"
500 loops, best of 5: 715 usec per loop

$ ./python.exe -m timeit "buildlist(10)"
1000000 loops, best of 5: 246 nsec per loop
$ ./python.exe -m timeit "buildlist(100)"
200000 loops, best of 5: 1.16 usec per loop
$ ./python.exe -m timeit "buildlist(1000)"
5000 loops, best of 5: 59.6 usec per loop
$ ./python.exe -m timeit "buildlist(10000)"
500 loops, best of 5: 764 usec per loop
Code
diff --git a/Python/bltinmodule.c b/Python/bltinmodule.c
index 787f53fae3..638556c8e5 100644
--- a/Python/bltinmodule.c
+++ b/Python/bltinmodule.c
@@ -2745,6 +2745,66 @@ builtin_issubclass_impl(PyObject *module, PyObject *cls,
     return PyBool_FromLong(retval);
 }
 
+
+/*[clinic input]
+buildtuple as builtin_buildtuple
+    n: int
+    /
+[clinic start generated code]*/
+
+static PyObject *
+builtin_buildtuple_impl(PyObject *module, int n)
+/*[clinic end generated code: output=eaed9c56019027a6 input=1c9950e2c501d16a]*/
+{
+    PyObject *tuple = PyTuple_New(n);
+    if (tuple == NULL) {
+        goto error;
+    }
+    for (int i = 0; i < n; i++) {
+        PyObject *item = PyLong_FromLong(i);
+        if (item == NULL) {
+            goto error;
+        }
+        PyTuple_SET_ITEM(tuple, i, item);
+    }
+    return tuple;
+
+error:
+    Py_XDECREF(tuple);
+    return NULL;
+}
+
+/*[clinic input]
+buildlist as builtin_buildlist = buildtuple
+[clinic start generated code]*/
+
+static PyObject *
+builtin_buildlist_impl(PyObject *module, int n)
+/*[clinic end generated code: output=227ae994f7278b23 input=f8333bb7e2c7f2ce]*/
+{
+    PyObject *list = PyList_New(n);
+    if (list == NULL) {
+        goto error;
+    }
+    for (int i = 0; i < n; i++) {
+        PyObject *item = PyLong_FromLong(i);
+        if (item == NULL) {
+            goto error;
+        }
+        PyList_SET_ITEM(list, i, item);
+    }
+    PyObject *tuple = PyList_AsTuple(list);
+    if (tuple == NULL) {
+        goto error;
+    }
+    Py_DECREF(list);
+    return tuple;
+
+error:
+    Py_XDECREF(list);
+    return NULL;
+}
+
 typedef struct {
     PyObject_HEAD
     Py_ssize_t tuplesize;
@@ -3060,6 +3120,8 @@ static PyMethodDef builtin_methods[] = {
     BUILTIN_SETATTR_METHODDEF
     BUILTIN_SORTED_METHODDEF
     BUILTIN_SUM_METHODDEF
+    BUILTIN_BUILDTUPLE_METHODDEF
+    BUILTIN_BUILDLIST_METHODDEF
     {"vars",            builtin_vars,       METH_VARARGS, vars_doc},
     {NULL,              NULL},
 };
diff --git a/Python/clinic/bltinmodule.c.h b/Python/clinic/bltinmodule.c.h
index 7540de6828..9da527ed3a 100644
--- a/Python/clinic/bltinmodule.c.h
+++ b/Python/clinic/bltinmodule.c.h
@@ -1212,4 +1212,58 @@ builtin_issubclass(PyObject *module, PyObject *const *args, Py_ssize_t nargs)
 exit:
     return return_value;
 }
-/*[clinic end generated code: output=daeee81b018824f4 input=a9049054013a1b77]*/
+
+PyDoc_STRVAR(builtin_buildtuple__doc__,
+"buildtuple($module, n, /)\n"
+"--\n"
+"\n");
+
+#define BUILTIN_BUILDTUPLE_METHODDEF    \
+    {"buildtuple", (PyCFunction)builtin_buildtuple, METH_O, builtin_buildtuple__doc__},
+
+static PyObject *
+builtin_buildtuple_impl(PyObject *module, int n);
+
+static PyObject *
+builtin_buildtuple(PyObject *module, PyObject *arg)
+{
+    PyObject *return_value = NULL;
+    int n;
+
+    n = _PyLong_AsInt(arg);
+    if (n == -1 && PyErr_Occurred()) {
+        goto exit;
+    }
+    return_value = builtin_buildtuple_impl(module, n);
+
+exit:
+    return return_value;
+}
+
+PyDoc_STRVAR(builtin_buildlist__doc__,
+"buildlist($module, n, /)\n"
+"--\n"
+"\n");
+
+#define BUILTIN_BUILDLIST_METHODDEF    \
+    {"buildlist", (PyCFunction)builtin_buildlist, METH_O, builtin_buildlist__doc__},
+
+static PyObject *
+builtin_buildlist_impl(PyObject *module, int n);
+
+static PyObject *
+builtin_buildlist(PyObject *module, PyObject *arg)
+{
+    PyObject *return_value = NULL;
+    int n;
+
+    n = _PyLong_AsInt(arg);
+    if (n == -1 && PyErr_Occurred()) {
+        goto exit;
+    }
+    return_value = builtin_buildlist_impl(module, n);
+
+exit:
+    return return_value;
+}
+/*[clinic end generated code: output=a3a6068cbca21e95 input=a9049054013a1b77]*/

@vstinner
Copy link
Member Author

I close the issue, see: #107139 (comment)

@erlend-aasland erlend-aasland closed this as not planned Won't fix, can't repro, duplicate, stale Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-C-API type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants