Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow mutating immutable objects #20

Open
encukou opened this issue Oct 10, 2023 · 10 comments
Open

Disallow mutating immutable objects #20

encukou opened this issue Oct 10, 2023 · 10 comments
Labels
guideline To be included in guidelines PEP

Comments

@encukou
Copy link

encukou commented Oct 10, 2023

From Mark's proposal

New API should not mutate immutable objects. The biggest offenders are currently:

  • tuples (for those we need better creation options -- from array&size, from list)
  • strings/bytes (for these we might need a string builder API – a list-of-strings has overhead over a list of char*)
  • type objects (for these we might want a "set IMMUTABLE flag" function to be called after the type is set up)
@gvanrossum
Copy link
Contributor

I still think this is setting the bar too high for tuple creation. PyTuple_New(int n) is fine, users of the GC "all objects" API should beware (that API is unsafe).

@zooba
Copy link

zooba commented Oct 10, 2023

Maybe we can make "this object is now eligible for GC" a more explicit step? We likely don't need a formal pattern everywhere, but it would be nice if native code could assume that it holds the only reference to a new object up until it returns or shares it.

@encukou
Copy link
Author

encukou commented Oct 11, 2023

In the guidelines for future API, I'd like to say that the PyTuple_New(int n) shouldn't get in today. It's not a precedent to follow.

@gvanrossum
Copy link
Contributor

In the guidelines for future API, I'd like to say that the PyTuple_New(int n) shouldn't get in today. It's not a precedent to follow.

Then you should also show how you would design a tuple builder today. It should have all the features of the existing tuple construction API (including resizing). Hopefully it would be more ergonomic than the current approach.

@encukou
Copy link
Author

encukou commented Oct 11, 2023

Probably like this:

  • PyObject **PyTuple_Prealloc(ssize_t n) preallocates necessary space and returns a pointer to the n-sized array of NULL, which the user should fill with new references.
  • PyTupleObject *PyTuple_FromPrealloc_Consume(PyObject **pre) fills in the tuple header & returns a pointer to the new tuple, and registers with GC if there are any GC objects. Asserts that no NULLs are left.
    Conceptually, the passed-in array (and references in it) is consumed/stolen.
  • PyObject **PyTuple_Prealloc_Resize(PyObject **old_pre, ssize_t n) resizes the space.
  • int PyTuple_Prealloc_Free(PyObject **pre) decrefs non-NULL contents and frees the space.

(In today's API we'd use PyObject* rather than PyTupleObject*.)

There is an explicit "this object is now eligible for GC" step, which creates a PyTuple.

@gvanrossum
Copy link
Contributor

What API should be used to actually set item i to value x? It can't be PyTuple_SetItem(pre, i, x), can it? Because the prealloc'ed thing isn't a tuple yet (from the description you give for the FromPrealloc API it seems this way).

There are two exit paths, mutually exclusive, right? You end with either FromPrealloc or Prealloc_Free.

The naming could use some work. This pattern might be reused widely if we get it right.

@encukou
Copy link
Author

encukou commented Oct 11, 2023

It'd be a PyObject * array. Sorry, I left out the extra star :(
So, pre[i] = Py_NewRef(x)

@encukou encukou added the guideline To be included in guidelines PEP label Oct 11, 2023
@pitrou
Copy link

pitrou commented Oct 14, 2023

In most cases PyTuple_Pack or Py_BuildValue are sufficient, no?

It's only in the rare cases that you are building tuples with runtime-varying sizes that you need a way to populate the contents iteratively.

And you could also expose a new API such as:

PyObject *PyTuple_PackArray(Py_ssize_t n, PyObject** items);

@zooba
Copy link

zooba commented Oct 16, 2023

PyTuple_PackArray will likely come about anyway as we add APIs without varargs, so apart from the convenient cleanup for a partially filled PyObject** array, it ought to be sufficient.

@encukou
Copy link
Author

encukou commented Oct 17, 2023

My draft avoids a memcpy; with inlining it should be as fast as the current approach. I assume CPython core and Cython will want that.

But yes, PyTuple_Pack/PyTuple_PackArray should be sufficient for nearly all cases. Memcpy is fast, and tuples are generally small.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guideline To be included in guidelines PEP
Projects
None yet
Development

No branches or pull requests

4 participants