gh-150865: Allow immortalizing objects as long as thread owns it#150956
gh-150865: Allow immortalizing objects as long as thread owns it#150956seberg wants to merge 1 commit into
Conversation
As discussed in pythongh-150865, it seems unnecessary to enforce a unique reference. Rather, an object can be immortalized (even accidentally by overflowing the local reference count) unless it is owned by the current thread. Allowing immortalizing more generally simplifies immortalization of objects with complex ceration (in the example, new types may be created with multiple references from the start). This may mean that the `ref_shared` reference count can be nonzero after immortalization, but that should be safe (as it also happens for accidental immortalization). I made the function idempotent (even for unicode on this one) just for the sake of it.
Documentation build overview
|
ZeroIntensity
left a comment
There was a problem hiding this comment.
Thanks for the PR. I'm going to temporarily apply DO-NOT-MERGE while I dig through my old implementation and figure out the contract more concretely.
| This function is intended to be used for reducing reference counting contention | ||
| in free-threaded builds for objects which are shared across threads. |
There was a problem hiding this comment.
We should note that it's intended for highly-specific cases where either:
- The object is not tracked by the GC (so DRC doesn't work).
- The contention occurs in native code calling
Py_INCREF, where DRC does not apply.
| Unicode objects are not immortalized and :c:func:`PyUnicode_InternInPlace` should | ||
| be used instead. |
There was a problem hiding this comment.
We should document that some objects may not support immortalization in the future and that some objects will break when immortalized. (The second part should be in a .. warning note.)
| object.ob_gc_bits = 0; | ||
| object.ob_ref_local = 1; | ||
| object.ob_ref_shared = 0; | ||
| object.ob_ref_local = 3; |
There was a problem hiding this comment.
This seems kind of arbitrary, can you add a comment explaining why 3 is used?
| object.ob_ref_local = 1; | ||
| object.ob_ref_shared = 0; | ||
| object.ob_ref_local = 3; | ||
| object.ob_ref_shared = 128; |
There was a problem hiding this comment.
Why is the shared refcount being changed?
| // Only the owning thread can immortalize the object safely | ||
| if (!_Py_IsOwnedByCurrentThread(op)) { |
There was a problem hiding this comment.
Don't we need to also check that ob_ref_shared == 0?
There was a problem hiding this comment.
Let's expose PyUnstable_SetImmortal as a helper and add tests using some more exotic objects. I'm still not very keen on the idea that "owned by current thread" implies "safe to immortalize".
Here's an outline:
static PyObject *
pyobject_set_immortal(PyObject *self, PyObject *op)
{
return PyLong_FromLong(PyUnstable_SetImmortal(op));
}import _testcapi
class TestWhatever(TestCase):
def test_set_immortal(self):
for obj, expected_result in [(module_object, 1), (shared_module_object, 0), ...]:
self.assertEqual(_testcapi.pyobject_enable_deferred_refcount(module), expected_result)| @@ -2885,7 +2885,17 @@ int | |||
| PyUnstable_SetImmortal(PyObject *op) | |||
There was a problem hiding this comment.
Ok, looking at my version, we need to do a few other things here:
- The interpreter stack might have some live references to the object, and it currently thinks that the object is mortal. We need to undo that (see this function).
- I think we need to disable deferred reference counting on the object in case there are some queued references that will break later. I achieved this by exposing this function from
gc_free_threading.cand calling_PyObject_MergePerThreadRefcounts. - It's probably a good idea to lower the reftotal so
-Xshowrefcountdoesn't break. (See this loop.) - If you want to support calling
PyUnstable_SetImmortalon objects that have a non-zeroob_ref_sharedas well, then you also need to make a stop-the-world pause (_PyEval_StopTheWorld/_PyEval_StartTheWorld) to ensure that the object is in a consistent state when immortalizing it.
This also furthers my suspicion that accidental immortalization does not work correctly.
Note that my implementation was also designed to eventually deallocate the immortal object, so there are some things in my PyUnstable_Immortalize that we don't need to do in PyUnstable_SetImmortal (namely the _Py_immortal stuff), but the above list makes sense to me even if the object isn't ever deallocated.
There was a problem hiding this comment.
-
4: Well, I'll assume stop the world likely not worth the effort. And I expect without that, there could be arbitrary increfs/decrefs in-flight on
ob_ref_shared.
So, unless we assume it's OK if (maybe except the flags space, I'll guess that only gets mutated from the owning thread/during stop-the-world)ob_ref_sharedmust be considered undefined for immortal objects. Otherwise, I expect it is best to just close this PR. -
3: Not sure I follow why it matters that immortalizing leaves the total refcount untouched. Immortalizing seems like "arbitrary code" and arbitrary code may modify it?
-
1-2: This is maybe just more complicated then I think, I can't follow that deeply unfortunately. Part of me keeps wondering if there are two competing concepts of "immortal" here, but I just don't know (the best I can do is find this comment in
PyStackRef_CheckValidright now:)/* Can be immortal if object was made immortal after reference came into existence */ assert(!_Py_IsStaticImmortal(obj));(and it isn't even clear to me that the current check would be enough to protect if that wasn't.)
There was a problem hiding this comment.
Yeah, I'm not sure what the difference between a "static immortal" and "immortal" is.
For 3, it would just be to improve debugging. Lowering the global refcount would make it so -Xshowrefcount shows 0 leaked references at the end.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
As discussed in gh-150865, it seems unnecessary to enforce a unique reference. Rather, an object can be immortalized (even accidentally by overflowing the local reference count) unless it is owned by the current thread.
Allowing immortalizing more generally simplifies immortalization of objects with complex ceration (in the example, new types may be created with multiple references from the start).
This may mean that the
ref_sharedreference count can be nonzero after immortalization, but that should be safe (as it also happens for accidental immortalization).I made the function idempotent (even for unicode on this one) just for the sake of it.
Closes gh-150865
(Not sure this needs a blurp, I guess I can add myself to the existing one, but I don't mind either way.)
(once merged, I guess this should also be ported to pythoncapi-compat)