Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ffi.gc() on the ring nodes to avoid needing a weakref to PersistentPy #134

Merged
merged 1 commit into from Mar 5, 2020

Conversation

jamadden
Copy link
Member

@jamadden jamadden commented Mar 4, 2020

Except on PyPy, where we can already weakref them automatically. On CPython, track them using their id (can't use an ffi.new_handle "pointer" because it strongly references the object it's a handle for).

Fixes #133.

Copy link
Member

@mgedmin mgedmin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very unqualified to review this (what even is ffi.gc? I probably don't even want to know), but other than that this looks very plausible to me. Except for the gc.get_objects() loop. That one seems like a recipe for a delayed disaster.

def _from_addr(self, addr):
for i in gc.get_objects():
if id(i) == addr:
return i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this. id() values can be reused for unrelated objects in CPython.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very true!

I don't expect this to be used on CPython. Even if it were, the ref counting semantics that are tied to ffi.gc should eliminate the race condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That fallback loop was borrowed from objgraph, IIRC. Perhaps the state of the art for non-CPython implementations has moved on since I checked?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

objgraph is a debugging tool. I allow hacks in debugging code that I wouldn't allow in production code ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my primary reason for rejecting this is that AFAIU _from_addr() is supposed to return None if the object associated with the old id() value has been freed, but the get_objects() method cannot reliably implement that.

# they don't cooperate (without this check a bunch of test_picklecache
# breaks)
or not isinstance(value, self._SWEEPABLE_TYPES)):
ring.delete_node(node)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double-checking: does self.ring support mutation during iteration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happily now it does!

I took this opportunity to improve the semantics of the one-ond-only Ring implementation. It resulted in, i think, some nice simplifications.

# We allow mutation during iteration, which means
# we must get the next ``here`` value before
# yielding, just in case the current value is
# removed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this nicely answers my question ;)

@jamadden
Copy link
Member Author

jamadden commented Mar 4, 2020

I'm very unqualified to review this (what even is ffi.gc? I probably don't even want to know), but other than that this looks very plausible to me. Except for the gc.get_objects() loop.

The gc.get_objects loop is for, nominally, Jython support. Such support is still claimed and supported by comments in a few places. That once mattered a lot to me, but less so now, so if it's a problem I'm happy eliding it.

The ZODB PURE_PYTHON tests on CPython run now with this change…with the exception of a block that's specifically targeting Jython, and one that expects ref counts of None to remain stable: yikes! (I think I might have written that.)

what even is ffi.gc? I probably don't even want to know

Indeed. My first few attempts resulted in segfaults everywhere, and then in undefined behaviour. But I think (hope?) this is close to the simplest possible exploitation of it: a reliable ref-counted object that doesn't depend on the circular CPython GC or __weakref__, yet still delivers a callback. (My first tries used CFFI handles, which wound up using cycles, which is not a good practice. I squashed that history away.)

This doesn't have any size benefit over __weakref__ on CPython, I think. It just has consistency benefits. Maybe that is or is not enough.

@icemac
Copy link
Member

icemac commented Mar 5, 2020

Is Jython still a thing and is anyone using it together with zopefoundation stuff? Currently we are not even testing (at least not automatically) on Jython so I feel not very well in claiming that we still support it.

@jamadden
Copy link
Member Author

jamadden commented Mar 5, 2020

Is Jython still a thing and is anyone using it together with zopefoundation stuff?

It's still a thing. As for its use with this stuff, I don't know anymore.

Currently we are not even testing (at least not automatically) on Jython so I feel not very well in claiming that we still support it.

We don't really claim to support it (it's not in the classifiers, for example). There are just some remnants of support for it left in comments and alternate code paths.

However, as of persistent 4.5.0, which made CFFI a firm requirement, all possibility of using Jython was eliminated (until such time as the CFFI port is finished). (I should have remembered that.) So there's really no need for this code path and I can drop it.

Copy link
Member

@mgedmin mgedmin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (from someone unqualified to review this, I remind you!), but please fix the cytypes typo in the changelog

CHANGES.rst Outdated Show resolved Hide resolved
…nt objects.

Except on PyPy, where we can already weakref them automatically.

Fixes #133.
@jamadden jamadden merged commit 3ff5f46 into master Mar 5, 2020
@jamadden jamadden deleted the issue133 branch March 5, 2020 12:06
@d-maurer
Copy link

d-maurer commented Mar 5, 2020 via email

@jamadden
Copy link
Member Author

jamadden commented Mar 5, 2020

I believe the two are unrelated. The changes here have nothing to do with making objects weakly referenceable or not. All that changed here is that the pure Python PickleCache can now store persistent objects that cannot be weakly referenced; the C implementation has always been able to do this. (All Persistent objects that come from ZODB are stored in a PickleCache.) By default, subclasses of Persistent can be weakly referenced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Python/C inconsistency: Cannot store raw Persistent objects in cache
4 participants