Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExtentFilteredSet: index_doc results in extra readCurrent calls for rejected objects #12

Closed
jamadden opened this issue May 13, 2021 · 0 comments · Fixed by #13
Closed

Comments

@jamadden
Copy link
Member

Here's index_doc for ExtentFilteredSet:
https://github.com/NextThought/nti.zope_catalog/blob/b27fd1abdd2f9448ce7d8a81ea95c20c071fb021/src/nti/zope_catalog/topic.py#L129-L133

This method is called quite a bit (every intid added). If the object doesn't pass the filter, the superclass raises a ValueError, and we unindex it (from zope.index):

    def unindex_doc(self, docid):
        try:
            self._ids.remove(docid)
        except KeyError:
            pass

Which ends up in this code from zc.catalog:

    def remove(self, uid):
        self.set.remove(uid)

self._ids.set is a BTree TreeSet. The first thing that TreeSet.remove() does is call self._p_jar.readCurrent(self). This happens even if the docid isn't in the set.

The result is that we can wind up with unchanged objects showing up in readCurrent calls and taking RelStorage locks on them.

If we change index_doc to first check if docid in self._ids:, we can avoid this. Yes, we'll have to traverse the TreeSet, but we're already traversing it to try to remove. So in the common case where a new object just fails to pass the filter, we still only traverse once. The uncommon case of an existing object that no longer passes the filter would traverse it twice, but that's relatively rare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant