-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
handle None value in field index #12
Conversation
Interestingly, when the now-failing test was first added back in 9c0933d, the Given that the original behaviour under BTrees 3 was to allow None keys, and that the restoration of that behaviour recently was for the benefit of the Plone community (who, IIUC, do want None keys in Products.ZIndex(?)), I'm wondering if the right thing to do might be to go back to the code pre- 9c0933d? That is, remove both the 'if' and 'try' statements (and alter the tests)? That would allow None keys, while not silently hiding TypeError for other invalid keys. |
The OS X 'homebrew 3' build failed. It has switched to Python 3.6, and apparently we don't support Python 3.6 yet (the linux build is pinned to 3.5 max). This has been dealt with before in BTrees and others and probably needs to change here too (although probably in a separate PR). |
@jamadden this follows your suggestion. |
@@ -84,8 +76,9 @@ def index_doc(self, docid, value): | |||
def unindex_doc(self, docid): | |||
"""See interface IInjection""" | |||
rev_index = self._rev_index | |||
value = rev_index.get(docid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use a marker value, to avoid doing both a __contains__
and then a __getitem__
call?
value = rev_index.get(docid, _MARKER)
if value is _MARKER:
return # not in index
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar enough with python interning rules to know how to do this in a completely safe way. Is there a part of the python spec that discusses this or is it something that can be different from implementation to implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, there's nothing that's actually in the spec about this specifically.
- Literals can be automatically interned by all implementations.
- PyPy can intern numbers automatically (well, it fakes that)
- Likewise, PyPy fake-interns tuples and frozensets (exactly which ones has changed recently)
- Strings can be interned manually; PyPy can intern strings automatically (which ones depends on the version)
- CPython uses freelists for things like lists and tuples (and dicts?), so in theory it's possible to observe two such "different" objects having the same ID, so long as you don't keep a reference to them.
>>> id([])
4527437080
>>> id([])
4527437080
>>> id([1])
4527437080
>>> id([1, 2])
4527437080
(PyPy does exactly the opposite of this, BTW:
>>> id([])
4562493104L
>>> id([])
4562493128L
>>> id([1])
4562493152L
)
I think that "keeping a reference" thing is generally enough to be safe, though, so long as you avoid things that can be written literally like strings and numbers (typically one sees object
). You can keep the _MARKER
at global module scope or if you want to be extra sure you can cons one up for every function invocation:
marker = object()
value = rev_index.get(docid, marker)
if value is marker:
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would create the module-global _MARKER
and use it as the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do switch back to using rev_index.get(docid, _MARKER)
.
@@ -84,8 +76,9 @@ def index_doc(self, docid, value): | |||
def unindex_doc(self, docid): | |||
"""See interface IInjection""" | |||
rev_index = self._rev_index | |||
value = rev_index.get(docid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do switch back to using rev_index.get(docid, _MARKER)
.
The fact that this requires BTrees > 4.4.1, plus the fact that the |
This should fix a test that depended upon None not being an acceptable key in Btree, which is no longer valid for current versions of BTree.