Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent behavior between dictionary methods of access (d[k] vs d.__getitem__(k)) #1602

Closed
goodmami opened this issue Feb 10, 2017 · 2 comments

Comments

@goodmami
Copy link

A dictionary with single-character unicode keys queried with a Py_UNICODE key will raise a KeyError with bracket access (e.g. d[k]), but succeeds with d.get(k), d.__getitem__(k), k in d, d.__contains__(k), etc). Here is an example:

d = {u'a': 1, u'b': 2}
cdef Py_UNICODE c = u'a'
print(c, c in d, d.get(c), d.__getitem__(c))  # succeeds
print(d[c])  # KeyError

I get that maybe Py_UNICODE isn't compatible with unicode, so the hashes would be different, thus leading to the KeyError, but I wouldn't expect it to work with some methods of dictionary access and not others. This can lead to some hard-to-find bugs.

@scoder
Copy link
Contributor

scoder commented Feb 11, 2017

Both Py_UNICODE and the more portable Py_UCS4 are treated as C integers here. I agree that they should not be.

@scoder scoder closed this as completed in 786e37f Feb 11, 2017
@scoder
Copy link
Contributor

scoder commented Feb 11, 2017

The old behaviour can be restored in a backwards-compatible way by explicitly casting to a C integer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants