-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: PyPy3 compatibility: sys.getsizeof() #8586
Conversation
…ode representation size This is for PyPy3 compatibility: sys.getsizeof() is CPython-specific and there doesn't seem to be a pure-Python way of getting the size of the internal PEP393 Unicode representation, so recompute it using documented invariants.
Ugh, that's ugly. How does pypy store unicode strings? |
It's not even clear to me that the original is correct ;) Maybe for testing purposes. |
Sorry, fat fingered the comment and close button. |
|
The use of buffer API on Unicode objects is nasty all around, since it's
necessarily super-tightly bound to the interpreter's internal
representation. But given that we're doing that, I don't think this patch
makes things worse.
|
I suppose it would also be possible to try ascii, latin1, utf-16, and utf-32 encodings and see which one first had a compatible length, {8,16,32}*number_characters. |
Thanks @rlamy |
@njsmith It occurs to me that perhaps we should discourage this usage of unicode strings in Python3. It was justified in Python2 which only offered ucs2 and ucs4, but in python3 the corresponding functionality would (almost) be byte strings encoded in utf-16 or utf-32. |
I'm currently working on ensuring compatibility of the upcoming pypy3.5 with numpy[*].
This use of
sys.getsizeof()
causes many spurious test failures on pypy3, becausesys.getsizeof()
is CPython-specific. Since there doesn't seem to be a pure-Python way of getting the size of theinternal PEP393 Unicode representation, I'm recomputing it using documented
invariants instead.
[*]: If you're interested, you can grab a Linux nightly from here and check for yourself. Most things already work, barring the occasional puzzling segfault.