New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create PyUnicode_AsWideCharString() function #54188
Comments
PyUnicode_AsWideChar() doesn't merge surrogate pairs on a system with 32 bits wchar_t and Python compiled in narrow mode (sizeof(wchar_t) == 4 and sizeof(Py_UNICODE) == 2) => see issue bpo-8670. It is not easy to fix this problem because the callers of PyUnicode_AsWideChar() suppose that the output (wide character) string has the same length (in character) than the input (PyUnicode) string (suppose that sizeof(wchar_t) == sizeof(Py_UNICODE)). And PyUnicode_AsWideChar() doesn't write nul character at the end if the output string is truncated. To prepare this change, a new PyUnicode_AsWideCharString() function would help because it does compute the size of the output buffer (whereas PyUnicode_AsWideChar() requires the output buffer in an argument). Attached patch implements it: Returns a buffer allocated by PyMem_Alloc() (use PyMem_Free() to free it) on PyAPI_FUNC(wchar_t*) PyUnicode_AsWideCharString(
PyUnicodeObject *unicode, /* Unicode object */
Py_ssize_t *size /* number of characters of the result */
); |
New version of the patch:
Keep the call to PyUnicode_AsWideChar() in:
|
STINNER Victor wrote:
Great idea ! |
+1 from me as well. |
I don't want to do two different things at the same time. My plan is:
So, you agree with the API (and the documentation)? |
I fixed in this issue in multiple commits:
Well, you can now directly patch the documentation. I think that the API is simple and fine :-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: