Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make _Py_char2wchar() and _Py_wchar2char() public #62595

vstinner opened this issue Jul 7, 2013 · 5 comments

Make _Py_char2wchar() and _Py_wchar2char() public #62595

vstinner opened this issue Jul 7, 2013 · 5 comments


Copy link

vstinner commented Jul 7, 2013

BPO 18395
Nosy @warsaw, @vstinner, @takluyver, @MojoVampire

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2014-08-01.10:36:33.655>
created_at = <Date 2013-07-07.14:00:36.528>
labels = []
title = 'Make _Py_char2wchar() and _Py_wchar2char() public'
updated_at = <Date 2014-08-01.10:36:33.654>
user = '' fields:

activity = <Date 2014-08-01.10:36:33.654>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2014-08-01.10:36:33.655>
closer = 'vstinner'
components = []
creation = <Date 2013-07-07.14:00:36.528>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 18395
keywords = []
message_count = 5.0
messages = ['192557', '223393', '223396', '223430', '224483']
nosy_count = 5.0
nosy_names = ['barry', 'vstinner', 'python-dev', 'takluyver', 'josh.r']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = None
url = ''
versions = ['Python 3.4']

Copy link
Member Author

vstinner commented Jul 7, 2013

The Python C API has two very useful functions: _Py_char2wchar() and _Py_wchar2char(). They must be used to handle correctly undecodable byte sequences. _Py_char2wchar() and _Py_wchar2char() use the surrogateescape error handler (PEP-383). _Py_char2wchar() forces also the ASCII encoding on FreeBSD and Solaris when the LC_CTYPE locale is C.

Py_Main() expects an array of wide character strings (wchar_t*) for the command line argument, whereas main() gets an array or byte strings (char*). _Py_char2wchar() must be used to be able to call Py_Main().

I propose the following names:

wchar_t* Py_DecodeLocale(const char* arg, size_t *size);
char* Py_EncodeLocale(const wchar_t *text, size_t *error_pos);

See Python/fileutils.c for more information about these functions.

Python 3.3 has already higher level functions (calling _Py_char2_wchar() and _Py_wchar2char()):

PyObject* PyUnicode_DecodeLocale(const char *str, const char *errors);
PyObject* PyUnicode_EncodeLocale(PyObject *unicode, const char *errors);

But these functions cannot be used before Python is initialized.

Copy link

MojoVampire mannequin commented Jul 18, 2014

How often do people need to convert to do platform independent locale encoding before Python is initialized? Encouraging use of platform dependent wchar_t's seems like a bad idea when PyUnicode abstracts away the difference ever since 3.3 released.

Copy link

takluyver mannequin commented Jul 18, 2014

You seem to need wchar_t to call Py_Main and Py_SetProgramName.

I think there's an example in the docs which is wrong, because it appears to pass a char* to Py_SetProgramName:

Copy link
Member Author

You seem to need wchar_t to call Py_Main and Py_SetProgramName.

Yes, exactly.

Copy link

python-dev mannequin commented Aug 1, 2014

New changeset 93a798c7f270 by Victor Stinner in branch 'default':
Issue bpo-18395: Rename ``_Py_char2wchar()`` to :c:func:`Py_DecodeLocale`, rename

New changeset 94d0e842b9ea by Victor Stinner in branch 'default':
Issue bpo-18395, bpo-22108: Update embedded Python examples to decode correctly

@vstinner vstinner closed this as completed Aug 1, 2014
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

No branches or pull requests

1 participant