Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WindowsError messages are not properly encoded #46094

Closed
r37c mannequin opened this issue Jan 7, 2008 · 16 comments
Closed

WindowsError messages are not properly encoded #46094

r37c mannequin opened this issue Jan 7, 2008 · 16 comments
Assignees
Labels
OS-windows type-bug An unexpected behavior, bug, or error

Comments

@r37c
Copy link
Mannequin

r37c mannequin commented Jan 7, 2008

BPO 1754
Nosy @gvanrossum, @loewis, @terryjreedy, @amauryfa, @tiran, @methane
PRs
  • bpo-30801: Fix for shoutdown process error with python 3.4 and pyqt/PySide #2413
  • Files
  • windowserror.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/amauryfa'
    closed_at = <Date 2010-08-05.00:36:50.725>
    created_at = <Date 2008-01-07.11:24:24.956>
    labels = ['type-bug', 'OS-windows']
    title = 'WindowsError messages are not properly encoded'
    updated_at = <Date 2017-06-27.13:29:46.274>
    user = 'https://bugs.python.org/r37c'

    bugs.python.org fields:

    activity = <Date 2017-06-27.13:29:46.274>
    actor = 'alberfontan1'
    assignee = 'amaury.forgeotdarc'
    closed = True
    closed_date = <Date 2010-08-05.00:36:50.725>
    closer = 'terry.reedy'
    components = ['Windows']
    creation = <Date 2008-01-07.11:24:24.956>
    creator = 'r37c'
    dependencies = []
    files = ['9100']
    hgrepos = []
    issue_num = 1754
    keywords = []
    message_count = 16.0
    messages = ['59441', '59448', '59452', '59455', '59462', '59472', '59474', '59490', '59495', '59498', '59512', '97794', '105120', '112899', '112902', '112934']
    nosy_count = 8.0
    nosy_names = ['gvanrossum', 'loewis', 'terry.reedy', 'amaury.forgeotdarc', 'christian.heimes', 'r37c', 'eckhardt', 'methane']
    pr_nums = ['2413']
    priority = 'normal'
    resolution = 'out of date'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue1754'
    versions = ['Python 2.7']

    @r37c
    Copy link
    Mannequin Author

    r37c mannequin commented Jan 7, 2008

    The message for WindowsError is taken from the Windows API's
    FormatMessage() function, following the OS language. Currently Python
    does no conversion for those messages, so non-ASCII characters end up
    improperly encoded in the console. For example:

      >>> import os
      >>> os.rmdir('E:\\temp')
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      WindowsError: [Error 41] A pasta nÒo estß vazia: 'E:\\temp'

    Should be: "A pasta não está vazia" [Folder is not empty].

    Python could check what is the code page of the current output interface
    and change the message accordingly.

    @r37c r37c mannequin added OS-windows type-bug An unexpected behavior, bug, or error labels Jan 7, 2008
    @gvanrossum
    Copy link
    Member

    Crys, can you confirm this?

    It would seem we'll need to fix this twice -- once for 2.x, once for 3.0.

    @tiran
    Copy link
    Member

    tiran commented Jan 7, 2008

    Oh nice ...

    Amaury knows probably more about the wide char Windows API than me. The
    function Python/error.c:PyErr_SetExcFromWindows*() needs to be modified.

    @amauryfa
    Copy link
    Member

    amauryfa commented Jan 7, 2008

    I confirm the problem (with French accents) on python 2.5.
    Python 3.0 already fixed the problem by using the FormatMessageW()
    unicode version of the API.

    We could do the same for python 2.5, but the error message must be
    converted to str early (i.e when building the Exception). What is the
    correct encoding to use?

    @r37c
    Copy link
    Mannequin Author

    r37c mannequin commented Jan 7, 2008

    "... but the error message must be converted to str early (i.e when
    building the Exception)."

    Wouldn't that create more problems? What if somebody wants to intercept
    the exception and do something with it, like, say, redirect it to a log
    file? The programmer must, then, be aware of the different encoding. I
    thought about keeping the exception message in Unicode and converting it
    just before printing. Is that possible for Python 2.x?

    @amauryfa
    Copy link
    Member

    amauryfa commented Jan 7, 2008

    I think this is not possible if we want to preserve compatibility; at
    least, str(e.strerror) must not fail.

    I can see different solutions:

    1. Don't fix, and upgrade to python 3.0
    2. Store an additional e.unicodeerror member, use it in a new
      EnvironmentError.__unicode__ method, and call this from PyErr_Display.
    3. Force FormatMessage to return US-English messages.

    My preferred being 1): python2.5 is mostly encoding-naive, python3 is
    unicode aware, and I am not sure we want python2.6 contain both code.
    Other opinions?

    @gvanrossum
    Copy link
    Member

    3.0 will be a long way away for many users. Perhaps forcing English
    isn't so bad, as Python's own error messages aren't translated anyway?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jan 7, 2008

    I would claim that this is not a bug. Sure, the message doesn't come out
    correctly, but only because you run it in a cmd.exe window, not in (say)
    IDLE.

    IIUC, the problem is that Python computes the message in CP_ACP (i.e.
    the ANSI code page), whereas the terminal interprets it in CP_OEMCP
    (i.e. the OEM code page).

    If we declare that all strings are considered as CP_ACP in the
    exception, then the only way to fix it would be to convert it from
    CP_ACP to CP_OEMCP (or, more generally, sys.stderr.encoding) on
    printing. Such conversion should be implemented in an unfailing way,
    either using replacement characters or falling back to no conversion.

    Forcing English messages would certainly reduce the problems, but it
    still might be that the file name in the error message does not come out
    correctly.

    @amauryfa
    Copy link
    Member

    amauryfa commented Jan 7, 2008

    Forcing English messages would certainly reduce the problems
    And it does not even work: my French Windows XP does not contain the
    English error messages :-(

    If we declare that all strings are considered as CP_ACP in the
    exception, then the only way to fix it would be to convert it from
    CP_ACP to CP_OEMCP (or, more generally, sys.stderr.encoding) on
    printing. Such conversion should be implemented in an unfailing way,
    either using replacement characters or falling back to no conversion.

    If this is chosen, I propose to use CharToOem as the "unfailing"
    conversion function. I will try to come with a patch following this idea.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jan 7, 2008

    If this is chosen, I propose to use CharToOem as the "unfailing"
    conversion function. I will try to come with a patch following this idea.

    Sounds fine to me.

    @amauryfa
    Copy link
    Member

    amauryfa commented Jan 8, 2008

    Here is a patch. Now I feel it is a hack, but it is the only place I
    found where I can access both the exception object and the encoding...

    @methane
    Copy link
    Member

    methane commented Jan 14, 2010

    I think WindowsError's message should be English like other errors.
    FormatMessageW() function can take dwLanguageId parameter.
    So I think Python should pass MAKELANGID(LANG_ENGLISH, SUBLANG_ENGLISH_US) to the parameter.

    @r37c
    Copy link
    Mannequin Author

    r37c mannequin commented May 6, 2010

    I think WindowsError's message should be English like other errors.
    FormatMessageW() function can take dwLanguageId parameter.
    So I think Python should pass MAKELANGID(LANG_ENGLISH, SUBLANG_ENGLISH_US) to the parameter.

    On a non-english system FormatMessageW fails with ERROR_RESOURCE_LANG_NOT_FOUND (The specified resource language ID cannot be found in the image file) when called with that parameter.

    @terryjreedy
    Copy link
    Member

    Should we close this?

    There was some opinion that this is not a bug.

    The argument for not closing this before "3.0 will be a long way away for many users." is obsolete as 3.1.2 is here and 3.2 will be in less than 6 months.

    Or, Amaury, do you have any serious prospect of applying the patch to 2.7?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 4, 2010

    Somebody should investigate the status of this on 3.x. If the message comes out as a nice Unicode string, I'd close it as fixed. If the message comes out as a byte string, it definitely needs fixing.

    For 2.x, the issue is out of date.

    @terryjreedy
    Copy link
    Member

    The message is definitely an str (unicode) string. WinXP,3.1.2,

    import os
    try: os.rmdir('nonexist')
    except Exception as e:
        print(repr(e.args[1]), '\n', repr(e.strerror), '\n', e.filename)
    os.rmdir('nonexist')

    # prints
    'The system cannot find the file specified'
    'The system cannot find the file specified'
    nonexist
    ...
    WindowsError: [Error 2] The system cannot find the file specified: 'nonexist'

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    OS-windows type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants