Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lzh_tw is missing in locale.py #76962

Closed
cypressyew mannequin opened this issue Feb 6, 2018 · 8 comments
Closed

lzh_tw is missing in locale.py #76962

cypressyew mannequin opened this issue Feb 6, 2018 · 8 comments
Labels
stdlib Python modules in the Lib dir

Comments

@cypressyew
Copy link
Mannequin

cypressyew mannequin commented Feb 6, 2018

BPO 32781
Nosy @malemburg, @benjaminp, @methane, @serhiy-storchaka
Superseder
  • bpo-20087: Mismatch between glibc and X11 locale.alias
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-02-15.10:03:40.857>
    created_at = <Date 2018-02-06.08:17:53.407>
    labels = ['library']
    title = 'lzh_tw is missing in locale.py'
    updated_at = <Date 2018-02-15.10:03:40.856>
    user = 'https://bugs.python.org/cypressyew'

    bugs.python.org fields:

    activity = <Date 2018-02-15.10:03:40.856>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-02-15.10:03:40.857>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2018-02-06.08:17:53.407>
    creator = 'cypressyew'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32781
    keywords = []
    message_count = 8.0
    messages = ['311717', '311719', '311720', '311721', '311778', '312195', '312196', '312197']
    nosy_count = 5.0
    nosy_names = ['lemburg', 'benjamin.peterson', 'methane', 'serhiy.storchaka', 'cypressyew']
    pr_nums = []
    priority = 'normal'
    resolution = 'duplicate'
    stage = 'resolved'
    status = 'closed'
    superseder = '20087'
    type = None
    url = 'https://bugs.python.org/issue32781'
    versions = ['Python 3.6']

    @cypressyew
    Copy link
    Mannequin Author

    cypressyew mannequin commented Feb 6, 2018

    The lzh_tw locale (Literary Chinese) is not available in Lib/locale.py

    This issue will cause error like:

    Traceback (most recent call last):
      File "/usr/share/apport/apport-gtk", line 598, in <module>
        app.run_argv()
      File "/usr/lib/python3/dist-packages/apport/ui.py", line 694, in run_argv
        return self.run_crashes()
      File "/usr/lib/python3/dist-packages/apport/ui.py", line 245, in run_crashes
        logind_session[1] > self.report.get_timestamp():
      File "/usr/lib/python3/dist-packages/apport/report.py", line 1684, in get_timestamp
        orig_ctime = locale.getlocale(locale.LC_TIME)
      File "/usr/lib/python3.6/locale.py", line 581, in getlocale
        return _parse_localename(localename)
      File "/usr/lib/python3.6/locale.py", line 490, in _parse_localename
        raise ValueError('unknown locale: %s' % localename)
    ValueError: unknown locale: lzh_TW

    This can be easily reproduced in Ubuntu 17.10, with English selected as the default language, but Timezone set to Taipei. This will set the locale to:

    $ locale
    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC=lzh_TW
    LC_TIME=lzh_TW
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY=lzh_TW
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER=lzh_TW
    LC_NAME=lzh_TW
    LC_ADDRESS=lzh_TW
    LC_TELEPHONE=lzh_TW
    LC_MEASUREMENT=lzh_TW
    LC_IDENTIFICATION=lzh_TW
    LC_ALL=

    And when running some python script to call locale.py, you will see the error message above.

    @cypressyew cypressyew mannequin added the stdlib Python modules in the Lib dir label Feb 6, 2018
    @methane
    Copy link
    Member

    methane commented Feb 6, 2018

    @cypressyew
    Copy link
    Mannequin Author

    cypressyew mannequin commented Feb 6, 2018

    Yes, this is related to the language setting in Ubuntu, as the locale should be set to zh_TW instead of lzh_TW when the Timezone was set to Taipei.

    But even so, I think this bug is still valid, as the lzh_TW does not exist in the lib at all.

    @methane
    Copy link
    Member

    methane commented Feb 6, 2018

    But even so, I think this bug is still valid, as the lzh_TW does not exist in the lib at all.

    Python doesn't have locale database, while have some aliases.
    Python uses libc's locale.

    This exception is raised because _parse_localename doesn't support
    locale name without encoding.

    In case of zh_TW, alias is registered:

    'zh_tw':                                'zh_TW.big5',
    

    But I don't think adding lzh_tw to alias is good idea.
    There are no "one right alias table". In case of zh_tw, you may
    want zh_TW.UTF-8 rather than zh_TW.bit5, don't you?

    So I think supporting locale name without encoding is right way.
    Maybe, we should return None for encoding in such situation.

    @cypressyew
    Copy link
    Mannequin Author

    cypressyew mannequin commented Feb 7, 2018

    Yes I think you are right,
    return None sounds like a good approach to me as we might have zh_TW translated but not lzh_TW.

    @methane
    Copy link
    Member

    methane commented Feb 15, 2018

    lzh_tw was added in this commit:
    bminor/glibc@5057e7c#diff-3d056472e12e5dc464fa44144719b82f

    I don't know why Python should have such a large locale alias table.

    I added Serhiy to nosy list because he is author of bpo-20079.

    Serhiy, how do you think about making UTF-8 as default charset and
    drop all aliases like "xx_YY" -> "xx_YY.UTF-8" ?

    @serhiy-storchaka
    Copy link
    Member

    I'm not sure that wrong guess is better that exception.

    It looks to me that there is something wrong with the way we use the alias table. It is glibc centric, but some entries contradict glibc, because the X11 alias have a precedence. There are known issues on OS X. I think the other way for determining the locale encoding should be used.

    @serhiy-storchaka
    Copy link
    Member

    See also bpo-20087. It added lzh_tw locale, but later this change was reverted. Thus I close this issue as a duplicate.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants