Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

locale fails if LANGUAGE has multiple locales #41664

Closed
mixedpuppy mannequin opened this issue Mar 7, 2005 · 13 comments
Closed

locale fails if LANGUAGE has multiple locales #41664

mixedpuppy mannequin opened this issue Mar 7, 2005 · 13 comments
Assignees
Labels
easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@mixedpuppy
Copy link
Mannequin

mixedpuppy mannequin commented Mar 7, 2005

BPO 1158490
Nosy @malemburg, @vstinner, @serhiy-storchaka
Files
  • remove-support-for-LANGUAGE--in-locale.patch: removes LANGUAGE from envvars kwarg, adds tests
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/malemburg'
    closed_at = <Date 2017-03-07.19:25:11.996>
    created_at = <Date 2005-03-07.19:11:05.000>
    labels = ['easy', 'type-bug', 'library']
    title = 'locale fails if LANGUAGE has multiple locales'
    updated_at = <Date 2017-03-07.19:25:11.995>
    user = 'https://bugs.python.org/mixedpuppy'

    bugs.python.org fields:

    activity = <Date 2017-03-07.19:25:11.995>
    actor = 'serhiy.storchaka'
    assignee = 'lemburg'
    closed = True
    closed_date = <Date 2017-03-07.19:25:11.996>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2005-03-07.19:11:05.000>
    creator = 'mixedpuppy'
    dependencies = []
    files = ['18300']
    hgrepos = []
    issue_num = 1158490
    keywords = ['patch', 'easy']
    message_count = 13.0
    messages = ['24492', '24493', '24494', '24495', '24496', '24497', '24498', '24499', '24500', '112260', '125562', '221816', '228193']
    nosy_count = 9.0
    nosy_names = ['lemburg', 'ber', 'bernhard', 'sorlov', 'mixedpuppy', 'vstinner', 'meatballhat', 'BreamoreBoy', 'serhiy.storchaka']
    pr_nums = []
    priority = 'low'
    resolution = 'out of date'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue1158490'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @mixedpuppy
    Copy link
    Mannequin Author

    mixedpuppy mannequin commented Mar 7, 2005

    The locale module does not correctly handle the
    LANGUAGE environment variable if it contains multiple
    settings. Example:

    LANGUAGE="en_DK:en_GB:en_US:en"

    Note, en_DK does not exist in locale_alias

    In normalize, the colons are replaced with dots, which
    is incorrect. getdefaultlocal should seperate these
    first, then try each one until it finds one that works,
    or fails on all.

    GLIBC documentation:
    http://www.delorie.com/gnu/docs/glibc/libc_138.html

    "While for the LC_xxx variables the value should
    consist of exactly one specification of a locale the
    LANGUAGE variable's value can consist of a colon
    separated list of locale names."

    Testing this is simple, just set your LANGUAGE
    environment var to the above example, and use
    locale.getdefaultlocal()

    > export LANGUAGE="en_DK:en_GB:en_US:en"
    > python
    ActivePython 2.4 Build 244 (ActiveState Corp.) based on
    Python 2.4 (#1, Feb  9 2005, 19:33:15)
    [GCC 3.3.1 (SuSE Linux)] on linux2
    Type "help", "copyright", "credits" or "license" for
    more information.
    >>> import locale
    >>> locale.getdefaultlocale()
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "/opt/ActivePython-2.4/lib/python2.4/locale.py",
    line 344, in getdefaultlocale
        return _parse_localename(localename)
      File "/opt/ActivePython-2.4/lib/python2.4/locale.py",
    line 278, in _parse_localename
        raise ValueError, 'unknown locale: %s' % localename
    ValueError: unknown locale: en_DK:en_GB:en_US:en
    >>>

    @mixedpuppy mixedpuppy mannequin assigned malemburg Mar 7, 2005
    @mixedpuppy mixedpuppy mannequin added the stdlib Python modules in the Lib dir label Mar 7, 2005
    @mixedpuppy mixedpuppy mannequin assigned malemburg Mar 7, 2005
    @mixedpuppy mixedpuppy mannequin added the stdlib Python modules in the Lib dir label Mar 7, 2005
    @malemburg
    Copy link
    Member

    Logged In: YES
    user_id=38388

    The URL you gave does state that LANGUAGE can take mulitple
    entries separated by colons. However, I fail to see how to
    choose the locale from the list of possibilities. Any ideas ?

    @sorlov
    Copy link
    Mannequin

    sorlov mannequin commented Mar 10, 2005

    Logged In: YES
    user_id=1235914

    The docs for getdefaultlocale state that it follows the GNU
    gettext search path. OTOH gettext can return result from any
    of catalogs en_DK:en_GB:en_US:en, it depends on the content
    of the message. So maybe getdefaultlocale should just pick
    up the first value from LANGUAGE ?

    @mixedpuppy
    Copy link
    Mannequin Author

    mixedpuppy mannequin commented Mar 10, 2005

    Logged In: YES
    user_id=1234417

    IMHO the proper behaviour is to split on the colon, then try
    each one from start to finish until there is a success, or
    all fail. For example, if you just try en_DK, you will get
    a failure since that is not in locale.locale_alias, but
    en_GB or en_US would succeed.

    @bernhard
    Copy link
    Mannequin

    bernhard mannequin commented Sep 26, 2005

    Logged In: YES
    user_id=2369

    Another consequence of this bug is that even if
    getdefaultlocale does not fail with an exception, it may
    return an invalid value for the encoding. E.g. one thuban
    user had

    LANGUAGE=pt_BR:pt_PT:pt

    getdefaultlocale did not raise an exception, but return
    "pt_pt" as the encoding because the normalized form of the
    above value was pt_BR.pt_pt and the locale module assumes
    that the part after the "." is the encoding.

    @malemburg
    Copy link
    Member

    Logged In: YES
    user_id=38388

    The current CVS version returns this value:

    >>> import locale
    >>> locale.getdefaultlocale()
    (None, None)

    Given all the problems with the LANGUAGE environment variable
    (which is a gettext() only thing) I'm inclined to remove
    support for
    it altogether.

    @ber
    Copy link
    Mannequin

    ber mannequin commented Oct 16, 2005

    Logged In: YES
    user_id=113859

    Hi Marc-Andre,

    do you mean that the current CVS version will return (None, None)
    always or only for special LANUGUAGE settings?

    I do not have an overview about other problems with the
    LANGUAGE variable (from gettext), but adding support
    for the proper parsing of the colons and the testing seems
    a good thing to do from my perspective.
    Getdefaultlocale() will not get called often and if additional information
    can be used from the LANGUAGE variable, this will be benefical to the
    applications.

    Anyway,
    just my 0,02 Euro-Cents.

    Bernhard R.

    @malemburg
    Copy link
    Member

    Logged In: YES
    user_id=38388

    Hi Bernhard,

    sorry my last comment wasn't clear: you get this output if
    you set the LANGUAGE variable to the example you gave
    (LANGUAGE=pt_BR:pt_PT:pt).

    The parsing order was changed, so that LANGUAGE is no longer
    searched for first, but instead as last resort if the other
    locale variables are not set.

    @ber
    Copy link
    Mannequin

    ber mannequin commented Oct 17, 2005

    Logged In: YES
    user_id=113859

    Hi,

    using other information first seems to be a step forward to me.
    I just could not see this from the given example.

    But if LANGUAGE will be evaluated, will the colon be parsed correctly
    and the results tested?
    This seems to be the remainder of this bug.

    Bernhard R.

    @devdanzin devdanzin mannequin added type-bug An unexpected behavior, bug, or error labels Mar 20, 2009
    @devdanzin devdanzin mannequin added easy labels Apr 22, 2009
    @meatballhat
    Copy link
    Mannequin

    meatballhat mannequin commented Aug 1, 2010

    I first verified that the relevant parts of locale:getdefaultlocale have been unchanged since 2005-10-17.

    I'm adding a patch to remove default support for the LANGUAGE variable and tests to assert that values like 'en_DK:en_GB:en_US' raise ValueError (plus asserting that getting value from LC_ALL, LC_CTYPE, and LANG are all supported.)

    None of the logic for normalizing candidate env vars has been changed, so the questions about how values like 'en_DK:en_GB:en_US' are handled all still apply -- I've just operated under the assumption that such values will continue to raise ValueError.

    @vstinner
    Copy link
    Member

    vstinner commented Jan 6, 2011

    The initial problem (":" in the LANGUAGE variable) was fixed in an independent (?) issue (bpo-1166938) by r39572.

    If I understood correctly, locale.getdefaultlocale() is supposed to give the locale settings that we will be active after the first call to locale.setlocale(locale.LC_ALL, ''). In this case, LANGUAGE should be ignored because it has no effect on the active locale. The variable is specific to the gettext library, it is not used by the locale machinery.

    About remove-support-for-LANGUAGE--in-locale.patch: you should also update the documentation.

    @BreamoreBoy
    Copy link
    Mannequin

    BreamoreBoy mannequin commented Jun 28, 2014

    The words here https://docs.python.org/3/library/locale.html#locale.getdefaultlocale read in part "envvars defaults to the search path used in GNU gettext; it must always contain the variable name 'LANG'.". I think this means that envvars should always contain 'LANG', even if the default is not used, but the code doesn't seem to need that. If somebody can clarify this for me I'll submit a new patch.

    @serhiy-storchaka
    Copy link
    Member

    It looks to me that this issue is already gone.

    >>> import os, locale
    >>> os.environ['LANGUAGE'] = 'en_DK:en_GB:en_US:en'
    >>> locale.getdefaultlocale(['LANGUAGE'])
    ('en_DK', 'ISO8859-1')

    'en_DK' was added in bpo-20079.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants