Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

locale.setlocale does not work with unicode strings #69928

Closed
tierlieb mannequin opened this issue Nov 27, 2015 · 13 comments
Closed

locale.setlocale does not work with unicode strings #69928

tierlieb mannequin opened this issue Nov 27, 2015 · 13 comments

Comments

@tierlieb
Copy link
Mannequin

tierlieb mannequin commented Nov 27, 2015

BPO 25742
Nosy @malemburg, @loewis, @terryjreedy, @vstinner, @ezio-melotti, @serhiy-storchaka
Files
  • setlocale_unicode.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2015-11-29.23:11:34.037>
    created_at = <Date 2015-11-27.14:54:09.182>
    labels = ['expert-unicode']
    title = 'locale.setlocale does not work with unicode strings'
    updated_at = <Date 2015-11-29.23:11:34.036>
    user = 'https://bugs.python.org/tierlieb'

    bugs.python.org fields:

    activity = <Date 2015-11-29.23:11:34.036>
    actor = 'Arfrever'
    assignee = 'none'
    closed = True
    closed_date = <Date 2015-11-29.23:11:34.037>
    closer = 'Arfrever'
    components = ['Unicode']
    creation = <Date 2015-11-27.14:54:09.182>
    creator = 'tierlieb'
    dependencies = []
    files = ['41175']
    hgrepos = []
    issue_num = 25742
    keywords = ['patch']
    message_count = 13.0
    messages = ['255461', '255481', '255494', '255496', '255498', '255501', '255503', '255504', '255506', '255507', '255514', '255574', '255576']
    nosy_count = 8.0
    nosy_names = ['lemburg', 'loewis', 'terry.reedy', 'vstinner', 'ezio.melotti', 'python-dev', 'serhiy.storchaka', 'tierlieb']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue25742'
    versions = ['Python 2.7']

    @tierlieb
    Copy link
    Mannequin Author

    tierlieb mannequin commented Nov 27, 2015

    Within locale.py in setlocale your have this piece of code:

        if locale and type(locale) is not type(""):
            # convert to string
            locale = normalize(_build_localename(locale))

    That does not work with unicode strings as I found out after wondering quite a bit about the difference was between my tests and my production code...

    So either expand the check here to include type(u"") or make _build_localename smarter.

    @tierlieb tierlieb mannequin added the topic-unicode label Nov 27, 2015
    @terryjreedy
    Copy link
    Member

    The doc for setlocale(category [,locale]) says "locale may be a string, or an iterable of two strings (language code and encoding)". The purpose of _build_locale is handle an iterable of two strings. This request looks like an enhancement request, which is not allowed for 2.7. I suspect that the locale locale module and doc predate the addition of unicode. I think this should be closed.

    @malemburg
    Copy link
    Member

    I wouldn't say this is a feature request.

    What the code wanted to check is "if this is an iterable of two strings, convert these to a locale string". I have no idea why the doc string uses "iterable". IMO, a tuple of two strings would have been fine and make the test case a lot simpler - too late to fix, though.

    If the code works with Unicode strings, I think we can change the test to:

    if locale and not isinstance(locale, basestring):
        ...

    In Python 3, the function will only accept Unicode strings, so no need to fix things there.

    @Tierlieb: Could you provide a patch with test for this ? Thanks.

    @vstinner
    Copy link
    Member

    I don't see the benefit of supporting Unicode strings for setlocale() arguments: locale name are always encodable to ASCII, so loc.decode('ascii') is enough to workaround the issue.

    But well, I think it's ok if it doesn't make the code much more complex ;-)

    I wrote a patch, what do you think? Is it worth it? ;-)

    @malemburg
    Copy link
    Member

    On 27.11.2015 23:11, STINNER Victor wrote:

    STINNER Victor added the comment:

    I don't see the benefit of supporting Unicode strings for setlocale() arguments: locale name are always encodable to ASCII, so loc.decode('ascii') is enough to workaround the issue.

    But well, I think it's ok if it doesn't make the code much more complex ;-)

    I wrote a patch, what do you think?

    Thanks :-)

    Is it worth it? ;-)

    I think so, since the current failure for Unicode is rather
    obscure.

    BTW: Why did you use (_str, _unicode) instead of basestring ?

    @vstinner
    Copy link
    Member

    BTW: Why did you use (_str, _unicode) instead of basestring ?

    Serhiy usually insists that technically, it's possible to compile Python 2.7 without Unicode support. I don't believe that anyone uses this crazy feature, but well, it was easier to use _unicode (which is already defined) than trying to run a poll on python users :-)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 27, 2015

    New changeset 7841e9b614eb by Victor Stinner in branch '2.7':
    Closes bpo-25742: locale.setlocale() now accepts a Unicode string for its second
    https://hg.python.org/cpython/rev/7841e9b614eb

    @python-dev python-dev mannequin closed this as completed Nov 27, 2015
    @malemburg
    Copy link
    Member

    On 27.11.2015 23:50, STINNER Victor wrote:

    STINNER Victor added the comment:

    > BTW: Why did you use (_str, _unicode) instead of basestring ?

    Serhiy usually insists that technically, it's possible to compile Python 2.7 without Unicode support. I don't believe that anyone uses this crazy feature, but well, it was easier to use _unicode (which is already defined) than trying to run a poll on python users :-)

    Hmm, but basestring is always defined, even when Python is compiled
    without Unicode support (which I agree is not used much these
    days). unicode won't exist in such a Python version, so basestring
    is actually safer to use than the tuple.

    @vstinner
    Copy link
    Member

    Marc-Andre Lemburg added the comment:

    Hmm, but basestring is always defined, even when Python is compiled
    without Unicode support (...)

    Oh, I didn't know. Well, I already pushed my patch and it works. Feel
    free to modify locale.py to use basestring. I'm not interested to
    spend time on this *minor* issue anymore ;-)

    Thanks for the review.by the way.

    @malemburg
    Copy link
    Member

    On 28.11.2015 00:00, STINNER Victor wrote:

    STINNER Victor added the comment:

    Marc-Andre Lemburg added the comment:
    > Hmm, but basestring is always defined, even when Python is compiled
    > without Unicode support (...)

    Oh, I didn't know. Well, I already pushed my patch and it works. Feel
    free to modify locale.py to use basestring. I'm not interested to
    spend time on this *minor* issue anymore ;-)

    No big deal. There are probably lots more places in the stdlib which
    break without Unicode compiled in... :-)

    Thanks for the review.by the way.

    Thanks for the patch.

    @vstinner
    Copy link
    Member

    Marc-Andre Lemburg added the comment:

    No big deal. There are probably lots more places in the stdlib which
    break without Unicode compiled in... :-)

    Well, to have more fun, try to run any Python application with a
    Python compiled without Unicode support *and* without thread support
    :-D

    @serhiy-storchaka
    Copy link
    Member

    http://buildbot.python.org/all/builders/x86%20XP-4%202.7/builds/3517/steps/test/logs/stdio
    ======================================================================
    ERROR: test_setlocale_unicode (test.test_locale.TestMiscellaneous)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\lib\test\test_locale.py", line 497, in test_setlocale_unicode
        old_loc = locale.getlocale(locale.LC_ALL)
      File "d:\cygwin\home\db3l\buildarea\2.7.bolen-windows\build\lib\locale.py", line 565, in getlocale
        raise TypeError, 'category LC_ALL is not supported'
    TypeError: category LC_ALL is not supported

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 29, 2015

    New changeset d7481ebeaa4f by Victor Stinner in branch '2.7':
    Issue bpo-25742: Try to fix test_locale on Windows
    https://hg.python.org/cpython/rev/d7481ebeaa4f

    @Arfrever Arfrever mannequin closed this as completed Nov 29, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants