Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gconvert() obeys LC_NUMERIC without "use locale" in 5.19.8 and 5.19.9 #13620

Closed
p5pRT opened this issue Feb 24, 2014 · 18 comments

Comments

@p5pRT
Copy link

commented Feb 24, 2014

Migrated from rt.perl.org#121317 (status was 'resolved')

Searchable as RT121317$

@p5pRT

This comment has been minimized.

Copy link
Author

commented Feb 24, 2014

From calle@init.se

Created by calle@init.se

Commit bc8ec7c makes Gconvert() obey
the LC_NUMERIC environment variable even without "use locale" (or with
"no locale"). The behavior is the same on at least OSX 10.9, FreeBSD
10 and Debian 7. Gconvert() may not be listed in perlapi, but there is
at least one module on CPAN that uses it (which is why I found this),
and if this change is intentional it would be nice if it was at least
mentioned in perldelta.

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl 5.19.9:

Configured by called at Thu Feb 20 15:54:16 CET 2014.

Summary of my perl5 (revision 5 version 19 subversion 9) configuration:
   
  Platform:
    osname=darwin, osvers=13.0.0, archname=darwin-2level
    uname='darwin necronomicon-ii.local 13.0.0 darwin kernel version 13.0.0: thu sep 19 22:22:27 pdt 2013; root:xnu-2422.1.72~6release_x86_64 x86_64 '
    config_args='-de -Dprefix=/Users/called/perl5/perlbrew/perls/perl-5.19.9 -Dusedevel -Aeval:scriptdir=/Users/called/perl5/perlbrew/perls/perl-5.19.9/bin'
    hint=recommended, useposix=true, d_sigaction=define
    useithreads=undef, usemultiplicity=undef
    use64bitint=define, use64bitall=define, uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include',
    optimize='-O3',
    cppflags='-fno-common -DPERL_DARWIN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
    ccversion='', gccversion='4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)', gccosandvers=''
    intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
    ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=8, prototype=define
  Linker and Libraries:
    ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc', ldflags =' -fstack-protector -L/usr/local/lib'
    libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/5.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib
    libs=-lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-ldl -lm -lutil -lc
    libc=, so=dylib, useshrplib=false, libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=bundle, d_dlsymun=undef, ccdlflags=' '
    cccdlflags=' ', lddlflags=' -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector'



@INC for perl 5.19.9:
    /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/site_perl/5.19.9/darwin-2level
    /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/site_perl/5.19.9
    /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/5.19.9/darwin-2level
    /Users/called/perl5/perlbrew/perls/perl-5.19.9/lib/5.19.9
    .


Environment for perl 5.19.9:
    DYLD_LIBRARY_PATH (unset)
    HOME=/Users/called
    LANG=sv_SE.UTF-8
    LANGUAGE (unset)
    LC_CTYPE=en_GB.UTF-8
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/Users/called/perl5/perlbrew/bin:/Users/called/perl5/perlbrew/perls/perl-5.19.9/bin:/Users/called/.gem/ruby/2.1.0/bin:/Users/called/.rubies/ruby-2.1.0/lib/ruby/gems/2.1.0/bin:/Users/called/.rubies/ruby-2.1.0/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/X11/bin:/Users/called/bin:/usr/local/sbin:/usr/local/opt/ruby/bin
    PERLBREW_BASHRC_VERSION=0.64
    PERLBREW_HOME=/Users/called/.perlbrew
    PERLBREW_MANPATH=/Users/called/perl5/perlbrew/perls/perl-5.19.9/man
    PERLBREW_PATH=/Users/called/perl5/perlbrew/bin:/Users/called/perl5/perlbrew/perls/perl-5.19.9/bin
    PERLBREW_PERL=perl-5.19.9
    PERLBREW_ROOT=/Users/called/perl5/perlbrew
    PERLBREW_VERSION=0.64
    PERL_BADLANG (unset)
    SHELL=/bin/bash

@p5pRT

This comment has been minimized.

Copy link
Author

commented Feb 27, 2014

From @khwilliamson

On 02/24/2014 07​:39 AM, (via RT) wrote​:

# New Ticket Created by
# Please include the string​: [perl #121317]
# in the subject line of all future correspondence about this issue.
# <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121317 >

This is a bug report for perl from calle@​init.se,
generated with the help of perlbug 1.40 running under perl 5.19.9.

-----------------------------------------------------------------
[Please describe your issue here]

Commit bc8ec7c makes Gconvert() obey
the LC_NUMERIC environment variable even without "use locale" (or with
"no locale"). The behavior is the same on at least OSX 10.9, FreeBSD
10 and Debian 7. Gconvert() may not be listed in perlapi, but there is
at least one module on CPAN that uses it (which is why I found this),
and if this change is intentional it would be nice if it was at least
mentioned in perldelta.

I'm thinking we should revert the code changes part of that commit. The
basic premise of this bug is wrong, but the underlying essence may
indicate the need to change back. The commit did not change Gconvert in
any way. All the bottom level interfaces to libc do not depend on being
in scope of 'use locale' or not. In other words, GConvert only by
chance obeyed 'use locale' in the cases that this ticket mentions.
There were ways to get it to not obey 'use locale' before the commit.

That said, the commit changes the behavior. I don't believe we are
under any obligation really to maintain behavior of undocumented
features, and avoiding such changes has the deleterious effect of
encouraging people to use things they shouldn't be. If someone wants to
use an undocumented feature, they should at least indicate to p5p that
they want this supported in the future, and see where that leads.

On the other hand, if we don't have to break existing code, then I don't
think we should. Since that commit was made, I was forced to come up
with a more general scheme for other reasons, and it turns out that
after reverting it, the code still passes all the tests it added, and
all others currently in the suite.

So I'm thinking it's best to revert.

@p5pRT

This comment has been minimized.

Copy link
Author

commented Feb 27, 2014

The RT System itself - Status changed from 'new' to 'open'

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 7, 2014

From @khwilliamson

On Mon Feb 24 06​:39​:54 2014, calle@​init.se wrote​:

This is a bug report for perl from calle@​init.se,
generated with the help of perlbug 1.40 running under perl 5.19.9.

-----------------------------------------------------------------
[Please describe your issue here]

Commit bc8ec7c makes Gconvert() obey
the LC_NUMERIC environment variable even without "use locale" (or with
"no locale"). The behavior is the same on at least OSX 10.9, FreeBSD
10 and Debian 7. Gconvert() may not be listed in perlapi, but there is
at least one module on CPAN that uses it (which is why I found this),
and if this change is intentional it would be nice if it was at least
mentioned in perldelta.

I'm wondering if you have experienced GConvert prior to 5.19 being affected by "use locale". My reading of the code indicates not. It appears to me that it generally used the dot for a decimal point no matter what the locale and regardless of "use locale". The place I know of where it could use a comma, say, stemmed from using POSIX​::strod() prior to it, as strtod didn't properly clean up after itself, and 'use locale' is irrelevant. But if you know cases other than this, I would appreciate hearing about them.
Karl Williamson

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 10, 2014

From calle@init.se

On 7 mar 2014, at 21​:05, Karl Williamson via RT <perlbug-followup@​perl.org> wrote​:

I'm wondering if you have experienced GConvert prior to 5.19 being affected by "use locale". My reading of the code indicates not. It appears to me that it generally used the dot for a decimal point no matter what the locale and regardless of "use locale". The place I know of where it could use a comma, say, stemmed from using POSIX​::strod() prior to it, as strtod didn't properly clean up after itself, and 'use locale' is irrelevant. But if you know cases other than this, I would appreciate hearing about them.

I haven’t seen that, no. But then I’ve never actually used locales in real-life code, since I’ve found them to contribute far more problems than value. My mentioning them in the problem report was more of a “this should definitely not happen without ‘use locale’” than a statement that it should happen with it. In retrospect, that may have been more confusing than useful.

--
Calle Dybedahl
calle@​init.se -*- +46 703 - 970 612

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 13, 2014

From @eserte

Calle Dybedahl <calle@​init.se> writes​:

On 7 mar 2014, at 21​:05, Karl Williamson via RT <perlbug-followup@​perl.org> wrote​:

I'm wondering if you have experienced GConvert prior to 5.19 being
affected by "use locale". My reading of the code indicates not. It
appears to me that it generally used the dot for a decimal point no
matter what the locale and regardless of "use locale". The place I
know of where it could use a comma, say, stemmed from using
POSIX​::strod() prior to it, as strtod didn't properly clean up after
itself, and 'use locale' is irrelevant. But if you know cases other
than this, I would appreciate hearing about them.

I haven’t seen that, no. But then I’ve never actually used locales in
real-life code, since I’ve found them to contribute far more problems
than value. My mentioning them in the problem report was more of a
“this should definitely not happen without ‘use locale’” than a
statement that it should happen with it. In retrospect, that may have
been more confusing than useful.

I don't know if it's related, but there's a number of CPAN modules which
fail now if a locale with "," for the decimal point is in effect. The
list of failing modules​:
* Number​::Format
* JSON​::XS
* String​::Print
* Cpanel​::JSON​::XS
* SHARYANTO​::Number​::Util
* Tk (if you happen to not get the segfault happening on some
  architectures)

The test suites of these modules fail if LC_NUMERIC is set to
de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems
(9.2 or 10.0).

Regards,
  Slaven

--
Slaven Rezic - slaven <at> rezic <dot> de
  BBBike - route planner for cyclists in Berlin
  WWW version​: http​://www.bbbike.de
  Perl/Tk version for Unix and Windows​: http​://bbbike.sourceforge.net

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 16, 2014

From calle@init.se

On 13 mar 2014, at 18​:45, slaven@​rezic.de via RT <perlbug-followup@​perl.org> wrote​:

I don't know if it's related, but there's a number of CPAN modules which
fail now if a locale with "," for the decimal point is in effect. The
list of failing modules​:
* Number​::Format
* JSON​::XS

Failing to install JSON​::XS was how I noticed this in the first place, I just tried to narrow it down as far as I could before reporting it.

--
Calle Dybedahl
calle@​init.se -*- +46 703 - 970 612

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 16, 2014

From @pjcj

On Thu, Mar 13, 2014 at 06​:39​:40PM +0100, Slaven Rezic wrote​:

I don't know if it's related, but there's a number of CPAN modules which
fail now if a locale with "," for the decimal point is in effect. The
list of failing modules​:
* Number​::Format
* JSON​::XS
* String​::Print
* Cpanel​::JSON​::XS
* SHARYANTO​::Number​::Util
* Tk (if you happen to not get the segfault happening on some
architectures)

The test suites of these modules fail if LC_NUMERIC is set to
de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems
(9.2 or 10.0).

Just for information, Devel​::Cover can be added to this list​:

  http​://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa

--
Paul Johnson - paul@​pjcj.net
http​://www.pjcj.net

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 16, 2014

From @khwilliamson

On 03/16/2014 03​:25 AM, Paul Johnson wrote​:

On Thu, Mar 13, 2014 at 06​:39​:40PM +0100, Slaven Rezic wrote​:

I don't know if it's related, but there's a number of CPAN modules which
fail now if a locale with "," for the decimal point is in effect. The
list of failing modules​:
* Number​::Format
* JSON​::XS
* String​::Print
* Cpanel​::JSON​::XS
* SHARYANTO​::Number​::Util
* Tk (if you happen to not get the segfault happening on some
architectures)

The test suites of these modules fail if LC_NUMERIC is set to
de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems
(9.2 or 10.0).

Just for information, Devel​::Cover can be added to this list​:

http​://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa

Thanks Slaven and Paul. This is most helpful. Is there any guess as to
how complete these lists may be?

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 17, 2014

From @pjcj

On Sun, Mar 16, 2014 at 08​:54​:37AM -0600, Karl Williamson wrote​:

On 03/16/2014 03​:25 AM, Paul Johnson wrote​:

On Thu, Mar 13, 2014 at 06​:39​:40PM +0100, Slaven Rezic wrote​:

I don't know if it's related, but there's a number of CPAN modules which
fail now if a locale with "," for the decimal point is in effect. The
list of failing modules​:
* Number​::Format
* JSON​::XS
* String​::Print
* Cpanel​::JSON​::XS
* SHARYANTO​::Number​::Util
* Tk (if you happen to not get the segfault happening on some
architectures)

The test suites of these modules fail if LC_NUMERIC is set to
de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems
(9.2 or 10.0).

Just for information, Devel​::Cover can be added to this list​:

http​://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa

Thanks Slaven and Paul. This is most helpful. Is there any guess
as to how complete these lists may be?

I only know about Devel​::Cover because of the cpantesters report from
Slaven. Fortuitously, he was sat in the same room as me when I read the
mail and he was able to diagnose the problem there and then.

Perhaps Slaven has an idea of how much of CPAN has passed through his
smokers since this problem started?

--
Paul Johnson - paul@​pjcj.net
http​://www.pjcj.net

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 17, 2014

From @khwilliamson

On 03/16/2014 06​:01 PM, Paul Johnson wrote​:

On Sun, Mar 16, 2014 at 08​:54​:37AM -0600, Karl Williamson wrote​:

On 03/16/2014 03​:25 AM, Paul Johnson wrote​:

On Thu, Mar 13, 2014 at 06​:39​:40PM +0100, Slaven Rezic wrote​:

I don't know if it's related, but there's a number of CPAN modules which
fail now if a locale with "," for the decimal point is in effect. The
list of failing modules​:
* Number​::Format
* JSON​::XS
* String​::Print
* Cpanel​::JSON​::XS
* SHARYANTO​::Number​::Util
* Tk (if you happen to not get the segfault happening on some
architectures)

The test suites of these modules fail if LC_NUMERIC is set to
de_DE.UTF-8 on a Linux system (Debian/squeeze) or on FreeBSD systems
(9.2 or 10.0).

Just for information, Devel​::Cover can be added to this list​:

http​://www.cpantesters.org/cpan/report/71ecad34-ac83-11e3-86a6-6f69e0bfc7aa

Thanks Slaven and Paul. This is most helpful. Is there any guess
as to how complete these lists may be?

I only know about Devel​::Cover because of the cpantesters report from
Slaven. Fortuitously, he was sat in the same room as me when I read the
mail and he was able to diagnose the problem there and then.

Perhaps Slaven has an idea of how much of CPAN has passed through his
smokers since this problem started?

First, I'd like to thank Calle for submitting this ticket, and doing the
initial leg work on it.

It is true that Gconvert is not listed in perlapi, so one might argue
that any module can't rely on it even existing in a future Perl release,
much less that its behavior should be unchangeable. However, Gconvert
is described in Porting/Glossary, so I think this means it effectively
is part of the API. That description says nothing of locale effects on
it, though the man pages for the functions it wraps do.

I had been leaning towards reverting this commit, but after doing more
research and experimentation, I've come to the conclusion that these
modules were already broken, albeit much more rarely.

The premise of this ticket is incorrect. The commit did not change
whether Gconvert() obeys 'use locale' or not. It always hasn't obeyed
'use locale'. What changed in 5.19 is that the LC_NUMERIC category
started to inherit from the environment variables that are in effect at
start-up, like the documentation says it does, and like all the other
locale categories. That it didn't inherit the environment caused pain
for some people, who rightly filed a ticket, and which 5.19 fixed.

I had thought that the only way to get Gconvert to use other than the C
locale was to use POSIX​::strtod(), as that function changed the locale
to the underlying one unconditionally, and didn't change it back. But I
was wrong, even though I had worked on this code recently. Besides
strtod(), any call by any code anywhere to POSIX​::setlocale() will set
the underlying locale to the new locale, and thus causes Gconvert() to
use that locale. (And BTW, strtod() has been fixed to restore the
locale afterwards.)

In other words, code using Gconvert cannot expect that the decimal point
is going to be a dot unless it has taken steps to ensure that. Code
that makes that assumption and fails to take such steps is buggy, and
the 5.19 changes merely expose these bugs.

A simple way to expose these bugs in earlier Perl releases is to add
somewhere in a program that uses one of these modules, the line

  BEGIN { POSIX​::setlocale(LC_NUMERIC, "de_DE.utf8"); }

(after making sure POSIX​:: is loaded). I tried this with 5.18.0 and a
JSON​::XS .t file, and sure enough, I get the same failures as with 5.19.9.

Now, it may be that JSON​::XS is agnostic about the radix character​: "if
the caller wants it to be a comma, fine; if it wants it to be a dot,
also fine." I don't know the semantics of this module enough to know
what it should do.

It may also be that the caller does not do anything with locale itself,
and now in 5.19, it is unexpectedly experiencing the effects of the
user's locale. But the setlocale() that creates this failure in all
releases (including 5.18 and before) doesn't have to come directly from
the caller, it could be some other module down the dependency chain that
gets loaded. Thus this bug is lurking even if we revert the blamed commit.

This succinctly demonstrates the crux of the bug​: JSON​::XS shouldn't
introduce failures because of locale changes outside its control, but it
does.

One solution to this is to wrap Gconvert, as the core now does in its
uses of it, so as to make sure that the radix character is what it
should be. The problem is that this is XS code, and I'm not sure we
know what the writer's intentions are. I *think* that this is too low
level code to be making assumptions about that, but I'm open to
suggestions to the contrary.

My current thinking is to make the wrapping macros part of the API, and
tell XS writers they should use these when calling libc functions which
are affected by LC_NUMERIC. There are macros that save and restore the
current state, and set it in the meantime to one of the following,
depending on the macro​: 1) based on being in 'use locale' or not; 2) to
"C"; 3) to the current underlying one.

Also, we could make a pre-wrapped Gconvert that uses the C locale when
called from outside 'use locale'. JSON​::XS and other modules could
convert to use this macro, depending on what they are trying to
accomplish. It could be backported through PPPort.

I haven't looked at all the modules in the list; I know that several of
them use Gconvert, but not all. The one that I've looked at that
doesn't, is Number​::Format. It has a different set of problems, which
I'll comment on in a different message.

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 19, 2014

From @khwilliamson

On 03/17/2014 02​:44 PM, Karl Williamson wrote​:

I haven't looked at all the modules in the list; I know that several of
them use Gconvert, but not all. The one that I've looked at that
doesn't, is Number​::Format. It has a different set of problems, which
I'll comment on in a different message.

And here is that message. I'm repeating a bit of the context of the
previous email, as I've added the maintainer of Number​::Format to the cc
list.

The tests in this module are buggy, and again it was exposed by the
change in 5.19 to have LC_NUMERIC be set by the environment variables in
effect at the time perl is started. As I said before, this change was
in response to a bug report by people adversely affected by it
previously not doing this, and it brings the actual behavior of
LC_NUMERIC into line with its documented behavior, and how all the other
locale categories behave.

The Number​::Format tests do seem to think that in fact the locale is
inherited by the environment, as they do attempt to set it to an
expected value, by the following​:

setlocale(&LC_ALL, 'en_US');

However, this is omitted from one file, format_bytes.t, and the return
code is not checked. On my machine, and I suspect many other modern
Linux versions, there is no locale 'en_US', and the setlocale fails.
Prior to 5.19.9, this didn't matter as the LC_NUMERIC was set to the C
locale, ignoring the outside environment; the changes to 5.19 caused
LC_NUMERIC to be set to what the outside environment says it should be,
thus causing the radix character to possibly be a comma in appropriate
locales, thus exposing this bug, which is only in the test files. If I
change the 'en_US' to 'C', they pass (also adding the appropriate
setlocale call to the one file that is missing it). 'C' is the only
locale that is guaranteed to exist on all systems (that have locales).

In researching this, I found another bug, in locale.t. It explicitly
sets the thousands separator to a dot, but leaves the monetary thousands
separator unchanged, and then asks for a monetary format expecting a
dot. On my machine that separator is a space in that locale. I haven't
done the legwork to see if this a bug in my machine's locales or not,
but it seems likely that if one has to set the regular separator, one
also has to set the monetary one.

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 20, 2014

From @rjbs

* Karl Williamson <public@​khwilliamson.com> [2014-03-19T16​:19​:29]

On 03/17/2014 02​:44 PM, Karl Williamson wrote​:

I haven't looked at all the modules in the list; I know that several of
them use Gconvert, but not all. The one that I've looked at that
doesn't, is Number​::Format. It has a different set of problems, which
I'll comment on in a different message.

And here is that message.

Thanks for both of these enlightening messages. Your reasoning seems sound to
me.

--
rjbs

@p5pRT

This comment has been minimized.

Copy link
Author

commented Mar 31, 2014

From @khwilliamson

On 03/30/2014 09​:58 PM, Karl Williamson wrote​:

On 03/30/2014 11​:38 AM, Slaven Rezic wrote​:

I suspect that every CPAN module using strtod/sprintf indirectly through
a shared library is broken.

I think you didn't understand my previous post on this
<53275EB2.4000809@​khwilliamson.com>. These modules were already broken;
it's just that their breakage didn't surface very often prior to the
blamed patch.

It's like the hash key order randomization change. Most modules that
"broke" as a result of the change were already broken. It's just that
their tests and typical usage didn't cause the hashes to grow enough to
cause an hsplit(), which, when it happens, causes the key order to
change, IIRC. The change, besides being necessary for security reasons,
did the maintainers a favor by exposing a problem that could
occasionally occur in the field and would be very hard to reproduce and
debug.

In my post on this, I show how to easily get the same breakage symptoms
on earlier Perl releases as the blamed commit gives in 5.19.

The blamed commit is not necessary for security, so we as a project
might decide that it's not worth fixing these bugs, and to permanently
revert the patch, documenting the issue. But that is very different
from the idea that this patch "broke" modules, and I believe it's
important to keep that distinction in mind when making whatever decision
gets made.

"The truth shall set you free, but first it will make you miserable"
-- origin disputed, often (mis-)attributed to U.S. president James
Garfield, who BTW came up with an original proof of the
Pythagorean theorem

Having thought about this a little more, I have yet another idea​:

1) Revert the commit for 5.20.

2) In 5.21, change POSIX​::setlocale() so that it always leaves the
LC_NUMERIC locale as "C", but sets an interpreter variable (which
already is done BTW) to indicate what locale to use when doing
LC_NUMERIC operations within the scope of "use locale". The core
already wraps all such operations it performs (unless I've missed any)
so that it uses the correct locale based on that flag, and "use locale"
scope. Thus pure perl code is unaffected.

3) This would mean that all libc calls from XS would normally get a dot
radix, and the modules that Slaven has given would be automatically
fixed from those bugs I said existed in 5.18 and earlier.

4) There are undoubtedly XS modules that depend on the radix not being
dot when a setlocale asking for such is executed. But judging from the
responses here, these are far fewer than those that always want a dot.
By doing the change very early in 5.21, we find out for sure. Anyway,
such modules would have to save/set/restore LC_NUMERIC around their
non-dot need. There are macros in perl.h that manage this for you.
Some of these have been around and been used by the core and POSIX​::XS
since at least 1996, v5.003.

5) POSIX​::strtod() has for a long time assumed that the LC_NUMERIC
locale was possibly wrongly C, and used the macros to change it to what
the interpreter variable says it should be (but there was a bug until
5.19 in which it failed to change it back). There are other POSIX​::
functions that should do the same. Maybe only localeconv().

This would probably mean we wouldn't deprecate Gconvert, but as I
suggested earlier in the [perl #121317] thread, we document those macros
(for the first time).

@p5pRT

This comment has been minimized.

Copy link
Author

commented Apr 1, 2014

From @khwilliamson

Changed by commit 52686f2
--
Karl Williamson

@p5pRT p5pRT closed this Apr 1, 2014
@p5pRT

This comment has been minimized.

Copy link
Author

commented Apr 1, 2014

@khwilliamson - Status changed from 'open' to 'resolved'

@p5pRT

This comment has been minimized.

Copy link
Author

commented Jun 1, 2014

From kaffeetisch@gmx.de

According to 'git bisect', commit
52686f2 broke perl's version parsing
when Gtk3 is loaded​:

# perl -e'use Gtk3; BEGIN{ Gtk3​::init (); } use 5.8.0;' && echo OK
Invalid version format (non-numeric data) at -e line 1.

Indirectly, this causes failures in Gtk3's test suite.

A similar bug was reported in
<https://rt.perl.org/Public/Bug/Display.html?id=120723> and fixed by
commit bc8ec7c, which the new commit
52686f2 partly reverts.

Gtk3​::init is a wrapper around the C function gtk_init which for this
bug boils down to setlocale (LC_ALL, ""), see
<https://git.gnome.org/browse/gtk+/tree/gtk/gtkmain.c#n622>.

However, in contrast to the situation prior to commit
bc8ec7c, I'm now unable to reproduce
the problem with POSIX​::setlocale alone​:

# perl -e'use POSIX qw/locale_h/; BEGIN{ setlocale (LC_ALL, ""); } use
5.8.0;' && echo OK
OK

My locale environment is​:

LANG=en_US.UTF-8
LANGUAGE=en_US​:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=de_DE.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_ALL=

@p5pRT

This comment has been minimized.

Copy link
Author

commented Jun 2, 2014

From @khwilliamson

On 06/01/2014 08​:48 AM, Torsten Schoenfeld wrote​:

According to 'git bisect', commit
52686f2 broke perl's version parsing
when Gtk3 is loaded​:

# perl -e'use Gtk3; BEGIN{ Gtk3​::init (); } use 5.8.0;' && echo OK
Invalid version format (non-numeric data) at -e line 1.

Indirectly, this causes failures in Gtk3's test suite.

A similar bug was reported in
<https://rt.perl.org/Public/Bug/Display.html?id=120723> and fixed by
commit bc8ec7c, which the new commit
52686f2 partly reverts.

Gtk3​::init is a wrapper around the C function gtk_init which for this
bug boils down to setlocale (LC_ALL, ""), see
<https://git.gnome.org/browse/gtk+/tree/gtk/gtkmain.c#n622>.

However, in contrast to the situation prior to commit
bc8ec7c, I'm now unable to reproduce
the problem with POSIX​::setlocale alone​:

# perl -e'use POSIX qw/locale_h/; BEGIN{ setlocale (LC_ALL, ""); } use
5.8.0;' && echo OK
OK

My locale environment is​:

LANG=en_US.UTF-8
LANGUAGE=en_US​:en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_COLLATE="en_US.UTF-8"
LC_MONETARY=de_DE.UTF-8
LC_MESSAGES="en_US.UTF-8"
LC_PAPER=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
LC_ALL=

Please see the discussion of
https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121930

I think that the branch at
  http​://perl5.git.perl.org/perl.git/shortlog/refs/heads/smoke-me/khw-locale

should fix this, and would appreciate it if you would try it out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.