Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

%g formatting broken on Ubuntu-18.04, NVSIZE == 8 #18170

Closed
sisyphus opened this issue Sep 28, 2020 · 43 comments
Closed

%g formatting broken on Ubuntu-18.04, NVSIZE == 8 #18170

sisyphus opened this issue Sep 28, 2020 · 43 comments

Comments

@sisyphus
Copy link
Contributor

This issue is found with perl-5.32.0, all the way back to perl-5.8.9 (AFAICT), there being no such issue on perl-5.6.2
Same problem exists on my Debian Wheezy system, but Windows (Windows 7) and FreeBSD (freebsd-12.0) are fine.
I'm interested in fixing this - I would appreciate any hints on whereabouts in the perl source I should start looking.
For this report, I'll stick to version 5.32.0 and Ubuntu-18.04.

Description

"%.${digits}g" formatting is broken for $digits > 17, $Config{nvsize} == 8.
A quick check indicates that this type of issue does NOT occur with -Duselongdouble and -Dusequadmath builds of perl.

Steps to Reproduce

For example:
$ perl -le 'printf "%.54g\n", 0.3
0.29999999999999999
$ perl -le 'printf "%.53e\n", 0.3;'
2.99999999999999988897769753748434595763683319091796875e-01

Expected behavior

As you can see, the "%.53e" formatting is fine and produces the correct result.
I expected the "%.54g" formatting to produce:

0.299999999999999988897769753748434595763683319091796875

but, either my expectation is way out of line, or the %g formatting is severely crippled.
The significance of the correct output is that it exactly represents (as a base 10 number) the actual value held by the double 0.3.

The "%.54g" formatting works fine in C, using the compiler that built this perl-5.32.0.
Given that the capability for such extended formatting already exists, I'm hopeful that it can be provided to "%g" without too much difficulty.

Cheers,
Rob

Perl configuration

# perl -V output goes here

Summary of my perl5 (revision 5 version 32 subversion 0) configuration:

Platform:
osname=linux
osvers=4.15.0-106-generic
archname=x86_64-linux
uname='linux sisyphus5-desktop 4.15.0-106-generic #107-ubuntu smp thu jun 4 11:27:52 utc 2020 x86_64 x86_64 x86_64 gnulinux '
config_args='-des -Duse64bitall -Dprefix=/home/sisyphus/perl-5.32.0-d'
hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
bincompat5005=undef
Compiler:
cc='cc'
ccflags ='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
optimize='-O2'
cppflags='-fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
ccversion=''
gccversion='7.5.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -fstack-protector-strong -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/7/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib
libs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lpthread -lnsl -ldl -lm -lcrypt -lutil -lc
libc=libc-2.27.so
so=so
useshrplib=false
libperl=libperl.a
gnulibc_version='2.27'
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=so
d_dlsymun=undef
ccdlflags='-Wl,-E'
cccdlflags='-fPIC'
lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl):
Compile-time options:
HAS_TIMES
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
USE_64_BIT_ALL
USE_64_BIT_INT
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
Built under linux
Compiled at Jun 25 2020 19:42:19
@inc:
/home/sisyphus/perl-5.32.0-d/lib/site_perl/5.32.0/x86_64-linux
/home/sisyphus/perl-5.32.0-d/lib/site_perl/5.32.0
/home/sisyphus/perl-5.32.0-d/lib/5.32.0/x86_64-linux
/home/sisyphus/perl-5.32.0-d/lib/5.32.0

@khwilliamson
Copy link
Contributor

khwilliamson commented Sep 28, 2020 via email

@sisyphus
Copy link
Contributor Author

Thanks Karl - that sent me to the right place.
The problem seems to arise at line 13118 in sv.c where we find a:
SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis)

which, AFAICT, equates to:
sprintf(ebuf, "%.*g", 54, (double)fv)

The size of ebuf is 127, precis is set to 54 and fv is a long double representation of the value (0.3 in my example).
The buffer ebuf is being set to the incorrect value of 0.29999999999999999, but I don't yet know why.

In a C script I ran:

char ebuf[127];
long double fv = 0.3L;
int precis = 54;
sprintf(ebuf, "%.*g", precis, (double)fv);
printf("%s\n", ebuf);

and it produced the expected value of
0.299999999999999988897769753748434595763683319091796875

It's getting late ... I'll see what tomorrow brings.

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

I misread the definition of SNPRINTF_G.
(I looked at the definition of Gconvert in uconfig.h - I should have looked at the definition in the generated config.h.)

SNPRINTF_G(fv, ebuf, sizeof(ebuf), precis)
actually equates to:
gcvt((double)fv, precis, ebuf)

and it's the gcvt function that's crippled.
If sprintf() was used instead, then all would be as I like it.

I doubt that there will be any interest (apart from mine) in getting this fixed.
Is there a configure option I can use when building perl on Ubuntu-18.04 which will lead to gcvt() being replaced by sprintf() ?
Or do I need to hack the source ?

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

In the belief that this might not be the best forum on which to ask the questions I've just asked, I've also raised them at:
https://www.perlmonks.org/?node_id=11122363

@khwilliamson
Copy link
Contributor

khwilliamson commented Sep 30, 2020 via email

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 1, 2020

Over on perlmonks (https://www.perlmonks.org/?node_id=11122363) Max and Merijn have suggested that fixing compline/d_gconvert.U is the way to go.
That's a murky little backwater that I've not swum in before - I didn't even know it existed.
Taking into account the point that the linux man page recommends sprintf(), maybe all that's needed is to remove the gcvt() option from that file altogether ?
Could we just do away compile/d_gconvert completely and just define Gconvert to always use sprintf ?

Max also came up with a -Dd_Gconvert='sprintf((b),"%.*g",(n),(x))' configure option that seems to do the job, and doesn't break the test suite.
However, it did create some noise during compilation - see https://www.perlmonks.org/?node_id=11122370 .

And I'm still a little puzzled as to how freebsd-12.0 manages to do things correctly.
According to its config.h, Gconvert uses gcvt(), but an Inline::C script that attempts to use gcvt() directly complains that "gcvt" is not found in any of the header files, nor is it locatable in the .so.

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 2, 2020

The Gconvert macro is used only in 2 places in the perl source - once in sv,c (inside the SNPRINTF_G macro), and once (explicitly) in cpan/Win32API-File/const2perl.h.
My intention is to remove both of those occurrences by replacing them with explicit sprintf() calls.

Before I go any further, I'll spend a bit of time doing some testing of this with perl-5.33.2 builds on both Windows and Linux, for the variety of nvtypes.

Ultimately, it would be nice to then do away completely with the Gconvert macro ... but probably best not to get too far ahead of myself.

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 6, 2020 via email

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 6, 2020

I can't see the 3 patches.
Probably unsupported file type.
Attaching them here with a '.txt. extension instead of the '.patch' extension
sprintf2.t.txt
sv.c.txt
sv.c_with_gconvert_fixed.txt

.

@hvds
Copy link
Contributor

hvds commented Oct 6, 2020

I'm not convinced the changes to the conditions are safe: we have calculated the width of the radix point and rolled that into float_need, but you no longer include that in the checks.

I tried to create a locale with an extra-wide radix point for testing, but on this Ubuntu system while locale(5) documents decimal_point only as "string", locale-gen(1) actually restricts it to a single character. I imagine not every OS is so restrictive, but that means I cannot readily test it with anything wider than a single Unicode codepoint.

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 7, 2020 via email

@hvds
Copy link
Contributor

hvds commented Oct 7, 2020

Would this shortcoming in my changes be satisfactorily addressed by changing the condition:
&& sizeof(ebuf) - precis > 10
to
&& sizeof(ebuf) - precis > 9 + SvCUR(PL_numeric_radix_sv)

No, we need to go through hoops to get to it: act only #ifdef USE_LOCALE_NUMERIC; then safely swap in the actual locale (under mutex if needed), get it, then safely swap back (either immediately or later - but one of the two without fail).

This is what the #ifdef USE_LOCALE_NUMERIC block just after the initialization of float_need is doing.

or to simply: && sizeof(ebuf) - precis > 17 /* allow for worst case radix point of 8 */

This is what I was referring to in my previous comment: I do not think it is safe to assume that every platform will guarantee that the radix point string is restricted to a single codepoint, though it appears to be doing its best to guarantee it on my Ubuntu system. I think we need to assume we may need to cope with a radix point like "<-----------------decimal point here------------------>", or at the very least a combining form that requires multiple codepoints.

I think the optimizer considers that the value of float_need is 0 ... but I'm not sure.

Yes, it must treat the float_need += SvCUR(PL_numeric_radix_sv) - 1 as "modify it by some arbitrary amount from a variable", so worst case that would leave it as 0 - there's no intrinsic reason it couldn't be trying to add STRLEN_MAX - 34.

It's possible wrapping full range checks around that addition would also be sufficient to make the warning go away, I might give that a try tomorrow.

@khwilliamson
Copy link
Contributor

khwilliamson commented Oct 7, 2020 via email

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 7, 2020 via email

@khwilliamson
Copy link
Contributor

khwilliamson commented Oct 7, 2020 via email

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 7, 2020 via email

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 8, 2020

This is becoming insane.
All along I've been using perl-5.33.2 source and I've only just now discovered that the overflow warning can be avoided simply by removing the condition:
&& float_need < sizeof(ebuf)

With just that one condition removed, I can then replace the SNPRINTF_G() call with:
PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)precis, (double)fv))

and no overflow warning is emitted.
Not sure why I didn't realize that earlier. I've just updated to Ubuntu-20.04 yesterday ... I don't know whether that might have altered something.

I don't really know why that condition is in there to begin with.
The conditions:
&& precis
&& sizeof(ebuf) - float_need > precis

effectively guarantee that float_need < sizeof(ebuf), don't they ?

Anyway, although this apparently fixes my original issue, it doesn't stop the premature fallthrough - if we think that's worth fixing.

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 8, 2020

Currently testing with the attached (sv.c2.txt) patch to sv.c.
It is patched against the version of sv.c that shipped with perl-5.33.2.
So far, so good - no overflow warnings and correct results.

All it does is:

  1. remove the "&& float_need < sizeof(ebuf)" condition;
  2. replace the "&& sizeof(ebuf) - float_need > precis" condition with:
    && sizeof(ebuf) - (float_need - 25) > precis,
    which means we don't fall through this block of code while precis <= 116 ..... so there's still a 3-byte safety net.
  3. replace the SNPRINTF_G() call with PERL_UNUSED_RESULT(sprintf(....))

Any obvious problems with that ?
AFAICT, (float_need - 34) should give us the bytesize of the radix point.
At least, I don't see that it can be less than the bytesize of the radix point, and if it's larger than the bytesize of the radix point, then we just fall through that block of code before it's actually necessary.

sv.c2.txt

@hvds
Copy link
Contributor

hvds commented Oct 8, 2020

Any obvious problems with that ?

Well, mainly the justification.

  1. remove the && float_need < sizeof(ebuf) condition

Following sizeof(ebuf) - float_need ... checks would be invalid if float_need < sizeof(ebuf) is not true. I don't think it's a good idea to remove it.

If removing that check is what makes the gcc warning go away, that's clearly a bug in gcc: making it less restrictive cannot possibly make it suddenly safe enough not to need a warning. If that really is the case, we'd be better off disabling the warning for affected versions of gcc.

I don't really know why that condition is in there to begin with.
The conditions: && precis && sizeof(ebuf) - float_need > precis effectively guarantee that float_need < sizeof(ebuf), don't they ?

No, not if float_need + precis overflows.

  1. replace the && sizeof(ebuf) - float_need > precis condition with && sizeof(ebuf) - (float_need - 25) > precis

We've calculated float_need to be worst case what we need. If worst case is actually float_need - 25, our calculation is wrong and we should fix it rather than applying a magic-number adjustment inside the test.

Fixing it would mean understanding what part of it is wrong, and correcting that part.

  1. replace the SNPRINTF_G() call with PERL_UNUSED_RESULT(sprintf(....))

This is the part I know least about. I believe that historically there were lots of buggy sprintf() implementations out there, and I'm reasonably sure that's why we have this SNPRINTF_G macro in the first place. If those buggy implementations are still out there, this will break for them without any easy workaround.

On the other hand, fixing it by updating configure or the hints files to make a different choice would give users of such hypothetical platforms a relatively easy workaround - the same one you found was available to you, by supplying a configure option.

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 9, 2020

The way I'm seeing it, in the original sv.c we have
&& float_need < sizeof(ebuf)
which can be rewritten as
&& sizeof(ebuf) - float_need > 0
This is followed by the more restrictive:
&& sizeof(ebuf) - float_need > precis
If an expression needs to greater than precis, why waste time by first checking to see that it's greater than zero ?
I think gcc regards the less restrictive && float_need < sizeof(ebuf) as susceptible to buffer overflow and issues the warning.
But it is satisfied that the more restrictive && sizeof(ebuf) - float_need > precis removes the possibility of overflow.
I am assuming that precis is greater than 0, and apparently gcc is satisfied that precis is greater than 0.

But then ... it all turns out to be very flaky.
With the condition float_need - 26 < sizeof(ebuf) the overflow warning is emitted. (Need to subtract at least 27 to avoid the warning.)
But change that to the equivalent 0 < sizeof(ebuf) - (float_need - 26) and there's no warning.
(Seems odd ... and I did test it a few times thinking I must have been doing something dumb ... maybe I was.)

Unless there really is a need (that I'm still too dim to notice) for the condition && float_need < sizeof(ebuf) then I think it should be removed, as it is the condition that's responsible for the overflow warning.
The other conditions could stay as they are.
I'm not too concerned about falling through that block if
precis > 92 - width_of_radix_point
even though there's no real need to fall through unless
precis > 117 - width_of_radix_point (or thereabouts).

As to the removal of that SNPRINTF_G() call, we know it's not doing the right thing on Ubuntu and (probably) Debian because of the buggy gcvt() that Gconvert invokes.
AFAICT, fixing it involves either hard coding the sprintf() call, or fixing the Configure process to not assign these buggy gcvt() implementations to Gconvert.

Perhaps the latter can be achieved simply by adding an appropriate test to the probing that the Configure script is already doing.
I dunno ... I'll have a play around with the Configure script and see if I can learn somethng.

So much work, for so little gain - the story of my life ;-)

BTW, I think that when I first tried removing flloat_need from the conditions, I removed it from both places.
That is I made 2 changes at once - and I think that's how I came to initially misdiagnose the source of the overflow warnings.

@hvds
Copy link
Contributor

hvds commented Oct 9, 2020

The way I'm seeing it, in the original sv.c we have
&& float_need < sizeof(ebuf)
which can be rewritten as
&& sizeof(ebuf) - float_need > 0
This is followed by the more restrictive:
&& sizeof(ebuf) - float_need > precis
If an expression needs to greater than precis, why waste time by first checking to see that it's greater than zero ?

If float_need + precis overflows, and wraps for example to 0, then sizeof(ebuf) - float_need > precis will be true even though we don't have room in the buffer to store float_need bytes followed by precis bytes.

It seems my earlier attempts to communicate this were unsuccessful, so if what I'm saying still isn't clear please try to point out what part of what I'm saying is not clear.

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 9, 2020

Yes - I just needed to remind myself that x - y is always >= 0 in unsigned land irrespective of the relative magnitudes of x and y. (Pathetic, I know.)
Casting to IV seemed to distract gcc sufficiently as to avoid any warnings.
That is, from the original version of sv.c I replaced:
&& float_need < sizeof(ebuf)
with
&& (IV)float_need < (IV)sizeof(ebuf)

Not sure whether that meets with approval.
Obviously it could be unsatisfactory if we need to allow for the highest bits of float_need and/or the size_t returned by sizeof(ebuf) being set.
But it would be scary to think that either float_need or sizeof(ebuf) could exceed IV_MAX.

Apart from that, the only other change I made was to replace the SNPRINTF_G call with:
PERL_UNUSED_RESULT(sprintf(ebuf, "%.*g", (int)precis, (double)fv))

As I said, that was sufficient to avoid compilation warnings,
I'll do the testing later on tonight, unless I get a message that I'd only be wasting my time.

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

That last suggestion doesn't even sit well with my own cavalier standards.
A better way is to simply reverse the order of the conditions and do:

&& sizeof(ebuf) - float_need > precis
&& float_need < sizeof(ebuf)

Normally one would run the validity check as a pre-check, and it's a bit unorthodox to run a validity check after the event - but I believe the subtraction is well defined, irrespective of the relative magnitudes of the 2 operands.
AFAICS, the only (minor) downside is that running those checks in that order means that there's no point in checking the first condition when the second is going to fail.

I tried a few other sneaky ways to test whether the '<' (or the '<=') condition held, like:
&& sizeof(ebuf) / float_need
and
&& !(float_need / sizeof(ebuf)
but I still got warnings if those checks were run as pre-checks.
One approach that did avoid the warning when run as a pre-check was:
sizeof(ebuf) % float_need != sizeof(ebuf)
but not
&& float_need % sizeof(ebuf) == float_need

For my own builds of perl, I think I'll just suppress that warning by changing the order of the existing conditions, and by providing the -Dd_Gconvert=sprintf(....) argument to the Configure command - at least until such time as I can get the Configure script to do the right thing.

I did manually add a test to the Configure script which resulted in sprintf() being assigned to Gconvert.
All seemed ok with that until I noticed that the XS-APITest compilation then issues another -Wformat-overflow warning.
So, there's something I have to look at there, too.

@hvds
Copy link
Contributor

hvds commented Oct 10, 2020

A better way is to simply reverse the order of the conditions and do:
&& sizeof(ebuf) - float_need > precis && float_need < sizeof(ebuf)

That looks fine to me, if accompanied with a comment explaining why.

One approach that did avoid the warning when run as a pre-check was:
sizeof(ebuf) % float_need != sizeof(ebuf)

Ditto.

Note that there's also some value in trying to cut this down to a minimal reproducer that could be reported to the gcc team - fixing the warning there would be a better solution all round.

Similarly, I hope that you will report your buggy gcvt() if you haven't already done so.

@sisyphus
Copy link
Contributor Author

Bug report about the gcvt/qgcvt behaviour has (now) been submitted:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97365

I also have a demo for a bug report regarding the -Wformat-overflow warnings:
##################################
#include <stdio.h>

void foo(double, unsigned int);

int main(void) {
double d = 5.1;
unsigned int precis = 15;

foo(d, precis);
}

void foo(double dub, unsigned int prec) {
char buf[127];
if(
prec < sizeof(buf) &&
sizeof(buf) - prec > 10
){
sprintf (buf, "%.*g", prec, dub);
printf("%s\n", buf);
}
}
##################################
I'm finding that either -O2 -Wall or -O1 -Wall is enough switches to provoke the warning on Ubuntu-20.04, gcc-9.3.0 and on Windows 7 with gcc-8.3.0.

The warning, and all of its associated noise, disappears as soon as the line prec < sizeof(buf) && is removed from the script.
(Reversing the order of the 2 conditions did not suppress the warning this time - but I think the program provides a clear demonstration of the bug, anyway.)

Before I file the report, are there any thoughts about the script's fitness for submission in a bug report ?
Have I missed something significant ?

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

I filed the gcvt and qgcvt bug reports with gcc, but they are disavowing any responsibility, stating:
"This isn't a problem in the compiler but the used C library. Please report the issue there".

Anyone here know exactly where "there" is located ?

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

As I mentioned, when Gconvert uses spritnf() on Ubuntu, I get similar warnings in XS-APItest as experienced with sv.c:

APItest.c: In function ‘XS_XS__APItest__Magic_test_Gconvert’:
../../config.h:909:39: warning: ‘%.*g’ directive writing between 1 and 106 bytes into a region of size 100 [-Wformat-overflow=]


This turns out to be a little more straightforward. (See attached APItest.xs.patch.txt .)
It's just a matter of replacing if(len > 99)with if(len > 92).
Actually "len > 93" was sufficient to suppress the warning ... but not quite right for double precision.
Apparently the number of bytes consumed by the radix point is not an issue here.

There was also another warning in XS-APItest:

APItest.xs: In function ‘blockhook_csc_start’:
APItest.xs:380:23: warning: comparison of integer expressions of different signedness: ‘I32’ {aka ‘int’} and ‘size_t’ {aka ‘long unsigned int’} [-Wsign-compare]

The patch takes care of that, too - by casting the 'long unsigned int' to I32.

And I've included a revised patch to sv.c (sv.c.patch.txt) in accordance with latest suggestions.
It doesn't feel to me like it's a robust way to avoid the warnings, though it seems to be doing that just fine.

APItest.xs.patch.txt
sv.c.patch.txt

@hvds
Copy link
Contributor

hvds commented Oct 12, 2020

I filed the gcvt and qgcvt bug reports with gcc, but they are disavowing any responsibility, stating:
"This isn't a problem in the compiler but the used C library. Please report the issue there".

Anyone here know exactly where "there" is located ?

% ldd `which perl` | grep libc
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fef37dde000)
% dpkg -S /lib/x86_64-linux-gnu/libc.so.6
libc6:amd64: /lib/x86_64-linux-gnu/libc.so.6
% 

=> https://packages.ubuntu.com/search?keywords=libc6
=> https://packages.ubuntu.com/bionic/libc6
=> [bug reports] https://launchpad.net/ubuntu/+source/glibc/+bugs

You'll need a launchpad account; when I posted a recent issue for the locale package, they asked me to post it upstream (but pointed me where to go for that).

@hvds
Copy link
Contributor

hvds commented Oct 12, 2020

As I mentioned, when Gconvert uses spritnf() on Ubuntu, I get similar warnings in XS-APItest as experienced with sv.c:
APItest.c: In function ‘XS_XS__APItest__Magic_test_Gconvert’: ../../config.h:909:39: warning: ‘%.*g’ directive writing between 1 and 106 bytes into a region of size 100 [-Wformat-overflow=]
This turns out to be a little more straightforward. (See attached APItest.xs.patch.txt .)
It's just a matter of replacing if(len > 99)with if(len > 92).
Actually "len > 93" was sufficient to suppress the warning ... but not quite right for double precision.
Apparently the number of bytes consumed by the radix point is not an issue here.

That probably also merits a comment, something like:

  /* from Perl_sv_vcatpvfn_flags, 92 = 127 - 35 = sizeof(ebuf) - float_need */

There was also another warning in XS-APItest:
APItest.xs: In function ‘blockhook_csc_start’: APItest.xs:380:23: warning: comparison of integer expressions of different signedness: ‘I32’ {aka ‘int’} and ‘size_t’ {aka ‘long unsigned int’} [-Wsign-compare]
The patch takes care of that, too - by casting the 'long unsigned int' to I32.

This was already fixed a couple of weeks ago by 4e3e027.

And I've included a revised patch to sv.c (sv.c.patch.txt) in accordance with latest suggestions.
It doesn't feel to me like it's a robust way to avoid the warnings, though it seems to be doing that just fine.

I agree it's not robust, but that's not uncommon when trying to work around toolchain bugs. This LGTM.

@sisyphus
Copy link
Contributor Author

Getting closer, I think:

diff.txt

As regards my changes to Configure, I suspect the same alteration would need to be made to https://perl5.git.perl.org/metaconfig.git/blob/HEAD:/U/compline/d_gconvert.U
but I assume that, for the purposes of smoking, it would be ok to simply use my patched Configure script.
I did go to some trouble to check not only that d_Gconvert is set to sprintf() if the new test fails, but also that d_Gconvert is set gcvt/qgcvt if the new test passes.
I also checked the -Duselongdouble and -Dusequadmath builds even though it doesn't seem to matter what d_Gconvert is set to on those builds.
On my Ubuntu -Duselongdouble builds, with the amended Configure, I found that the buggy qgcvt was being replaced with the correct sprintf() call ... which, I think is worth doing even though I can't find any breakage when Gconvert is set to use qgcvt().
On my Ubuntu -Dusequadmath builds, with the amended Configure, I found that Gconvert switched from using gcvt() to sprintf's "%.*g" formatting - neither of which is appropriate for a quadmath build, and neither of which is of any consequence, AFAICT.

As regards my changes to sv.c, I still get the feeling that we should remove all references to float_need because:

  1. float_need allows for an unnecessary "safety net" of 20 bytes and a leading "0x1" - thereby leading to wastefulness and confusion;
  2. float_need allows for an 8-byte exponent even though 5 bytes will be the maximum - more wastefulness and confusion.
    I do understand the need to cater for the radix point but, for accuracy and clarity, I'm thinking of replacing
    && sizeof(ebuf) - float_need > precis
    && float_need < sizeof(ebuf)
    with:
    #if defined(USE_LOCALE_NUMERIC)
    && sizeof(ebuf) - precis > 6 + SvCUR(PL_numeric_radix_sv)
    #else
    && sizeof(ebuf) - precis > 7
    #endif
    && precis < sizeof(ebuf)

That allows just enough for a possible leading '-', a radix point, a (max) 5-byte exponent and the terminating NULL.
There's also the possibility of a leading zero prior to the radix point - but then there should be no exponent, so we shouldn't have to worry about that case.
What do you think ?
Ought we provide a safety net of a few bytes ?
Does that revision allow for the width of the radix point correctly ?

I'm currently comfortable with the amendments to APItest.xs and sprintf2.t that I've proposed in this latest attached diff.
I think sprintf2.t is unchanged from last time, but I've altered APItest.xs to correctly cater for long double and __float128 types - though I wouldn't be surprised if test_Gconvert() is not even being run for those nvtypes.

Cheers,
Rob

@hvds
Copy link
Contributor

hvds commented Oct 15, 2020

Getting closer, I think:

diff.txt

As regards my changes to Configure, I suspect the same alteration would need to be made to https://perl5.git.perl.org/metaconfig.git/blob/HEAD:/U/compline/d_gconvert.U
but I assume that, for the purposes of smoking, it would be ok to simply use my patched Configure script.
I did go to some trouble to check not only that d_Gconvert is set to sprintf() if the new test fails, but also that d_Gconvert is set gcvt/qgcvt if the new test passes.
I also checked the -Duselongdouble and -Dusequadmath builds even though it doesn't seem to matter what d_Gconvert is set to on those builds.
On my Ubuntu -Duselongdouble builds, with the amended Configure, I found that the buggy qgcvt was being replaced with the correct sprintf() call ... which, I think is worth doing even though I can't find any breakage when Gconvert is set to use qgcvt().
On my Ubuntu -Dusequadmath builds, with the amended Configure, I found that Gconvert switched from using gcvt() to sprintf's "%.*g" formatting - neither of which is appropriate for a quadmath build, and neither of which is of any consequence, AFAICT.

@Tux could you take a look at the Configure change proposed here?

@hvds
Copy link
Contributor

hvds commented Oct 15, 2020

As regards my changes to sv.c, I still get the feeling that we should remove all references to float_need because:

1. `float_need` allows for an unnecessary "safety net" of 20 bytes and a leading "0x1" - thereby leading to wastefulness and confusion;

The "safety net" is there on non-DEBUGGING builds, because we'll get coredumps or security issues if we get it wrong. The safety net is removed on DEBUGGING builds. While the size of it may be arguable, I think the principle here is a good idea.

This is described in the block comment above the derivation of float_need.

The leading 0x1 is there for %a formats.

This is described in detail in the block comment above the derivation of float_need.

2. `float_need` allows for an 8-byte exponent even though 5 bytes will be the maximum - more wastefulness and confusion.

When you say "8-byte exponent", do you mean the sum of:

                          +  2  /* "e-", "p+" etc */
                          +  6  /* exponent: up to 16383 (quad fp) */

? The comments do explain the derivation of that, and it is described in more detail in the block comment above the derivation of float_need. I have no idea where your 5 maximum comes from.

There are more formatting specifiers than just %g being handled here. These are all described in detail in the block comment above the derivation of float_need.

I am finding this discussion increasingly frustrating: I exhort you to read those comments. If you find something wrong there, please point that out specifically so we can correct it. If you find something unclear, please point out the specific part you find unclear so we can attempt to improve it.

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 16, 2020 via email

@hvds
Copy link
Contributor

hvds commented Oct 16, 2020

My preference is that I report a bug, and some competent person fixes it.

Well the base bug here seemed to be in your C library's gcvt, and a secondary one was in gcc's printf warning logic.

Apart from the size of the safety net, I've no question with the way that float_need is being set. It just seems odd to me that we reference that value inside a block where we are guaranteed to be performing %g formatting of a double.

The desire is for consistency: we know we need some logic to determine the minimum number of extra bytes, far better to have that in one place and then used in all relevant pathways.

I would not necessarily be against setting float_need separately for different format letters, but I'd want to do it by expanding the existing setting of it with some conditionals, so that the logic being implemented continues to relate clearly to the existing comments. The question becomes whether there's enough benefit to justify the added complexity.

But it would scare me to make tight assumptions about the maximum width of a radix point; I'd want some locale expert like @khwilliamson to offer assurance that those assumptions will be safe across the myriad platforms perl supports, and will continue to be when we add more platforms in the future.

Independently of that, if you feel you could get some noticeable benefit from dropping the 20 byte safety margin, it would not be unreasonable to make it depend on a new #define that you could override at configure time - the current tight coupling to debugging perls does not really seem ideal to me.

/* in a header somewhere */
#ifndef PERL_SPRINTF_SAFETY_MARGIN
#  ifdef DEBUGGING
#    define PERL_SPRINTF_SAFETY_MARGIN 0
#  else
#    define PERL_SPRINTF_SAFETY_MARGIN 1
#  endif
#endif
...
/* in sv_vcatpvn */
#if PERL_SPRINTF_SAFETY_MARGIN
                          + 20  /* safety net */
#endif

.. or we could even #define the size of the safety net itself, which would let you set it to -7 if you were really feeling racy.

@sisyphus
Copy link
Contributor Author

Ok ... it's probably a good idea to just concentrate on the bugs.
And, as you say, the base bugs are not perl bugs.
I would regard the following as perl bugs :

  1. that Configure does not detect the fact that gcvt and qgcvt are not fit for purpose on Ubuntu/Debian;
  2. that test_Gconvert() in APItest.xs is prone to buffer overflow.

(Yes, the -Wformat-overflow warning issued when compiling APItest.c is justified AFAICT.)

So - let's just fix those 2 bugs, and do whatever we need to do to silence the bogus warnings.
And I should not go off on a tangent thinking about that "else if (c == 'g' ...){}" block and what it can/will/could/should do.
The point that @hvds makes about the convenience of having a float_need variable that you can drop into different situations is a pertinent one.

So ... I think I'm just waiting on the verdict from @Tux regarding proposed changes to Configure.

The link to the glibc bug report (re gcvt and qgcvt) is:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1899553
No response to that as yet.

And the pre-existing report regarding the bogus -Wformat-overflow warnings (to which I added) is at:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89161
It's expected (or hoped, anyway) to be fixed soon.

@Tux
Copy link
Contributor

Tux commented Oct 16, 2020

Sorry to be late, busy :(
I looked at the Configure part of https://github.com/Perl/perl5/files/5384834/diff.txt , and it looks harmless. I'd like to see either an explanation of what the Ubuntu/Debian bug is or a reference url in some comment.

@sisyphus
Copy link
Contributor Author

sisyphus commented Oct 16, 2020 via email

@sisyphus
Copy link
Contributor Author

Heh ... sorry, @Tux probably meant that he'd like to see the reference/explanation presented in the Configure file ;-)
I can attend to that.

Do I now create a pull request ?

Cheers,
Rob

@sisyphus
Copy link
Contributor Author

Just noticed that the Configure patch is incorrect. (I had fiddled about with that test to check that Gconvert would still use gcvt/qgcvt if the test passed ... and then had not properly reversed the fiddling.)

Attached is the corrected Configure patch.
The second arg provided to Gconvert() has been corrected from 21 to 55.
Oh ... and I've also added the comment that @Tux requested.

Configure.patch.txt

@hvds
Copy link
Contributor

hvds commented Oct 19, 2020

Do I now create a pull request ?

That's usually the easiest way to get stuff reviewed and merged, yes.

@Tux
Copy link
Contributor

Tux commented Oct 26, 2020

FWIW @khwilliamson created a branch. The patch to Configure has already been merged to meta, so if the complete pictura is accepted, a regen of Configure is ready for use. Accept pending :)

@khwilliamson
Copy link
Contributor

This looks like it has been merged. Is there a reason to keep this ticket open @sisyphus

@sisyphus
Copy link
Contributor Author

@khwilliamson , AFAIK there's no reason to keep this open.
Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants