Workaround XS that wants non-STRLEN for PV len #20037

khwilliamson · 2022-08-04T01:56:38Z

 This should fix GH #19983.
 
 Some of the macros that extract a PV from an SV also will set a 'len'
 parameter to how many bytes long it is.  The len parameter is supposed
 to be declared as a STRLEN (or equivalently, Size_t).  But there is a
 significant amount of code that declares the parameter wrongly, such as
 an int, and this code generally has worked.  I do believe that warnings
 are generated.  With 1ef9039 such code broke.
 
 One could view this as similar to the hash key retrieval order problem
 from years past, where we viewed the breakage as a "good thing" to catch
 real bugs early.  But in this case, an int may be large enough so that
 the issue wouldn't ever arise in practice.
 
 What this commit does is to see if the 'len' parameter is the same size
 and sign as STRLEN.  If so, it follows the code in 1ef9039.  I believe
 this is technically undefined behavior, as the only defined behavior is
 if the pointers point to the same object type, but we do such things all
 the time without negative consequences.
 
 If 'len' isn't equivalent to STRLEN, the implementation falls back to
 using gcc brace groups, when available, to only evaluate the passed in
 SV once.  If not available, it uses temporary variables for the same
 effect.

sisyphus · 2022-08-04T06:24:11Z

Allowing this mistake of passing an int* to a function that expects a STRLEN* sits a bit uncomfortably with me when the STRLEN is bigger than int && the architecture is big-endian.
On my big-endian ppc64 machine, if a C sub that expects STRLEN* is given an int*, then the 'int' ends up being zero unless the value overflows the 32 low bits.
That is, the high 32- bits of the 64-bit value is assigned to the int, not the low 32 bits.
(On little-endian machines it seems that the desired "low" bits are assigned, so it's less of a concern with them.)

Below is an XS (Inline::C) demo script that exhibits such behaviour on a big-endian machine with intsize=4 and sizesize=8.
However, I've been unable to reproduce this issue on that machine by specifying an int as the second arg to SvPV() - so perhaps there's no need to worry about this usage wrt SvPV().
Perhaps the perl internals take care of it (via casts ?), or maybe there's something wrong with my demo, or maybe this is just the beauty of undefined behaviour.

Here's the demo script:

use strict;
use warnings;
use Config;

use Inline C => Config =>
  BUILD_NOISY => 1,
;

use Inline C => <<'EOC';

void fu(STRLEN * in, STRLEN * out) {
  *out = *in;
}

void foo(SV * in) {
  dXSARGS;
  int i, count = 0, x = 0;
  STRLEN a = (STRLEN)SvUV(in);

  fu(&a, &x);

  for(i = 0; i < x; i++) {
    count++;
  }

  printf("count: %d\n", count);

  ST(0) = sv_2mortal(newSVuv(x)); /* the int */
  ST(1) = sv_2mortal(newSVuv(a)); /* the STRLEN */
  XSRETURN(2);
}

EOC

print "BYTEORDER:   $Config{byteorder}\n";
print "STRLEN size: $Config{sizesize} \n";
print "INT SIZE   : $Config{intsize}  \n\n";

for(1, 1000, 4294967295, 4294967296, 8589934591, 8589934592) {
  my @ret = foo($_);
  print "$ret[1] becomes $ret[0]\n\n";
}

And here's the output:

BYTEORDER:   87654321
STRLEN size: 8
INT SIZE   : 4

count: 0
1 becomes 0

count: 0
1000 becomes 0

count: 0
4294967295 becomes 0

count: 1
4294967296 becomes 1

count: 1
8589934591 becomes 1

count: 2
8589934592 becomes 2

On a little-endian machine, I find that there's no problem until the value of the STRLEN overflows 32-bits - which is going to be a pretty rare case where SvPV() is concerned.
The only warnings issued during compilation are the expected ones:

strlen_pl_c68c.xs: In function ‘foo’:
strlen_pl_c68c.xs:15:3: warning: passing argument 2 of ‘fu’ from incompatible pointer type [enabled by default]
strlen_pl_c68c.xs:6:6: note: expected ‘STRLEN *’ but argument is of type ‘int *’

Cheers,
Rob

dur-randir · 2022-08-05T08:06:27Z

I'd be really, really wary for commiting workarounds that may break silently - as they're the worst ones to debug. Luckily, I can't imagine any easy way to break this (get magic looks like the only one, and they aren't common on cpan), but conceptually introducing those scratch_* variables really bothers me. Does the original idea of "evaluate SV in a macro once" really warrants this?

PS: shouldn't scratch_* vars be a subject to the same ifdef as theirs usage?

khwilliamson · 2022-08-05T20:33:02Z

To @sisyphus, I used U16 to test this.

To @RandiR. I'm unsure as to if it is worth it or not. But this implementation is less susceptible to magic and overloading issues than the one it replaces.

But we can make it as loud as we want, including refusing to compile, or deprecating calling with a too-small len parameter, or issuing a runtime warning at the first wrong call of these at each calling instance, as examples

The core has used PL_na and PL_Sv for many years to implement things like this. But they are API. I personally spent too many hours finding a bug that turned out to be caused by having one function that used them calling another function that used them. The core needs to stop using those API ones regardless of this PR. This would be a step towards that, and so the variables are not #ifdefd Of course, each such potential use would have to be examined to be sure that the same call stack error couldn't occur. But having core-only ones would allow us to control things better than currently.

Other options include:

Doing the single evaluation only for the nolen forms, and documenting that you can do a SvCUR yourself afterwards if you want that behavior
Reverting the behavior to the old way of doing things if the len parameter isn't declared STRLEN, and documenting that. So XS writers would get single-evaluation if they used the proper declaration.

tonycoz · 2022-08-17T01:41:49Z

However, I've been unable to reproduce this issue on that machine by specifying an int as the second arg to SvPV() - so perhaps there's no need to worry about this usage wrt SvPV().

The problem is that when SvPV() was more than a thin wrapper around a function, it could safely be called with a non-STRLEN len parameter for many cases:

#define SvPV_flags(sv, len, flags) \
    (SvPOK_nog(sv) \
     ? ((len = SvCUR(sv)), SvPVX(sv)) : sv_2pv_flags(sv, &len, flags))

But now to avoid repeated evaluation of macro parameters it directly calls a function:

#define SvPV_flags(sv, len, flags)                                          \
   Perl_SvPV_helper(aTHX_ sv, &len, flags, SvPVnormal_type_,                \
                    Perl_sv_2pv_flags, FALSE, 0)

which causes "stack smashing" as described in #19983 and similar problems even for non-magical POK SVs, which the older definition allowed to "work".

Both versions caused compilers to produce a type mismatch diagnostic, which unfortunately some CPAN authors ignored.

jkeenan · 2022-08-24T13:02:29Z

@tonycoz, @khwilliamson, Can we get an update on the status of this p.r. (which now has merge conflicts)?

I can confirm that we are still getting CPANtester failure reports (below, Linux, unthreaded, gcc-4.9.2) on some of the distros cited in #19983:

test FAIL Locale-Hebrew-1.05 (perl-5.37.4) x86_64-linux 5.17.5-x86_64-linode154
test FAIL Digest-OAT-0.04 (perl-5.37.4) x86_64-linux 5.17.5-x86_64-linode154
...
test FAIL Math-FastGF2-0.07 (perl-5.37.4) x86_64-linux 5.17.5-x86_64-linode154

Thank you very much.

khwilliamson · 2022-10-11T08:47:32Z

On 8/24/22 07:02, James E Keenan wrote: @tonycoz <https://github.com/tonycoz>, @khwilliamson <https://github.com/khwilliamson>, Can we get an update on the status of this p.r. (which now has merge conflicts)?

I rebased out the conflicts. I'm waiting for replies about what levels of warnings etc to add to it

…

I can confirm that we are still getting CPANtester failure reports (below, Linux, unthreaded, gcc-4.9.2) on some of the distros cited in #19983 <#19983>: |test FAIL Locale-Hebrew-1.05 (perl-5.37.4) x86_64-linux 5.17.5-x86_64-linode154 test FAIL Digest-OAT-0.04 (perl-5.37.4) x86_64-linux 5.17.5-x86_64-linode154 ... test FAIL Math-FastGF2-0.07 (perl-5.37.4) x86_64-linux 5.17.5-x86_64-linode154 | Thank you very much. — Reply to this email directly, view it on GitHub <#20037 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA2DH7FPJB37R4F3LJM25LV2YMPDANCNFSM55Q2KCWQ>. You are receiving this because you were mentioned.Message ID: ***@***.***>

khwilliamson · 2022-10-15T19:54:43Z

We don't seem to be getting new failures from this, so it looks like the problems are limited in scope.

These are for internal use only, and won't collide with external uses of the public API' PL_na'.

This should fix GH Perl#19983. Some of the macros that extract a PV from an SV also will set a 'len' parameter to how many bytes long it is. The len parameter is supposed to be declared as a STRLEN (or equivalently, Size_t). But there is a significant amount of code that declares the parameter wrongly, such as an int, and this code generally has worked. I do believe that warnings are generated. With 1ef9039 such code broke. One could view this as similar to the hash key retrieval order problem from years past, where we viewed the breakage as a "good thing" to catch real bugs early. But in this case, an int may be large enough so that the issue wouldn't ever arise in practice. What this commit does is to see if the 'len' parameter is the same size and sign as STRLEN. If so, it follows the code in 1ef9039. I believe this is technically undefined behavior, as the only defined behavior is if the pointers point to the same object type, but we do such things all the time without negative consequences. If 'len' isn't equivalent to STRLEN, the implementation falls back to using gcc brace groups, when available, to only evaluate the passed in SV once. If not available, it uses temporary variables for the same effect.

demerphq · 2023-02-08T08:39:19Z

I think we should merge this. @tonycoz, @sisyphus, @dur-randir any feedback on this?

tonycoz · 2023-02-12T23:13:07Z

Doing the single evaluation only for the nolen forms, and documenting that you can do a SvCUR yourself afterwards if you want that behavior

This wouldn't be correct, SvCUR() might not be set (when SvPV() calls an overload) or for a different representation (SvPVutf8() on a read only SV).

tonycoz · 2023-02-12T23:34:52Z

I think we should merge this. @tonycoz, @sisyphus, @dur-randir any feedback on this?

I'm inclined away from this, the type mismatches in the failing modules are diagnosed by the compiler - the modules were always broken.

But that's a soft rejection, I wouldn't merge it, but I won't object if someone else does.

khwilliamson · 2023-02-13T00:36:31Z

I would push for merging this if we had continued to get failure reports. But the lack of them after the initial flurry, indicates to me that there aren't that many more modules out there that failed to act on the compiler warnings they had presented to them.

sisyphus · 2023-02-13T01:54:50Z

Another alternative, and one that I would much prefer to see, is for someone to take ownership or co-maintenance of these modules and release new versions of them with the proper fix in place.
I'm quite happy to be that someone. I'm also quite happy for someone else to instead take on that role for one or more of the affected modules.

AFAICS, there are 3 modules affected:

Locale::Hebrew (which is up for adoption)
Digest::OAT
Math::FastGF2

Are there any other affected modules with non-responsive owners/maintainers that are similarly affected ?

I find it a bit sad that we're even considering covering up these coding errors simply because we're having trouble finding someone to apply the correct fixes to them.
However, I won't complain if the cover-up patch is committed.

Cheers,
Rob

dur-randir · 2023-02-13T12:41:11Z

I agree with Tony. I can't say this'd break anything, just my feelings.

khwilliamson · 2023-02-28T01:53:27Z

I would push for merging this if we had continued to get failure reports. But the lack of them after the initial flurry, indicates to me that there aren't that many more modules out there that failed to act on the compiler warnings they had presented to them.

khwilliamson · 2024-02-14T01:39:34Z

I'm reopening this for further comments, based on new information from Debian. reported by @ntyni

I can't find the details just now. They had to do a workaround for a module that is important but isn't being maintained, and with little possibility of it getting maintained. Maybe Niko can fill in some details.

tonycoz · 2024-02-14T22:19:38Z

It is https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1060458 which they reported upstream as https://rt.cpan.org/Ticket/Display.html?id=151141

khwilliamson · 2024-08-26T21:00:32Z

There doesn't seem to be a need for this still, so reclosing

khwilliamson force-pushed the newsvpv branch from 67af761 to 2d4b96e Compare August 4, 2022 04:04

github-actions bot added the hasConflicts label Aug 10, 2022

khwilliamson force-pushed the newsvpv branch from 2d4b96e to 04ab174 Compare August 24, 2022 14:20

github-actions bot removed the hasConflicts label Aug 24, 2022

khwilliamson force-pushed the newsvpv branch 2 times, most recently from fc2c181 to 951ae4b Compare December 2, 2022 02:57

github-actions bot added the hasConflicts label Dec 24, 2022

khwilliamson added 2 commits January 15, 2023 10:56

Add scratch per_interpreter variables for PV and STRLEN

b5b157d

These are for internal use only, and won't collide with external uses of the public API' PL_na'.

khwilliamson force-pushed the newsvpv branch from 951ae4b to 6faec27 Compare January 15, 2023 17:56

github-actions bot removed the hasConflicts label Jan 15, 2023

demerphq approved these changes Feb 8, 2023

View reviewed changes

khwilliamson closed this Feb 28, 2023

khwilliamson mentioned this pull request Apr 28, 2023

BBC: Math-FastGF2-0.07 triggers "stack smashing detected" #19983

Open

khwilliamson mentioned this pull request Jun 26, 2023

BBC: Crypt-OpenSSL-PKCS10-0.19 triggers "stack smashing detected" #21174

Closed

khwilliamson reopened this Feb 14, 2024

khwilliamson closed this Aug 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workaround XS that wants non-STRLEN for PV len #20037

Workaround XS that wants non-STRLEN for PV len #20037

khwilliamson commented Aug 4, 2022

sisyphus commented Aug 4, 2022

dur-randir commented Aug 5, 2022

khwilliamson commented Aug 5, 2022

tonycoz commented Aug 17, 2022

jkeenan commented Aug 24, 2022

khwilliamson commented Oct 11, 2022 via email

khwilliamson commented Oct 15, 2022

demerphq commented Feb 8, 2023

tonycoz commented Feb 12, 2023

tonycoz commented Feb 12, 2023

khwilliamson commented Feb 13, 2023

sisyphus commented Feb 13, 2023

dur-randir commented Feb 13, 2023

khwilliamson commented Feb 28, 2023

khwilliamson commented Feb 14, 2024

tonycoz commented Feb 14, 2024

khwilliamson commented Aug 26, 2024

Workaround XS that wants non-STRLEN for PV len #20037

Workaround XS that wants non-STRLEN for PV len #20037

Conversation

khwilliamson commented Aug 4, 2022

sisyphus commented Aug 4, 2022

dur-randir commented Aug 5, 2022

khwilliamson commented Aug 5, 2022

tonycoz commented Aug 17, 2022

jkeenan commented Aug 24, 2022

khwilliamson commented Oct 11, 2022 via email

khwilliamson commented Oct 15, 2022

demerphq commented Feb 8, 2023

tonycoz commented Feb 12, 2023

tonycoz commented Feb 12, 2023

khwilliamson commented Feb 13, 2023

sisyphus commented Feb 13, 2023

dur-randir commented Feb 13, 2023

khwilliamson commented Feb 28, 2023

khwilliamson commented Feb 14, 2024

tonycoz commented Feb 14, 2024

khwilliamson commented Aug 26, 2024