-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workaround XS that wants non-STRLEN for PV len #20037
Conversation
khwilliamson
commented
Aug 4, 2022
Allowing this mistake of passing an int* to a function that expects a STRLEN* sits a bit uncomfortably with me when the STRLEN is bigger than int && the architecture is big-endian. Below is an XS (Inline::C) demo script that exhibits such behaviour on a big-endian machine with intsize=4 and sizesize=8. Here's the demo script:
And here's the output:
On a little-endian machine, I find that there's no problem until the value of the STRLEN overflows 32-bits - which is going to be a pretty rare case where SvPV() is concerned.
Cheers, |
I'd be really, really wary for commiting workarounds that may break silently - as they're the worst ones to debug. Luckily, I can't imagine any easy way to break this (get magic looks like the only one, and they aren't common on cpan), but conceptually introducing those scratch_* variables really bothers me. Does the original idea of "evaluate SV in a macro once" really warrants this? PS: shouldn't scratch_* vars be a subject to the same ifdef as theirs usage? |
To @sisyphus, I used U16 to test this. To @RandiR. I'm unsure as to if it is worth it or not. But this implementation is less susceptible to magic and overloading issues than the one it replaces. But we can make it as loud as we want, including refusing to compile, or deprecating calling with a too-small len parameter, or issuing a runtime warning at the first wrong call of these at each calling instance, as examples The core has used PL_na and PL_Sv for many years to implement things like this. But they are API. I personally spent too many hours finding a bug that turned out to be caused by having one function that used them calling another function that used them. The core needs to stop using those API ones regardless of this PR. This would be a step towards that, and so the variables are not #ifdefd Of course, each such potential use would have to be examined to be sure that the same call stack error couldn't occur. But having core-only ones would allow us to control things better than currently. Other options include:
|
The problem is that when SvPV() was more than a thin wrapper around a function, it could safely be called with a non-STRLEN len parameter for many cases:
But now to avoid repeated evaluation of macro parameters it directly calls a function:
which causes "stack smashing" as described in #19983 and similar problems even for non-magical POK SVs, which the older definition allowed to "work". Both versions caused compilers to produce a type mismatch diagnostic, which unfortunately some CPAN authors ignored. |
@tonycoz, @khwilliamson, Can we get an update on the status of this p.r. (which now has merge conflicts)? I can confirm that we are still getting CPANtester failure reports (below, Linux, unthreaded, gcc-4.9.2) on some of the distros cited in #19983:
Thank you very much. |
On 8/24/22 07:02, James E Keenan wrote:
@tonycoz <https://github.com/tonycoz>, @khwilliamson
<https://github.com/khwilliamson>, Can we get an update on the status of
this p.r. (which now has merge conflicts)?
I rebased out the conflicts. I'm waiting for replies about what levels
of warnings etc to add to it
…
I can confirm that we are still getting CPANtester failure reports
(below, Linux, unthreaded, gcc-4.9.2) on some of the distros cited in
#19983 <#19983>:
|test FAIL Locale-Hebrew-1.05 (perl-5.37.4) x86_64-linux
5.17.5-x86_64-linode154 test FAIL Digest-OAT-0.04 (perl-5.37.4)
x86_64-linux 5.17.5-x86_64-linode154 ... test FAIL Math-FastGF2-0.07
(perl-5.37.4) x86_64-linux 5.17.5-x86_64-linode154 |
Thank you very much.
—
Reply to this email directly, view it on GitHub
<#20037 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA2DH7FPJB37R4F3LJM25LV2YMPDANCNFSM55Q2KCWQ>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
We don't seem to be getting new failures from this, so it looks like the problems are limited in scope. |
fc2c181
to
951ae4b
Compare
These are for internal use only, and won't collide with external uses of the public API' PL_na'.
This should fix GH Perl#19983. Some of the macros that extract a PV from an SV also will set a 'len' parameter to how many bytes long it is. The len parameter is supposed to be declared as a STRLEN (or equivalently, Size_t). But there is a significant amount of code that declares the parameter wrongly, such as an int, and this code generally has worked. I do believe that warnings are generated. With 1ef9039 such code broke. One could view this as similar to the hash key retrieval order problem from years past, where we viewed the breakage as a "good thing" to catch real bugs early. But in this case, an int may be large enough so that the issue wouldn't ever arise in practice. What this commit does is to see if the 'len' parameter is the same size and sign as STRLEN. If so, it follows the code in 1ef9039. I believe this is technically undefined behavior, as the only defined behavior is if the pointers point to the same object type, but we do such things all the time without negative consequences. If 'len' isn't equivalent to STRLEN, the implementation falls back to using gcc brace groups, when available, to only evaluate the passed in SV once. If not available, it uses temporary variables for the same effect.
I think we should merge this. @tonycoz, @sisyphus, @dur-randir any feedback on this? |
This wouldn't be correct, SvCUR() might not be set (when SvPV() calls an overload) or for a different representation (SvPVutf8() on a read only SV). |
I'm inclined away from this, the type mismatches in the failing modules are diagnosed by the compiler - the modules were always broken. But that's a soft rejection, I wouldn't merge it, but I won't object if someone else does. |
I would push for merging this if we had continued to get failure reports. But the lack of them after the initial flurry, indicates to me that there aren't that many more modules out there that failed to act on the compiler warnings they had presented to them. |
Another alternative, and one that I would much prefer to see, is for someone to take ownership or co-maintenance of these modules and release new versions of them with the proper fix in place. AFAICS, there are 3 modules affected:
Are there any other affected modules with non-responsive owners/maintainers that are similarly affected ? I find it a bit sad that we're even considering covering up these coding errors simply because we're having trouble finding someone to apply the correct fixes to them. Cheers, |
I agree with Tony. I can't say this'd break anything, just my feelings. |
I would push for merging this if we had continued to get failure reports. But the lack of them after the initial flurry, indicates to me that there aren't that many more modules out there that failed to act on the compiler warnings they had presented to them. |
I'm reopening this for further comments, based on new information from Debian. reported by @ntyni I can't find the details just now. They had to do a workaround for a module that is important but isn't being maintained, and with little possibility of it getting maintained. Maybe Niko can fill in some details. |
It is https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1060458 which they reported upstream as https://rt.cpan.org/Ticket/Display.html?id=151141 |
There doesn't seem to be a need for this still, so reclosing |