-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build fails when building with PERL_NO_COW #19987
Comments
Observed on FreeBSD as well. |
On Sat, Jul 23, 2022 at 01:16:13PM -0700, Bram wrote:
Noticed when attempting to build with `-Accflags='-DPERL_NO_COW'`
$ ./Configure -des -Dusedevel -Accflags='-DPERL_NO_COW'
$ make -j8 miniperl
...
./miniperl -Ilib -f write_buildcustomize.pl
makefile:363: recipe for target 'lib/buildcustomize.pl' failed
make: *** [lib/buildcustomize.pl] Segmentation fault (core dumped)
Given that it's been broken for a year and you're the first to notice,
I wonder whether the time has come to remove the facility to build a
non-COW perl? It's been on by default for 8 years now.
…--
All wight. I will give you one more chance. This time, I want to hear
no Wubens. No Weginalds. No Wudolf the wed-nosed weindeers.
-- Life of Brian
|
It's broken in a dev release since v5.35.4 (2021-Sep-20) Anyway, briefly discussing this with @leonerd :
The question is if someone still has a use-case for building without Copy on Write.. (I do not have one) |
I have delayed upgrading to newer perl because they fail when PERL_NO_COW is on. But that too is now failing, because older perls have other compilation errors on the latest Linux systems. COW does not seem to extend to C functions: when passing a string reference to a C function that updates it (as a byte array) then the cloning does not happen it seems. If this feature disappears I will have to do explicit copying in many places. I think this compilation problem should be fixed (same as above, except on Linux Mint 21.1) rather than shipping for a long time something that doesn't work. Whether the feature is useful or not, thats a separate discussion. I think it is. So my big software package is stuck between one compilation error or another. I will even pay reasonably to have it fixed. (Btw, the switch to Github instead of email means shifting convenience from users to developers, maybe not a good idea, at least I always try to do the opposite with my code). |
@nielsl I suspect you just need to change your code a bit so that SvPV() are changed to be one of the following:
You can also use the function COW does not refer to the Unix COW facility, it is purely an internal feature. I would be happy to discuss more privately, you can find my email on the list. Having said that, I do think that we should make PERL_NO_COW work. Nothing we do should depend on COW. That would be/is a very sad state of affairs, as it would mean we have a hard dependency on some particular implementation of COW, and IMO we should be able to change how COW works without causing havok. The whole scheme of putting the COW bytes at the end of the PV buffer, and various other aspects of its implementation definitely could be improved, as it leaves cases that can't be COW'ed and causes other problems (there is some kind of hairy interaction with file operations and regexes that we never really fixed afair), and if we have somehow hard coded a dependency on COW then IMO we should fix it. |
@bram-perl said:
That doesn't make sense to me, it feels like an unforced issue that we can resolve if we choose to. I will add this to my todo list. |
Yah; firstclass-bools don't actually rely on COW, they just happen to exploit the fact that if they set the COW flag, then the PV buffer is shared by plain pointer address, and so we can use that pointer to distinguish if it is the special boolean true/false value. You could arrange for that pointer to be statically-shared by some other mechanism if COW was not enabled. |
@demerphq Thank you for giving me a way out at the C level. I am not very good with Perl/C, but managed to create "pidgin-C" functions in all the places where there is a memory or speed bottleneck, and where the code doesn't change. |
Oh that would be wonderful, thanks. I can then update to the latest Perl and to all the non-standard Perl and GNU packages that I depend on. Then I can install my system on the latest Linux and Unix systems again. And if the fix is not immediate, I can now make temporary workarounds knowing that there will be a fix. Bravo. |
Is this done by now? |
I can answer my own question: in 5.38.2, compiling with -DPERL_NO_COW still crashes on the most commonly used Linux systems. Attached are the configure and make outputs from 5.38.2 and 5.26.1, and the hardware configuration. Below the gcc version and system software. I am using 5.26.1 with a large amount of perl modules that I wrote for a DNA analysis package. Customers use it and I cannot build it. I could upgrade to the latest stable, and I will, but it means updating CPAN modules and testing them in combination for breakage (no, I cannot just always take the latest, something will break). As was suggested above, I could change C-function calls throughout, but doing and testing that would take me 1-2 weeks of studying and repair probably (well I have no idea really). I hoped it would be fixed by now. I don't mind paying in advance if this will be fixed quickly, since customers will have to postpone otherwise. Do you see it as an easy fix, or is it difficult? can the fix be back-ported to 5.26.1? is there a fix already in development versions? Contact info below. 5.38.2-error.txt gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 DISTRIB_ID=LinuxMint Niels Larsen, PhD, CEO E-mail: niels@genomics.dk |
Apparently not. Trying the reduced case provided by @bram-perl in July 2022 (on Ubuntu Linux 22.04 LTS with
Note that perl-5.26 was originally released on May 30 2017. So it's long out-of-maintenance. |
Perl-5.26.1 does compile with PERL_NO_COW on Ubuntu 20 (gcc-9), with a lot of warnings, but it fails on Ubuntu 22 (gcc-11). Perl-5.34.3 does compile with PERL_NO_COW on Ubuntu 22, so that takes the immediate customer pressure off me. I tried your suggestions, but get either symbol lookup error or clearly wrong results. I am not fluent in C though. As far as I can tell, this affects anyone who calls a C-function from Perl that modifies a string/byte array, so isn't it a bit odd no-one has complained .. |
Thanks for mentioning this, after I saw your comment I did have luck with PERL_NO_COW in Perl-5.34.3 on Ubuntu 22 (gcc-11). As far as I can tell, this affects anyone who calls a C-function from Perl that modifies a string/byte array, so isn't it a bit odd no-one has complained .. |
We're calling |
You mean I should I call
void SV_CHECK_THINKFIRST(SV * sv )
on every scalar passed from Perl that C operates on
in every C-function in every program until PERL_NO_COW is fixed?
I could, if that would make it work for sure. For now,
I am ok with 5.34.3, even if it takes some weeks to fix.
Sent with [Proton Mail](https://proton.me/) secure email.
…On Tuesday, April 30th, 2024 at 11:44 AM, Leon Timmermans ***@***.***> wrote:
> As far as I can tell, this affects anyone who calls a C-function from Perl that modifies a string/byte array, so isn't it a bit odd no-one has complained ..
We're calling SV_CHECK_THINKFIRST quite aggressively (probably even too aggressively) so usually that's not a problem.
—
Reply to this email directly, [view it on GitHub](#19987 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/AAIPDLSVK25R5EBFKFFNNYTY75RWJAVCNFSM54OP5TQ2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBYGQ4DIOJWGIYA).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Well, according to https://perldoc.perl.org/perlapi#SV_CHECK_THINKFIRST you should call |
Do you have some code you can share? That would make it a lot easier for us to help you. |
Thank you, yes certainly. I did send the piece below to
you, but must not have gotten the right e-mail address.
Nearly all my byte allocations are done this way
static void* get_ptr( SV* obj ) { return SvPVX( obj ); }
and calls look like
unsigned int interleave_sims_C( float maxdif,
SV* old_ndcs_sv, SV* old_sims_sv, unsigned int old_len,
SV* add_ndcs_sv, SV* add_sims_sv, unsigned int add_len,
SV* new_ndcs_sv, SV* new_sims_sv )
{
unsigned int* old_ndcs = get_ptr( old_ndcs_sv );
unsigned int* add_ndcs = get_ptr( add_ndcs_sv );
unsigned int* new_ndcs = get_ptr( new_ndcs_sv );
float* old_sims = get_ptr( old_sims_sv );
float* add_sims = get_ptr( add_sims_sv );
float* new_sims = get_ptr( new_sims_sv );
....
}
The get_ptr function is identical across many modules so it
would be quite easy to fix it turns out, until the PERL_NO_COW
compilation option is fixed (as you suggested it should be).
If I change SvPVX to SvPVX_force as suggested, then I get
undefined symbol: SvPVX_force
But I don't really know what I am doing in C, so maybe that
is why. My e-mail address is niels at genomics dot dk if you
want to ask me something off the list.
With thanks,
Niels L
|
What happens if you use |
With 5.38.2 compiled without PERL_NO_COW, SvPV_force_nolen (for both readonly and read/write buffers) runs but produces very wrong results. I would have to track where that happens and why, but that is not a quick thing to do. I am guessing it will make no difference to use SvPV_nolen for readonly and SvPV_force_nolen for read/write, but I will try that if you want. Well I know what I put into all the linear buffers (Perl strings) and their boundaries, but I am weak at C and largely unfamiliar with Perl guts. I have ways to check these buffers and overall results and SvPVX has not produced errors since 2012 and it has been battle-tested "in production". But I will follow you Perl gurus advice of course. I am able to compile 5.34.3 with PERL_NO_COW and have things working, which takes pressure off me for weeks, maybe months. Longer term I sure hope PERL_NO_COW will be back, as Yves Orton argued above it should. |
I think you want SvPV_mutable() or SvPVX_mutable(), it requires a len argument, but you can provide it and ignore it. It would be nice if you could flesh out what you mean by "very wrong" would help us understand what the problem is a bit better. The function @Leont suggested, SvPV_force_nolen() is similar to SvPV_mutable() for many purposes, but it can also break things. See the docs:
So i can easily imagine that excessive use of SvPV_force_nolen() would break things. I didnt notice anything in this thread mentioning SvPV_mutable() or SvPVX_mutable() so could you try those? Some kind of script which replicates the issues you are having would be super nice. Some kind of idea of what "very wrong" means would as well. Really you shouldn't /need/ a PERL_NO_COW, so lets figure out why you think you do. |
Yeah you're right. |
With SvPVX_mutable (which takes no second argument), it runs and makes same wrong results as SvPV_force_nolen. With SvPV_mutable (which takes a length argument) I got compile errors that I don't yet understand. I will make a test case, but I am struggling with deadlines and it make take two weeks before it appears. I see two separate questions with PERL_NO_COW, 1) that it crashes compilation in latest stable on all platforms probably, and 2) whether the feature is useful or not, and should be phased out or not. One reason no-one have complained could be that very few other than module writers use C functions from Perl, but that is pure guessing. |
To me the recent comments look a bit out of scope for this issue and look more like a "support/help/... request" and should be discussed elsewhere (mailing list?) and not in this particular issue. Footnotes
|
Helping me with a work-around I agree could be taken off this forum, and I agree it is a bit noisy and off-topic. But having stable Perl crash since 5.36.0 on probably all platforms when compiled with PERL_NO_COW is certainly on-topic. I brought it up last summer, and with a timely fix there would have been no need for work-arounds. |
Ok, well until you have time to flesh that out I am not sure what else we can do. Those functions should ensure that the SV owns the ptr, and that the buffer it points to is safe to modify in place. So if you are experiencing breakage I don't see why PERL_NO_COW would fix things.
Regarding 1, we have a general problem that we don't test esoteric build config very well as a general rule of thumb. Building with all the different flags would be very resource intensive so we have shied away from testing them. Regarding 2, my personal view is that COW is an optimization, and that as a matter of general principal we should able to disable an optimizations and still function correctly. Having said that I am pretty sure that PERL_NO_COW is implemented in such a way that the code is divergent from the COW case such that we have to maintain two separate implementations. Ideally PERL_NO_COW would be "copy buffer always" instead of the "copy buffer when refcount=254" (i am pretty sure 255 is used for something special), but I suspect it is more like "don't do COW stuff at all". I think it probably makes sense to rework PERL_NO_COW so the code doesn't diverge but that we still have a "never COW" mode of operation. I will reach out to you privately to see if I can help you sort this out. |
There's an underlying assumption behind this discussion that It gets very awkward if we were to remain down the path of making all features optional for all of time - as the number of optional features increases over time, the number of possible combinations increases exponentially and it becomes impossible to test all of them. It could now be the case that we declare COW strings a successful experiment and simply say they're always present now, thus removing the configuration option. |
@leonerd I dont think you can put an optimization, especially a kinda hacky optimization like COW in the same categories as other features like tie or overload, those are language level features that are visible to a programmer, COW is not, its presence or absence should be completely invisible to a user except for the performance impact of having it. IMO optimizations should be disablable, if only for testing and bug detection. As I said in my previous comment, I think it is probably quite reasonable to rework things so we don't have divergant code-paths, but we should also have a mode where we set the refcount at which a buffer is copied to be low enough that the buffer is copied always. COW isnt a /feature/ of perl, it is an implementation detail of how we manage certain practical problems. There are other options and other approaches (especially at the implementation level), and we should be free to change the code to use a different strategy without breaking everything. If anything is expecting COW for anything other than performance reasons then that is a barrier to improving our current solutions to these problems. The current implementation arguably has issues. It does not work consistently, it does not play nicely with certain forms of XS string initialization, it puts the refcount at the end of the buffer where it causes cache line spoilation for nothing, and it has a practical limit of 254 copies per buffer. So arguably it should at some point be improved or changed. Ensuring we can build with it disabled is about the best way to ensure we dont end up on dependencies on the implementation. I agree with you that not every /feature/ should have a way to disable it. But i do think that pretty much every /optimization/ should have a way to disable it and still get correct behavior. |
Agreed. That's the only way to be sure it's good or not. |
@nielsl s problem sounds like his C funcs are doing writing buffer overruns on SvPVX and overwriting with "garbage" the CowREFCNT(sv) which is the last malloc allocated byte in the buffer, which CowREFCNT is directly after invisible C str terminating NUL, or CowREFCNT is some uninit/unused bytes further down the buffer, after the terminating NULL byte. Also remember about SINCE FOREVER (20 yrs) there was a tiny bit of faux-COW, in the sorta rareish on PP level SVPVs with share_hek_hek/SvIsCOW_shared_hash/unshare_hek PV buffers which are per interp/thread "globals".
should really be
or atleast not memory corruption to Perl, but maybe corrupt/uninit input to your 3rd party C lib/funcs.
I can't tell from " BTW, keeping " Another thing, if your doing in PP before calling the XSUB.
smells risky but I think in prior years this always worked, but with COW strings (im not sure I havent C stepped newer perls), $packedvec could be a shared some-type-COW PV buffer, and DEF not a "malloc" buffer for your purpose. On 2nd thought, your XS code problems start with |
Noticed when attempting to build with
-Accflags='-DPERL_NO_COW'
Rebuilding with -DDEBUGGING shows an assertion failure:
Reducing to a minimal test case:
Bisecting points to commit 914bb57: (@leonerd)
(Note: I have no use case for
-DPERL_NO_COW
, I only tried it while looking at another issue)Steps to reproduce
./Configure -des -Dusedevel -Accflags='-DPERL_NO_COW' -DDEBUGGING
make -j8 miniperl
./miniperl -wle 'my $x = (0 == 1);'
Actual result
Segmentation fault/Assertion failure:
Expected result
No crash.
The text was updated successfully, but these errors were encountered: