-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEGV with debugging perls with multiplicity on #90
Comments
@sjaeckel do you have any idea what might went wrong in The line of the segfault is https://github.com/DCIT/perl-CryptX/blob/master/src/ltc/prngs/fortuna.c#L234 |
The first thing that comes to my mind is that the allocated struct isn't big enough. Could be because How can this be reproduced? |
A fresh report with a more recent perl (5.38.0) that exposes the problem: http://www.cpantesters.org/cpan/report/d4da173a-1f77-11ee-a370-d61eba172296 Not every perl with similar configuration exposes the problem. But it seems like when you have a compilation that exhibits it, then it is reproducable. I just let this perl from the report above run the t/prng_fortuna.t test ~1000 times and the SEGV happened every time. The stack trace for this perl looks practically the same as above:
|
How can I reproduce this locally? Can I somehow get access to this exact version that fails? I tried it locally with the latest version and
|
It is not easy to reproduce, I have tried to build perl-5.36.1 binary on Ubuntu-22.04 with the same options as in the original failing report:
But I was unable to reproduce the failure in |
I have been able to reproduce it. The problem is in these innocent looking lines.
Somehow those can result in a null-pointer dereference. I don't understand what's going on here either, it only happens with I worked around it by putting removing those two lines and using this instead (before initializing the pools)
Obviously, this is not a very satisfying fix. |
@sjaeckel ^^^ |
@karel-m I'm already watching this issue :)
@Leont How?
TBH I would prefer to leave the fortuna code as it is and wait for the moment when someone solves the underlying problem, since that can't be the real solution. Or am I mistaken here? |
Just for completeness here is a code fragment from my perl xs/c module, something may be wrong here:
|
And it is also worth mentioning that the same code works without crash for Crypt::PRNG::ChaCha20 / Crypt::PRNG::RC4 / Crypt::PRNG::Sober128 / Crypt::PRNG::Yarrow the difference is only in |
Lines 121 to 125 in fc61205
perl-CryptX/inc/CryptX_PRNG.xs.inc Lines 25 to 40 in fc61205
IMO that code looks fine. As pointed out by @Leont the crash also doesn't happen on the call of |
I suspect the issue only occurs on debugging perls, I don't fully understand that because AFAICT that shouldn't affect the crypto code at all. |
That doesn't matter, it shouldn't happen. Please write down how it can be reproduced :) |
While looking through the Perl internals regarding memory management ... Could this issue be related to mixing native and Perl-specific malloc/free calls? Using native malloc to allocate memory but Perl-free to free it or vice versa? Have you ever thought of using the Perl-specific malloc/free calls inside ltc/ltm instead of the native ones? As the macro magic involved is quite extensive until you arrive at the really called Perl MM function I guess the easiest would be to trampoline those inside cryptx ...
Then pre-define Or do you already do that and I missed it while searching through the sources? :) |
That would be But I don't think that's what's going on here. |
https://github.com/Perl/perl5/blob/dd4eb78c55aab441aec1639b1dd49f88bd960831/perl.h#L1697-L1739 You're sure?
If nobody reveals how it can be reproduced I'm pretty sure we will never find out. |
If using perlbrew, compile a perl with «perl install perl-5.38.0 --debug --thread», and install the distribution on that perl. |
It still doesn't fail on any of my machines... and with those two (slightly) different build configurations. After looking through some of the failed builds on https://www.cpantesters.org/distro/C/CryptX.html I saw that all of the segfaults were on a machine called |
@Leont you've been able to reproduce the issue on a machine that you have access to? |
Yes, I can reliably reproduce it on my computer. |
Can you maybe tell me all the details of the tools you're using in the process? Which Distro and Compiler versions are you using? Can you please write down the exact command how you run all the tools? perlbrew etc.? Or could you maybe even create a docker image to reproduce this, based on your distro? Or do you see another way how we can debug this? |
I can confirm I can not reproduce the issue with CryptX 0.080_006 |
@karel-m what does that label exactly mean? First I thought that an issue tagged with this label "is fixed in ltc". After having a second thought is it instead "depends on ltc to be fixed"? |
@sjaeckel the label indicates that the issue requires a fix in the libtomcrypt sources (at least, that's my opinion, which you might not share :). Maybe I should rename it to "needs a fix in libtomcrypt." FYI CryptX 0.080_006 = libtomcrypt current develop branch 12bf723b which includes many changes since CryptX 0.080. Interestingly, there were basically no changes to the Fortuna code, so I have no idea why the above reported issue seems to have disappeared. |
I'm sharing your opinion and I doubt that the underlying issue is fixed. @Leont which CPU model does the computer have you were seeing this on?
👍 |
AMD Ryzen 5 3600 6-Core Processor. |
OK, that CPU has AES-NI support. ... and is AES-NI even enabled? nevermind. I was just thinking aloud and I still don't get it where the problem could originate... |
Sample fail report: http://www.cpantesters.org/cpan/report/9dadce3e-e8fc-11ed-a654-b70f1145618a
With that same perl I produced a core file and then got this stack trace:
The text was updated successfully, but these errors were encountered: