New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
openssl-3.2.0-alpha1 fails tests when built out-of-source #21999
Comments
Not able to reproduce this currently. That doesn't mean this isn't a real issue, but maybe I don't have enough parallelism on this system to reproduce it. The most likely cause is some kind of race condition around Consider adding |
I'll poke a bit more now, but while I do, here's a log from a failure in our packaging w/ |
A possible issue here is running the tests while Is it possible |
Hmmm. I also just tried this and have been unable to reproduce it here either. |
I'm going to play with this over the weekend and dig into it. I can still reproduce it in one environment consistently, others are more picky. Please don't worry about this bug until I've had a chance to give an update and hopefully some useful info over the next few days. Cheers! |
@thesamesam - do you have an update for us on this issue? |
No update, closing. If it still happens with alpha2 or fresh master branch checkout, please reopen. |
retested using master: I get the result:
Steps to reproduce:
Note: https://github.com/nvinson/openssl/tree/unix-Makefile.tmpl_fix includes a patch that alters how libcrypto.ld is created by writing the file to libcrypto.ld.tmp, and moving libcyrpto.ld.tmp to libcyrpto.ld when file is complete. This results in build failures due to libcrypto.ld.tmp is missing. configure.pm dump
|
Come to think of it, I don't quite believe this, as that would have had an impact on things like |
Both myself and @nvinson are sort of stunned here as well and don't really get it. I didn't reopen it immediately earlier because I wanted to try get something more useful to debug it and have so far failed. @nvinson managed to later come up with that recipe which implies something is mangling the source dir. |
It's possible that running Another thing: never do parallel make with the install targets. We do have diamond dependencies there which are hard to resolve cleanly... and installation shouldn't normally need to be done in parallel anyway. |
(I'll add |
See openssl/openssl#21999 (comment) - upstream say parallelism isn't supported for the install targets. Bug: openssl/openssl#21999 Signed-off-by: Sam James <sam@gentoo.org>
Can this be closed? |
No, I only addressed the aside you mentioned for install in our packaging. It still happens for me with tests as originally reported. @nvinson's reproducer might work as well with test instead of install, not tried. |
silly question: Are you running on a smaller system, or in a containerized environment? I ask because this failure Can you recreate the issue, and run df immediately after the issue reproduces? |
My assumption has been that's from two runs of the generator and there's no locking/a missing dependency. No, I'm not running out of disk, or ram (just checked). |
Neither disk nor memory was an issue.
|
Reverting 0e55c3a seems to fix the problem for me. |
This suggests a time stamp issue, 'cause this commit should not affect day-to-day building... but could potentially do so still if the |
Just out of curiousity, what happens if you build 3.2.0 pre-release (alpha 1 or beta 1, it shouldn't matter) without the parallelism, i.e. without Could you please try that? |
@levitte I have never reproduced this bug using -j1 for any version. I have always needed -j2 or higher to reproduce. |
Yes, a serial build works fine. |
Does dropping -jx from |
There seems to be a misunderstanding about the conditions needed to trigger this bug. Therefore, let me make it clear -j2 or higher is needed to trigger the bug Neither @thesamesam nor I have observed this bug while building a valid make target when make was invoked with |
@nvinson I run make -j8 or -j16 all the time on out-of-source builds and I've never seen this. However I never use |
BTW you have wrong command in the issue description as it should be |
@nvinson wrote:
Using This should work fine: So what I usually do is: |
Really??? I build with |
@DDvO wrote:
This appears to work. I'll run it a few more times to be sure. |
Ah, looks like this meanwhile has improved! |
Not a 3.2.0 blocker -> removing the milestone. |
I'm not sure why it's not a blocker, as it's a regression for us and it's meant we hadn't been able to test any of it. Running tests with -j1 seems to be a fair bit slower because actually compiling them takes a while but I guess it's a workaround. |
You can build the tests first with |
I think we are doing that, though. But I'll check again. |
So far, it seems fine with |
+1. FWIW when developing I always find running 'make test' after changing a file to result in trying to run some test binaries before they are actually finished compiling, resulting in an error, which is why I always run |
That's an odd one, @hlandau. This issue should really be closed, perhaps after raising a few topical ones from the details we've found here. |
$(SRCDIR)/util/libcrypto.num and $(SRCDIR)/util/libssl.num were made their own targets to have 'make ordinals' reproduce them (run mknum.pl) only if needed. Unfortunately, because the shared library linker scripts depend on these .num files, we suddenly have mknum.pl run at random times when building. Furthermore, this created a diamond dependency, which disturbs parallell building because multiple mknum.pl on the same file could run at the same time. This reverts commit 0e55c3a. Fixes #21999 Partially fixes #22841 Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from #22890) (cherry picked from commit c08b21a)
$(SRCDIR)/util/libcrypto.num and $(SRCDIR)/util/libssl.num were made their own targets to have 'make ordinals' reproduce them (run mknum.pl) only if needed. Unfortunately, because the shared library linker scripts depend on these .num files, we suddenly have mknum.pl run at random times when building. Furthermore, this created a diamond dependency, which disturbs parallell building because multiple mknum.pl on the same file could run at the same time. This reverts commit 0e55c3a. Fixes openssl#21999 Partially fixes openssl#22841 Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> (Merged from openssl#22890)
In Gentoo, we build out-of-source for multilib as it makes life a lot easier (and it's a bit quicker).
In 3.2.0-alpha1, tests seem to fail in this configuration:
The text was updated successfully, but these errors were encountered: