Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build randomly fails with automake-1.16 #529

Closed
bmwiedemann opened this issue Feb 28, 2019 · 25 comments · Fixed by #686
Closed

build randomly fails with automake-1.16 #529

bmwiedemann opened this issue Feb 28, 2019 · 25 comments · Fixed by #686
Assignees
Labels
Milestone

Comments

@bmwiedemann
Copy link

When building heimdal-7.5.0 in openSUSE with make -j1, we get

Making all in hcrypto
make[2]: Entering directory '/home/abuild/rpmbuild/BUILD/heimdal-7.5.0/lib/hcrypto'
  CC       test_rand.o
In file included from test_rand.c:42:
rand.h:46:10: fatal error: hcrypto/engine.h: No such file or directory
 #include <hcrypto/engine.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [Makefile:2064: test_rand.o] Error 1
make[2]: Leaving directory '/home/abuild/rpmbuild/BUILD/heimdal-7.5.0/lib/hcrypto'
make[1]: *** [Makefile:564: all-recursive] Error 1

to reproduce:

osc checkout openSUSE:Factory/libheimdal && cd $_
osc build --noservice -j1

I think there is some dependency not specified in lib/hcrypto/Makefile.am and when doing a parallel build, it usually gets created in time from a parallel job.

This bug was found while working on reproducible builds for openSUSE.

@MilhouseVH
Copy link

I see this same failure on Ubuntu 17.10 when building heimdal-7.7.0 with automake-1.16 and -j8.

I have no problem building heimdal-7.7.0 with automake-1.15.1 (with -j8), so there's an issue when building with automake-1.16 (also automake-1.16.1).

@bmwiedemann what version of automake are you using?

@bmwiedemann
Copy link
Author

We are building with automake-1.16.1 in openSUSE:Factory. Building for Leap:15.1 with automake-1.15.1 indeed does not fail in this way. So maybe libheimdal triggers a problem/regression in automake-1.16 ?

@MilhouseVH
Copy link

MilhouseVH commented Jun 18, 2019

The following automake commit seems to be the trigger for the heimdal failure:

http://git.savannah.gnu.org/cgit/automake.git/commit/?id=f4e91bfc490da63209aad19636568da3b955dcd4&h=master

I can apply this commit on automake-1.15.1 and reproduce the above failure.

I've no idea if this is a heimdal or automake issue.

@MilhouseVH
Copy link

@bmwiedemann it might be worth updating the title of this issue to reference automake-1.16+ as I don't think the problem has anything to do with -j1.

@bmwiedemann bmwiedemann changed the title build fails with make -j1 build randomly fails with automake-1.16 Jul 8, 2019
@MilhouseVH
Copy link

@nicowilliams sorry for the ping, but any ideas on this one? automake-1.16 is starting to be used more widely (it's been available for 19 months) and heimdal isn't compatible - the lack of any similar reports against automake would suggest this is a heimdal issue...

@jaltman
Copy link
Member

jaltman commented Oct 6, 2019

@MilhouseVH my opinion is that since your have identified the specific change which introduced a regression in automake 1.16 that the bug report be filed against automake. The commit message for http://git.savannah.gnu.org/cgit/automake.git/commit/?id=f4e91bfc490da63209aad19636568da3b955dcd4&h=master indicates that there should be no change in behavior. Shorter pathnames are supposed to be used when doing so would not introduce an ambiguity. Clearly that change breaks Heimdal. Notifying the author of the unintended side effect of the commit is the appropriate thing to do.

Unfortunately Heimdal will need to work around this regression as well.

@MilhouseVH
Copy link

MilhouseVH commented Oct 6, 2019

@jaltman thanks for the reply. And thanks for prompting me to dig a little deeper - I have in fact now found an automake email discussion[1] where the subdir-objects commit[2] has been acknowledged as causing a regression[3], although not in exactly the same way that it fails for heimdal but maybe close enough (I myself don't have an issue building gnulib with automake-1.16.1 and GNU make-4.2.1).

There may be a work-around[4] but I'm not entirely sure how this can be applied to heimdal as to be quite honest I have limited understanding of Makefile syntax - right now, I'm just stuck in the middle... 😄

I've sent a reply to the automake mailing list (a reply to Thomas Martitz[3] - I've subscribed, but no idea when my post will appear) with details of this heimdal issue in the hope of a potential workaround, but also to inform any potential fix.

  1. https://lists.gnu.org/archive/html/automake/2019-08/msg00000.html
  2. https://lists.gnu.org/archive/html/automake/2019-08/msg00004.html
  3. https://lists.gnu.org/archive/html/automake/2019-08/msg00010.html
  4. https://lists.gnu.org/archive/html/automake/2019-08/msg00005.html

@MilhouseVH
Copy link

My mailing list post is here:

https://lists.gnu.org/archive/html/automake/2019-10/msg00000.html

@MilhouseVH
Copy link

This heimdal build failure continues to be an issue with the newly released automake-1.16.2.

Either this isn't an automake issue, or automake don't consider it an issue worth fixing - could the heimdal developers possibly reach out to automake directly to try and resolve this?

@jaltman jaltman added the bug label Apr 7, 2020
@jaltman jaltman added this to To do in Heimdal 8 Release via automation Apr 7, 2020
@jaltman jaltman added this to the Heimdal 8 milestone Apr 7, 2020
@lhoward lhoward self-assigned this Apr 7, 2020
@lhoward
Copy link
Member

lhoward commented Apr 7, 2020

The comments in the mailing list messages you referenced suggest it's something to do with pathname equivalences, e.g. ./foo/libbar.a not matching foo/libbar.a.

@lhoward
Copy link
Member

lhoward commented Apr 7, 2020

I'm trying with automake-1.16.2 on macOS but haven't been able to duplicate the issue yet.

@lhoward
Copy link
Member

lhoward commented Apr 7, 2020

With automake-1.16.2 (and latest autoconf/libtool) on Ubuntu 18.04.3 LTS and Heimdal master I can't seem to duplicate this using make -j1. I wonder if the issue is that in your build system the built artefacts go outside the source tree?

@lhoward
Copy link
Member

lhoward commented Apr 7, 2020

OK. Could you please try grabbing commit cc6a3f3 and seeing if that fixes it?

@bmwiedemann
Copy link
Author

Commit cc6a3f3 fixed it for me (still with automake-1.16.1). If others don't object, this issue can be closed.

@lhoward
Copy link
Member

lhoward commented Apr 7, 2020

I'll leave it to @jaltman to close, in case he wants to cut another release.

@MilhouseVH
Copy link

@lhoward many thanks for taking a look at this!

I'm testing with automake-1.16.2 (also tested with automake-1.16.1 but result is the same) and unfortunately cc6a3f3 doesn't appear to be a complete fix, although it does move the failure on a step.

Previously, heimdal-7.7.0 (unpatched) with automake-1.16.2 would fail as follows:

Making all in hcrypto
make[3]: Entering directory '/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/.x86_64-linux-gnu/lib/hcrypto'
  CC       test_rand.o
In file included from /home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/lib/hcrypto/test_rand.c:42:0:
/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/lib/hcrypto/rand.h:46:10: fatal error: hcrypto/engine.h: No such file or directory
 #include <hcrypto/engine.h>
          ^~~~~~~~~~~~~~~~~~
compilation terminated.
make[3]: *** [Makefile:2061: test_rand.o] Error 1
make[3]: Leaving directory '/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/.x86_64-linux-gnu/lib/hcrypto'
make[2]: *** [Makefile:566: all-recursive] Error 1
make[2]: Leaving directory '/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/.x86_64-linux-gnu/lib'
make[1]: *** [Makefile:614: all-recursive] Error 1
make[1]: Leaving directory '/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/.x86_64-linux-gnu'

(build log: http://ix.io/2h34)

With cc6a3f3, heimdal-7.7.0 and automake-1.16.2 now fail as follows - heimdal seems to get a little further in the build than hcrypto:

make  all-am
make[4]: Entering directory '/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/.x86_64-linux-gnu/lib/hx509'
  GEN    /home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/lib/hx509/hx509-protos.h
updating hx509-protos.h
  GEN    /home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/lib/hx509/hx509-private.h
updating hx509-private.h
  GEN    hxtool-commands.h
;  CC       hxtool.o
In file included from /home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/lib/hx509/hxtool.c:34:0:
/home/ubuntu/projects/LibreELEC.tv/build.LibreELEC-RPi2.arm-9.80-devel/build/heimdal-7.7.0/lib/hx509/hx_locl.h:66:10: fatal error: ocsp_asn1.h: No such file or directory
 #include <ocsp_asn1.h>
          ^~~~~~~~~~~~~
compilation terminated.
make[4]: *** [Makefile:1320: hxtool.o] Error 1

(build log: http://ix.io/2h33, and build log with V=1: http://ix.io/2h3o)

ocsp_asn1.h has not been generated, and does not exist.

In both cases I'm building heimdal with export MAKEFLAGS=-j1

@MilhouseVH
Copy link

Testing automake-1.16.2 with/without cc6a3f3 and MAKEFLAGS=-j8 is actually successful!

-j8 with cc6a3f3: log
-j8 without cc6a3f3: log

Although I doubt it will always build reliably with -j8, as there is clearly race-related behaviour.

But with -j1, it always fails as per the previous post.

@jaltman
Copy link
Member

jaltman commented Apr 7, 2020

Since "master" 7055365 doesn't experience the problem there must be more commits on master that fix build dependency issues that must be cherry-picked to "heimdal-7-1-branch".

@jaltman
Copy link
Member

jaltman commented Apr 7, 2020

@MilhouseVH please try building c4cff68 which is the new tip of the heimdal-7-1-branch

@lhoward
Copy link
Member

lhoward commented Apr 7, 2020

BTW OT but @bmwiedemann – when did SuSE switch back to Heimdal?

@MilhouseVH
Copy link

@jaltman I will build latest heimdal master and get back to you with feedback. Unfortunately my remote build server VM fell off the network a few hours ago which may delay me!

@lhoward
Copy link
Member

lhoward commented Apr 8, 2020

@MilhouseVH I think @jaltman is asking you to build c4cff68 which is the new tip of the heimdal-7-1-branch, rather than master (which is not in a shippable state).

@MilhouseVH
Copy link

@jaltman ah OK thanks - as soon as I re-establish contact with my VM I'll give it a go!

@MilhouseVH
Copy link

@jaltman, @lhoward: I've successfully built c4cff68 (heimdal-7-1-branch) and can confirm this branch has no issue with automake-1.16.2 (and make-4.3, just for completeness). I tested both -j1 and also -j8, building both multiple times and had no failures. Many thanks!

@jaltman
Copy link
Member

jaltman commented Apr 8, 2020

@MilhouseVH thanks for testing. I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
4 participants