-
Notifications
You must be signed in to change notification settings - Fork 4
Conversation
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
…e system to be present) Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
I never trusted the travis builds too much, so I did a bunch of sbuilds locally.
Now cosmic was forked just a few weeks ago, so something in the overall releases has changed.
But when I enter the chroot, not only do links look sane:
Also in addition I can enter the very same I get the feeling this is limited to chroots, I'll be building in VMs later to prove that theory |
I have built it on all arches here. You can find buildlogs there. There it works on all, but ppc64el - but I actually believe this is a race of something on the FS not being ready. I wonder if cp always used --reflink=auto or if that might be related. |
After re-running enough builds I'd want to summarize:
I seem to have reduced (but not understood) the race window accessing the files before dh_install. |
I'm still puzzled by the symlink issue on dh_install
But is shows me this:
Isn't that proving that it is not a symlink loop, just to then fail with just that? |
Interesting - today it works in Ubuntu Disco, but not yet in Debian unstable. At least I can reproduce at the time in d/rules. And there I realized the file breaking with "Too many levels of symbolic links" is not even the one being a symlink.
I more and more think cp misdetects some other error as -ELOOP and the error message is misleading. |
Hmm, now after about a day of experiments (when I added too much debug it went away) it seems no more reproducible anymore on my local system. Maybe whatever fix was pushed to Ubuntu-disco did notwalso appear in debain-unstable? It now worked 4/4 times for me while formerly I had a 100% bad case rate. Next I rebuilt it in a place now where i can run multi-arch autopkgtests on it later on to be sure how that behaves now .... But there one architecture (ppc64el) still hits the same bug. I'll continue to respin different ideas whenever I have time as this is very odd ... |
The ppc64el build seems to be the one exposed to the bug the most. Even with a lot of extra actions in place it still hits the issue. I was even desperate enough to try a silly sleep/sync in the build, that still fails (good for repro, bad for me) Now maybe at least I could get strace of the cp from ppc64el then. |
On travis it builds fine, only the autopkgtests fail. |
Yeah, it is a race for sure - any build environment might be fine or not. I didn't find any differentiator yet. But I'd want to autopkgtest it on all architectures to be sure, so I'd want to build it. But I'm lost, as I'm down to strace telling me: @bzed - does it even fails on travis autopkgtest with the fixes I added here? I might just run the test in a local VM to be sure (giving up to wait for the ppc64el build). |
With the branch here applied I ran the autopkgtest and it LGTM now. $ sudo autopkgtest -o verify-gpsd-test --no-built-binaries --apt-upgrade --shell --setup-commands="add-apt-repository ppa:ci-train-ppa-service/3522; apt update; apt -y upgrade" gpsd_3.18.1-2~ppa16.dsc -- qemu --qemu-options='-cpu host' --ram-size=4096 --cpus 8 ~/work/autopkgtest-disco-amd64.img I posted a full log of this here If anything I'd be somewhat afraid of this odd build issue, but you are right if it builds fine for you please feel free to go on. I have not much more ideas how to debug it - it is no symlink loop but fails as one :-/ What are the next steps you'd suggest - try uploading and see what happens? maybe some testing of 3.18 on real devices first? |
As a reference look at this build log Even in an iteration (for the race to close) it repetitively shows:
|
My wild guess: fakeroot is broken... |
Also ran a dmesg in the build env after the error now, but no new insight due to that :-( The tests are good as shown a few comments before. For the potential build issue I'm out of ideas for now. |
good case (s390x):
Bad case:
Both lstat are showing it is a link -> S_FLINK You see that in the good case detects it is a symlink, and then uses readlink+symlinkat to "copy". It seems to me that it is a symlink on disk, but within fakeroot stat it is not. There is a new strace released just recently which is complete in Debian and in proposed in Ubuntu. BTW - Thanks @apw for helping me to read most of that strace output! Sadly I still don't know what to do about it now :-/ |
I think we are good, the build issue that is left is odd and makes no sens, if anything it seems like a fakeroot fix on needed the Ubuntu builders. @bzed - so what are the next steps now?
Or is there anything in the MP here that I have to fix for you? |
Actually I think I came up wit ha quirk for the build issue. I'll be pushing the fix to this MP after another round of test builds (cleaning up all my debug changes). P.S. I also have a new theory what causes all of this - the multi-scons build (per python version) will end up "make installing" files multiple times - that is uncommon and might be related. |
…ME with a quirk to avoid issues by broken fakeroot Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Ok, the quirk is working fine on all architectures for me - give it a check and let me know what you think. |
The test fail is some wget not related to this PR. Therefore I wanted to ping on merging and uploading that if it is ok for you? |
A few issues with the Tests were mentioned, this should fix them.
Also all install files follow the LIBGPSSONAME rename, lets do the same for libgps itself as well.