Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test error on i386 arch #59

Closed
jgmbenoit opened this issue Nov 6, 2021 · 27 comments
Closed

test error on i386 arch #59

jgmbenoit opened this issue Nov 6, 2021 · 27 comments

Comments

@jgmbenoit
Copy link
Contributor

Version 0.9.9 can build on i386 ach in Debian Sid environment but it fails the test.
In a gap session at the source root folder:

SetPackagePath("float",".");
TestPackage("float");

gives the message

Error, GET_MPFR: object must be an MPFR, not a integer in
  Inverse( r.PI ) at .//lib/mpfr.gi:73 called from 
<function "unknown">( <arguments> )
 called from read-eval loop at .//lib/mpfr.gi:324
type 'quit;' to quit to outer loop

The issue does not show up on other (official) architectures.
Any hint to fix it ?

@laurentbartholdi
Copy link
Collaborator

@fingolfin , is this the issue you've been working on? Is it fixed now?

@fingolfin
Copy link
Member

It's not the issue I was working on, or at least not that I could tell; in particular, I never observed the behavior described by @jgmbenoit . But of course it is possible that one of the changes we made affected it.

To debug this issue, we need more information by @jgmbenoit:

  • which version of GAP is that?
  • does the issue still occurs with the master branch?
  • what was the full output from starting GAP (i.e., including its banner) until that error?
  • ideally also attach the config.log for float

@jgmbenoit
Copy link
Contributor Author

The following CI log test responses to some of the questions. Otherwise the previous version 0.9.1 built well.

@laurentbartholdi
Copy link
Collaborator

This is very strange. I have a hard time guessing which object is supposed to be a MPFR float but is an integer.

As I see it, the culprit has most chances of being in the input parser. I just added a few lines to tst/arithmetic.tst to see if the problem's there. Do you have a way of re-running the test on your debian-i386 setup? I guess you don't directly have access to such a machine, otherwise it might be easier to inspect actually what goes on by typing in the lines in tst/arithmetic.tst (after SetFloats(MPFR)) and seeing exactly where GAP chokes.

@fingolfin
Copy link
Member

Let me stress that we are testing 32bit in the CI here, and it works. I also have tested both 0.9.9 and the current master on various 32bit systems, and the tests pass. However, my tests all were with GAP master, not 4.11.1 (but I don't think this matters, but I will try to rule it out)

@fingolfin
Copy link
Member

@jgmbenoit could you try again with float 1.0.1 ?

@jgmbenoit
Copy link
Contributor Author

I have just packaged it. Let see what we get.

@jgmbenoit
Copy link
Contributor Author

The error persists: see the build log.

@fingolfin
Copy link
Member

This is compiling a 32bit .so file; So it can't possibly work. What I don't get, though, is why it doesn't refuse to load float in the first place. Weird

To fix the build, add -m32 to CPPFLAGS and LDFLAGS

@fingolfin
Copy link
Member

Actually that was probably nonsense, assuming your compiler defaults to producing 32bit binaries! Sorry for the confusion

@jgmbenoit
Copy link
Contributor Author

Can you reproduce the issue on your side ?
Can it be a gap configuration issue ?

@fingolfin
Copy link
Member

So far I was not able to reproduce the issue. However, I only had an Ubuntu 20.04 64bit system on which I installed 32bit tooling. I'll try and see if I can set up a "proper" Debian 32bit VM

@fingolfin
Copy link
Member

@jgmbenoit a bit off-topic, but: I just tried to apt install gap-core, and that resulted in:

The following NEW packages will be installed:
  adwaita-icon-theme at-spi2-core dbus dbus-bin dbus-daemon dbus-session-bus-common dbus-system-bus-common
  dbus-user-session dconf-gsettings-backend dconf-service dmsetup dvisvgm fontconfig fonts-droid-fallback
  fonts-lmodern fonts-mathjax fonts-mathjax-extras fonts-noto-mono fonts-urw-base35 gap gap-alnuth gap-atlasrep
  gap-autpgrp gap-character-tables gap-core gap-dev gap-doc gap-factint gap-gapdoc gap-io gap-libs gap-online-help
  gap-polycyclic gap-primgrp gap-smallgrp gap-table-of-marks gap-transgrp gtk-update-icon-cache hicolor-icon-theme
  libapparmor1 libargon2-1 libatk-bridge2.0-0 libatk1.0-0 libatk1.0-data libatspi2.0-0 libauthen-sasl-perl
  libavahi-client3 libavahi-common-data libavahi-common3 libcairo-gobject2 libcairo2 libclone-perl libcolord2
  libcryptsetup12 libcups2 libdata-dump-perl libdatrie1 libdbus-1-3 libdconf1 libdevmapper1.02.1 libdrm-amdgpu1
  libdrm-common libdrm-intel1 libdrm-nouveau2 libdrm-radeon1 libdrm2 libelf1 libencode-locale-perl libepoxy0
  libfile-basedir-perl libfile-desktopentry-perl libfile-listing-perl libfile-mimeinfo-perl libfont-afm-perl
  libfontenc1 libfribidi0 libgap-dev libgap7 libgdk-pixbuf-2.0-0 libgdk-pixbuf2.0-bin libgdk-pixbuf2.0-common
  libgl1 libgl1-mesa-dri libglapi-mesa libglib2.0-0 libglib2.0-data libglvnd0 libglx-mesa0 libglx0 libgraphite2-3
  libgs9 libgs9-common libgtk-3-0 libgtk-3-bin libgtk-3-common libharfbuzz0b libhtml-form-perl libhtml-format-perl
  libhtml-parser-perl libhtml-tagset-perl libhtml-tree-perl libhttp-cookies-perl libhttp-daemon-perl
  libhttp-date-perl libhttp-message-perl libhttp-negotiate-perl libice6 libicu67 libidn12 libijs-0.35
  libio-html-perl libio-socket-ssl-perl libio-stringy-perl libip4tc2 libipc-system-simple-perl libjbig2dec0
  libjs-mathjax libjson-c5 libkmod2 libkpathsea6 liblcms2-2 libllvm12 liblua5.3-0 liblwp-mediatypes-perl
  liblwp-protocol-https-perl libmailtools-perl libnet-dbus-perl libnet-http-perl libnet-smtp-ssl-perl
  libnet-ssleay-perl libnss-systemd libopenjp2-7 libpam-systemd libpango-1.0-0 libpangocairo-1.0-0
  libpangoft2-1.0-0 libpaper-utils libpaper1 libpciaccess0 libpixman-1-0 libptexenc1 libreadline8 librsvg2-2
  librsvg2-common libsensors-config libsensors5 libsm6 libsynctex2 libteckit0 libtexlua53 libtexluajit2
  libtext-iconv-perl libthai-data libthai0 libtie-ixhash-perl libtimedate-perl libtry-tiny-perl liburi-perl
  libvte-2.91-0 libvte-2.91-common libvulkan1 libwayland-client0 libwayland-cursor0 libwayland-egl1 libwoff1
  libwww-perl libwww-robotrules-perl libx11-protocol-perl libx11-xcb1 libxaw7 libxcb-dri2-0 libxcb-dri3-0
  libxcb-glx0 libxcb-present0 libxcb-randr0 libxcb-render0 libxcb-shape0 libxcb-shm0 libxcb-sync1 libxcb-xfixes0
  libxcomposite1 libxcursor1 libxdamage1 libxfixes3 libxft2 libxi6 libxinerama1 libxkbcommon0 libxkbfile1
  libxml-parser-perl libxml-twig-perl libxml-xpathengine-perl libxml2 libxmu6 libxrandr2 libxrender1 libxshmfence1
  libxt6 libxtst6 libxv1 libxxf86dga1 libxxf86vm1 libz3-4 libzzip-0-13 lmodern mesa-vulkan-drivers pari-doc
  pari-elldata pari-galdata pari-gp pari-seadata perl-openssl-defaults poppler-data readline-common
  shared-mime-info systemd systemd-sysv systemd-timesyncd t1utils termit tex-common texlive-base texlive-binaries
  x11-common x11-utils x11-xserver-utils xdg-user-dirs xdg-utils xfonts-encodings xfonts-utils xkb-data

and I was like "whaaaaat????" I don't get why it has so many dependencies?! E.g. why is mesa and x11 in there?

@jgmbenoit
Copy link
Contributor Author

Have you try without installing the Recommended packages ?
It might shorten the list.
Hint: in /etc/apt/apt.conf put

APT:Install-Recommends "0";

@jgmbenoit
Copy link
Contributor Author

Could you finally reproduce the issue ?

@fingolfin
Copy link
Member

Nope.

@jgmbenoit
Copy link
Contributor Author

Have you tried in the current Sid (unstable) environment ? in a i386 one ?

@fingolfin
Copy link
Member

No. I tried but I gave up after spending an hour making it work. Sorry, but there is only so much time I can invest in debugging a package that I do not maintain and that is not essential to core GAP.

Others (e.g. @laurentbartholdi the maintainer, or other parties interested in making it work) will have to take care of this one, I am afraid.

@jgmbenoit
Copy link
Contributor Author

Okay. Fair enough.

@fingolfin
Copy link
Member

"Good" (?) news we also still get a crash when testing float 1.0.1 in 32bit mode within the GAP CI. Alas a quite different segfault:

Loading float ... 
gap: /home/runner/work/gap/gap/src/stringobj.h:140: CONST_CSTR_STRING: Assertion `IS_STRING_REP(list)' failed.
Abort

@fingolfin
Copy link
Member

This is with a GAP debug build so it notices corruption earlier

@fingolfin
Copy link
Member

I finally was able to repro the crash in the GAP CI: I think the problem is caused by -malign-double added by m4/ax_cc_maxopt.m4. I've not analyzed why / how it causes the issue (other than speculation), but it seems generally plausible, given that this potentially affects the ABI of packages using double, such as all the libraries float uses...

That said, the crash there is different from what @jgmbenoit observes, but it might be worth a try to see if it helps there, too...

Also, one complication and the reason why I couldn't replicate this locally at first was that I was testing with float from git (using the v1.0.1 tag), and there it worked! I only was able to repro the crash with the float-1.0.1.tar.gz. I don't understand it yet, but apparently the different autotools version used to make that tarball vs. what I have locally lead to a behavior difference. In the end, I noticed this when diffing the two directories:

  • with float from git, I got: CFLAGS = -m32
  • with the float 1.0.1 tarball, I got: CFLAGS = -O3 -fomit-frame-pointer -malign-double -fstrict-aliasing -ffast-math -march=corei7-avx

I am not sure yet why this difference happens, but it certainly was helpful :-)

There are other suspicious things in there: the -march is not a good idea (als for the debian package), and also -ffast-math is dangerous. Honestly, I think m4/ax_cc_maxopt.m4 overall is a bad idea, I removed it from GAP packages I maintain long ago, I would recommend doing the same here.

@laurentbartholdi
Copy link
Collaborator

laurentbartholdi commented Dec 13, 2021 via email

@fingolfin
Copy link
Member

See also PR #76

@fingolfin
Copy link
Member

Could you make a new release? Even if it doesn't help @jgmbenoit (I hope it does), it would help unbreak the GAP CI. Thanks!

@jgmbenoit
Copy link
Contributor Author

Release 1.0.2 builds well in the current i386 Debian Sid environment. So I am closing the issue. Thanks a lot.

@laurentbartholdi
Copy link
Collaborator

laurentbartholdi commented Dec 15, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants