Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the C++ compiler to link semigroups.so #942

Merged
merged 1 commit into from
Sep 11, 2023

Conversation

fingolfin
Copy link
Contributor

Resolves #928

This is a simpler (?) alternative to PR #937 and thus closes #937 if it gets merged.

@dimpase could you verify if this also resolves the issue (modulo #938 of course) for you?

@dimpase
Copy link
Contributor

dimpase commented Sep 7, 2023

Trying to work around #943 to test this atm.

@dimpase
Copy link
Contributor

dimpase commented Sep 8, 2023

OK, I'm finally down to testing on OpenBSD. I use an external Eigen library (it's an OpenBSD package), and build/install libsemigroups as follows:

$ ./config.status --config  
'--prefix=/home/dima' '--disable-hpcombi' '--disable-backward' '--with-external-eigen' 

(these --disable-* are needed, otherwise it doesn't build)

For semigroups
I set PKG_CONFIG_PATH=/home/dima/lib/pkgconfig - that's the installation prefix of libsemigroups with lib/pkgconfig appended. It's needed for configuring external libsemigroup, using pkg-config.

For running GAP (and even for running ldd on semigroups.so) I need LD_LIBRARY_PATH= /home/dima/lib, in view of #938.

So far I tested the rebased branch on #937, it works (package loads in GAP, TestPackage("semigroups"); runs, with some errors, but test errors are another story). On OpenBSD GAP does not need to be linked with -pthread, it works with and without it.

Testing this branch now.

The semigroups GAP package is configured with

$ ./config.status --config                                                                                                                                                                                                                                                                                                               
'--with-external-libsemigroups' '--disable-hpcombi'  'PKG_CONFIG_PATH=/home/dima/lib/pkgconfig'

@dimpase
Copy link
Contributor

dimpase commented Sep 8, 2023

This branch does not play as nice with tests. TestPackage("semigroups"); ends with a coredump.

...
######## > Diff in:
/home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semiquo.tst:209
# Input is:
T = H;
# Expected output:
true
# But found:
Error, Variable: 'H' must have an assigned value
########
######## > Diff in:
/home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semiquo.tst:211
# Input is:
Factorization(H, images[2]);
# Expected output:
[ 2 ]
# But found:
Error, Variable: 'H' must have an assigned value
########
libc++abi: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc
Abort trap (core dumped)

@fingolfin
Copy link
Contributor Author

An uncaught exception sounds like a problem in libsemigroups. In any case, I don't see how that could fail in this PR and work in PR #937... Just to double check, did you also run the tests in otherwise identical setup (in particular: same gap and same Semigroups commit) with PR #937?

@fingolfin
Copy link
Contributor Author

Ah sorry, I now see you wrote in a previous comment that you first tested with PR #937 and it all passed and then did the same test with this PR here and it gave that crash, is that right?

Is the H must have an assigned value also visible with PR #937? It would be useful to have the full test logs for both PRs, perhaps you can attach them to a comment here (note that file attachments in GitHub only work with certain file extensions, such as .txt)

@dimpase
Copy link
Contributor

dimpase commented Sep 8, 2023

some tests with #937 fail, with a bad alloc error, too, but GAP stays up. So it looks like it's a slightly different error handling, more robust.

will post the logs

@dimpase
Copy link
Contributor

dimpase commented Sep 9, 2023

OK, ran tests number of times, with both this and the old branch. The new branch doesn't really display a different behaviour (crashes don't happen during every run, and anyway it's a memory corruption/leak that finally brings GAP down sometimes).

Leaking/broken test is apparently semigroups/tst/standard/semigroups/semidp.tst and semigroups/tst/standard/semigroups/semiquo.tst, possibly more, only when run with acting methods disabled.

So here what all the logs I made have in common:

...
testing: /home/dima/tmp/gap/pkg/semigroups/tst/standard/tools/io.tst
    2438 ms (1711 ms GC) and 7.58MB allocated for tools/io.tst
-----------------------------------
total   1072164 ms (115772 ms GC) and 17.0GB allocated
              0 failures in 72 files

ESC[1m#I  Running tests with acting methods disabled
ESC[0mArchitecture: x86_64-unknown-openbsd7.3-default64-kv9

testing: /home/dima/tmp/gap/pkg/semigroups/tst/standard/attributes/acting.tst
    2281 ms (1625 ms GC) and 47.6MB allocated for attributes/acting.tst
...
testing: /home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semicons.ts\
t
   22820 ms (1734 ms GC) and 105MB allocated for semigroups/semicons.tst
testing: /home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semidp.tst
######## > Diff in:
/home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semidp.tst:319
# Input is:
Size(D) = Size(T) ^ 3;
# Expected output:
ESC[30;42mtrueESC[0m
# But found:
ESC[30;41mError, std::bad_allocESC[0m
########
   10906 ms (1812 ms GC) and 109MB allocated for semigroups/semidp.tst
testing: /home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semieunit.t\
st
...
testing: /home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semiquo.tst
######## > Diff in:
/home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semiquo.tst:121
# Input is:
Size(S / cong);
# Expected output:
ESC[30;42m120ESC[0m
# But found:
ESC[30;41mError, std::bad_allocESC[0m
########
######## > Diff in:
/home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semiquo.tst:123
# Input is:
IsomorphismPermGroup(S / cong);
# Expected output:
ESC[30;42m<quotient of <semigroup congruence over <inverse partial perm semigrou\
p 
  of size 720, rank 10 with 14 generators> with congruence pair (6,1)>> -> 
<permutation group of size 120 with 2 generators>ESC[0m
# But found:
ESC[30;41mError, std::bad_allocESC[0m
########
######## > Diff in:
/home/dima/tmp/gap/pkg/semigroups/tst/standard/semigroups/semiquo.tst:138
# Input is:
Size(S);
# Expected output:
ESC[30;42m336ESC[0m
# But found:
ESC[30;41mError, std::bad_allocESC[0m
########
...

Also,when it comes to running these tests, memory used by GAP shots up by about 1GB - another indication of a leak to me.

@dimpase
Copy link
Contributor

dimpase commented Sep 9, 2023

@fingolfin - a typo to be fixed on fingolfin#1

@fingolfin
Copy link
Contributor Author

@dimpase the typo is not in this PR, though, wouldn't it make more sense to directly submit it to this repository?

@dimpase
Copy link
Contributor

dimpase commented Sep 9, 2023

It's a typo made by you in an earlier iteration. Let's not be overtly bureacratic here.

Doing it the other way would likely create a merge conflict to resolve.

@james-d-mitchell
Copy link
Collaborator

@dimpase can I have access to a machine where I can reproduce the bug? I agree with @fingolfin that this looks like an issue with the package or libsemigroups rather than the build system

@dimpase
Copy link
Contributor

dimpase commented Sep 9, 2023

@dimpase can I have access to a machine where I can reproduce the bug? I agree with @fingolfin that this looks like an issue with the package or libsemigroups rather than the build system

Absolutely, no problem. Email me the public part (.pub) of your ssh key.

@dimpase
Copy link
Contributor

dimpase commented Sep 9, 2023

@dimpase can I have access to a machine where I can reproduce the bug? I agree with @fingolfin that this looks like an issue with the package or libsemigroups rather than the build system

Absolutely, no problem. Email me the public part (.pub) of your ssh key.

Hey @james-d-mitchell , you already have an account on this machine - you accessed in in June afaict. :-)
try ssh jdm@sagemath.openbsd.amsterdam

@fingolfin
Copy link
Contributor Author

It's a typo made by you in an earlier iteration. Let's not be overtly bureacratic here.

??? This PR here may never be merged. Making a fix to something not touched by this PR only in here at best delays the fix from becoming generally available and at worst means the fix is never applied.

Doing it the other way would likely create a merge conflict to resolve.

How? My PR does not even touch configure.ac

@dimpase
Copy link
Contributor

dimpase commented Sep 9, 2023

How? My PR does not even touch configure.ac

oh, OK, I confused it with Makefile.in

@james-d-mitchell
Copy link
Collaborator

Thanks @dimpase i will try to look asap, but probably not until Monday

Copy link
Contributor

@dimpase dimpase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@james-d-mitchell
Copy link
Collaborator

Tests fail, would be better if they passed

@dimpase
Copy link
Contributor

dimpase commented Sep 9, 2023

Tests fail, would be better if they passed

Did they work before?

They all (with exception of cygwin, which I can't say much about now) seem to fail due to inability to find libsemigroups at runtime, i.e. either LD_LIBRARY_PATH must be set, or better #938 resolved)

@james-d-mitchell
Copy link
Collaborator

Tests fail, would be better if they passed

Did they work before?

They all (with exception of cygwin, which I can't say much about now) seem to fail due to inability to find libsemigroups at runtime, i.e. either LD_LIBRARY_PATH must be set, or better #938 resolved)

Yes they passed before for example in your most recent pr.

@dimpase
Copy link
Contributor

dimpase commented Sep 10, 2023

are they all running in miniconda environment with libsemigroups coming from conda?

@dimpase
Copy link
Contributor

dimpase commented Sep 10, 2023

By the way, I ran make check on libsemigroup on the OpenBSD box, and got a bad::alloc error in one test:

...
FpSemigroup 041: equal_to . . .                                          2205μs
FpSemigroup 042: cbegin/cend_rules . . .                                  177μs
FpSemigroup 043: semigroup of size 3 . . .                                458μs
FpSemigroup 044: run_for/ulibc++abi: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc
ntil . . .                                     2574μs
FpSemigroup 045: constructors . . .                                       264μs
FpSemigroup 046: set_inverses . . .                                       421μs
FpSemigroup 047: smalloverlap . . .                                        19ms
FpSemigroup 048: quaternion group Q8 . . .                                492μs

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_all is a Catch v2.13.7 host application.
Run with -? for options

-------------------------------------------------------------------------------
FpSemigroup 049: symmetric group Coxeter presentation
-------------------------------------------------------------------------------
tests/test-fpsemi.cpp:1040
...............................................................................

tests/test-fpsemi.cpp:1040: FAILED:
due to a fatal error condition:
  SIGABRT - Abort (abnormal termination) signal

FpSemigroup 049: symmetric group Coxeter presentation . . .                    
===============================================================================
test cases:    293 |    292 passed | 1 failed
assertions: 434209 | 434208 passed | 1 failed

Abort trap (core dumped) 
make: *** [Makefile:5367: check] Error 134

@james-d-mitchell
Copy link
Collaborator

Seems to fail with the external libsemigroups. Are you sure that the failing tests aren't just because the test machine has limited memory, and the tests are legitimately using large amounts of memory?

@dimpase
Copy link
Contributor

dimpase commented Sep 10, 2023

Seems to fail with the external libsemigroups. Are you sure that the failing tests aren't just because the test machine has limited memory, and the tests are legitimately using large amounts of memory?

the machine has 2GB of physical RAM and 4GB of swap; and the default user is limited to processes using 4GB of memory.

Do you really run tests which need gigabytes of RAM?

@james-d-mitchell
Copy link
Collaborator

Probably not, just checking!

@dimpase
Copy link
Contributor

dimpase commented Sep 10, 2023

@fingolfin - I reproduced these CI errors locally on Debian oldstable linux.

The problem is that these incantations on Linux don't lead to libsemigroups being linked with semigroups.so

dimpase@penguin:~/work/software/gap/pkg/semigroups$ make -j2 V=1
g++  -L/home/dimpase/lib -lsemigroups  gen/src/bipart.o gen/src/cong.o gen/src/conglatt.o gen/src/froidure-pin-base.o gen/src/froidure-pin-bipart.o gen/src/froidure-pin-bmat.o gen/src/froidure-pin-fallback.o gen/src/froidure-pin-matrix.o gen/src/froidure-pin-max-plus-mat.o gen/src/froidure-pin-min-plus-mat.o gen/src/froidure-pin-pbr.o gen/src/froidure-pin-pperm.o gen/src/froidure-pin-transf.o gen/src/pkg.o gen/src/to_gap.o gen/gapbind14/src/gapbind14.o -o bin/x86_64-pc-linux-gnu-default64-kv9/semigroups.so   -shared -fPIC
dimpase@penguin:~/work/software/gap/pkg/semigroups$ ldd bin/x86_64-pc-linux-gnu-default64-kv9/semigroups.so 
        linux-vdso.so.1 (0x00007fff349b3000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007c11eea9a000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007c11ee956000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007c11ee93c000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007c11ee768000)
        /lib64/ld-linux-x86-64.so.2 (0x00007c11efd38000)

dimpase@penguin:~/work/software/gap/pkg/semigroups$ ls -l /home/dimpase/lib/*semi*
-rw-r--r-- 1 dimpase dimpase 3775574 Sep 10 11:26 /home/dimpase/lib/libsemigroups.a
-rwxr-xr-x 1 dimpase dimpase     987 Sep 10 11:26 /home/dimpase/lib/libsemigroups.la
lrwxrwxrwx 1 dimpase dimpase      22 Sep 10 11:26 /home/dimpase/lib/libsemigroups.so -> libsemigroups.so.2.0.0
lrwxrwxrwx 1 dimpase dimpase      22 Sep 10 11:26 /home/dimpase/lib/libsemigroups.so.2 -> libsemigroups.so.2.0.0
-rwxr-xr-x 1 dimpase dimpase 1766768 Sep 10 11:26 /home/dimpase/lib/libsemigroups.so.2.0.0
dimpase@penguin:~/work/software/gap/pkg/semigroups$ ldd /home/dimpase/lib/libsemigroups.so
        linux-vdso.so.1 (0x00007ffe761c5000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007d2076671000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007d20764a4000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007d2076360000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007d207618c000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007d2076172000)
        /lib64/ld-linux-x86-64.so.2 (0x00007d207682a000)

@dimpase
Copy link
Contributor

dimpase commented Sep 10, 2023

Here is what the current main branch is doing instead

$ make V=1
"/home/dimpase/work/software/gap/gac" -d -p "-MQ "bin/x86_64-pc-linux-gnu-default64-kv9/semigroups.so" -MMD -MP -MF bin/x86_64-pc-linux-gnu-default64-kv9/.d" -P "-lstdc++   -L/home/dimpase/lib -lsemigroups " gen/src/bipart.o gen/src/cong.o gen/src/conglatt.o gen/src/froidure-pin-base.o gen/src/froidure-pin-bipart.o gen/src/froidure-pin-bmat.o gen/src/froidure-pin-fallback.o gen/src/froidure-pin-matrix.o gen/src/froidure-pin-max-plus-mat.o gen/src/froidure-pin-min-plus-mat.o gen/src/froidure-pin-pbr.o gen/src/froidure-pin-pperm.o gen/src/froidure-pin-transf.o gen/src/pkg.o gen/src/to_gap.o gen/gapbind14/src/gapbind14.o -o bin/x86_64-pc-linux-gnu-default64-kv9/semigroups.so
gcc -o bin/x86_64-pc-linux-gnu-default64-kv9/semigroups.so gen/src/bipart.o gen/src/cong.o gen/src/conglatt.o gen/src/froidure-pin-base.o gen/src/froidure-pin-bipart.o gen/src/froidure-pin-bmat.o gen/src/froidure-pin-fallback.o gen/src/froidure-pin-matrix.o gen/src/froidure-pin-max-plus-mat.o gen/src/froidure-pin-min-plus-mat.o gen/src/froidure-pin-pbr.o gen/src/froidure-pin-pperm.o gen/src/froidure-pin-transf.o gen/src/pkg.o gen/src/to_gap.o gen/gapbind14/src/gapbind14.o -shared -fPIC -lstdc++ -L/home/dimpase/lib -lsemigroups
rm -f

and this works correctly:

dimpase@penguin:~/work/software/gap/pkg/semigroups$ ldd bin/x86_64-pc-linux-gnu-default64-kv9/semigroups.so 
        linux-vdso.so.1 (0x00007ffe7517e000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007eac1b85c000)
        libsemigroups.so.2 => /home/dimpase/lib/libsemigroups.so.2 (0x00007eac1b6df000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007eac1b6c5000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007eac1b4f1000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007eac1b3ad000)
        /lib64/ld-linux-x86-64.so.2 (0x00007eac1cafa000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007eac1b38b000)

@dimpase
Copy link
Contributor

dimpase commented Sep 10, 2023

here is how to fix it - just to shuffle the arguments of the linking call around. Only checked on Linux.

--- a/Makefile.gappkg
+++ b/Makefile.gappkg
@@ -138,7 +138,7 @@ gen/%.$(GAP_OBJEXT): %.s Makefile
 # build rule for linking all object files together into a kernel extension
 $(KEXT_SO): $(KEXT_OBJS)
        @mkdir -p $(@D)
-       $(QUIET_GAC)$(GAP_CXX) $(KEXT_LDFLAGS) $(KEXT_OBJS) -o $@ $(GAP_LDFLAGS) $(GAC_LDFLAGS)
+       $(QUIET_GAC)$(GAP_CXX) -o $@ $(GAP_LDFLAGS) $(GAC_LDFLAGS)  $(KEXT_OBJS) $(KEXT_LDFLAGS)
 
 # hook into `make clean`
 clean: clean-kext

@dimpase
Copy link
Contributor

dimpase commented Sep 10, 2023

@fingolfin - please see fingolfin#2

on my fork this CI step passes.
https://github.com/dimpase/Semigroups/actions/runs/6138576172

I'll check the rest now.
Update: the standard tests pass too (on Cygwin there forking errors during execution of some tests,
but I guess it's OK).
https://github.com/dimpase/Semigroups/actions/runs/6138704922

@fingolfin
Copy link
Contributor Author

Thanks @dimpase I've updated this PR to fix the link order

@james-d-mitchell
Copy link
Collaborator

Thanks very much @fingolfin and @dimpase

@james-d-mitchell james-d-mitchell merged commit 5c5f780 into semigroups:main Sep 11, 2023
12 checks passed
@fingolfin fingolfin deleted the mh/link-cxx branch September 11, 2023 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

do not link libstdc++ unconditionally
3 participants