Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken CI on master / conflict with semigroups #4878

Closed
fingolfin opened this issue Apr 22, 2022 · 14 comments
Closed

Broken CI on master / conflict with semigroups #4878

fingolfin opened this issue Apr 22, 2022 · 14 comments
Labels
regression A bug that only occurs in the branch, not in a release topic: ci Anything related to GitHub Actions, Codecov, AppVeyor, Coveralls, Travis, ... topic: packages issues or PRs related to package handling, or specific to a package (for packages w/o issue tracker)
Milestone

Comments

@fingolfin
Copy link
Member

A recent package update has broken our CI tests, specifically the testpackages jobs. I am looking into figuring out what exactly broke them. Also, the PackageDistro CI will be improved to reduce the likelihood of this happening again, see gap-system/PackageDistro#324

@fingolfin fingolfin added regression A bug that only occurs in the branch, not in a release topic: packages issues or PRs related to package handling, or specific to a package (for packages w/o issue tracker) topic: ci Anything related to GitHub Actions, Codecov, AppVeyor, Coveralls, Travis, ... labels Apr 22, 2022
@fingolfin
Copy link
Member Author

Unfortunately this breakage is caused by semigroups 4.0.1 (CC @james-d-mitchell). I am going to revert semigroups in the package distribution to the previous version for now, to unbreak CI here. Then we can investigate what is wrong.

@james-d-mitchell
Copy link
Contributor

Thanks for raising this, and apologies for breaking the CI @fingolfin, any chance you can link me to the log showing the failure?

@fingolfin
Copy link
Member Author

@james-d-mitchell to be clear I don't think it'd be fair to see that "you" broke the CI. If at all, most like it's my patch for semigroups that's the culprit (but I actually did not yet have time to dig into what exactly is wrong). Also, of course we before broke semigroups, so... Let's not dwell too much on who broken what, just let's find a way out of this :-).

So one mistake I made with the package distro was to not also run the GAP test suite. I feel pretty stupid about that in retrospect sigh.

Next, I just realized that semigroups doesn't run CI tests against GAP master right now(again, in hindsight, this is pretty obvious, I just didn't reflect up on this), so of course CI tests for my semigroups PR did not test there -- but I did test it locally -- but I only tested that the semigroups test suite passes, not the GAP test suite sigh.

Anyway, enough of that. I can reproduce the breakage locally by running the GAP master testinstall testsuite after loading Semigroups. I.e.:

gap> LoadPackage("semigroups");
gap> ReadGapRoot("tst/testinstall.g");
...
testing: /Users/mhorn/Projekte/GAP/gap/tst/testinstall/ctbl.tst
########> Diff in /Users/mhorn/Projekte/GAP/gap/tst/testinstall/ctbl.tst:311
# Input is:
IsCharacterTable( t mod 3 );
# Expected output:
true
# But found:
Error, arg[1] must be mutable
########
########> Diff in /Users/mhorn/Projekte/GAP/gap/tst/testinstall/ctbl.tst:313
# Input is:
IsCharacterTable( t mod 5 );
# Expected output:
true
# But found:
Error, arg[1] must be mutable
########
...
testing: /Users/mhorn/Projekte/GAP/gap/tst/testinstall/grpmat.tst
########> Diff in /Users/mhorn/Projekte/GAP/gap/tst/testinstall/grpmat.tst:49
# Input is:
IsTransitive( Image( IsomorphismPermGroup( SO( 1, 8, 2 ) ) ) );
# Expected output:
true
# But found:
Error, arg[1] must be mutable
########
     647 ms (208 ms GC) and 448MB allocated for grpmat.tst
...
testing: /Users/mhorn/Projekte/GAP/gap/tst/testinstall/meataxe.tst
########> Diff in /Users/mhorn/Projekte/GAP/gap/tst/testinstall/meataxe.tst:30
# Input is:
IdGroup(MTX.ModuleAutomorphisms(M2));
# Expected output:
[ 1344, 11301 ]
# But found:
Error, arg[1] must be mutable
########
     343 ms (272 ms GC) and 237MB allocated for meataxe.tst
...
testing: /Users/mhorn/Projekte/GAP/gap/tst/testinstall/random.tst
########> Diff in /Users/mhorn/Projekte/GAP/gap/tst/testinstall/random.tst:98
# Input is:
randomTest(CyclicGroup(IsMatrixGroup, GF(2), 3), Random);
# Expected output:
# But found:
Error, arg[1] must be mutable
########
    1074 ms (96 ms GC) and 292MB allocated for random.tst
...

There are more places with failures, I just selected a few.

@fingolfin
Copy link
Member Author

Digging a bit deeper into one of these:

gap> G:=SymmetricGroup(3);;
gap> M:=PermutationGModule(G,GF(2));
rec( IsOverFiniteField := true, dimension := 3, field := GF(2),
  generators := [ <an immutable 3x3 matrix over GF2>, <an immutable 3x3 matrix over GF2>
     ], isMTXModule := true )
gap> M2:=TensorProductGModule(M,M);
rec( IsOverFiniteField := true, dimension := 9, field := GF(2),
  generators := [ <an immutable 9x9 matrix over GF2>, <an immutable 9x9 matrix over GF2>
     ], isMTXModule := true )
gap> MTX.ModuleAutomorphisms(M2);
Error, arg[1] must be mutable at GAPROOT/lib/listcoef.gi:79 called from
AddRowVector( row2, coeffs[head], x
 ); at GAPROOT/lib/matrix.gi:2541 called from
DoSemiEchelonMatTransformationDestructive( DefaultFieldOfMatrix( mat ), mat
 ) at GAPROOT/lib/matrix.gi:2576 called from
SemiEchelonMatTransformationDestructive( mat
 ) at GAPROOT/lib/matrix.gi:2187 called from
TriangulizedNullspaceMatDestructive( MutableCopyMatrix( mat )
 ) at GAPROOT/lib/matrix.gi:2180 called from
TriangulizedNullspaceMat( grpalg[1]
 ) at GAPROOT/lib/meatauto.gi:513 called from
...  at *stdin*:5
you can 'quit;' to quit to outer loop, or
you can 'return;' to continue
brk> Where(20);
AddRowVector( row2, coeffs[head], x
 ); at GAPROOT/lib/matrix.gi:2541 called from
DoSemiEchelonMatTransformationDestructive( DefaultFieldOfMatrix( mat ), mat
 ) at GAPROOT/lib/matrix.gi:2576 called from
SemiEchelonMatTransformationDestructive( mat
 ) at GAPROOT/lib/matrix.gi:2187 called from
TriangulizedNullspaceMatDestructive( MutableCopyMatrix( mat )
 ) at GAPROOT/lib/matrix.gi:2180 called from
TriangulizedNullspaceMat( grpalg[1]
 ) at GAPROOT/lib/meatauto.gi:513 called from
SpinHomFindVector( work ) at GAPROOT/lib/meatauto.gi:703 called from
SpinHom( m1, m2 ) at GAPROOT/lib/meatauto.gi:1654 called from
SMTX.BasisModuleHomomorphisms( m, m
 ) at GAPROOT/lib/meatauto.gi:1668 called from
SMTX.BasisModuleEndomorphisms( M ) at GAPROOT/lib/meatauto.gi:1071 called from
ProperModuleDecomp( stack[i][2] ) at GAPROOT/lib/meatauto.gi:1184 called from
SMTX.Indecomposition( m ) at GAPROOT/lib/meatauto.gi:1271 called from
MTX.HomogeneousComponents( m ) at GAPROOT/lib/meatauto.gi:1515 called from
<function "SMTX_ModuleAutomorphisms">( <arguments> )
 called from read-eval loop at *errin*:1
brk> row2;
<an immutable GF2 vector of length 9>
brk> IsMutable(row2);
false

Note that semigroups does not appear in that callstack. But somehow that row vector ended up being immutable when it should be mutable. Weird.

@hulpke
Copy link
Contributor

hulpke commented Apr 25, 2022

Binary search finds that if not loading

gap/elements/semiringmat.gi

from semigroups, the error is gone.

In fact, only removing the artificial upranking in lines 304 and 310 of that file make the error disappear.

@fingolfin
Copy link
Member Author

Weird. Anyone have a theory how and why methods that only trigger for integer matrices could affect the Meataxe dealing with finite field matrices?

@hulpke
Copy link
Contributor

hulpke commented Apr 26, 2022

The meataxe calls Matrix to bring lists of vectors into matrix form. The package method (function SEMIGROUPS_MatrixForIsSemiringIsHomogenousListFunc) gets called since the field is a semiring, and goes into NewMatrixOverFiniteField that forces the entries to be immutable.
I'm unsure why the package installs such methods (that apply to semirings which are fields) and doesn't just use the matrices from the library.

@james-d-mitchell
Copy link
Contributor

I'm unsure why the package installs such methods (that apply to semirings which are fields) and doesn't just use the matrices from the library.

When we did this back in 2015 or so, the reason was that we couldn't make the code for semigroups of matrices over finite fields work using the matrices in the library. There were several reasons for this, perhaps these have disappeared in the meantime, and I'm happy to revisit this.

This wasn't an arbitrary weird choice to reimplement things for ourselves, but a thought through workaround for deficiencies in the GAP library approach to matrices (such as them not be associative elements, not being permitted to construct 0-dimensional matrices, there being no means of identifying matrices (i.e. no filter IsMatrixOverFiniteField or similar) (since they were only plain lists of lists of values), no means of recovering the base domain of the matrices, etc). Since the code for MatrixObjs still isn't complete some 7 years later, but the code for semigroups of matrices over finite fields in Semigroups has been working all along, I still feel like this was the correct choice.

@james-d-mitchell
Copy link
Contributor

@james-d-mitchell to be clear I don't think it'd be fair to see that "you" broke the CI. If at all, most like it's my patch for semigroups that's the culprit (but I actually did not yet have time to dig into what exactly is wrong). Also, of course we before broke semigroups, so... Let's not dwell too much on who broken what, just let's find a way out of this :-).

Thanks @fingolfin, I appreciate this, didn't think you were blaming me! I'm happy to try and resolve this however makes most sense.

@james-d-mitchell
Copy link
Contributor

In fact, only removing the artificial upranking in lines 304 and 310 of that file make the error disappear.

Looks like this only makes the error disappear in the GAP tests, but breaks the Semigroups tests:

https://github.com/semigroups/Semigroups/runs/6173025291?check_suite_focus=true

@hulpke
Copy link
Contributor

hulpke commented Apr 26, 2022

OK, so since matrix objects were not yet available,semigroups installed its own methods (for the library operation Matrix) to create matrices (to allow cases such as dimension zero, or matrices over more general structures).
Now the library provides certain methods for matrix objects that allow in certain cases more general settings (such as matrices with mutable rows) than the ones in semigroups. The problem with method ranking happens in that the MeatAxe creates matrices with Matrix that assume the functionality of the library, but (by happenstance) the method in semigroups ranks higher and is applied. I can see the following ways of resolving it:

  1. semigroups does not use Matrix, but its own operation, named differently. This is very cheap to implement but might be impalatable if it affects the user interface. On the other hand, it might be necessary to do so if Matrixin the library is declared in ways to make implications (such as associativity) that would conflict.
  2. The Matrix methods in semigroups get extended to the full library functionality -- for example. distinguish mutable and immutable rows. This is likely the most work intensive option.
  3. The library methods for Matrix get extended to provide the full functionality of what is needed for semigroups. I can't judge how much work it would be, since I don't know what is missing.
  4. The Matrix methods in semigroups check whether the arguments are in a case that is already provided for by the library, and if so exit with TryNextMethod(). That is, they only apply for the situations in which the library does not provide suitable functionality. This should also be reasonably easy to implement; it requires tests in the methods in semigroups and potentially a few more methods for the interaction of library matrices with semigroups matrices. (This could be combined with parts of option 3. to have the library provide as much as possible.)

@james-d-mitchell
Copy link
Contributor

Thanks @hulpke, I think a combination of 3 and 4 is probably the best way forwards. It'd be better for semigroups to use the better code for matrices that GAP provides when they are compatible, and contain less special case code of its own. If there's a pressing need to resolve this very quickly, then we could temporarily do 1), although this might at some point become permanent, and so if possible I'd prefer to try 3 and 4 first.

I think the intersection of the library methods and those in Semigroups is limited to integer matrices and matrices over finite fields. Resolving things for integer matrices should be most straightforward because the functionality required and implemented for these in Semigroups is limited. Matrices of finite fields will be more complicated.

Is there somewhere I can read about the requirements of MatrixObjs so that I can start to figure out what changes might be required to do 3 and 4?

@fingolfin
Copy link
Member Author

@james-d-mitchell to learn more about MatrixObj, take a look at Chapter 26 "Vector and Matrix Objects " of the GAP reference manual, and in particular Section 26.13 "Implementing New Vector and Matrix Objects Types". Best to do it in the development version, though, as the manual for 4.11.1 is of course older.

At the same time, nothing in there (still!) is 100% settled, simply because not enough practical experience was gathered. What I am trying to say is: if something doesn't seem to make sense or is confusing or not explained well or... don't hesitate to question it.

@fingolfin
Copy link
Member Author

Resolved by Semigroups 5. Thanks a million to @james-d-mitchell for investing a lot of work into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
regression A bug that only occurs in the branch, not in a release topic: ci Anything related to GitHub Actions, Codecov, AppVeyor, Coveralls, Travis, ... topic: packages issues or PRs related to package handling, or specific to a package (for packages w/o issue tracker)
Projects
None yet
Development

No branches or pull requests

3 participants