check_9 "InvariantRing" sometimes fails #1773

d-torrance · 2021-01-01T23:42:26Z

Hrm, the test failure is puzzling. It definitely seems to be caused by these commits, and in particular the changes to startup.m2.in that get compiled into the binary. I get different results depending on which binary I use, but using the same source directory for the .m2 files:

Normal M2 binary

From PPA package based on release-1.17 branch

i1 : needsPackage "InvariantRing";

i2 : R = QQ[x_1..x_4];

i3 : invariants diagonalAction(matrix {{0,1,-1,1},{1,0,-1,-1}}, R)

               2
o3 = {x x x , x x x }
       1 2 3   1 3 4

o3 : List

M2 binary compiled from this branch

i1 : needsPackage "InvariantRing";

i2 : R = QQ[x_1..x_4];

i3 : invariants diagonalAction(matrix {{0,1,-1,1},{1,0,-1,-1}}, R)

       5   3 2   5 3 4     4 2 3     6 3 3   2
o3 = {x x x x , x x x x , x x x x , x x x , x x x , x x x }
       1 2 3 4   1 2 3 4   1 2 3 4   1 3 4   1 3 4   1 2 3

o3 : List

Originally posted by @d-torrance in #1750 (comment)

The text was updated successfully, but these errors were encountered:

d-torrance · 2021-01-01T23:45:40Z

This failed again on a CI test of the Debian experimental package on armhf, so I figured I'd make an issue out of it:
https://ci.debian.net/data/autopkgtest/unstable/armhf/m/macaulay2/9176633/log.gz

#1750 (comment) is also relevant:

Not sure what the underlying issue was, but it seemed like the new code in the binary was somehow changing how the keys of some hash tables used by invariants were sorted, which apparently messed up the algorithm

mahrud · 2021-02-11T02:06:23Z

Yikes, if your guess is correct, I'm blaming all the hardcoded magic numbers in the interpreter.

mahrud · 2021-02-11T02:08:45Z

Did it happen consistently? If so we can build from that commit and see what is going wrong.

d-torrance · 2021-02-11T02:09:19Z

Did it happen consistently? If so we can build from that commit and see what is going wrong.

I think so, yes.

DanGrayson · 2021-02-11T15:33:12Z

Keys in hash tables are not "sorted", they're just deposited, along with corresponding values, in various "buckets". A mathematical algorithm that changes its answer when hash codes change is incorrect.

d-torrance · 2021-02-11T15:39:51Z

Keys in hash tables are not "sorted", they're just deposited, along with corresponding values, in various "buckets".

Yeah, "sort" was a poor choice of words. If I recall from fiddling with this issue a couple months ago, the order that the keys were arranged into a list by keys was different using one binary vs. the other, and so we ended up getting something different when calling first:

M2/M2/Macaulay2/packages/InvariantRing/Invariants.m2

Lines 197 to 201 in fed620f

    
              nonemptyU := select(keys U, w -> #(U#w) > 0); 
        
              while  #nonemptyU > 0 do( 
        
           v = first nonemptyU; 
        
           m = first (U#v);

d-torrance · 2021-02-11T16:15:58Z

Yeah the order of keys U definitely changes things:

diff --git a/M2/Macaulay2/packages/InvariantRing/Invariants.m2 b/M2/Macaulay2/packages/InvariantRing/Invariants.m2
index c2016b87b..c223a378d 100644
--- a/M2/Macaulay2/packages/InvariantRing/Invariants.m2
+++ b/M2/Macaulay2/packages/InvariantRing/Invariants.m2
@@ -193,8 +193,8 @@ invariants DiagonalAction := List => o -> D -> (
     S = new MutableHashTable from apply(C, w -> w => {});
     scan(#mons, i -> S#(W1_i) = S#(W1_i)|{mons#i});
     U = new MutableHashTable from S;
-    
-    nonemptyU := select(keys U, w -> #(U#w) > 0);
+    randomKeys := random keys U;
+    nonemptyU := select(randomKeys, w -> #(U#w) > 0);
     while  #nonemptyU > 0 do(
        v = first nonemptyU;
        m = first (U#v);
@@ -220,7 +220,7 @@ invariants DiagonalAction := List => o -> D -> (
                )
            );
        U#v = delete(m, U#v);
-       nonemptyU = select(keys U, w -> #(U#w) > 0)
+       nonemptyU = select(randomKeys, w -> #(U#w) > 0)
        );
     
     if S#?(0_(ZZ^r)) then mons = S#(0_(ZZ^r)) else mons = {};

Then:

i2 : R = QQ[x_1..x_4];

i3 : unique apply(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))

                2         2                         8 4 4   4 2 2   2      
o3 = {{x x x , x x x }, {x x x , x x x }, {x x x , x x x , x x x , x x x },
        1 2 3   1 3 4     1 3 4   1 2 3     1 2 3   1 3 4   1 3 4   1 3 4  
     --------------------------------------------------------------------------
       5 3 4     6 2 4 2   5   3 2   6 3 3   5 5 5           2               
     {x x x x , x x x x , x x x x , x x x , x x x , x x x , x x x }, {x x x ,
       1 2 3 4   1 2 3 4   1 2 3 4   1 3 4   1 2 3   1 2 3   1 3 4     1 2 3 
     --------------------------------------------------------------------------
      4 2 2   2
     x x x , x x x }}
      1 3 4   1 3 4

o3 : List

DanGrayson · 2021-02-11T16:16:06Z

I would like hash codes not to depend on the architecture or the compiler used, so if that happens again, let's track it down. Could it be that there is another instance where certain compilers detect an overflow at compile time and take advantage of the result being "undefined"?

DanGrayson · 2021-02-11T16:18:37Z

Maybe invariants should minimize and sort its answer to make it unique.

mahrud · 2021-02-11T16:42:34Z

Maybe invariants should minimize and sort its answer to make it unique.

If anyone should do that, it should probably be the interpreter.

DanGrayson · 2021-02-11T16:51:02Z

Maybe invariants should minimize and sort its answer to make it unique.

If anyone should do that, it should probably be the interpreter.

How could it possibly do that?

mahrud · 2021-02-11T16:54:56Z

It should make sure that keys HashTable is always returned in the same order.

mahrud · 2021-02-11T16:55:46Z

@d-torrance instead of unique, could you call tally on that output?

d-torrance · 2021-02-11T16:58:49Z

i4 : tally apply(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))

                     2
o4 = Tally{{x x x , x x x } => 43                                     }
             1 2 3   1 3 4
                     6 3 3   4 2 2   2
           {x x x , x x x , x x x , x x x } => 1
             1 2 3   1 3 4   1 3 4   1 3 4
             2
           {x x x , x x x } => 4
             1 3 4   1 2 3
             5 3 4     6 2 4 2           2
           {x x x x , x x x x , x x x , x x x } => 1
             1 2 3 4   1 2 3 4   1 2 3   1 3 4
             5 3 4     6 2 4 2   7   4 3   8 4 4           2
           {x x x x , x x x x , x x x x , x x x , x x x , x x x } => 1
             1 2 3 4   1 2 3 4   1 2 3 4   1 3 4   1 2 3   1 3 4

DanGrayson · 2021-02-11T16:58:55Z

It should make sure that keys HashTable is always returned in the same order.

Hash codes change from one version to the next, so this is too much to hope for. (This is because hash codes of mutable objects are serial numbers, and the series may be altered.)

mahrud · 2021-02-11T19:10:44Z

Maybe we should ask @galettof if there's another reason why the order of the keys matters.

DanGrayson · 2021-02-11T19:18:10Z

Ideally, @galettof, mathematical algorithms should give answers that are made unique in some way, thereby insulating the code from changes in Macaulay2 that ought to be irrelevant mathematically. In this case, an extra step to minimize and sort the set of generators would do that.

mahrud · 2021-02-11T20:06:49Z

I can do this in #1929, if that sounds correct to Federico, but @DanGrayson, I'm more generally concerned about this:

i1 : setRandomSeed()

i2 : keys set {1,2,3}

o2 = {1, 2, 3}

o2 : List

i3 : keys set {10,20,30}

o3 = {20, 30, 10}

o3 : List

I don't think this makes sense for Set, hence why I think the interpreter isn't doing all it can to prevent this.

galettof · 2021-02-11T20:23:31Z

For reference, this is an implementation of Algorithm 4.3.1 in Derksen, Kemper - Computational Invariant Theory (modified to incorporate an algorithm in Francesca Gandini's dissertation, though that's earlier code that does not seem responsible for the issue). The algorithm suggests order should not matter but I can reach out to colleagues to ask if their implementation altered the algorithm in a way that would make order relevant.

The algorithm is already supposed to output minimal invariants, so minimizing generators before returning them is not an ideal solution. If the goal is to ensure the output is always the same, I would be more inclined to sort the list nonemptyU but there are two issues:

as Doug shows, order seems to matter;
the list contains vectors, which Macaulay2 does not know how to sort (?).

galettof · 2021-02-11T20:30:26Z

Does it help to add

nonemptyU = sort(nonemptyU/entries)/vector;

after the line where nonemptyU is defined?

d-torrance · 2021-02-11T20:40:19Z

Sorting helps in the sense that we get the same thing each time, but it's the same wrong thing!

i38 : tally apply(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))

                      8 4 4   6 3 3   4 2 2   2
o38 = Tally{{x x x , x x x , x x x , x x x , x x x } => 50}
              1 2 3   1 3 4   1 3 4   1 3 4   1 3 4

It's also significantly slower with all the extra sorting (since we need to sort not just at the initial creation of nonemptyU, but each time it's re-defined at the bottom of the while loop).

Without sorting:

i43 : elapsedTime scan(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))
 -- 8.82792 seconds elapsed

With sorting:

i45 : elapsedTime scan(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))
 -- 75.7911 seconds elapsed

DanGrayson · 2021-02-11T21:14:21Z

I can do this in #1929, if that sounds correct to Federico, but @DanGrayson, I'm more generally concerned about this:
i1 : setRandomSeed()

i2 : keys set {1,2,3}

o2 = {1, 2, 3}

o2 : List

i3 : keys set {10,20,30}

o3 = {20, 30, 10}

o3 : List
I don't think this makes sense for Set, hence why I think the interpreter isn't doing all it can to prevent this.

I see nothing to prevent. The order of the keys in a hashtable is something determined by internal aspects of the code for handling hashtables, and no particular order should be expected by the user.

mahrud · 2021-02-11T22:15:35Z

I see nothing to prevent. The order of the keys in a hashtable is something determined by internal aspects of the code for handling hashtables, and no particular order should be expected by the user.

Can you explain why that happens? The "internal" aspects should not be mysterious, and from reading the code I don't see an explanation why in the second call to set the order is changed.

DanGrayson · 2021-02-11T23:22:48Z

You can use "buckets" to see the internal structure of a hash table:

Macaulay2, version 1.17.1.1
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, LLLBases,
               MinimalPrimes, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone

i1 : debug Core

i2 : buckets set {1,2,3}

o2 = {{}, {(1, 1)}, {(2, 1)}, {(3, 1)}}

o2 : List

i3 : buckets set {10,20,30}

o3 = {{(20, 1)}, {}, {(30, 1), (10, 1)}, {}}

o3 : List

Yes, it's not mysterious. Both hash tables have 4 elements, the hash code of a small integer is the number itself, and the
bucket the key ends up in is the hash code modulo 4. That's why the order of keys ends up the way it does.

mahrud · 2021-02-11T23:30:05Z

Thanks, I didn't know about buckets!

DanGrayson · 2021-02-12T01:08:48Z

Thanks, I didn't know about buckets!

Maybe it should be exported and documented, as a debugging aid.

galettof · 2021-02-12T03:36:35Z

My colleague Matt Mastroeni looked at this. It seems there is an issue with the algorithm itself. As new monomials are added to the list of previously found invariant monomials, we need to also check whether any previously found monomial is divisible by the new one and delete those that are from the list. I wonder if it might be faster to just minimize the monomial ideal at the end though.

DanGrayson · 2021-02-12T12:54:00Z

It might be faster to minimize as you go along, for then the list of monomials doesn't get gigantic.

galettof · 2021-02-12T14:34:44Z

That makes sense. Matt is going to look into it but he will need some time. If you need a quick fix for the time being, you are welcome to add a line to minimize the result at the end (maybe with a comment so we know where it's coming from). Thanks!

mahrud · 2021-02-12T19:30:47Z

@DanGrayson, I still don't agree with you that the interpreter can't do anything to prevent this. Maybe I should clarify that by "this" I mean the fact that this bug seems to occur consistently on the github builds, but not locally, despite the fact that:

examples and tests run with --no-randomize
I downloaded the binary artifact from github and ran it in a docker container with the exact same libraries

and yet I couldn't reproduce this without artificially randomizing the list. As @d-torrance noted, this bug seemed to happen based on changes to startup.m2, so I think we should acknowledge that there's a bug in hashtables.

DanGrayson · 2021-02-12T20:20:51Z

Hashcodes can change from one commit to the next, and the algorithm coded in the function invariants is faulty, in the sense that the answer it yields depends on the values of hashcodes. So, no, I don't agree there's a bug in hashtables.

mahrud · 2021-02-12T20:30:30Z

Hashcodes can change from one commit to the next

How is the commit relevant? I'm trying on the same commit! And why is this inevitable? I would think keeping the random seed, architecture, and even executable the same would make M2 run deterministically.

To be clear, I agree that the function should be fixed! The issue I'm concerned about is the difficulty of debugging this.

DanGrayson · 2021-02-12T20:58:40Z

Hashcodes can change from one commit to the next

How is the commit relevant? I'm trying on the same commit! And why is this inevitable? I would think keeping the random seed, architecture, and even executable the same would make M2 run deterministically.

I would think so, too. Could you show me what code you ran and the output from two different runs of the same binary with the same code that produced different hashcodes? Or different results of some other code?

mahrud · 2021-02-12T21:56:01Z

Sure, download this binary artifact: https://github.com/Macaulay2/M2/suites/2024651034/artifacts/40656232 produced by github on #1929 and follow these instructions. Basically:

go to M2/BUILD/docker
run make build
run make shell
run sudo apt-get install normaliz (somehow normaliz isn't installed, not sure why)
run M2 --check 2

When I try M2 --check 2 in the Docker instance I get:

 -- capturing check(8, "InvariantRing")                                      -- 0.0703749 seconds elapsed
 -- capturing check(9, "InvariantRing")                                      -- 0.159413 seconds elapsed
 -- capturing check(10, "InvariantRing")                                     -- 0.0528693 seconds elapsed

On that build, with the same exact executable (you can confirm the checksum!) this output is produced:

2021-02-12T08:16:04.9916176Z InvariantRing
2021-02-12T08:16:04.9916597Z *************
...
2021-02-12T08:16:06.1441919Z  -- capturing check(9, "InvariantRing")                                      -- capture failed; retrying ...
2021-02-12T08:16:06.6458855Z  -- running   check(9, "InvariantRing")                                     
2021-02-12T08:16:06.6461300Z  ulimit -c unlimited; ulimit -t 700; ulimit -m 850000; ulimit -s 8192; ulimit -n 512;  cd /tmp/M2-161613-0/9-rundir/; GC_MAXIMUM_HEAP_SIZE=400M "/home/runner/work/M2/M2/M2/BUILD/build/usr-dist/x86_64-Linux-Ubuntu-20.04/bin/M2-binary" -q --int --no-randomize --no-readline --silent --stop --print-width 77 -e 'needsPackage("InvariantRing",Reload=>true,FileName=>"/home/runner/work/M2/M2/M2/Macaulay2/packages/InvariantRing.m2")' <"/tmp/M2-161613-0/8.m2" >>"/tmp/M2-161613-0/8.tmp" 2>&1
2021-02-12T08:16:06.6463654Z /tmp/M2-161613-0/8.tmp:0:1: (output file) error: Macaulay2 exited with status code 1
2021-02-12T08:16:06.6464491Z /tmp/M2-161613-0/8.m2:0:1: (input file)
2021-02-12T08:16:06.6465022Z M2: *** Error 1
2021-02-12T08:16:06.6465623Z  -- 0.658691 seconds elapsed

DanGrayson · 2021-02-12T22:20:34Z

Hashcodes can change from one commit to the next

How is the commit relevant? I'm trying on the same commit! And why is this inevitable? I would think keeping the random seed, architecture, and even executable the same would make M2 run deterministically.

I would think so, too. Could you show me what code you ran and the output from two different runs of the same binary with the same code that produced different hashcodes? Or different results of some other code?

Correction -- all the *.m2 files have to be the same, too.

DanGrayson · 2021-02-12T22:21:07Z

Re: "On that build, with the same exact executable (you can confirm the checksum!) this output is produced:"

What does "that build" refer to?

mahrud · 2021-02-12T22:41:49Z

Like I said, #1929: https://github.com/Macaulay2/M2/pull/1929/checks?check_run_id=1885201625

DanGrayson · 2021-02-12T23:13:06Z

Okay. I don't see what the difference could be. Are the *.m2 files the same? Are all the dynamic libraries linked with M2 the same? Is the OS the same?

Anyway, even if the hashcodes differ between those two runs, it's still an error for a mathematical algorithm to depend on them. Maybe we should go to the opposite extreme and randomize the hashcodes each time M2 starts up, so we could detect problems like that more easily. I may try that...

mahrud · 2021-02-15T17:42:35Z

Okay. I don't see what the difference could be. Are the *.m2 files the same? Are all the dynamic libraries linked with M2 the same? Is the OS the same?

All the same, though could you explain why the *.m2 files can affect the hash codes? @d-torrance pointed out that the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables.

Anyway, even if the hashcodes differ between those two runs, it's still an error for a mathematical algorithm to depend on them. Maybe we should go to the opposite extreme and randomize the hashcodes each time M2 starts up, so we could detect problems like that more easily. I may try that...

This would be good to do on occasion, though not always because certain random choices might make an example or text unusually long!

DanGrayson · 2021-02-15T17:59:51Z

Okay. I don't see what the difference could be. Are the *.m2 files the same? Are all the dynamic libraries linked with M2 the same? Is the OS the same?

All the same, though could you explain why the *.m2 files can affect the hash codes? @d-torrance pointed out that the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables.

Yes, it's because many hashcodes are simply serial numbers:

Macaulay2, version 1.17.1.1
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, LLLBases,
               MinimalPrimes, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone

i1 : X = new Type of List;

i2 : hash X

o2 = 1170724

i3 : restart
Macaulay2, version 1.17.1.1
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, LLLBases,
               MinimalPrimes, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone

i1 : Y = new Type of List;

i2 : X = new Type of List;

i3 : hash X

o3 = 1170728

DanGrayson · 2021-02-15T18:00:43Z

Anyway, even if the hashcodes differ between those two runs, it's still an error for a mathematical algorithm to depend on them. Maybe we should go to the opposite extreme and randomize the hashcodes each time M2 starts up, so we could detect problems like that more easily. I may try that...

This would be good to do on occasion, though not always because certain random choices might make an example or text unusually long!

I don't see how, as I'm talking about the hash codes, not the random number seed.

mahrud · 2021-02-15T18:10:56Z

Yes, it's because many hashcodes are simply serial numbers:

You clearly created a new hash table here, but what I said was: "the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables."
Are hash codes of monomials sequential? (I'm guessing the hash code of the base ring is sequential?)

I didn't realize hash codes could be negative:

i2 : R = kk[x];

i3 : hash x

o3 = -1446996547

i10 : hash 2147483647

o10 = 2147483647

i11 : hash 2147483648

o11 = -2147483648

DanGrayson · 2021-02-15T18:18:57Z

Yes, it's because many hashcodes are simply serial numbers:

You clearly created a new hash table here, but what I said was: "the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables."

I was answering this question: "All the same, though could you explain why the *.m2 files can affect the hash codes?"

mahrud · 2021-02-15T18:28:32Z

Ah, I see. Sure.

DanGrayson · 2021-02-15T18:36:53Z

About hashcodes being sometimes negative: I don't much like it. It might be worth changing int to uint in this line:

export hash(e:Expr):int := (

d-torrance mentioned this issue Feb 11, 2021

Hookify and add top level caching for basis, pushForward #1929

Closed

15 tasks

mahrud added the bug in our code label Feb 11, 2021

d-torrance mentioned this issue Mar 26, 2021

Update package RationalPoints2 #1996

Merged

mahrud mentioned this issue Jun 25, 2021

hashcodes of mutable objects #2157

Closed

check_9 "InvariantRing" sometimes fails #1773

check_9 "InvariantRing" sometimes fails #1773

Comments

d-torrance commented Jan 1, 2021

Normal M2 binary

M2 binary compiled from this branch

d-torrance commented Jan 1, 2021 • edited

mahrud commented Feb 11, 2021

mahrud commented Feb 11, 2021

d-torrance commented Feb 11, 2021

DanGrayson commented Feb 11, 2021

d-torrance commented Feb 11, 2021

d-torrance commented Feb 11, 2021

DanGrayson commented Feb 11, 2021

DanGrayson commented Feb 11, 2021

mahrud commented Feb 11, 2021

DanGrayson commented Feb 11, 2021

mahrud commented Feb 11, 2021

mahrud commented Feb 11, 2021

d-torrance commented Feb 11, 2021

DanGrayson commented Feb 11, 2021

mahrud commented Feb 11, 2021

DanGrayson commented Feb 11, 2021

mahrud commented Feb 11, 2021

galettof commented Feb 11, 2021

galettof commented Feb 11, 2021

d-torrance commented Feb 11, 2021

Without sorting:

With sorting:

DanGrayson commented Feb 11, 2021

mahrud commented Feb 11, 2021

DanGrayson commented Feb 11, 2021

mahrud commented Feb 11, 2021

DanGrayson commented Feb 12, 2021

galettof commented Feb 12, 2021

DanGrayson commented Feb 12, 2021

galettof commented Feb 12, 2021

mahrud commented Feb 12, 2021

DanGrayson commented Feb 12, 2021

mahrud commented Feb 12, 2021

DanGrayson commented Feb 12, 2021

mahrud commented Feb 12, 2021 • edited

DanGrayson commented Feb 12, 2021

DanGrayson commented Feb 12, 2021

mahrud commented Feb 12, 2021

DanGrayson commented Feb 12, 2021

mahrud commented Feb 15, 2021

DanGrayson commented Feb 15, 2021

DanGrayson commented Feb 15, 2021

mahrud commented Feb 15, 2021 • edited

DanGrayson commented Feb 15, 2021

mahrud commented Feb 15, 2021

DanGrayson commented Feb 15, 2021

d-torrance commented Jan 1, 2021 •

edited

mahrud commented Feb 12, 2021 •

edited

mahrud commented Feb 15, 2021 •

edited