Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_9 "InvariantRing" sometimes fails #1773

Open
d-torrance opened this issue Jan 1, 2021 · 45 comments
Open

check_9 "InvariantRing" sometimes fails #1773

d-torrance opened this issue Jan 1, 2021 · 45 comments

Comments

@d-torrance
Copy link
Member

Hrm, the test failure is puzzling. It definitely seems to be caused by these commits, and in particular the changes to startup.m2.in that get compiled into the binary. I get different results depending on which binary I use, but using the same source directory for the .m2 files:

Normal M2 binary

From PPA package based on release-1.17 branch

i1 : needsPackage "InvariantRing";

i2 : R = QQ[x_1..x_4];

i3 : invariants diagonalAction(matrix {{0,1,-1,1},{1,0,-1,-1}}, R)

               2
o3 = {x x x , x x x }
       1 2 3   1 3 4

o3 : List

M2 binary compiled from this branch

i1 : needsPackage "InvariantRing";

i2 : R = QQ[x_1..x_4];

i3 : invariants diagonalAction(matrix {{0,1,-1,1},{1,0,-1,-1}}, R)

       5   3 2   5 3 4     4 2 3     6 3 3   2
o3 = {x x x x , x x x x , x x x x , x x x , x x x , x x x }
       1 2 3 4   1 2 3 4   1 2 3 4   1 3 4   1 3 4   1 2 3

o3 : List

Originally posted by @d-torrance in #1750 (comment)

@d-torrance
Copy link
Member Author

d-torrance commented Jan 1, 2021

This failed again on a CI test of the Debian experimental package on armhf, so I figured I'd make an issue out of it:
https://ci.debian.net/data/autopkgtest/unstable/armhf/m/macaulay2/9176633/log.gz

#1750 (comment) is also relevant:

Not sure what the underlying issue was, but it seemed like the new code in the binary was somehow changing how the keys of some hash tables used by invariants were sorted, which apparently messed up the algorithm

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

Yikes, if your guess is correct, I'm blaming all the hardcoded magic numbers in the interpreter.

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

Did it happen consistently? If so we can build from that commit and see what is going wrong.

@d-torrance
Copy link
Member Author

Did it happen consistently? If so we can build from that commit and see what is going wrong.

I think so, yes.

@DanGrayson
Copy link
Member

Keys in hash tables are not "sorted", they're just deposited, along with corresponding values, in various "buckets". A mathematical algorithm that changes its answer when hash codes change is incorrect.

@d-torrance
Copy link
Member Author

Keys in hash tables are not "sorted", they're just deposited, along with corresponding values, in various "buckets".

Yeah, "sort" was a poor choice of words. If I recall from fiddling with this issue a couple months ago, the order that the keys were arranged into a list by keys was different using one binary vs. the other, and so we ended up getting something different when calling first:

nonemptyU := select(keys U, w -> #(U#w) > 0);
while #nonemptyU > 0 do(
v = first nonemptyU;
m = first (U#v);

@d-torrance
Copy link
Member Author

Yeah the order of keys U definitely changes things:

diff --git a/M2/Macaulay2/packages/InvariantRing/Invariants.m2 b/M2/Macaulay2/packages/InvariantRing/Invariants.m2
index c2016b87b..c223a378d 100644
--- a/M2/Macaulay2/packages/InvariantRing/Invariants.m2
+++ b/M2/Macaulay2/packages/InvariantRing/Invariants.m2
@@ -193,8 +193,8 @@ invariants DiagonalAction := List => o -> D -> (
     S = new MutableHashTable from apply(C, w -> w => {});
     scan(#mons, i -> S#(W1_i) = S#(W1_i)|{mons#i});
     U = new MutableHashTable from S;
-    
-    nonemptyU := select(keys U, w -> #(U#w) > 0);
+    randomKeys := random keys U;
+    nonemptyU := select(randomKeys, w -> #(U#w) > 0);
     while  #nonemptyU > 0 do(
        v = first nonemptyU;
        m = first (U#v);
@@ -220,7 +220,7 @@ invariants DiagonalAction := List => o -> D -> (
                )
            );
        U#v = delete(m, U#v);
-       nonemptyU = select(keys U, w -> #(U#w) > 0)
+       nonemptyU = select(randomKeys, w -> #(U#w) > 0)
        );
     
     if S#?(0_(ZZ^r)) then mons = S#(0_(ZZ^r)) else mons = {};

Then:

i2 : R = QQ[x_1..x_4];

i3 : unique apply(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))

                2         2                         8 4 4   4 2 2   2      
o3 = {{x x x , x x x }, {x x x , x x x }, {x x x , x x x , x x x , x x x },
        1 2 3   1 3 4     1 3 4   1 2 3     1 2 3   1 3 4   1 3 4   1 3 4  
     --------------------------------------------------------------------------
       5 3 4     6 2 4 2   5   3 2   6 3 3   5 5 5           2               
     {x x x x , x x x x , x x x x , x x x , x x x , x x x , x x x }, {x x x ,
       1 2 3 4   1 2 3 4   1 2 3 4   1 3 4   1 2 3   1 2 3   1 3 4     1 2 3 
     --------------------------------------------------------------------------
      4 2 2   2
     x x x , x x x }}
      1 3 4   1 3 4

o3 : List

@DanGrayson
Copy link
Member

I would like hash codes not to depend on the architecture or the compiler used, so if that happens again, let's track it down. Could it be that there is another instance where certain compilers detect an overflow at compile time and take advantage of the result being "undefined"?

@DanGrayson
Copy link
Member

Maybe invariants should minimize and sort its answer to make it unique.

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

Maybe invariants should minimize and sort its answer to make it unique.

If anyone should do that, it should probably be the interpreter.

@DanGrayson
Copy link
Member

Maybe invariants should minimize and sort its answer to make it unique.

If anyone should do that, it should probably be the interpreter.

How could it possibly do that?

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

It should make sure that keys HashTable is always returned in the same order.

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

@d-torrance instead of unique, could you call tally on that output?

@d-torrance
Copy link
Member Author

i4 : tally apply(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))

                     2
o4 = Tally{{x x x , x x x } => 43                                     }
             1 2 3   1 3 4
                     6 3 3   4 2 2   2
           {x x x , x x x , x x x , x x x } => 1
             1 2 3   1 3 4   1 3 4   1 3 4
             2
           {x x x , x x x } => 4
             1 3 4   1 2 3
             5 3 4     6 2 4 2           2
           {x x x x , x x x x , x x x , x x x } => 1
             1 2 3 4   1 2 3 4   1 2 3   1 3 4
             5 3 4     6 2 4 2   7   4 3   8 4 4           2
           {x x x x , x x x x , x x x x , x x x , x x x , x x x } => 1
             1 2 3 4   1 2 3 4   1 2 3 4   1 3 4   1 2 3   1 3 4

@DanGrayson
Copy link
Member

It should make sure that keys HashTable is always returned in the same order.

Hash codes change from one version to the next, so this is too much to hope for. (This is because hash codes of mutable objects are serial numbers, and the series may be altered.)

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

Maybe we should ask @galettof if there's another reason why the order of the keys matters.

@DanGrayson
Copy link
Member

Ideally, @galettof, mathematical algorithms should give answers that are made unique in some way, thereby insulating the code from changes in Macaulay2 that ought to be irrelevant mathematically. In this case, an extra step to minimize and sort the set of generators would do that.

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

I can do this in #1929, if that sounds correct to Federico, but @DanGrayson, I'm more generally concerned about this:

i1 : setRandomSeed()

i2 : keys set {1,2,3}

o2 = {1, 2, 3}

o2 : List

i3 : keys set {10,20,30}

o3 = {20, 30, 10}

o3 : List

I don't think this makes sense for Set, hence why I think the interpreter isn't doing all it can to prevent this.

@galettof
Copy link
Contributor

For reference, this is an implementation of Algorithm 4.3.1 in Derksen, Kemper - Computational Invariant Theory (modified to incorporate an algorithm in Francesca Gandini's dissertation, though that's earlier code that does not seem responsible for the issue). The algorithm suggests order should not matter but I can reach out to colleagues to ask if their implementation altered the algorithm in a way that would make order relevant.

The algorithm is already supposed to output minimal invariants, so minimizing generators before returning them is not an ideal solution. If the goal is to ensure the output is always the same, I would be more inclined to sort the list nonemptyU but there are two issues:

  1. as Doug shows, order seems to matter;
  2. the list contains vectors, which Macaulay2 does not know how to sort (?).

@galettof
Copy link
Contributor

Does it help to add

nonemptyU = sort(nonemptyU/entries)/vector;

after the line where nonemptyU is defined?

@d-torrance
Copy link
Member Author

Sorting helps in the sense that we get the same thing each time, but it's the same wrong thing!

i38 : tally apply(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))

                      8 4 4   6 3 3   4 2 2   2
o38 = Tally{{x x x , x x x , x x x , x x x , x x x } => 50}
              1 2 3   1 3 4   1 3 4   1 3 4   1 3 4

It's also significantly slower with all the extra sorting (since we need to sort not just at the initial creation of nonemptyU, but each time it's re-defined at the bottom of the while loop).

Without sorting:

i43 : elapsedTime scan(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))
 -- 8.82792 seconds elapsed

With sorting:

i45 : elapsedTime scan(50, i -> invariants diagonalAction((matrix{{0,1,-1,1},{1,0,-1,-1}}, R)))
 -- 75.7911 seconds elapsed

@DanGrayson
Copy link
Member

I can do this in #1929, if that sounds correct to Federico, but @DanGrayson, I'm more generally concerned about this:

i1 : setRandomSeed()

i2 : keys set {1,2,3}

o2 = {1, 2, 3}

o2 : List

i3 : keys set {10,20,30}

o3 = {20, 30, 10}

o3 : List

I don't think this makes sense for Set, hence why I think the interpreter isn't doing all it can to prevent this.

I see nothing to prevent. The order of the keys in a hashtable is something determined by internal aspects of the code for handling hashtables, and no particular order should be expected by the user.

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

I see nothing to prevent. The order of the keys in a hashtable is something determined by internal aspects of the code for handling hashtables, and no particular order should be expected by the user.

Can you explain why that happens? The "internal" aspects should not be mysterious, and from reading the code I don't see an explanation why in the second call to set the order is changed.

@DanGrayson
Copy link
Member

You can use "buckets" to see the internal structure of a hash table:

Macaulay2, version 1.17.1.1
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, LLLBases,
               MinimalPrimes, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone

i1 : debug Core

i2 : buckets set {1,2,3}

o2 = {{}, {(1, 1)}, {(2, 1)}, {(3, 1)}}

o2 : List

i3 : buckets set {10,20,30}

o3 = {{(20, 1)}, {}, {(30, 1), (10, 1)}, {}}

o3 : List

Yes, it's not mysterious. Both hash tables have 4 elements, the hash code of a small integer is the number itself, and the
bucket the key ends up in is the hash code modulo 4. That's why the order of keys ends up the way it does.

@mahrud
Copy link
Member

mahrud commented Feb 11, 2021

Thanks, I didn't know about buckets!

@DanGrayson
Copy link
Member

Thanks, I didn't know about buckets!

Maybe it should be exported and documented, as a debugging aid.

@galettof
Copy link
Contributor

My colleague Matt Mastroeni looked at this. It seems there is an issue with the algorithm itself. As new monomials are added to the list of previously found invariant monomials, we need to also check whether any previously found monomial is divisible by the new one and delete those that are from the list. I wonder if it might be faster to just minimize the monomial ideal at the end though.

@DanGrayson
Copy link
Member

It might be faster to minimize as you go along, for then the list of monomials doesn't get gigantic.

@galettof
Copy link
Contributor

That makes sense. Matt is going to look into it but he will need some time. If you need a quick fix for the time being, you are welcome to add a line to minimize the result at the end (maybe with a comment so we know where it's coming from). Thanks!

@mahrud
Copy link
Member

mahrud commented Feb 12, 2021

@DanGrayson, I still don't agree with you that the interpreter can't do anything to prevent this. Maybe I should clarify that by "this" I mean the fact that this bug seems to occur consistently on the github builds, but not locally, despite the fact that:

  • examples and tests run with --no-randomize
  • I downloaded the binary artifact from github and ran it in a docker container with the exact same libraries

and yet I couldn't reproduce this without artificially randomizing the list. As @d-torrance noted, this bug seemed to happen based on changes to startup.m2, so I think we should acknowledge that there's a bug in hashtables.

@DanGrayson
Copy link
Member

Hashcodes can change from one commit to the next, and the algorithm coded in the function invariants is faulty, in the sense that the answer it yields depends on the values of hashcodes. So, no, I don't agree there's a bug in hashtables.

@mahrud
Copy link
Member

mahrud commented Feb 12, 2021

Hashcodes can change from one commit to the next

How is the commit relevant? I'm trying on the same commit! And why is this inevitable? I would think keeping the random seed, architecture, and even executable the same would make M2 run deterministically.

To be clear, I agree that the function should be fixed! The issue I'm concerned about is the difficulty of debugging this.

@DanGrayson
Copy link
Member

Hashcodes can change from one commit to the next

How is the commit relevant? I'm trying on the same commit! And why is this inevitable? I would think keeping the random seed, architecture, and even executable the same would make M2 run deterministically.

I would think so, too. Could you show me what code you ran and the output from two different runs of the same binary with the same code that produced different hashcodes? Or different results of some other code?

@mahrud
Copy link
Member

mahrud commented Feb 12, 2021

Sure, download this binary artifact: https://github.com/Macaulay2/M2/suites/2024651034/artifacts/40656232 produced by github on #1929 and follow these instructions. Basically:

  • go to M2/BUILD/docker
  • run make build
  • run make shell
  • run sudo apt-get install normaliz (somehow normaliz isn't installed, not sure why)
  • run M2 --check 2

When I try M2 --check 2 in the Docker instance I get:

 -- capturing check(8, "InvariantRing")                                      -- 0.0703749 seconds elapsed
 -- capturing check(9, "InvariantRing")                                      -- 0.159413 seconds elapsed
 -- capturing check(10, "InvariantRing")                                     -- 0.0528693 seconds elapsed

On that build, with the same exact executable (you can confirm the checksum!) this output is produced:

2021-02-12T08:16:04.9916176Z InvariantRing
2021-02-12T08:16:04.9916597Z *************
...
2021-02-12T08:16:06.1441919Z  -- capturing check(9, "InvariantRing")                                      -- capture failed; retrying ...
2021-02-12T08:16:06.6458855Z  -- running   check(9, "InvariantRing")                                     
2021-02-12T08:16:06.6461300Z  ulimit -c unlimited; ulimit -t 700; ulimit -m 850000; ulimit -s 8192; ulimit -n 512;  cd /tmp/M2-161613-0/9-rundir/; GC_MAXIMUM_HEAP_SIZE=400M "/home/runner/work/M2/M2/M2/BUILD/build/usr-dist/x86_64-Linux-Ubuntu-20.04/bin/M2-binary" -q --int --no-randomize --no-readline --silent --stop --print-width 77 -e 'needsPackage("InvariantRing",Reload=>true,FileName=>"/home/runner/work/M2/M2/M2/Macaulay2/packages/InvariantRing.m2")' <"/tmp/M2-161613-0/8.m2" >>"/tmp/M2-161613-0/8.tmp" 2>&1
2021-02-12T08:16:06.6463654Z /tmp/M2-161613-0/8.tmp:0:1: (output file) error: Macaulay2 exited with status code 1
2021-02-12T08:16:06.6464491Z /tmp/M2-161613-0/8.m2:0:1: (input file)
2021-02-12T08:16:06.6465022Z M2: *** Error 1
2021-02-12T08:16:06.6465623Z  -- 0.658691 seconds elapsed

@DanGrayson
Copy link
Member

Hashcodes can change from one commit to the next

How is the commit relevant? I'm trying on the same commit! And why is this inevitable? I would think keeping the random seed, architecture, and even executable the same would make M2 run deterministically.

I would think so, too. Could you show me what code you ran and the output from two different runs of the same binary with the same code that produced different hashcodes? Or different results of some other code?

Correction -- all the *.m2 files have to be the same, too.

@DanGrayson
Copy link
Member

Re: "On that build, with the same exact executable (you can confirm the checksum!) this output is produced:"

What does "that build" refer to?

@mahrud
Copy link
Member

mahrud commented Feb 12, 2021

@DanGrayson
Copy link
Member

Okay. I don't see what the difference could be. Are the *.m2 files the same? Are all the dynamic libraries linked with M2 the same? Is the OS the same?

Anyway, even if the hashcodes differ between those two runs, it's still an error for a mathematical algorithm to depend on them. Maybe we should go to the opposite extreme and randomize the hashcodes each time M2 starts up, so we could detect problems like that more easily. I may try that...

@mahrud
Copy link
Member

mahrud commented Feb 15, 2021

Okay. I don't see what the difference could be. Are the *.m2 files the same? Are all the dynamic libraries linked with M2 the same? Is the OS the same?

All the same, though could you explain why the *.m2 files can affect the hash codes? @d-torrance pointed out that the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables.

Anyway, even if the hashcodes differ between those two runs, it's still an error for a mathematical algorithm to depend on them. Maybe we should go to the opposite extreme and randomize the hashcodes each time M2 starts up, so we could detect problems like that more easily. I may try that...

This would be good to do on occasion, though not always because certain random choices might make an example or text unusually long!

@DanGrayson
Copy link
Member

Okay. I don't see what the difference could be. Are the *.m2 files the same? Are all the dynamic libraries linked with M2 the same? Is the OS the same?

All the same, though could you explain why the *.m2 files can affect the hash codes? @d-torrance pointed out that the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables.

Yes, it's because many hashcodes are simply serial numbers:

Macaulay2, version 1.17.1.1
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, LLLBases,
               MinimalPrimes, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone

i1 : X = new Type of List;

i2 : hash X

o2 = 1170724

i3 : restart
Macaulay2, version 1.17.1.1
with packages: ConwayPolynomials, Elimination, IntegralClosure, InverseSystems, LLLBases,
               MinimalPrimes, PrimaryDecomposition, ReesAlgebra, Saturation, TangentCone

i1 : Y = new Type of List;

i2 : X = new Type of List;

i3 : hash X

o3 = 1170728

@DanGrayson
Copy link
Member

Anyway, even if the hashcodes differ between those two runs, it's still an error for a mathematical algorithm to depend on them. Maybe we should go to the opposite extreme and randomize the hashcodes each time M2 starts up, so we could detect problems like that more easily. I may try that...

This would be good to do on occasion, though not always because certain random choices might make an example or text unusually long!

I don't see how, as I'm talking about the hash codes, not the random number seed.

@mahrud
Copy link
Member

mahrud commented Feb 15, 2021

Yes, it's because many hashcodes are simply serial numbers:

  1. You clearly created a new hash table here, but what I said was: "the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables."

  2. Are hash codes of monomials sequential? (I'm guessing the hash code of the base ring is sequential?)

I didn't realize hash codes could be negative:

i2 : R = kk[x];

i3 : hash x

o3 = -1446996547

i10 : hash 2147483647

o10 = 2147483647

i11 : hash 2147483648

o11 = -2147483648

@DanGrayson
Copy link
Member

Yes, it's because many hashcodes are simply serial numbers:

  1. You clearly created a new hash table here, but what I said was: "the bug seems to appear and disappear depending on changes in startup.m2, even changes not involving hash tables."

I was answering this question: "All the same, though could you explain why the *.m2 files can affect the hash codes?"

@mahrud
Copy link
Member

mahrud commented Feb 15, 2021

Ah, I see. Sure.

@DanGrayson
Copy link
Member

About hashcodes being sometimes negative: I don't much like it. It might be worth changing int to uint in this line:

export hash(e:Expr):int := (

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants