Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading and saving BinaryMatroids can cause infinite loops #23437

Closed
sagetrac-Stefan mannequin opened this issue Jul 14, 2017 · 14 comments
Closed

Loading and saving BinaryMatroids can cause infinite loops #23437

sagetrac-Stefan mannequin opened this issue Jul 14, 2017 · 14 comments

Comments

@sagetrac-Stefan
Copy link
Mannequin

sagetrac-Stefan mannequin commented Jul 14, 2017

The following results in an unending loop:

sage: M = matroids.named_matroids.Fano().dual()
sage: _ = list(M.bases())
sage: N = loads(dumps(M))
sage: N.is_isomorphic(M)

My guess is that N thinks some invariant has been computed, when it really hasn't been computed. You need to do something with M before saving to trigger the behavior.

CC: @sagetrac-Rudi @sagetrac-zgershkoff @sagetrac-yomcat @bgillesp

Component: matroid theory

Author: Bryan Gillespie

Branch/Commit: 4075773

Reviewer: Travis Scrimshaw

Issue created by migration from https://trac.sagemath.org/ticket/23437

@sagetrac-Stefan sagetrac-Stefan mannequin added this to the sage-8.1 milestone Jul 14, 2017
@tscrim

This comment has been minimized.

@tscrim
Copy link
Collaborator

tscrim commented Jul 15, 2017

comment:1

I think your guess is probably correct because when you get the bases of the new object (or do not construct them for the old object), then this works.

sage: M = matroids.named_matroids.Fano().dual()
sage: _ = list(M.bases())
sage: Mp = loads(dumps(M))
sage: _ = list(Mp.bases())
sage: Mp.is_isomorphic(M)
True
sage: M = matroids.named_matroids.Fano().dual()
sage: Mp = loads(dumps(M))
sage: Mp.is_isomorphic(M)
True
sage: Mp is M
False

Also, if you look at the pickling data, there is a difference

sage: M = matroids.named_matroids.Fano().dual()
sage: explain_pickle(dumps(M))
pg_unpickle_binary_matroid = unpickle_global('sage.matroids.unpickling', 'unpickle_binary_matroid')
pg_unpickle_binary_matrix = unpickle_global('sage.matroids.unpickling', 'unpickle_binary_matrix')
pg_unpickle_binary_matroid(0r,
 (pg_unpickle_binary_matrix(0r, (4r, 7r, 0r, long(7), 1r, 8r, [(long(97),), (long(82),), (long(52),), (long(120),)])),
 ['d', 'e', 'f', 'g', 'a', 'b', 'c'],
 ['d', 'e', 'f', 'g'],
 None))
sage: _ = list(M.bases())
sage: explain_pickle(dumps(M))
pg_unpickle_binary_matroid = unpickle_global('sage.matroids.unpickling', 'unpickle_binary_matroid')
pg_unpickle_binary_matrix = unpickle_global('sage.matroids.unpickling', 'unpickle_binary_matrix')
pg_unpickle_binary_matroid(0r,
 (pg_unpickle_binary_matrix(0r, (4r, 7r, 0r, long(7), 1r, 8r, [(long(42),), (long(120),), (long(25),), (long(52),)])),
 ['e', 'c', 'd', 'f', 'g', 'a', 'b'],
 ['e', 'c', 'd', 'f'],
 None))

I simplified the example to get the code to fail.

@bgillesp
Copy link
Contributor

comment:2

Hey all! I managed to hunt down the problem here; I'll get the patch up shortly.

The issue is in the method BinaryMatroid.__reduce__. Here's another interesting failure mode:

sage: M = matroids.named_matroids.Fano().dual()
sage: _ = list(M.bases())
sage: N = loads(dumps(M))
sage: N.closure(frozenset({'d'})) # !
frozenset({'e'})

The code in __reduce__ produces a pickled data object which includes a binary matrix representation of the matroid in question, and also a list of the elements in the current basis, in the sense of the algorithms of class BasisExchangeMatroid. The problem stems from the following note in the documentation of BinaryMatroid.__init__:

The extra argument ``basis``, when provided, is an ordered list of
elements of the groundset, ordered such that they index a standard
identity matrix within ``matrix``.

The code which is problematic is the following, at lines 3951 to 3960 in linear_matroid.pyx:

if self._representation is not None:
    A = self._representation
    gs = self._E
    basis = None
else:
    A = self._A
    for e in self.basis():
        basis[self._prow[self._idx[e]]] = e
    rows, cols = self._current_rows_cols()
    gs = rows + cols

When an explicit representation of a matroid hasn't been generated, the existing code in __reduce__ produces an ordering of the matroid groundset and current basis which are not consistent with this requirement. The call to self._current_rows_cols sets rows to the ordered list of groundset elements corresponding to the current basis, ordered as required by the note above so that they index an identity submatrix of the current matrix self._A. The for loop above this effectively recomputes the ordering of self.basis() (a similar loop is used in the implementation of _current_rows_cols), so the resulting return data is an ordered groundset and basis for which the basis is a prefix.

However, the matrix self._A is not necessarily reduced so that the left-most square submatrix is an identity matrix -- the elements of the current basis may be distributed in a different order and on different column indices. This results in an improperly initialized BinaryMatroid object, which results in all kinds of bad behavior, including the unending loop noted in the ticket description. The correct data to include for pickling is the basis (in appropriate order to correspond with consecutive standard basis vectors in the matrix) and the current groundset ordering, given by self._E, which gives the correct correspondence between the elements listed in basis and the columns of the matrix self._A.

These assumptions and the corresponding issue in the __reduce__ method are also found in the TernaryMatroid and QuaternaryMatroid classes, so I've updated the corresponding code in these classes as well. I didn't find specific examples where unpickling causes issues for these two classes, but I didn't search too hard; the problem would show up in examples where a lot of manipulations have been done to the matroid being pickled/unpickled, changing the underlying matrix by row-reduction into a form for which the basis columns are scattered and out of order in the matrix.

@bgillesp
Copy link
Contributor

@bgillesp
Copy link
Contributor

Commit: 94b87e9

@bgillesp
Copy link
Contributor

Author: Bryan Gillespie

@bgillesp
Copy link
Contributor

New commits:

94b87e9Fix issue with pickling of BinaryMatroid objects

@tscrim
Copy link
Collaborator

tscrim commented Aug 13, 2018

comment:5

It is great that you were able to track down the problem (I am not sure I completely understand the issue in full technicality, but I just need to reread what you wrote more carefully). I do have a few little fixes, but otherwise your branch is good.

  • Instead of \\times, it would be better to do r""" at the beginning of the docstring.
  • TESTS:: -> TESTS: since you have text after (not code).
  • Similarly From :trac:`23437` and comments: -> Check that :trac:`23437` is fixed::.
  • Do basis = self._current_rows_cols()[0] as you just need the 1st output; better to not have a _ variable.
  • Similarly _ = list(M.bases())->B = list(M.bases())`.
  • I would probably add the example in the ticket description to the other __reduce__ methods changed (but with an appropriate matroid to test them).

@bgillesp
Copy link
Contributor

comment:6

Great, I'll make the changes and push the updated branch in a bit. Just for my understanding, what's the advantage of avoiding _ variable assignment?

Also, managed to find appropriate corresponding bugs/tests for the TernaryMatroid and QuaternaryMatroid cases; for reference:

sage: from sage.matroids.advanced import TernaryMatroid
sage: X_bin = matroids.named_matroids.Fano().representation()
sage: X = Matrix(GF(3, 'x'), X_bin)
sage: M = TernaryMatroid(matrix=X).dual()
sage: B = list(M.bases())
sage: N = loads(dumps(M))
sage: N.closure(frozenset({3}))
frozenset({4})
sage: N.is_isomorphic(M)

and

sage: from sage.matroids.advanced import QuaternaryMatroid
sage: X_bin = matroids.named_matroids.Fano().representation()
sage: X = Matrix(GF(4, 'x'), X_bin)
sage: M = QuaternaryMatroid(matrix=X).dual()
sage: B = list(M.bases())
sage: N = loads(dumps(M))
sage: N.closure(frozenset({3}))
frozenset({4})
sage: N.is_isomorphic(M)

both hang on the current development branch, but are fixed by the patch.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 14, 2018

Changed commit from 94b87e9 to 4075773

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 14, 2018

Branch pushed to git repo; I updated commit sha1. New commits:

4075773Incorporate small style changes and add doctests

@tscrim
Copy link
Collaborator

tscrim commented Aug 16, 2018

comment:8

Replying to @bgillesp:

Great, I'll make the changes and push the updated branch in a bit. Just for my understanding, what's the advantage of avoiding _ variable assignment?

It is not really a good variable name, even if it is for something to be discarded. I think it might also not be valid or considered as evil on Python3.

Also, managed to find appropriate corresponding bugs/tests for the TernaryMatroid and QuaternaryMatroid cases; for reference:

Great. Thank you. Positive review.

@tscrim
Copy link
Collaborator

tscrim commented Aug 16, 2018

Reviewer: Travis Scrimshaw

@vbraun
Copy link
Member

vbraun commented Aug 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants