Complete MatrixObj #945

Open
fingolfin opened this Issue Nov 9, 2016 · 8 comments

Comments

4 participants
@fingolfin
Member

fingolfin commented Nov 9, 2016

This issue is supposed to be the "master issue" for completing the MatrixObj code in the GAP library, which is meant to replace (or complement) our current matrix code (which represents matrices as lists of lists) with proper "matrix objects", which internally can be represented in different ways (as lists of lists in row-major or column-major form; as a list of rows*columns entries; as various kinds of sparse matrices; as compressed matrices; etc.).

I'll try to list some things that needs to be done for MatrixObj to be "complete". But we shouldn't necessarily add everything here, instead we can file separate issues, and also organize these into a GitHub "project", see https://github.com/gap-system/gap/projects/1

(I really have no idea whether everything should be in a separate issue, or in one big list, or... whatever, just do what seems natural. The important thing in the end is to actual code stuff anyway, not how we log the open tasks :-)

@fingolfin

This comment has been minimized.

Show comment
Hide comment
@fingolfin

fingolfin Nov 9, 2016

Member

Here are some things I think should be addressed. Some of these should probably be turned into full issues of their own. Others might turn out to be bad ideas. We'll see

  • Finalize the MatrixObj interface (there were some minor concerns about this by @mohamed-barakat and also myself)
    • consider writing the MatrixObj API so that it also works sanely for non-commutative base domains. E.g. MultRowVector should specify from which side it multiplies with scalars, and perhaps even come in both variants?
  • Make it possible to create groups of MatrixObj matrices with Group([mat1,mat2,...]).
  • Teach the NiceMonomorphism code for groups to support MatrixObj
  • Teach recog to use MatrixObj
  • Complete the documentation, see http://www.gap-system.org/Manuals/doc/ref/chap26.html
  • go through all (well, at least as many as possible) functions in GAP that support matrices as input, and teach them to support MatrixObj
    • as this is a herculean effort, also think about ways to allow backwards compatibility
  • add support for mat[i,j] to access matrix entries (the syntax is already supported, this is mostly about installing suitable methods)
    • perhaps also allow this for plain "lists of lists", to make it easier to write code which accepts both lists of lists and MatrixObj?
  • consider allowing for 0 x n and m x 0 matrix objects. These can then properly remove existing hacks like EmptyMatrix, EmptyRowVector, NullMapMatrix, ...
  • also think about vectors...
    • we want both row and column vectors
    • for non-commutative base domains, row vectors should be left module elements, while column vectors should be right module elements
    • we probably should have a IsVectorObj and then IsRowVectorObj and IsColumnVectorObj specialize it (indeed, virtually all code for them could be identical, it' just a bit that decides whether we interpret something as a 1 x n or n x 1
    • on the other hand, some people may prefer to lazily allow vectors to be both column and row vectors at the same time. This costs us some type safety, but maybe be more convenient??? at the very least, it should be easy and cheap (=fast) to convert between the two
  • RowLength seems like a really bad idea. I propose that instead we add NumberRows and NumberColumns to the MatrixObj API and ditch RowLength. Ideally I'd also disallow Length for MatrixObj, but perhaps it's useful to have it (as alias for NumberRows???) to make transitioning code to using MatrixObj easier. But I think it would be better to start without it, and see whether we really can benefit from it. I.e. show me code that becomes trivially able to use MatrixObj because of Length, and we can keep it... but if you have to modify the code anyway, I think changing it to use NumberRows is better
  • consider adding an API for MatrixObj which indicates if a given matrix obj is stored in row-major or column-major form: Reason: some algorithm are only optimal for one or the other. By detecting this, one could rewrite the matrix first... or invoke different implementations of the algorithm suitable for each rep.... Of course we can add this API later on, it doesn't have to be there initially. But note that some reps may be neither and/or both (think of sparse reps; or block matrices with submatrices of different types -- though perhaps the latter should simply not be done ;-)
Member

fingolfin commented Nov 9, 2016

Here are some things I think should be addressed. Some of these should probably be turned into full issues of their own. Others might turn out to be bad ideas. We'll see

  • Finalize the MatrixObj interface (there were some minor concerns about this by @mohamed-barakat and also myself)
    • consider writing the MatrixObj API so that it also works sanely for non-commutative base domains. E.g. MultRowVector should specify from which side it multiplies with scalars, and perhaps even come in both variants?
  • Make it possible to create groups of MatrixObj matrices with Group([mat1,mat2,...]).
  • Teach the NiceMonomorphism code for groups to support MatrixObj
  • Teach recog to use MatrixObj
  • Complete the documentation, see http://www.gap-system.org/Manuals/doc/ref/chap26.html
  • go through all (well, at least as many as possible) functions in GAP that support matrices as input, and teach them to support MatrixObj
    • as this is a herculean effort, also think about ways to allow backwards compatibility
  • add support for mat[i,j] to access matrix entries (the syntax is already supported, this is mostly about installing suitable methods)
    • perhaps also allow this for plain "lists of lists", to make it easier to write code which accepts both lists of lists and MatrixObj?
  • consider allowing for 0 x n and m x 0 matrix objects. These can then properly remove existing hacks like EmptyMatrix, EmptyRowVector, NullMapMatrix, ...
  • also think about vectors...
    • we want both row and column vectors
    • for non-commutative base domains, row vectors should be left module elements, while column vectors should be right module elements
    • we probably should have a IsVectorObj and then IsRowVectorObj and IsColumnVectorObj specialize it (indeed, virtually all code for them could be identical, it' just a bit that decides whether we interpret something as a 1 x n or n x 1
    • on the other hand, some people may prefer to lazily allow vectors to be both column and row vectors at the same time. This costs us some type safety, but maybe be more convenient??? at the very least, it should be easy and cheap (=fast) to convert between the two
  • RowLength seems like a really bad idea. I propose that instead we add NumberRows and NumberColumns to the MatrixObj API and ditch RowLength. Ideally I'd also disallow Length for MatrixObj, but perhaps it's useful to have it (as alias for NumberRows???) to make transitioning code to using MatrixObj easier. But I think it would be better to start without it, and see whether we really can benefit from it. I.e. show me code that becomes trivially able to use MatrixObj because of Length, and we can keep it... but if you have to modify the code anyway, I think changing it to use NumberRows is better
  • consider adding an API for MatrixObj which indicates if a given matrix obj is stored in row-major or column-major form: Reason: some algorithm are only optimal for one or the other. By detecting this, one could rewrite the matrix first... or invoke different implementations of the algorithm suitable for each rep.... Of course we can add this API later on, it doesn't have to be there initially. But note that some reps may be neither and/or both (think of sparse reps; or block matrices with submatrices of different types -- though perhaps the latter should simply not be done ;-)
@markuspf

This comment has been minimized.

Show comment
Hide comment
@markuspf

markuspf Nov 9, 2016

Member

Obvious question from me: What about semigroups and semirings? We have some code in the semigroups package that might be of some inspiration.

Member

markuspf commented Nov 9, 2016

Obvious question from me: What about semigroups and semirings? We have some code in the semigroups package that might be of some inspiration.

@markuspf

This comment has been minimized.

Show comment
Hide comment
@markuspf

markuspf Nov 9, 2016

Member

Maybe this could be a project for a GAP days or a GAP hackathon. This being the only issue worked on?

Member

markuspf commented Nov 9, 2016

Maybe this could be a project for a GAP days or a GAP hackathon. This being the only issue worked on?

@mohamed-barakat

This comment has been minimized.

Show comment
Hide comment
@mohamed-barakat

mohamed-barakat May 29, 2017

Member

My wish list:

  • The matrix should always know its ring:
    • allow the zero ring
    • do not require the ring to know all its properties, e.g., that knows later that it is the zero ring
    • allow noncommutative (semi)rings.
  • NrRows, NrColumns:
    • allow empty matrices
    • should be attributes not required during creation of the matrix
      (allows implementation of lazy matrices)
Member

mohamed-barakat commented May 29, 2017

My wish list:

  • The matrix should always know its ring:
    • allow the zero ring
    • do not require the ring to know all its properties, e.g., that knows later that it is the zero ring
    • allow noncommutative (semi)rings.
  • NrRows, NrColumns:
    • allow empty matrices
    • should be attributes not required during creation of the matrix
      (allows implementation of lazy matrices)
@fingolfin

This comment has been minimized.

Show comment
Hide comment
@fingolfin

fingolfin May 30, 2017

Member

We are now working on this as part of GAP Days. We created a work branch, see https://github.com/gap-system/gap/tree/MatrixObj resp.
https://github.com/gap-system/gap/compare/MatrixObj

Here is a list of tasks for the rest of the GAP Days:

  • Go through all the operations/attributes in matobj2.gd; then, if we decided to keep the...
    • write a proper documentation comment;
    • verify that a default implementation has been provided (if applicable), else provide one;
    • write some tests
    • possibly provide optimized versions for plistrep, 8bit rep, GF2 rep, ...
  • Write (or complete) a high level description of the MatrixObj interface; in particular, clearly state what a minimal implementation of IsVectorObj resp. IsMatrixObj needs to do (what methods to provide...); also explain things like "vectors are neither row nor column; product is always scalar product"; etc.
  • work on making the library compatible with IsMatrixObj... for example:
    • make it possible to create groups generated by matrix objs, and make sure as many things as possible "work" with it; this probably involves getting the NiceMonomorphism code to deal with MatrixObjs (defined over finite fields)
    • teach the meataxe to work with MatrixObj
    • look for any places in the library using IsMatrix (as a filter), and see if we can / should adapt them for IsMatrixObj
    • for certain operations, we probably can use (almost?) the same code for IsMatrix and IsMatrixObj; possibly provide a way to make it easy to install a method for both at once -> but first see if there really are examples where this would be useful, collect them, then think about how to best tackle this
  • Deal with ConvertToVectorRep / ConvertToMatrixRep
    • ideally, we want to get rid of them completely
    • of course lots of package use them, so we also need to work with package authors, and provide could migration strategies (possibly documented somewhere as well)
  • rename functions like RankMat consistently to RankMatrix; make sure to provide synonyms for the old name, to retain backwards compatibility
  • some love for foo[i,j] syntax
    • benchmark it via foo[i][j] resp. MatElm(foo,i,j)
    • look into replacing the existing non-optimized dispatch code for foo[i,j] by something that is faster in special cases: Right now, foo[i,j] translates to foo[ [i,j] ] (so a temporary plist is created), and then we dispatch generically to methods for \[\] resp. \[\]:=; we could avoid creating the temporary plist, and instead dispatch through MatElm resp. SetMatElm
    • ensure foo[i,j] also works for plists-of-plists resp. for IsMatrix; and in addition, see about extending the hot path from above for this, i.e. don't even dispatch through (Set)MatElm, but directly modify the list;
    • perhaps also have hot paths for compressed matrices???
  • ...
Member

fingolfin commented May 30, 2017

We are now working on this as part of GAP Days. We created a work branch, see https://github.com/gap-system/gap/tree/MatrixObj resp.
https://github.com/gap-system/gap/compare/MatrixObj

Here is a list of tasks for the rest of the GAP Days:

  • Go through all the operations/attributes in matobj2.gd; then, if we decided to keep the...
    • write a proper documentation comment;
    • verify that a default implementation has been provided (if applicable), else provide one;
    • write some tests
    • possibly provide optimized versions for plistrep, 8bit rep, GF2 rep, ...
  • Write (or complete) a high level description of the MatrixObj interface; in particular, clearly state what a minimal implementation of IsVectorObj resp. IsMatrixObj needs to do (what methods to provide...); also explain things like "vectors are neither row nor column; product is always scalar product"; etc.
  • work on making the library compatible with IsMatrixObj... for example:
    • make it possible to create groups generated by matrix objs, and make sure as many things as possible "work" with it; this probably involves getting the NiceMonomorphism code to deal with MatrixObjs (defined over finite fields)
    • teach the meataxe to work with MatrixObj
    • look for any places in the library using IsMatrix (as a filter), and see if we can / should adapt them for IsMatrixObj
    • for certain operations, we probably can use (almost?) the same code for IsMatrix and IsMatrixObj; possibly provide a way to make it easy to install a method for both at once -> but first see if there really are examples where this would be useful, collect them, then think about how to best tackle this
  • Deal with ConvertToVectorRep / ConvertToMatrixRep
    • ideally, we want to get rid of them completely
    • of course lots of package use them, so we also need to work with package authors, and provide could migration strategies (possibly documented somewhere as well)
  • rename functions like RankMat consistently to RankMatrix; make sure to provide synonyms for the old name, to retain backwards compatibility
  • some love for foo[i,j] syntax
    • benchmark it via foo[i][j] resp. MatElm(foo,i,j)
    • look into replacing the existing non-optimized dispatch code for foo[i,j] by something that is faster in special cases: Right now, foo[i,j] translates to foo[ [i,j] ] (so a temporary plist is created), and then we dispatch generically to methods for \[\] resp. \[\]:=; we could avoid creating the temporary plist, and instead dispatch through MatElm resp. SetMatElm
    • ensure foo[i,j] also works for plists-of-plists resp. for IsMatrix; and in addition, see about extending the hot path from above for this, i.e. don't even dispatch through (Set)MatElm, but directly modify the list;
    • perhaps also have hot paths for compressed matrices???
  • ...
@ssiccha

This comment has been minimized.

Show comment
Hide comment
@ssiccha

ssiccha May 31, 2017

Contributor
  • IsPlistMatrixRep / IsPlistVectorRep access: compare performance-wise ![BDPOS] (global vars), ![1] (directly), !.BaseDomain (AsComponentObj), BASE_DOMAIN() (Kernel function)
Contributor

ssiccha commented May 31, 2017

  • IsPlistMatrixRep / IsPlistVectorRep access: compare performance-wise ![BDPOS] (global vars), ![1] (directly), !.BaseDomain (AsComponentObj), BASE_DOMAIN() (Kernel function)
@fingolfin

This comment has been minimized.

Show comment
Hide comment
@fingolfin

fingolfin Jun 18, 2017

Member

@ssiccha any progress on that?

In other news, see also: #1427

Member

fingolfin commented Jun 18, 2017

@ssiccha any progress on that?

In other news, see also: #1427

@ssiccha

This comment has been minimized.

Show comment
Hide comment
@ssiccha

ssiccha Jul 7, 2017

Contributor

@fingolfin See commit for the results.
It looks like there are no big differences, unless you install a kernel function into an Attribute instead of into an Operation. Which is unfortunate because atm BaseDomain is declared as an Attribute in matobj2.gd.

Contributor

ssiccha commented Jul 7, 2017

@fingolfin See commit for the results.
It looks like there are no big differences, unless you install a kernel function into an Attribute instead of into an Operation. Which is unfortunate because atm BaseDomain is declared as an Attribute in matobj2.gd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment