HPC GAP Plans

Max Horn edited this page Mar 17, 2015 · 12 revisions
Clone this wiki locally

HPC-GAP future: questions and planning

For now this is mostly a list of questions, but hopefully they can guide us towards some actual plans.

What needs to be done to...

... to make a public HPC-GAP beta release?

  • stable single-threaded mode (see below)

  • documentation for end users: not just starting HPC-GAP, but also some basic hints on experimenting with parallelization, and pitfalls

  • documentation for package developers on how to port packages

  • kernel documentation (for people who want to port resp. fix C / C++ code in both the GAP kernel as well as in kernel extensions provided by packages)

  • Tests:

    • For the correct operation of the HPC specific features
    • For the continued correct operation of the library features that we had to modify to make them thread-safe

... make HPC-GAP in single-threaded mode a viable replacement for plain GAP?

  • in particular, what is slower / faster?
  • provide concrete examples / benchmarks, that showcase specific slowdowns
  • any known functionality regressions?
  • Single-threaded mode should (essentially) be performance-neutral and have identical functionality compared to normal GAP, assuming:
    • The Gasman garbage collector is used for single-threaded mode (there are already #ifdef's, but it doesn't compile yet).
    • Ward isn't being used (since it's unnecessary).
    • Relevant concurrency primitives become no-ops.
  • Individual GAP libraries may be adversely affected by being made thread-safe.

... make HPC-GAP usable for a general audience in multi-threaded mode?

  • provide a list of concrete issues that need to be addressed eventually, perhaps with priorities, and ideally also some hints (e.g.: for things that need to be fixed in many places, provide 1-2 examples comparing old code and fixed code)

  • below are a few preliminary lists, please feel free to add to them. We may want to put this into an issue tracker somewhere, though

issues: incorrect / unexpected results, crashes

  • readline / color prompt issues :)

  • conversions can be problematic, i.e. places that uses thing like ConvertToMatrixRep or ConvertToVectorRep (TODO: give some concrete examples for this)

  • polycyclic group collectors causes problems (TODO: document ways to reproduce that, describe issue; Steve and Max H. know more about it; Max H. might work on it)

  • Typing of plists that you only have read access to is two problems:

  • In some situations to get the type or do tests such as IsMatrix you need to recurse down through all nested subobjects -- for instance to find out if the list is loopy (contains itself). To do this safely (or efficiently in the presence of heavily shared sublists) the regular code uses TNUM changes to record where it has already been. This is not possible if some or all of the list is read-only.

  • Because we cannot change the TNUM of the list, some information may not be remembered between successive typings of the same list. GAP code tends to assume that they will be.

  • ... TODO ....

issues: performance regressions

  • provide concrete examples!

  • for each example, analyze what causes the slowdown; e.g. changed memory management; ward guards; other locking features; etc.

  • then, we can (a) try to fix it, or at least improve the situation; and (b) document typical such regressions, and where possible workarounds, so that users / package authors may know about them, so they can (1) decide if it matters to them, (2) deal with it if necessary, possible

  • ...

issues: usability

  • Probably make the multi-=threaded UI a bit nicer and more importantly document how it works so that it can be extended by anyone who wants to.

... to get better documentation?

Well, obviously, somebody needs to write it... :). But if people want to help with that, then as with everything else, they need to know what needs to be done, and where.

Things that need to be done include:

  • documentation for end users: not just starting HPC-GAP, but also some basic hints on experimenting with parallelization, and pitfalls

  • documentation for package developers on how to port packages

  • kernel documentation (for people who want to port resp. fix C / C++ code in both the GAP kernel as well as in kernel extensions provided by packages)

  • on the long run, some tutorial might also be helpful, showcasing things one can do with parallelization, but also pitfalls

TODO: describe where these need to be done. E.g. where to find the kernel documentation.

TODO: the hpcgap documentation currently lives in a separate world; convert it to GAPDoc format?

... to figure out whether my code works in HPC-GAP?

Suppose a user already have some existing code (or a package), and wants to know whether it "works" in HPC-GAP. What should they look out for? How should they test? And after that, how to address problems one finds along the way?

Of course passing test suites is a good first test, but in the end it only tells you whether things work in a single thread. Which is of course a start. But: What are good strategies to figure out whether things keep working across multiple threads?

Some specific hints and/or examples would be helpful. Here's a potential start:

  • You could run your test suite in multiple threads, i.e. start it in multiple threads.

  • Once that works, try to create objects in one thread and use them in others. TODO: give some examples

  • ...

... to make HPC-GAP ready to be the successor of GAP, i.e. turn it into GAP 5.0?

  • of corse all of the above

  • but we also need a good transition plan to be able to migrate most (ideally, all) existing packages and user code; we do not want the GAP 3->4 transition disaster to be repeated