Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metaticket - SageManifolds Code Improvement Discussion #30139

Open
mjungmath opened this issue Jul 14, 2020 · 32 comments
Open

Metaticket - SageManifolds Code Improvement Discussion #30139

mjungmath opened this issue Jul 14, 2020 · 32 comments

Comments

@mjungmath
Copy link

The main purpose of this ticket is a discussion and coordination platform regarding general code improvements of the preexisting SageManifolds code apart from bugfixes and new features. That is, for instance, the removal of existing code duplication, and strategies to improve readability, debugging and compatibility.

I for myself do not have much experience in this field. However, I am eager to improve this nice piece of package as much as I can, so that anyone can benefit from it. In any case, this is not a one man task and I appreciate any support in this matter.

I am looking forward to the discussion! :)

Tasks:

CC: @egourgoulhon @tscrim @tobiasdiez @fchapoton @mkoeppe @nthiery

Component: manifolds

Keywords: sagemanifolds

Issue created by migration from https://trac.sagemath.org/ticket/30139

@mjungmath mjungmath added this to the sage-9.2 milestone Jul 14, 2020
@mjungmath
Copy link
Author

comment:1

Allow me to start the discussion. In my opinion, the highest priority has the removal of code duplication. The new feature of vector bundles, for example, came with a cost: code duplication and redundancies. This is certainly unwanted (see ticket #29234). Take sections and tensor fields. They share a lot of common code snippets and methods. For example the method restrict which is more or less the same. Bugs for sections must now be fixed for tensor fields, and vice versa. But a simple inheritation comes with a problem: we have to drop most of the preexisting documentation. Imho, this would be sad, and I like to discuss how we deal with that.

Of course, this issue is not only restricted to vector bundles. As I modified tensor fields, I had to make changes in three different files simultaneously which had a very similar setup with similar code snippets: manifolds/differentiable/tensorfield.py, manifolds/differentiable/tensorfield_paral.py and tensor/modules/free_module_tensor.py. I had to copy the very same lines over and over again. It was a tedious task not only to modify the code but also to understand how it was built up. A simplification would be desireable, even though I have no clue how. If I look through the code, I am sure, I will find further examples.

Another thing I am bothering with is the invocation of private variables via ._private_variable. I think, these variables are private for a reason, and in my opinion, to keep the code as clean as possible, they should be invoced by methods and not directly. But I think, this has low priority because it has not such a tremendous impact. However, it would be nice, I guess.

@mjungmath

This comment has been minimized.

@mkoeppe
Copy link
Member

mkoeppe commented Jul 14, 2020

comment:4

Let me preface my comments: I have only very recently started to look into sage-manifolds and my understanding of this code is very shallow.

Replying to @mjungmath:

The new feature of vector bundles, for example, came with a cost: code duplication and redundancies. This is certainly unwanted (see ticket #29234). Take sections and tensor fields. They share a lot of common code snippets and methods. For example the method restrict which is more or less the same. Bugs for sections must now be fixed for tensor fields, and vice versa. But a simple inheritation comes with a problem: we have to drop most of the preexisting documentation. Imho, this would be sad, and I like to discuss how we deal with that.

If you are concerned about removing doctests (rather than documentation) as a byproduct of removing code duplications: There is a solution by moving many tests from doctests to _test... methods. These are inherited by subclasses and therefore test coverage can be improved.

I did this in the sage.numerical.backends modules in #20323 and follow-up tickets, and recently gh-kliem started to do the same in sage.geometry.polyhedra (see for example #29918).

In sage.numerical.backends.logging_backend I have some code that assists with transforming doctests to _test... methods, which could be generalized easily.

@mjungmath
Copy link
Author

comment:5

Replying to @mkoeppe:

If you are concerned about removing doctests (rather than documentation) as a byproduct of removing code duplications: There is a solution by moving many tests from doctests to _test... methods. These are inherited by subclasses and therefore test coverage can be improved.

Thank you. This is good to know and definitely an important point. My concern, however, is also the documentation itself. Tensor fields and sections are declared slighly differently. For example, sections are constructed by using an instance of the vector bundle. Tensor fields, on the other hand, are declared by using an instance of the manifold. Therefore, it would be good to have separate closed minimal examples and hence distinct documentations.

What about something like this:

class MyClass(ParentClass)
    r"""
    Blabla
    """

    my_method(self, a, b, c='c'):
        r"""
        Overwritten docstring
        """
        return ParentClass.my_method(self, a, b, c=c)

Or is that too nasty?

@mkoeppe
Copy link
Member

mkoeppe commented Jul 14, 2020

comment:6

Nothing wrong with that, I would say. (Of course, one would use super() instead of referring to the parent class by name.)
The performance impact should be minimal.

@nthiery
Copy link
Contributor

nthiery commented Jul 14, 2020

comment:7

Just one caveat: it makes it more complicated to look at the code of the method by introspection.

I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.

Matthias: glad to see _test methods being adopted at a large scale, with additional tooling :-)

@mkoeppe
Copy link
Member

mkoeppe commented Jul 14, 2020

comment:8

In a different direction: The (multi)linear algebra in sage.tensor that was developed for sage-manifolds has a much better design than the linear algebra provided by sage.matrix (see, for example, #30091).

It would be nice to be able to use it for computational (exact and inexact) (multi)linear algebra. In #30061, Eric already did some improvements that makes better use of sparsity.

In addition to such speedups, it would be good to be able to use different backends for storing tensors instead of the current dictionary backend.

On the one hand, for special cases such as vectors and matrices by existing special implementations; for the special case of fully symmetric tensors, using fast implementations of multivariate polynomials.

On the other hand, for general numerical tensors using mainstream accelerated frameworks like PyTorch (see #30096).

@mjungmath
Copy link
Author

comment:9

Replying to @mkoeppe:

Nothing wrong with that, I would say. (Of course, one would use super() instead of referring to the parent class by name.)

I would have thought that referring to the parent class by name is more robust against changes and increases readability. Is there a particular reason to prefer super()?

The performance impact should be minimal.

That's good.

Replying to @nthiery:

Just one caveat: it makes it more complicated to look at the code of the method by introspection.

What exactly do you mean by that?

I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.

I'd guess this would contradict the well-documented philosophy of Sage, wouldn't it?

@mkoeppe
Copy link
Member

mkoeppe commented Jul 17, 2020

comment:10

Replying to @mjungmath:

Replying to @mkoeppe:

(Of course, one would use super() instead of referring to the parent class by name.)

I would have thought that referring to the parent class by name is more robust against changes and increases readability. Is there a particular reason to prefer super()?

For classes with single inheritance, it is merely a matter of style. But as soon as multiple inheritance enters, using super() ensures that methods of all superclasses are called (and called in the right order)

@mkoeppe

This comment has been minimized.

@bollu
Copy link
Member

bollu commented Jul 22, 2020

comment:12

As someone who's new to sage, but would like to get involved in SageManifolds: how can I help?

@egourgoulhon
Copy link
Member

comment:13

Replying to @bollu:

As someone who's new to sage, but would like to get involved in SageManifolds: how can I help?

Welcome to the project!
There are various ways to contribute, see
https://sagemanifolds.obspm.fr/contrib.html.

If you take a look at the metaticket #18528, you'll see that various tickets are in stalled state (mostly the blue ones), are therefore are open for development. For instance, #18786 about complex structures. You could pick one of those or create a brand new ticket about a topic that you like.

@mjungmath

This comment has been minimized.

@egourgoulhon
Copy link
Member

comment:15

Replying to @mjungmath:

Replying to @mkoeppe:

What about something like this:

class MyClass(ParentClass)
    r"""
    Blabla
    """

    my_method(self, a, b, c='c'):
        r"""
        Overwritten docstring
        """
        return ParentClass.my_method(self, a, b, c=c)

Or is that too nasty?

This is code duplication and should be avoided IMHO, cf. comment:6 in #29234 for the case of sections vs. tensor fields.

@egourgoulhon
Copy link
Member

comment:16

Replying to @nthiery:

I don't have a good solution though. The best approximation I can come it is to include both the examples/doc of vector bundles and tensor fields in the documentation of the method. It makes it a bit long for the reader. But points out the similarity, which is not bad per se.

+1

@mjungmath
Copy link
Author

comment:17

Tobias had a very nice idea how to deal with some code duplication, in particular the restriction process: introduce sheafs (cf. comment:7 in #29234).

I like that idea and I think this is something we definitely should aim for. However, this task is not as easy as one might think on the first glimpse. My first attempt would be utilizing the morphism framework of Sage. Unfortunately, my experience with morphisms in Sage is very limited. What do you think? Any further ideas?

@egourgoulhon
Copy link
Member

comment:18

Replying to @mjungmath:

Replying to @nthiery:

Just one caveat: it makes it more complicated to look at the code of the method by introspection.

What exactly do you mean by that?

I guess Nicolas means that if the user would like to take a look at the source code via t.my_method??, he/she would get return ParentClass.my_method(self, a, b, c=c), i.e.
useless information.

@mkoeppe mkoeppe modified the milestones: sage-9.2, sage-9.3 Sep 5, 2020
@mjungmath
Copy link
Author

comment:20

I thought a bit about optimizations and the first thing that came to my mind: Cythonize code. Of course, one has to check with prun etc. whether it's worth it. However, Cythonizing comp.py and/or the algebraic part in the tensor module should be doable.

One thing that could definitely be Cythonized is the way of how names, i.e. strings, are concatenated, for example in format_utilities.py. I can imagine that this could improve speed a lot. Of course, this needs to be investigated with some performance checks before, too.

Eric, what do you think?

@egourgoulhon
Copy link
Member

comment:21

Replying to @mjungmath:

I thought a bit about optimizations and the first thing that came to my mind: Cythonize code. Of course, one has to check with prun etc. whether it's worth it. However, Cythonizing comp.py and/or the algebraic part in the tensor module should be doable.

One thing that could definitely be Cythonized is the way of how names, i.e. strings, are concatenated, for example in format_utilities.py. I can imagine that this could improve speed a lot. Of course, this needs to be investigated with some performance checks before, too.

Eric, what do you think?

Cythonizing the algebraic part of tensor calculus will unfortunately not help much to improve the computational speed on manifolds. The reason is that the main bottleneck is symbolic calculus on the tensor field components and more specifically the simplification process, which is performed via Maxima (default backend, Lisp based) or by SymPy (optional backend, Python based). To really gain in performance, one shall rely on parallelization rather than cythonization. A real limitation in this respect is #27492, which forbids parallelization with symbolic functions (side note: certainly Sage's symbolic functions require some major rewritting, see #31047)

@bollu
Copy link
Member

bollu commented Jan 18, 2021

comment:22

Perhaps I am missing something. SymPy has a new C++ backend called SymEngine https://github.com/symengine/symengine. Would that improve performance by some degree?

@egourgoulhon
Copy link
Member

comment:23

Replying to @bollu:

Perhaps I am missing something. SymPy has a new C++ backend called SymEngine https://github.com/symengine/symengine. Would that improve performance by some degree?

Thanks for pointing this! This is certainly worth to investigate. Especially SymEngine's README says: Python wrappers allow easy usage from Python and integration with SymPy and Sage (the symengine.py repository).

@bollu
Copy link
Member

bollu commented Jan 18, 2021

comment:24

If one were to attempt to integrate/use SymEngine for the purpose of making the algebra faster, what should one do? I'd imagine

  1. Pull in SymEngine into SAGE
  2. Replace the calls to Maxima to call to SymEngine
  3. Benchmark?

@mjungmath
Copy link
Author

comment:25

Replying to @bollu:

Perhaps I am missing something. SymPy has a new C++ backend called SymEngine https://github.com/symengine/symengine. Would that improve performance by some degree?

That looks extremely promising. It's definitely worth investigating. And who knows, perhaps it could even replace the entire symbolic backend in Sage.

Perhaps we should post this in the sage-devel group and initiate a discussion.

@egourgoulhon
Copy link
Member

comment:26

Replying to @mjungmath:

Replying to @bollu:

Perhaps I am missing something. SymPy has a new C++ backend called SymEngine https://github.com/symengine/symengine. Would that improve performance by some degree?

That looks extremely promising. It's definitely worth investigating. And who knows, perhaps it could even replace the entire symbolic backend in Sage.

Indeed, this is all the more relevant, given that the main developer of Pynac (Sage's symbolic backend) has resigned:
https://groups.google.com/g/sage-devel/c/JVLHRy2mxUI

Perhaps we should post this in the sage-devel group and initiate a discussion.

Yes this is certainly the right place to initiate the discussion.

@bollu
Copy link
Member

bollu commented Jan 18, 2021

comment:27

I've created a thread on sage-devel about moving to SymEngine https://groups.google.com/g/sage-devel/c/sY3zh-pq8T4

@mkoeppe
Copy link
Member

mkoeppe commented Mar 27, 2021

comment:28

Sage development has entered the release candidate phase for 9.3. Setting a new milestone for this ticket based on a cursory review of ticket status, priority, and last modification date.

@mkoeppe mkoeppe modified the milestones: sage-9.3, sage-9.4 Mar 27, 2021
@mjungmath
Copy link
Author

comment:29

Indeed, it would be nice to (co)homology of manifolds. There are two approaches I see right now:

As for Morse homology I'd like to refer to https://www.sciencedirect.com/science/article/pii/S0723086905000642. With the orientation implementation we already have, this could be doable, even though there are some immediate problems that I can think of:

  • How do we check whether a function is Morse?
  • How can we obtain all critial points? How do we determine their indices?

Is there perhaps a nice class of Morse functions we can easily deal with?

Opinions, comments, hints, references and ideas are welcome! :)

@tscrim
Copy link
Collaborator

tscrim commented Apr 7, 2021

comment:30

I would probably start with Čech cohomology since it is more straightforward.

For finding the critical points in Morse homology, I think you can compute them on each chart individually since they are suppose to be isolated IIRC and then remove redundancies. From there, you can check the degeneracy condition.

@mjungmath
Copy link
Author

comment:31

Replying to @tscrim:

I would probably start with Čech cohomology since it is more straightforward.

For that, we need "good covers" consisting of contractible sets, and all its intersections being contractible. Thus we need a way to check contractibility for open subsets. Is that worth a post in sage-devel?

@mjungmath
Copy link
Author

comment:32

The manifolds code is growing steadily and more files are added. Do you think it is time that we consider refactoring the folder structure of the manifolds module?

@mkoeppe
Copy link
Member

mkoeppe commented Apr 27, 2021

comment:33

Replying to @mjungmath:

The manifolds code is growing steadily and more files are added. Do you think it is time that we consider refactoring the folder structure of the manifolds module?

Some of the things added as part of #31740 (Meta-ticket: Families, posets, complexes of manifold subsets) could certainly be moved into a subpackage sage.manifolds.subsets:

@mjungmath
Copy link
Author

comment:34

Opened #32274 for the structuring issue.

@mkoeppe mkoeppe modified the milestones: sage-9.4, sage-9.5 Aug 10, 2021
@mkoeppe mkoeppe modified the milestones: sage-9.5, sage-9.6 Dec 18, 2021
@mkoeppe mkoeppe modified the milestones: sage-9.6, sage-9.7 May 3, 2022
@mkoeppe mkoeppe modified the milestones: sage-9.7, sage-9.8 Aug 31, 2022
@mkoeppe mkoeppe removed this from the sage-9.8 milestone Jan 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants