Make submersion and tangent_submersion abstract in LevelSet #1681

luisfpereira · 2022-10-17T18:53:09Z

Partially answers #1680 for LevelSet.

Missing:

docstrings

Need to discuss:

should value somehow also be forced to be defined or passing as argument to parent works fine? (I think it should be OK as argument)
is the API naming OK? e.g. point and vector, point, respectively?

@ninamiolane @ymontmarin

…e children accordingly

ymontmarin · 2022-10-18T08:39:23Z

Looks promising !!!

I think embedding_space should follow the same logic, as you know the submersion and its tangent before any instantiation you know the space in which it submerse. Again, it can look strange that we can instantiate the space outside

maybe with something like

@abs.abstract
def define_embedding_space(self):
    return

@property
def embedding_space(self):
    if not hasattr(self, '_embedding_space'):
         self._embedding_space = self.define_embedding_space()
    return self._embedding_space

or alternatively:

self._embedding_space = self.define_embedding_space()

in __init__ but then we have to be sure that child class will use super().__init__()

Maybe tangent does not necessary needs to be abstract and we can implement a default autodiff implementation instead, for class with no close form of the tangent immersion

Let us discuss !

ninamiolane

This is great! I also like the fact that having them as abstract method enforces a consistent API of these functions.

geomstats/geometry/base.py

ninamiolane · 2022-10-18T09:24:14Z

Re: "should value somehow also be forced to be defined or passing as argument to parent works fine? (I think it should be OK as argument)"

I think that value could be hardcoded to 0. because the submersion can always be redefined (by subtracting value) to have value zero.

Re: API
Looks good!

ninamiolane · 2022-10-18T09:25:33Z

Another inconsistency in LevelSet: it has default_coords_type being intrinsic, but it uses the shape of the embedding space. The docstring also mention that the shapes of points should be [..., dim] which seems to refer to intrinsic coordinates.

ninamiolane · 2022-10-21T09:55:01Z

Another comment: tangent_submersion should me a method but not be abstract but it does not have to be overwritten. If it is not provided, then we use gs.autodiff.jacobian(submersion). <--- that piece of code would go in the LevelSet method.

ninamiolane · 2022-10-21T10:05:49Z

Another question: we want to put define_embedding_space to prevent the users to change the embedding space once the object (say sphere) of a class (say Hypersphere) subclassing LevelSet has been instantiated.

But what prevents users to still do "sphere.embedding_space = new_space" again?

ymontmarin · 2022-10-23T21:36:10Z

Another inconsistency in LevelSet: it has default_coords_type being intrinsic, but it uses the shape of the embedding space. The docstring also mention that the shapes of points should be [..., dim] which seems to refer to intrinsic coordinates.

Maybe my question is stupid but: we agree we have intrinsic => shape = [..., dim], but what about the other way around ? I guess there is some case in Poincaré disk where extrinsinc coordinate are also shape = [..., dim] ? Should we do a check.

ymontmarin · 2022-10-23T21:40:26Z

Another comment: tangent_submersion should me a method but not be abstract but it does not have to be overwritten. If it is not provided, then we use gs.autodiff.jacobian(submersion). <--- that piece of code would go in the LevelSet method.

I think it is what is written in PullbackDiffeo

ymontmarin · 2022-10-23T21:43:39Z

Another question: we want to put define_embedding_space to prevent the users to change the embedding space once the object (say sphere) of a class (say Hypersphere) subclassing LevelSet has been instantiated.

But what prevents users to still do "sphere.embedding_space = new_space" again?

Because embedding_space will be a property without a setter hook, so it can't be set by the user

… submersion

review-notebook-app · 2022-10-27T18:51:57Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

luisfpereira · 2022-10-27T18:54:58Z

@ninamiolane @ymontmarin I think this may be ready. A new review from your side may be worthwhile, as several things have changed. I think it implements "our vision".

codecov · 2022-10-27T19:07:32Z

Codecov Report

Merging #1681 (557f9af) into master (4b031ac) will decrease coverage by 11.39%.
The diff coverage is 91.16%.

❗ Current head 557f9af differs from pull request most recent head 035d053. Consider uploading reports for the commit 035d053 to get more accurate results

@@             Coverage Diff             @@
##           master    #1681       +/-   ##
===========================================
- Coverage   91.95%   80.56%   -11.38%     
===========================================
  Files         129      117       -12     
  Lines       13327    12604      -723     
===========================================
- Hits        12253    10153     -2100     
- Misses       1074     2451     +1377

Flag	Coverage Δ
autograd	`?`
numpy	`?`
pytorch	`80.56% <91.16%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
geomstats/geometry/special_euclidean.py	`88.47% <60.00%> (-0.13%)`	⬇️
geomstats/geometry/discrete_curves.py	`91.70% <87.50%> (-1.28%)`	⬇️
geomstats/geometry/hyperboloid.py	`84.49% <90.91%> (-0.06%)`	⬇️
geomstats/geometry/special_orthogonal.py	`97.52% <90.91%> (-0.41%)`	⬇️
geomstats/geometry/base.py	`91.93% <96.00%> (-0.52%)`	⬇️
...omstats/geometry/full_rank_correlation_matrices.py	`100.00% <100.00%> (ø)`
geomstats/geometry/grassmannian.py	`93.94% <100.00%> (+0.33%)`	⬆️
geomstats/geometry/hypersphere.py	`81.33% <100.00%> (-6.69%)`	⬇️
geomstats/geometry/pre_shape.py	`65.93% <100.00%> (-26.55%)`	⬇️
geomstats/geometry/stiefel.py	`76.67% <100.00%> (-2.99%)`	⬇️
... and 59 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

ninamiolane

Very nice, thank you! I've just put minor comments. Maybe double check the docstrings of the submersions so that they match the fact that the value is now 0 (I've put one comment to that effect in SE(n).

I'm still not a huge fan of the _define_embedding_space, because:

this would be the only method across all manifolds that is "talking to the user" (with the keyword "define") and this feels wrong. All the other methods are purely impersonal and geometry-facing.
if it's main role is to prevent the user from changing the embedding_space attribute, then this goes against our vision of "not preventing anything, but letting the user use the manifolds at their own risk". This would also be prevented by the new gold rule which is "never modify the attributes of a manifold, unless it's its numerics objects (ExpSolver)".

But I forget the details of our debate about this : happy to go with it if I'm forgetting important aspect of this choice.

geomstats/geometry/base.py

geomstats/geometry/full_rank_correlation_matrices.py

geomstats/geometry/grassmannian.py

geomstats/geometry/special_euclidean.py

luisfpereira · 2022-11-02T15:35:16Z

@ninamiolane, regarding _define_embedding_space I would say the main strength of this implementation is that it forces the contributor to define an embedding space (since we are asking _define_embedding_space as abstract method). In other words, the "contract" for creating a new child of LevelSet is very clear for the contributor.

I can only see an alternative to this, but without the strength we have above (i.e. a less knowledgeable contributor may not know he/she has to define embedding_space):

embedding space is define in __init__ as self.embedding_space = ...

The alternative of having a property (or function) simply called embedding_space does not work because it would instantiate the embedding space everytime we do space.embedding_space.

Do you have any alternatives in mind? Would we prefer the alternative I mentioned to the current implementation?

Edit: What I would probably like do to make the current implementation more coherent with the second point you make is to not have embedding_space as a property, but have it defined in LevelSet has self.embedding_space = self._define_embedding_space(). What do you think?

ninamiolane · 2022-11-02T17:27:28Z

I would indeed prefer the alternative of the attribute. To tell the less knowledgeable contributor he/she has to define embedding_space, we could have a clear error message: the first time self.embedding_space is needed, if it is not defined then the error tells the user to define it.

I know we talked a lot about this already. Maybe we could poll everybody at the next meeting to get a vote?

ymontmarin · 2022-11-03T00:36:41Z

From what i remember from the discussion the starting point was that in python methods implementation are defined when child class are written, meaning it is decided before any instantiation. Whereas argument of __init__ may not (it allows definition to be outside the object). In addition, having a method defining an attribute only once ensure it canot be modified in the same way method should not be modify.

As we discuss, it is more natural to have immersion (and its derivative) as a method, since it is a function anyway, and it is known when child class are written. Mathematically, the immersion is the tuple $(\phi, E)$ so it makes sense the the embedding space should be at the same level as the immersion, hence a method.

But if we step back, what we really want is that embedding_space implementation is written when a child clas is written, at the same moment as the immersion (it is weird to require that immersion should be implemented when child class is written and not requiring the definition of the space it goes in).
To be more precise, in the current implementation it is possible to have code such as:

class Child(LevelSet):
    def immersion(self, x):
          XXX

e = Euclidean(n)
a = Child(dim=m, embedding_space=e)

which is pretty weird to me, because immersion implementation is written inside the object, and the space it goes to is written outside.
The most natural way of ensuring this cannot happen, is to use method (or some python dark magic) and mark it as abstract.

But I agree with @ninamiolane that a method _define_embedding_space feel weird and different. But actually, I do not see the word define as a name talking to the coder, but as a description of what happen in the method, something like _instantiate_embedding_space is more accurate. Still, it is different from the other methods name that are mathematic. So lexically i totally agree that it does not feel right !

So for a better use experience, it could be a property ! But then the instantiation would occur every time the attribute is asked.

A python magic that can work is this:

class LevelSet(ABC):
    def __init__(self, dim, default_coords_type="intrinsic", **kwargs):
        super().__init__(dim=dim, default_coords_type=default_coords_type, **kwargs)

    def __getattribute__(self, name):
        if name == 'embedding_space':
            if not hasattr(self, '_embedding_space'):
                self._embedding_space = super().__getattribute__('embedding_space')
            return self._embedding_space
        return super().__getattribute__(name)

    @property
    @abstractmethod
    def embedding_space(self):
        pass

    @abstractmethod
    def immersion(self):
        pass

In this way, embedding_space is defined using a property so it is closer to mathematic naming, it is marked as abstract and should be written in the child class with immersion but we still have the lazy computation which call it only once and it does not impact the user experience.

We will have

class Hypersphere(LevelSet):
    def __init__(self, dim):
        super().__init__(dim=dim, default_coords_type='extrasinc', **kwargs)

    @property
    def embedding_space(self):
        return Euclidean(self.dim + 1)

    def immersion(self, x):
        return gs.norm(x) - 1

luisfpereira · 2022-11-03T08:54:05Z

I think @ymontmarin's Child example shows clearly why it may be inadequate to have the definition at __init__ level.

I fully disagree with the idea of instantiating it every time: it will negatively impact performance and will not allow to change any numerical parameter of algorithms associated with a given embedding space.

I don't like so much the tricks of checking if it hasattr (for me it looks like an unnecessary condition - the impact in performance is negligible, but there's still a dictionary key lookup every time we access embedding space). Besides, we know it will have, for sure, the attribute after instantiation. This is the reason why I started this PR by having embedding_space has a property and changed it (today) to simply have it has attribute (instantiated in the init of LevelSet) - also, we need to instantiate it in LevelSet anyway to have access to shape. This is strong due to our "new" way of thinking stating we allow mathematical attributes to change, but at user's risk.

The semantics of _define_embedding_space do not look so strange to me. It is not so far from _create_basis we already have in several other classes (the reason of being of that method is performance).

I agree with the poll @ninamiolane, but after our discussions, I kind of think the solution current implemented in the PR is the strongest technically speaking.

In order to move on, do you agree I merge this solution for now (after addressing @ninamiolane's comments) and then we update it based on our poll? (The implementation on this PR is by far much cleaner than the currently available in master.)

ymontmarin · 2022-11-03T10:06:33Z

@LPereira95 I also add some minor comments in the code.

About the main discussion:

under the hood, any attribute access is a key lookup, which is a constant time operation. Having 2 instead of one does not change performance
I also fully disagree with the idea of instantiating it every time, hence the classical trick of using xxx and _xxx
"also, we need to instantiate it in LevelSet anyway to have access to shape." I don't understand the difference you are talking about. If you use self.embbeding_space.shape with the property implementation, it will call _define_embedding_space so the instantiation also happen in __init__. In any case, none of the above solve the problem of circular need: _define_embedding_space may need attributes (e.g. dim) that are defined in __init__ but __init__ need embedding_space.shape that is define in _define_embedding_space
So maybe the semantic should simply be _create_embedding_space ?

I think the circular issue (and other comments) need to be addressed before merging. I only give suggestion for the love <3 of geomstats; let me know for curiosity, what people want :)

An idea for circular loop issue while following @LPereira95 implementation choice:

class LevelSet(ABC):
    def __init__(self, dim, shape=None, default_coords_type="extrinsic", **kwargs):
        self.embedding_space = self._create_embedding_space(dim, shape, default_coords_type, **kwargs)
        if shape is None:
            shape = self.embedding_space.shape
        super().__init__(dim=dim, shape=shape, default_coords_type=default_coords_type, **kwargs)

    @abstractmethod
    def _create_embedding_space(self, dim, shape, default_coords_type, **kwargs):
        pass

    @abstractmethod
    def immersion(self):
        pass

Such that implementation of Hypersphere looks like

class Hypersphere(LevelSet):

    def _create_embedding_space(self, dim, **kwargs):
        return Euclidean(dim + 1)

    def immersion(self, x):
        return gs.norm(x) - 1

luisfpereira · 2022-11-03T10:48:27Z

Thanks a lot @ymontmarin, we deeply appreciate your comments and suggestions. All this discussion is to try to come up with a solution that pleases everyone and is technically robust.

Thanks for the clarification on the constant time operation for key lookup (in fact I should have thought better before my comment before).

For the third point, I mean that since we definitely have to instantiate embedding_space in the init of LevelSet to access shape, it looks unnecessary to me to instantiate it "implicitly" by having a property verifying if _embedding_space already exist and instantiating it if not. The current solution is more explicit in my opinion.

Your point on circularity is in fact interesting, but I would argue there's not really a circularity. What's happening is that we normally need the arguments passed in child init to instantiate the embedding space. I would say it is a good practice to store init args as attributes (e.g. sklearn forces you to do that to allow a better __repr__), so this solution actually forces us to follow a good practice. Passing arguments to _define/create_embedding_space does not feel right to me. The case of the Hypersphere is a little particular in the sense that it asks for dim, which is also an attribute of Manifold. (But I can fully understand why you are raising this issue, and I'm completely open to find out a better solution)

Both _create and _define work for me.

ymontmarin · 2022-11-03T11:13:52Z

Ok I get the implicit reasoning for the instanciation, I agree with you !
I agree with the sklearn practice, it is also the practice i follow !

But in our case the argument are setted as attribute of the object during the call super().__init__(...) but this line is AFTER the line self.embedding_space = self._create_embedding_space(), which mean than when _create_embedding_space is called the arguments have not been setted as attribute yet ! So self has no dim attribute and _create_embedding_space can't access dim.

The natural solution (which is what happen in sklearn etc...) would be to have super().__init__(...) BEFORE self.embedding_space = self._create_embedding_space(), but in this situation, we also need to have self.embedding_space to access its shape as default shape BEFORE super().__init__(...). Hence the circularity.

Maybe I am wrong but for me this code will fail:

class LevelSet(Manifold):
    def __init__(self, dim, shape=None, default_coords_type="extrinsic", **kwargs):
        self.embedding_space = self._create_embedding_space()
        if shape is None:
            shape = self.embedding_space.shape
        super().__init__(dim=dim, shape=shape, default_coords_type=default_coords_type, **kwargs)

    @abstractmethod
    def _create_embedding_space(self):
        pass

    @abstractmethod
    def immersion(self, x):
        pass

class Hypersphere(LevelSet):

    def _create_embedding_space(self):
        return Euclidean(self.dim + 1)

    def immersion(self, x):
        return gs.norm(x) - 1

And maybe I did not totally understand your response

ymontmarin · 2022-11-03T11:26:19Z

@LPereira95 Nevermind, I dit not see you were resetting dim as attribute manually (it is set twice but I don't think it is a problem). So I think it is ok then !

ymontmarin · 2022-11-03T11:29:04Z

class LevelSet(Manifold):
    def __init__(self, dim, shape=None, default_coords_type="extrinsic", **kwargs):
        self.embedding_space = self._create_embedding_space()
        if shape is None:
            shape = self.embedding_space.shape
        super().__init__(dim=dim, shape=shape, default_coords_type=default_coords_type, **kwargs)

    @abstractmethod
    def _create_embedding_space(self):
        pass

    @abstractmethod
    def immersion(self, x):
        pass

class Hypersphere(LevelSet):
    def __init__(self, dim):
        self.dim = dim
        super().__init__(dim)

    def _create_embedding_space(self):
        return Euclidean(self.dim + 1)

    def immersion(self, x):
        return gs.norm(x) - 1

will indeed work and looks pretty clean to me :) Sorry for the misunderstanding !

luisfpereira · 2022-11-16T15:40:52Z

@ninamiolane I think I've addressed everything here. I'll just ensure tests are passing and then I'll merge.

ymontmarin · 2022-11-16T16:12:53Z

Hello @LPereira95

What about the shapes computation ?

luisfpereira · 2022-11-16T16:53:15Z

Hi @ymontmarin!

It looks like that comments are pending or I really missed them out...

ymontmarin · 2022-11-16T16:58:08Z

@LPereira95 What does the label "pending" mean ? I am not so used to review on github

luisfpereira · 2022-11-16T16:59:14Z

Your review has not been submitted @ymontmarin. Only you have access to it. There should be a button on top with "Submit review".

ymontmarin

First time using the review tool !

ymontmarin · 2022-11-02T21:07:11Z

geomstats/geometry/base.py

+            tuple(-n + 1 + i for i in range(n - 1))
+            if gs.ndim(point) > len(self.shape)
+            else tuple(-n + i for i in range(n))
+        )


Isn't axis = tuple(range(n - len(self.shape), n)) working ? I think you want the all to apply to the axis involved in the shape of point which are the latest. It may be shorter and even work with general batching

Here @LPereira95

Thanks @ymontmarin!

This solution fails for:

space = Hyperboloid(3) space.belongs(space.random_point())

I think this may work in all cases:

axis = tuple(range(n - len(self.shape), n)) if (n - len(self.shape)) >= 0 else ()

What do you think?

Actually I'm not so sure this version increases readability (it may even make it harder to understand). Besides, we can easily achieve general batching by replacing 1 by gs.ndim(point) - len(self.shape) in the first range...

a simplified and more readable version may be:

n_batch = gs.ndim(point) - len(self.shape) axis = tuple(range(-n + n_batch, 0))

After thinking about it, none of the two solutions are very universal.

point is of shape [*batch_shape, *self.shape]
immersed_point is of shape [*batch_shape, *sub_shape]

If the subermesion only give a real sub_value_shape = () otherwise we want the all on the sub_shape axis.

So maybe a sorter, more readable and universal way that work is:

belongs = self.embedding_space.belongs(point, atol) if not gs.any(belongs): return belongs submersed_point = self.submersion(point) constraint = gs.isclose(submersed_point, 0.0, atol=atol) sub_ndim = gs.ndim(submersed_point) - gs.ndim(point) + len(self.shape) if sub_ndim: constraint = gs.all(constraint, axis=tuple(range(-sub_ndim, 0))) return gs.logical_and(belongs, constraint)

@LPereira95 My message was written before i saw your two messages ! I think it is almost the same :) (apart from the tuple construction here being inside the if)

I'm glad we came up with similar solutions independently. These ones I think are pretty universal. I'll merge now.

ymontmarin · 2022-11-02T21:43:23Z

geomstats/geometry/base.py

+            if gs.ndim(base_point) > len(self.shape)
+            or gs.ndim(vector) > len(self.shape)
+            else tuple(-n + i for i in range(n))
+        )


Same as above

Here @LPereira95

geomstats/geometry/special_euclidean.py

ymontmarin · 2022-11-16T17:07:13Z

@LPereira95 Ok !!! Thank you :) sorry I miss that, I left out the only two points (qhape stuff) that was not adress with the discussion and so on

luisfpereira · 2022-11-17T09:04:31Z

Test failures are unrelated... merging.

Make submersion and tangent_submersion abstract in LevelSet and updat…

bcbe856

…e children accordingly

ninamiolane approved these changes Oct 18, 2022

View reviewed changes

geomstats/geometry/base.py Outdated Show resolved Hide resolved

luisfpereira added 5 commits October 27, 2022 20:10

Remove value from submersion and add _define_embedding_space to LevelSet

660b291

Update information geometry LevelSet children

f3e61ad

Merge branch 'master' into level_set

0ddf218

Improve LevelSet children docstrings regarding submersion and tangent…

37a67c2

… submersion

Fix notebooks

1732b3d

Add missing docstrings

06506c9

ninamiolane approved these changes Nov 2, 2022

View reviewed changes

luisfpereira added 2 commits November 3, 2022 09:34

Undo embedding space as property in LevelSet

6e34560

Change default_coords_type to extrinsic in LevelSet

08541c2

Make definition of shape more explicit in LevelSet

b655ad0

luisfpereira added 2 commits November 16, 2022 16:38

Update LevelSet docstrings

557f9af

Merge branch 'master' into level_set

bd365af

ymontmarin reviewed Nov 16, 2022

View reviewed changes

Improve LevelSet belongs and is_tangent based on ymontmarin comments

035d053

luisfpereira merged commit 603c228 into geomstats:master Nov 17, 2022

luisfpereira deleted the level_set branch November 22, 2022 14:56

Make submersion and tangent_submersion abstract in LevelSet #1681

Make submersion and tangent_submersion abstract in LevelSet #1681

Conversation

luisfpereira commented Oct 17, 2022

ymontmarin commented Oct 18, 2022 • edited

ninamiolane left a comment

Choose a reason for hiding this comment

ninamiolane commented Oct 18, 2022 • edited

ninamiolane commented Oct 18, 2022 • edited

ninamiolane commented Oct 21, 2022

ninamiolane commented Oct 21, 2022

ymontmarin commented Oct 23, 2022

ymontmarin commented Oct 23, 2022

ymontmarin commented Oct 23, 2022

review-notebook-app bot commented Oct 27, 2022

luisfpereira commented Oct 27, 2022

codecov bot commented Oct 27, 2022 • edited

Codecov Report

ninamiolane left a comment

Choose a reason for hiding this comment

luisfpereira commented Nov 2, 2022 • edited

ninamiolane commented Nov 2, 2022

ymontmarin commented Nov 3, 2022 • edited

luisfpereira commented Nov 3, 2022

ymontmarin commented Nov 3, 2022 • edited

luisfpereira commented Nov 3, 2022

ymontmarin commented Nov 3, 2022 • edited

ymontmarin commented Nov 3, 2022

ymontmarin commented Nov 3, 2022

luisfpereira commented Nov 16, 2022

ymontmarin commented Nov 16, 2022 • edited

luisfpereira commented Nov 16, 2022

ymontmarin commented Nov 16, 2022

luisfpereira commented Nov 16, 2022

ymontmarin left a comment

Choose a reason for hiding this comment

ymontmarin Nov 2, 2022

Choose a reason for hiding this comment

ymontmarin Nov 16, 2022

Choose a reason for hiding this comment

luisfpereira Nov 16, 2022

Choose a reason for hiding this comment

luisfpereira Nov 16, 2022

Choose a reason for hiding this comment

luisfpereira Nov 16, 2022 • edited

Choose a reason for hiding this comment

luisfpereira Nov 16, 2022 • edited

Choose a reason for hiding this comment

ymontmarin Nov 16, 2022 • edited

Choose a reason for hiding this comment

ymontmarin Nov 16, 2022 • edited

Choose a reason for hiding this comment

luisfpereira Nov 17, 2022

Choose a reason for hiding this comment

ymontmarin Nov 2, 2022

Choose a reason for hiding this comment

ymontmarin Nov 16, 2022

Choose a reason for hiding this comment

ymontmarin commented Nov 16, 2022

luisfpereira commented Nov 17, 2022

ymontmarin commented Oct 18, 2022 •

edited

ninamiolane commented Oct 18, 2022 •

edited

ninamiolane commented Oct 18, 2022 •

edited

codecov bot commented Oct 27, 2022 •

edited

luisfpereira commented Nov 2, 2022 •

edited

ymontmarin commented Nov 3, 2022 •

edited

ymontmarin commented Nov 3, 2022 •

edited

ymontmarin commented Nov 3, 2022 •

edited

ymontmarin commented Nov 16, 2022 •

edited

luisfpereira Nov 16, 2022 •

edited

luisfpereira Nov 16, 2022 •

edited

ymontmarin Nov 16, 2022 •

edited

ymontmarin Nov 16, 2022 •

edited