Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InheritComparisonUniqueRepresentation #25388

Open
embray opened this issue May 17, 2018 · 29 comments
Open

InheritComparisonUniqueRepresentation #25388

embray opened this issue May 17, 2018 · 29 comments

Comments

@embray
Copy link
Contributor

embray commented May 17, 2018

This is a bit ugly, but it seems to be necessary if you want to use UniqueRepresentation with subclasses of Element (among others) which has the InheritComparison metaclass. This otherwise creates a conflict with ClasscallMetaclass.

This class resolves that by making a UniqueRepresentation with the InheritComparisonClasscallMetaclass. This can then be used as a mix-in class with Element subclasses.

There's at least some prior precedence for this, and I've updated those examples to use InheritComparisonUniqueRepresentation to demonstrate that it works. I'm open to better ideas.

CC: @jdemeyer

Component: refactoring

Author: Erik Bray

Branch/Commit: u/embray/misc/inherit-comp-unique-repr @ 8d665aa

Issue created by migration from https://trac.sagemath.org/ticket/25388

@embray embray added this to the sage-8.3 milestone May 17, 2018
@jdemeyer
Copy link

comment:2

My first reaction is that it doesn't really make sense to use UniqueRepresentation for elements. I wonder what the reason for that is.

@embray
Copy link
Contributor Author

embray commented May 17, 2018

comment:3

Perhaps the entire concept is wrong somehow? That said as you can see there are cases of this (albeit with classes that only inherit Element indirectly).

Why doesn't it make sense? Just as 1 is 1 it does make sense to have this for other objects. Granted 1 is 1 is a consequence of caching small integers, but I don't see the harm of having it for other elements.

Alternatively, it would be nice to have something that behaves somewhat like UniqueRepresentation just for __eq__. That is, something like CachedRepresentation that compares instances according to their cache keys. The idea is to avoid implementing a bunch of __eq__ that compare two objects according to some set of attributes that are the same attributes used to initialize a "unique" instance of that class. Does that make sense?

@embray
Copy link
Contributor Author

embray commented May 17, 2018

comment:4

In any case, it does make sense, as long as we have a complicated metaclass hierarchy, to have a metaclass that combines InheritComparison with UniqueRepresentation.

@tscrim
Copy link
Collaborator

tscrim commented May 18, 2018

comment:5

I can think of certain morphisms that I could see benefiting from being UniqueRepresentation, but I wouldn't do non-equality comparisons on them. The other would be for small sets of key algebraic objects. We do this in combinat/crystals/letters.pyx, but we do it by @cached_method on _element_constructor_ to help improve the construction time and to tie it directly with the parent.

@nbruin
Copy link
Contributor

nbruin commented May 18, 2018

comment:6

There are a couple of reasons why UniqueRepresentation should be used sparingly:

  • It makes instantiation of an object expensive: The construction parameters need to be processed and made into a dictionary key for lookup
  • It is very prone to creating memory leaks if the construction parameters consist of anything more complicated than integers and strings. That's because the construction parameters, being put in the key of a global WeakValueDict, have their life lower bounded by the object. So if those parameters end up carrying (indirectly) a reference to the object, it's now immortal.
  • it means that one has to be very careful caching UniqueRepresentation objects: they're already being globally cached (under expensive keys) and the places where one would like to cache them are generally places where that would create the kind of reference loops that cause memory leaks.

The reason why we have UniqueRepresentation is because it allows the coercion framework to do lightning-fast comparisons (because we can make do with an "is" test rather than a "=="). It's a compromise that makes memory management with parents much more complicated, for the benefit of making some operations with elements much faster.

That's why people think it's strange to make elements UniqueRepresentation.

@embray
Copy link
Contributor Author

embray commented May 18, 2018

comment:7

That's weird about the memory leaks since I thought the point of using weak reference caching was that you could construct an object once--it would be cached--but then if it isn't used again the cached keys would (potentially) go away.

As for this specific ticket, perhaps the use case of using UniqueRepresentation with Elements is not good, though as the diff shows there were still at least a few places where it was useful to define this (unfortunately verbose) InheritComparisonUniqueRepresentation class for reuse.

However, my primary motivation to using more UniqueRepresentation was actually as a possible solution, in many cases, to #24551. The reason it appealed to me is that it gives free hashing and equality comparison implementation for classes that have a property that identity of instances of that class is uniquely determined by some attributes of that instance that are provided at its time of construction (particularly via __init__).

If UniqueRepresentation is not right for this in all cases (something I can believe) then perhaps it would be good to add a new base class that has a similar property, but without caching (and thus without the guaranteed equality-by-id property, which for my use case is not actually so important). In other words, it would basically generate the same cache key that would be generated by CachedRepresentation, but only use that to implement __eq__ and __hash__, but not do any caching.

@embray
Copy link
Contributor Author

embray commented May 18, 2018

comment:8

Replying to @tscrim:

I can think of certain morphisms that I could see benefiting from being UniqueRepresentation, but I wouldn't do non-equality comparisons on them. The other would be for small sets of key algebraic objects. We do this in combinat/crystals/letters.pyx, but we do it by @cached_method on _element_constructor_ to help improve the construction time and to tie it directly with the parent.

On that note, could you have a look at #25389 and tell me if the use of UniqueRepresentation doesn't make sense in any of those cases. I felt that it did make sense but clearly there's a lot of hard-earned experience I don't have as to when it does or doesn't make sense to use. (Note: The branch in that ticket has this one as a dependency, so it's really only interesting to look at the changes in quaternion algebras: https://github.com/sagemath/sagetrac-mirror/blob/9ea55d9933a3cd2bedd391068276f804ff868e1f/src/sage/algebras/quatalg/quaternion_algebra.py&id2=6fc1e20c666283a301b4ff3f855013de8d206b35)

@nbruin
Copy link
Contributor

nbruin commented May 18, 2018

comment:9

Replying to @embray:

That's weird about the memory leaks since I thought the point of using weak reference caching was that you could construct an object once--it would be cached--but then if it isn't used again the cached keys would (potentially) go away.

That's true, but the caching is not the primary design feature. It's a cost more than a benefit (but necessary for what it does). The global cache brings you very close to creating a global anchor for reference cycles: The global weakvaluedict that does the caching has strong references to the keys used (the construction parameters). As has been shown time and again, these keys have a tendency of carrying references to the cached object. Now your memory cycle is globally anchored and will be immune to cyclic garbage collection: a memory leak.

This kind of leak keeps occurring. See #23807 comment:14 for a recent example.

The EqualityAndHashingBasedOnConstructionParameters that you suggest makes a lot more sense: here you get to store the key on the object itself, so memory leaks should be much less of a problem. I suspect it will not be a very memory-efficient class, because the key objects you're storing are likely preserved elsewhere on the object as well.

@nbruin
Copy link
Contributor

nbruin commented May 18, 2018

comment:10

Replying to @embray:

On that note, could you have a look at #25389 and tell me if the use of UniqueRepresentation doesn't make sense in any of those cases.

The construction parameters of quaternionalgebras are sufficiently light that I would not expect memory leaks (as long as people don't take to caching quaternion algebras on the base ring!, which now that I thought of it, people will undoubtedly start doing).

In principle, QuaternionAlgebra is the kind of parent that could participate in coercion discovery, where UniqueRepresentation might be required (in particular, if people consider polynomial ring over a ring and expect coercion discovery, and in particular common covering parent construction, to work with it, then UniqueRepresentation might be required.

I would find it surprising if people are actually doing that with quaternion algebras, so I don't think we're getting the designed benefit from quaternion algebras.

By turning quaternion algebras into globally unique objects, you are requiring the algebras to be immutable in a very strong sense: different parts could end up having references to the same algebra, so ANY change in state (even little representation/caching details) must be assumed to affect completely unrelated code. It's not possible to make your own little local algebra and scribble in it anymore.

More precisely. The kind of scenario in the coercion framework where I know that UniqueRepresentation is really required is to make this assertion hold:

sage: Zxy.<x,y>=ZZ[]
sage: Qx.<x>=QQ[]
sage: Qy.<y>=QQ[]
sage: Qxy=QQ['x','y']
sage: x=Qx.0
sage: y=Zxy.1
sage: (x+y).parent() is Qxy
True
sage: Qy.<y>=QQ[]
sage: (Qy.0+Zxy.0).parent() is Qxy
True

Note that the coercion framework needs to construct a common covering parent for Zxy and Qx and one for Zxy and Qy itself. It would be very surprising if the results were not identical.
(the fact that the resulting parent is equal to Qxy constructed by the user is convenient but less important). I don't think people would be doing that with QuaternionAlgebra so I think you could safely leave off the UniqueRep for now.

@embray
Copy link
Contributor Author

embray commented May 21, 2018

comment:11

Replying to @nbruin:

The EqualityAndHashingBasedOnConstructionParameters that you suggest makes a lot more sense: here you get to store the key on the object itself, so memory leaks should be much less of a problem. I suspect it will not be a very memory-efficient class, because the key objects you're storing are likely preserved elsewhere on the object as well.

While I like the feature of CachedRepresentation that bases the cache key on the construction parameters, I don't consider it a critical feature either. This could just as easily (more easily in fact) be implemented simply by defining a tuple of attribute names (or possibly methods that should be called) that should be used for testing equality and hashing. It's just such a common pattern it really could save a lot of code and effort if it could be coded once for a lot of classes.

Having an additional, usually small, tuple of references is not generally a major memory concern but of course as with anything it depends on how many of these objects you expect to construct.

@embray
Copy link
Contributor Author

embray commented May 21, 2018

comment:12

Replying to @nbruin:

Note that the coercion framework needs to construct a common covering parent for Zxy and Qx and one for Zxy and Qy itself. It would be very surprising if the results were not identical.
(the fact that the resulting parent is equal to Qxy constructed by the user is convenient but less important). I don't think people would be doing that with QuaternionAlgebra so I think you could safely leave off the UniqueRep for now.

Thanks for looking at it and for the explanation. I agree. I guess what I really just want there is the aforementioned EqualityAndHashingBasedOnConstructionParameters, and I was using UniqueRepresentation more for that feature than anything (although it does seem useful to me for preventing construction of multiple copies of otherwise identical objects, but unless that's actually a problem for, say, QuaternionOrders then it's a premature optimization).

@embray
Copy link
Contributor Author

embray commented Jul 18, 2018

comment:13

I believe this issue can reasonably be addressed for Sage 8.4.

@embray embray modified the milestones: sage-8.3, sage-8.4 Jul 18, 2018
@embray
Copy link
Contributor Author

embray commented Sep 7, 2018

comment:14

I think this utility class is still useful as it clearly makes a few minor simplifications.

@jdemeyer
Copy link

jdemeyer commented Sep 8, 2018

comment:15

The problem is that I'm still not convinced that combining UniqueRepresentation and InheritComparisonMetaclass makes sense at all.

@embray
Copy link
Contributor Author

embray commented Sep 10, 2018

comment:16

Oh what basis? On its face there's no reason not to and there are clearly classes that do use that combination. Now, if you disagreed that those classes should be using that combination in the first place that's another story: If those were fixed then there would be no use for adding this class. But first you'd have to justify why those classes shouldn't use UniqueRepresentation in particular and then fix that.

@nbruin
Copy link
Contributor

nbruin commented Sep 10, 2018

comment:17

Looking at the code referenced in your commits: Indeed, it looks to me these objects have no business being UniqueRepresentation. They are differentials: elements; morphisms. There's no need for those to have their equality determined by their ID, which is what UniqueRepresentation is for. Indeed, needing comparison inheritance shows that they are not UniqueRepresentation. The right way to "fix" this (if any fix is necessary) is to change the inheritance of these objects, not to introduce an oxymoronic new class.

It could be that those pieces of code are from before UniqueRepresentation and CachedRepresentation were split off, or perhaps they were written by someone who didn't understand the fine difference between the two. In this case, I don't think you would even want to do it as a transition, since you're causing a net increase in code, so it's not making it shorter either.

@tscrim
Copy link
Collaborator

tscrim commented Sep 10, 2018

comment:18

So I can answer for sage.algebras.commutative_dga.Differential why it is useful to have it inherit from UniqueRepresentation: These morphisms are used to define a differential graded commutative algebra (DGCA), so they need to be hashable and easily comparable. They also contain computationally difficult results that we want to cache and not duplicate, so they should be a cached representation. Hence, it makes sense to have them be UniqueRepresentation. From a quick thought, I believe it removes code duplication between the Z and multigraded cases.

This does not quite hold as much water for sage.algebras.clifford_algebra.ExteriorAlgebraDifferential as those are less likely to be compared or used to construct other objects. However, it does make the comparisons easier to work with in case someone does have a use-case for that.

@nbruin
Copy link
Contributor

nbruin commented Sep 11, 2018

comment:19

Replying to @tscrim:

So I can answer for sage.algebras.commutative_dga.Differential why it is useful to have it inherit from UniqueRepresentation

I see you make an argument for why CachedRepresentation is good for them via the cached bit.
I also see you make an argument for UniqueRepresentation because you want them easily hashed and compared. Indeed, UniqueRepresentation makes those operations very fast because it's a composition of CachedRepresentation and EqualityById.

However, it seems that the comparison and hashing you want is NOT the one from UniqueRepresentation, because otherwise you wouldn't need the InheritComparison thing. The whole difference between UniqueRepresentation and CachedRepresentation is the mixing in of EqualityById. If you don't want that, then obviously you need to be inheriting from CachedRepresentation instead. So InheritComparisonCachedRepresentation might make sense. InheritComparisonUniqueRepresention never does.

Be careful with thinking that CachedRepresentation is actually more efficient: Creating objects the first time with them is actually MORE expensive, because the construction parameters need to be processed and hashed. You'll only see a benefit if you're actually recreating the same object from construction parameters multiple times. Often that's a silly thing to do; just keep the object!
The other penalty to pay, that you're now sharing global objects so that you REALLY have to take immutability seriously, has been mentioned many times already.

@embray
Copy link
Contributor Author

embray commented Sep 11, 2018

comment:20

So should sage.algebras.commutative_dga.Differential be using WithEqualityById? Or InheritComparison? Because I agree it doesn't make sense to use both (in which case I see now why InheritComparisonUniqueRepresentation doesn't make sense, but InheritComparisonCachedRepresentation might.

On another note, I'm not even convinced InheritComparison needs to be a metaclass. ISTM it would work fine as a decorator. Maybe the only argument for it being a metaclass is that then subclasses can inherit its behavior, but I don't know if that's really needed or not...

@nbruin
Copy link
Contributor

nbruin commented Sep 11, 2018

comment:21

Replying to @embray:

On another note, I'm not even convinced InheritComparison needs to be a metaclass. ISTM it would work fine as a decorator. Maybe the only argument for it being a metaclass is that then subclasses can inherit its behavior, but I don't know if that's really needed or not...

I think that is because it actually pokes around in the class data structure, to "hard" inherit the slot value for the comparison functions, the way it would happen normally. The difference is that normally, comparison is not inherited if the hash is changed. The InheritComparisonMetaClass makes this happen even if hash is not inherited. I don't see at which point a decorator would have the opportunity to scribble in the relevant slots.

@embray
Copy link
Contributor Author

embray commented Sep 12, 2018

comment:22

Ah, I just tried to implement it as a decorator and the problem is actually that Cython will not allow you to use "arbitrary" class decorators on cdef classes. Which is a shame because I don't see any reason that couldn't work. Yes, it is dangerous, but consenting adults and all...

What's worse is, it still works just fine if you call it not as a decorator, but just at module-level after the class definition (note: I defined a Python function called inherit_comparison that implements the functionality as a class decorator):

sage: cython('''
....: from sage.misc.inherit_comparison import inherit_comparison
....:
....: cdef class Base(object):
....:     def __richcmp__(left, right, int op):
....:         print("Base.__richcmp__")
....:         return left is right
....:
....: cdef class Derived(Base):
....:     def __hash__(self): return 1
....:
....: inherit_comparison(Derived)
....: ''')

So I don't think it needs to be a decorator. At the same time, as I wrote previously, I suppose the advantage of using a metaclass is that the behavior will automatically be inherited by subclasses, whereas with a decorator (whether used as above, or with the actual decorator syntax) one has to remember to use it when subclassing.

So in light of that, having the metaclass makes sense, unfortunately.

One bit of good news is that Python 3 (specifically 3.6+) has added a number of new dunder methods for customizing class creation that ease many of the most common use cases for metaclasses without explicitly using a metaclass. For example, I think init_subclass might be able to serve as a replacement for InheritComparisonMetaclass. I don't know if Cython supports __init_subclass__ yet, but if/when it does we might be able to drop the metaclass even on Python 2.

@embray embray modified the milestones: sage-8.4, sage-8.5 Oct 28, 2018
@embray embray modified the milestones: sage-8.7, sage-8.8 Mar 25, 2019
@embray
Copy link
Contributor Author

embray commented Jul 3, 2019

comment:26

Moving tickets from the Sage 8.8 milestone that have been actively worked on in the last six months to the next release milestone (optimistically).

@embray embray modified the milestones: sage-8.8, sage-8.9 Jul 3, 2019
@embray
Copy link
Contributor Author

embray commented Dec 30, 2019

comment:27

Ticket retargeted after milestone closed

@embray embray modified the milestones: sage-8.9, sage-9.1 Dec 30, 2019
@mkoeppe
Copy link
Member

mkoeppe commented Apr 14, 2020

comment:28

Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.

@mkoeppe mkoeppe modified the milestones: sage-9.1, sage-9.2 Apr 14, 2020
@mkoeppe mkoeppe modified the milestones: sage-9.2, sage-9.3 Sep 5, 2020
@fchapoton
Copy link
Contributor

comment:30

red branch => needs work

@mkoeppe
Copy link
Member

mkoeppe commented Mar 15, 2021

comment:31

Setting new milestone based on a cursory review of ticket status, priority, and last modification date.

@mkoeppe mkoeppe modified the milestones: sage-9.3, sage-9.4 Mar 15, 2021
@mkoeppe
Copy link
Member

mkoeppe commented Jul 19, 2021

comment:32

Setting a new milestone for this ticket based on a cursory review.

@mkoeppe mkoeppe modified the milestones: sage-9.4, sage-9.5 Jul 19, 2021
@mkoeppe mkoeppe modified the milestones: sage-9.5, sage-9.6 Dec 18, 2021
@mkoeppe mkoeppe modified the milestones: sage-9.6, sage-9.7 Apr 2, 2022
@mkoeppe mkoeppe modified the milestones: sage-9.7, sage-9.8 Aug 31, 2022
@mkoeppe mkoeppe removed this from the sage-9.8 milestone Jan 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants