-
-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cythoned homsets #14214
Comments
comment:1
Ideally, sage.categories.homset.Homset should be a cdef class, with cdef attributes _domain, _codomain, Problem and potential long term solutions
Short term improvements suggested in my patch
Making Homset cdef requires to put Timings will be in the next post. |
comment:2
Timings with the patch:
Timings without the patch:
Conclusion
Some details I am not yet totally happy with (todo on a different ticket?):
Doctests pass for me. Needs review! |
comment:3
Hooray, the startup_time plugin reports:
I think it would only be significant if it was 95% confidence. Nevertheless, it is a decrease of the startup time! |
comment:4
Replying to @simon-king-jena:
Yes, I don't why we should not cache Hom(X,Y). By the way, is there a Thanks for your work on this! Get in touch with me in case no one Cheers, |
comment:5
Replying to @nthiery:
OK, doing it soon. In fact, what I do now: Try to implement the new coercion framework for all homsets (hence: Remove
Yes, actually three reasons:
|
comment:6
The second patch is not tested yet. But it yields:
Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch |
comment:7
The patchbot finds one failing test, and this is actually a test that relies on non-unique parent behaviour:
But in this example, at least with my patch, we have |
comment:8
Done! Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch |
comment:9
Replying to @simon-king-jena:
What follows is a bit of a rant that probably doesn't immediately affect your patch here. However, it's something I've seen crop up around the coercion/category framework in the last couple of weeks a little and I think has some relevance for work in this direction in general: One design consideration I haven't seen mentioned much is that some parent constructors have very good reasons to return a fresh copy every time one is requested (i.e., don't do the One could design these things differently, e.g., don't make the Mordell-Weil basis part of the elliptic curve itself, but the reality is that design choice hasn't been made in sage and changing that design now will be very difficult. The problem arises elsewhere too: Sometimes you DO want multiple isomorphic-but-not-identical copies of objects. It's convenient to have multiple bivariate polynomial rings over QQ, for instance. We have that in sage by making generator names part of the definition of polynomial rings. For other objects, for instance elliptic curves, there are no convenient generator names to distinguish multiple copies (the fundamental constructor being It terms of the coercion framework I have hopes these are not too much of an issue. I think elliptic curves are sufficiently high up the chain of complexity that people are happy to apply explicit homomorphisms to move things around, and hopefully this is true for most objects (the elephants in the room here are number fields: they are expected to behave nicely wrt coercion and yet are sufficiently complex that they have data hanging off them that makes it hard to consider them immutable. A lot of the problems may be hidden because it's relatively hard to run into exactly the same representation of the field twice) The consistent thing for such non-unique-by-necessity parents is to make "==" indeed just "is", so that uniqueness is simply forced. This means
which is fine with me, since they are isomorphic but not uniquely so. However, I'm not sure how easy it will be to convince other people of such a strict interpretation of "==". Of course such Anyway, please keep in mind: It's very unlikely (and probably even undesirable) that all parents will ever be essentially |
comment:10
Replying to @nbruin:
Since the patch doesn't touch the way how things are cached, it doesn't affect the patch. But it does affect what I try to do next, so, let's discuss it.
They are an issue, if the equality-by-id paradigma is used in some parts and isn't used in other parts of the coercion framework. They are not an issue, if the coercion framework consistently uses equality-by-id even on non-unique parents, as I am now explaining. I am fine with parents P1, P2 that evaluate equal but are non-identical. But then, I think it makes sense that Hom(P1,X) and Hom(P2,X) are non-identical, too. The question is whether they should evaluate equal, and here I actually have no very strong opinion, except that I like fast hash and fast equality test... Let f be a coercion map from P1 to X. Now, we consider an element p of P2. In elder versions of Sage, it used to be the case that f was stored in a usual dictionary as an attribute of X. "Usual dictionary" means: If you have P2 and inspect the cache for a coercion to X, you'll find f. But f.domain() is not the same as P2. Being "not the same as" P2 means: We can't simply coerce p via f---we must first coerce p into P1 and then apply f. Namely, if P1==P2, we certainly expect that there is a coercion from P2 to P1. But it could be that, for example, a non-trivial base change is involved in this coercion. This is why now the coercion maps are stored in a I think this is a sane approach and it works fine with non-unique parents. This is why I think that non-unique parents are not an issue in the coercion framework---but it is essential to be consistent, i.e., inside of the coercion framework one should consistently use comparison by identity even for non-unique parents (as in the P1-versus-P2 example above).
I think it is enough to convince the coercion model...
In fact, this would be a problem. We can not simply use But couldn't we make Homset have the
Inside of the coercion framework, I think nothing prevents us from comparing non-unique parents by identity---except old code. In that way, we have freedom to choose non-trivial coercions between non-identical copies of a parent. |
comment:11
PS: What I currently try on top of this patch is merely to implement the new coercion framework (element_constructor and friends) for Homsets. Only if this is achieved, I would think of changing the cache logic. |
comment:12
One remark on the reliability of the startup_time plugin: With the previous version of the patches, we had:
Hence, we had a significant speed-up of 0.25% of startup time. With the current version of the patches (which only differ in the removal of one doc test, but there is no change in the code!!), we have
Hence, we have a significant slow-down of 0.25% of startup time. But since the code didn't change, perhaps it isn't significant, after all? |
comment:13
Replying to @simon-king-jena:
Yes, I agree that the coercion model can work with "is" even if "==" doesn't. It may surprise people every now and again, but the alternative is almost certainly worse (besides being less efficient).
Good catch. However, the comparison-by-== happens on construction parameters. Hence, if someone implements some equal-but-not-identical rings
That's a serious problem. If you're implementing something that can be used in further (functorial) (cached) parent constructions that are expected to have coercions they have to be I doubt things like elliptic curves disappear into such cached parent constructions, which was why I thought the problem might not be that bad.
Homsets are important in coercion and have proper mathematical meaning. If you make them consider their arguments by identity, you're basically telling people "you can use
I think you're right on that in principle. There is a bit of an issue that "coercions between equal-but-not identical" parents are fundamentally between ""equivalent" objects, whereas the coercion framework seems to work best when there's a hierarchy. It's an invitation to loops in the coercion graph, inviting non-commuting paths. However, I think that's more an issue of the mathematical problem that of how we're modelling it. |
comment:14
Needed to update both patches, because a patch in #14159 has changed. Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch |
comment:15
I made some changes to the Hence, cmp should be changed, and this ticket needs work! |
Work Issues: Do not change cmp too much |
comment:16
Updated! Apply trac_14214-cython_homset.patch trac_14214-backup_cache.patch |
This comment has been minimized.
This comment has been minimized.
comment:17
Do you have a reason to make
|
comment:18
Replying to @nbruin:
Yes. See comment:2. A cached method is much much faster than a Python method that does nothing but return an attribute, and it is a little faster than directly accessing the attribute in a "timeit". Things are slightly different if you access an attribute inside of a method. At some point, I had added a test method that accesses self._domain a million times or calls self.domain() a million times. In the case of HF in comment:2, the latter was faster, but in other cases (it seems: In cases with a short mro), direct attribute access was faster. Anyway, that's why I like to keep the cached method and self._domain and self._codomain around. Note, however, that one could get rid of self._domain and self._codomain. Namely, instead of
one could do
in
Note that keeping the _domain and _codomain attributes is actually a cheat:
Do you see a way to solve this problem in a clean way (i.e., with cdef attributes)? I'd appreciate to know! |
comment:54
Ping!. The present patches here pass doctests on the bot! I think conceptually what you want in this case is really just a nondata descriptor that acts as a backstopper to provide (essentially) a read-only attribute stored in
Since the
is going to give the same performance. You may be able to shave off a few nanoseconds by replacing some of the auto-generated cython code with explicit Python C-API calls. I suspect we'll have to live with many parents not having a (note that any slowdowns due to deeper MRO will go away once the cython upgrade is in). |
comment:55
Replying to @nbruin:
Agreed. The first question is: Where shall we put the code?
I guess you either mean And I guess you could get it still a bit faster, if you do
And that's the second question: Are you really sure that lazy_attribute will be as fast as Note that I tested to cythonize lazy_attribute and to cimport Parent. It failed (circular import or so). So, we come back to the first question: Where shall we put the code? I think it makes sense to cdefine And the third question is: Shall we really make this ticket depend on a proper cythonized version of lazy_attribute? Note that currently the lazy attribute or cached attribute would never come into play, because all our homsets are Python subclasses of the (now cythonized) Homset. |
comment:56
Replying to @simon-king-jena:
Excellent idea! if
My timings suggested this. However, I suspect that providing cdef access to
That can't be So, we come back to the first question: Where shall we put the code? I think it makes sense to cdefine If it's a parent-specific support class it can just go with the parent source, right?
Not if you go the Parent-specific
Talk about academic code then ... I'm sure we'll get cythonized homset instances at some point. |
comment:57
Replying to @nbruin:
sage.structure.parent imports lazy_attribute, and (to my surprise, I must say) cimporting Parent in sage.misc.lazy_attribute did crash.
Hey, that's a good idea!
|
comment:58
I think I will be able to provide a third patch with the non data descriptor tonight. |
comment:59
Replying to @simon-king-jena:
That said: Could we perhaps make this depend on #11935? |
comment:60
Replying to @simon-king-jena:
You're doing the work, so you decide. Looking at the tickets, it seems to me this one is less invasive and hence easier to get in: It's already passing doctests; the only thing left is some backstop code that currently won't be exercised anyway. |
comment:61
Replying to @nbruin:
Yes. And in addition, when I tried to apply a dependency of #11935, I got several errors. So, let's first fix this, and then care about #12876 and finally #11935. |
comment:62
For the record: #12876 and #11935 are applying smoothly on 5.10 beta4 and are passing all tests. So I don't really care about the order, but it would be nice to have them in shortly. And I am bit lazy to rebase them once again. |
comment:63
Replying to @nthiery:
OK, if that's the case then these go in first, because
|
comment:64
Replying to @simon-king-jena:
Thanks Simon, I appreciate that; I know the work it costs you! |
comment:65
Replying to @nbruin:
Note that |
comment:70
TODO: Provide a branch... |
comment:71
Replying to @simon-king-jena:
Note that this should now be a lot easier, by #18756.
See #14279, but both this ticket and #14279 need work.
If I recall correctly, that's not the case any longer. However, if we are able to avoid double inheritance from |
Changed work issues from How to store domain and codomain? to none |
comment:73
Since the patches have bit-rotted for so long, I guess it makes sense to start from scratch. Most part of the discussion here was about different ways to store the _domain and _codomain. Since it should now be possible to have fast ring-behaviour defined via category framework, I suppose one can simply make them cdef attributes, which should be the fastest solution. |
comment:75
I confused two tickets. In fact, it is #18758 which makes it possible to define a ring structure via category framework without too much slow-down. |
Homsets arise whenever a coercion map or a conversion map is involved. Hence, homsets are ubiquitous in Sage and should thus be as a fast as possible.
Therefore I suggest change sage.categories.homset into a cython module. A clean solution seems out of reach at the moment (see comments), and so I just provide a couple of speed-ups here.
Apply
Depends on #14279
Depends on #18758
CC: @nthiery
Component: performance
Keywords: Hom, cached method
Author: Simon King
Issue created by migration from https://trac.sagemath.org/ticket/14214
The text was updated successfully, but these errors were encountered: