Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make calling a cached method independent of source code inspection #17814

Closed
simon-king-jena opened this issue Feb 20, 2015 · 79 comments
Closed

Comments

@simon-king-jena
Copy link
Member

If a pyx file in the sage library uses cached methods and then the source file is removed after compiling Sage, then accessing the cached methods in this file becomes impossible. Example: Move the file src/sage/rings/finite_rings/finite_field_base.pyx away. Then:

sage: K=GF(5)
sage: K.factored_order
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-488672e9c19b> in <module>()
----> 1 K.factored_order()

/home/king/Sage/git/sage/src/sage/structure/parent.pyx in sage.structure.parent.Parent.__getattr__ (build/cythonized/sage/structure/parent.c:7863)()
    838             return self.__cached_methods[name]
    839         except KeyError:
--> 840             attr = getattr_from_other_class(self, self._category.parent_class, name)
    841             self.__cached_methods[name] = attr
    842             return attr

/home/king/Sage/git/sage/src/sage/structure/misc.pyx in sage.structure.misc.getattr_from_other_class (build/cythonized/sage/structure/misc.c:1582)()
    249         dummy_error_message.cls = type(self)
    250         dummy_error_message.name = name
--> 251         raise dummy_attribute_error
    252     try:
    253         attribute = getattr(cls, name)

AttributeError: 'FiniteField_prime_modn_with_category' object has no attribute 'factored_order'

The same does work in attached pyx files, even after removing the temporary copy of the pyx file. Conjecture: The cached method's __get__ method relies on source code inspection. That should of course not be the case. Source code inspection should only come in play if the user wants to see the source code or at least the doc string.

I chose the component "distribution", since it was reported on sage-devel as something that happened on certain sage distributions that do not provide sage sources.

Component: distribution

Author: Simon King

Branch/Commit: 0e8ef00

Reviewer: Volker Braun

Issue created by migration from https://trac.sagemath.org/ticket/17814

@simon-king-jena

This comment has been minimized.

@simon-king-jena
Copy link
Member Author

comment:3

Here one can see how introspection enters:

sage: K=GF(5)
sage: C=type(K)
sage: C.factored_order
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-d955bd892332> in <module>()
----> 1 C.factored_order

/home/king/Sage/git/sage/src/sage/misc/cachefunc.pyx in sage.misc.cachefunc.CachedMethod.__get__ (build/cythonized/sage/misc/cachefunc.c:15452)()
   2673         f = (<CachedFunction>self._cachedfunc).f
   2674         if self.nargs==0:
-> 2675             args, varargs, keywords, defaults = sage_getargspec(f)                                                                                                        
   2676             if varargs is None and keywords is None and len(args)<=1:                                                                                                     
   2677                 self.nargs = 1

/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/misc/sageinspect.pyc in sage_getargspec(obj)
   1394         except TypeError: # arg is not a code object
   1395         # The above "hopefully" was wishful thinking:
-> 1396             return inspect.ArgSpec(*_sage_getargspec_cython(sage_getsource(obj)))
   1397             #return _sage_getargspec_from_ast(sage_getsource(obj))
   1398     try:

/home/king/Sage/git/sage/local/lib/python2.7/site-packages/sage/misc/sageinspect.pyc in _sage_getargspec_cython(source)
   1006 
   1007     """
-> 1008     defpos = source.find('def ')
   1009     assert defpos > -1, "The given source does not contain 'def'"
   1010     s = source[defpos:].strip()

AttributeError: 'NoneType' object has no attribute 'find'

So: cached_method tries to determine the number of arguments. This is because there is a special implementation for functions with no argument (except self).

In other words, the wrong behaviour could be fixed by finding a way to get the number of arguments (or better the argspec) of a cython function without looking at its source code. I'll see if that's possible. So, the fix will likely be in sage.misc.sageinspect, not in sage.misc.cachefunc.

@simon-king-jena
Copy link
Member Author

comment:4

Fixing sageinspect won't be easy. I currently have absolutely no clue how to read off the number of arguments (or even the names of the arguments) of a function that is defined in a Cython file. Those functions don't seem to have any attributes holding useful information.

Has someone else an idea? If not, then I don't see what we can do. We do want a special cached method implementation for methods without arguments, and thus we need to determine the number of arguments of the to-be-wrapped method. If that doesn't work without reading the sources, what could we possibly do?

@jdemeyer
Copy link

comment:5

Replying to @simon-king-jena:

Fixing sageinspect won't be easy. I currently have absolutely no clue how to read off the number of arguments (or even the names of the arguments)

Do you really require the number and names of arguments or do you only want to know whether a function takes any *args and/or **kwds at all (apart from the self argument)?

Technically, Python always uses *args and **kwds to pass arguments (except for some special methods like __new__). It's the Cython method implementation which interprets these *args and **kwds.

@jdemeyer
Copy link

comment:6

The question is also: if there is a method

def f(self, arg=1)

do we require that obj.f() and obj.f(1) and obj.f(arg=1) all use the same cache entry? If you don't require that, there might be a solution by working on the level of *args and **kwds.

@simon-king-jena
Copy link
Member Author

comment:7

Replying to @jdemeyer:

The question is also: if there is a method

def f(self, arg=1)

do we require that obj.f() and obj.f(1) and obj.f(arg=1) all use the same cache entry? If you don't require that, there might be a solution by working on the level of *args and **kwds.

Yes, we do require that. Equivalent inputs should result in identical output.

The problem occurs when the cached method is bound to an instance. At that point, we want to know if the method accepts any argument in addition to self. If it does, then a CachedMethodCaller is set as an attribute of the instance. If it does not, then a CachedMethodCallerNoArgs is bound to the method. If sage can't answer the question, then an AttributeError is raised. We don't need to know all argument names and all defaults, but we do need to know their number.

@jdemeyer
Copy link

comment:8

The two statements below seem like a contraction: using the example above, if you want to ensure that obj.f() and obj.f(1) and obj.f(arg=1) use the same cache entry, you do need to know the names and defaults of the arguments.

Replying to @simon-king-jena:

Yes, we do require that. Equivalent inputs should result in identical output.

We don't need to know all argument names and all defaults, but we do need to know their number.

@jdemeyer
Copy link

comment:9

Replying to @simon-king-jena:

we want to know if the method accepts any argument in addition to self.

That's possible using the Python API function PyCFunction_GetFlags. When applied to a bound method, it returns the flags as defined in https://docs.python.org/2/c-api/structures.html#c.PyMethodDef

@simon-king-jena
Copy link
Member Author

comment:10

Replying to @jdemeyer:

The two statements below seem like a contraction: using the example above, if you want to ensure that obj.f() and obj.f(1) and obj.f(arg=1) use the same cache entry, you do need to know the names and defaults of the arguments.

For calling the function, you of course need it (so, I admit that the title of this ticket is misleading). However, for determining whether it ought to be a CachedMethodCaller or a CachedMethodCallerNoArgs, it suffices to know a little less.

And thank you for the link to PyCFunction_GetFlags. So, it seems that sage.misc.sageinspect should get a little addition sage.misc.sage_inspect_cython written in Cython that provides helpers for inspection of Cython methods.

@jdemeyer
Copy link

comment:11

Replying to @simon-king-jena:

For calling the function, you of course need it (so, I admit that the title of this ticket is misleading). However, for determining whether it ought to be a CachedMethodCaller or a CachedMethodCallerNoArgs, it suffices to know a little less.

Sure, but that doesn't solve the problem really (unless you want to fix the issue of this ticket only for methods taking no arguments). You still need to call the function eventually.

@jdemeyer
Copy link

comment:12

Replying to @simon-king-jena:

Replying to @jdemeyer:

The question is also: if there is a method

def f(self, arg=1)

do we require that obj.f() and obj.f(1) and obj.f(arg=1) all use the same cache entry? If you don't require that, there might be a solution by working on the level of *args and **kwds.

Yes, we do require that.

Is that really so important? Does it happen a lot in practice that people or the Sage library call a function in different but equivalent ways?

@simon-king-jena
Copy link
Member Author

comment:13

Replying to @jdemeyer:

Replying to @simon-king-jena:

Yes, we do require that.

Is that really so important? Does it happen a lot in practice that people or the Sage library call a function in different but equivalent ways?

I believe it is important, and I think it is specified somewhere in the docs of sage.misc.cachefunc. There are places where it is used for UniqueRepresentation. Indeed, if you have a unique parent that depends on arguments which (partially) have a default, then you simply don't know if the user will provide these arguments explicitly or implicitly. But when pickling/unpickling it, the arguments will be provided in a uniform way (always explicitly, if I recall correctly).

Hence, if we changed that, we would break uniqueness of UniqueRepresentation under pickling.

@jdemeyer
Copy link

comment:14

I see. This is important where the semantics of cached_function are really important.

I always had in mind only the application of speeding up the function by not computing the same thing multiple times. For the latter application, you need to consider also the overhead of parsing *args and **kwds to compute the correct cache entry. Where speed is important, we might just use *args and **kwds instead.

@jdemeyer
Copy link

comment:15

Proposal: have 2 different implementations of cached_function:

  • one for Python methods, which works just like the current implementation.

  • one for Cython methods, which works on the level of *args and **kwds.

@jdemeyer
Copy link

comment:16

This proposal would solve the issue on the ticket. Specializing for Cython, it might also be possible to speed-up things more.

@simon-king-jena
Copy link
Member Author

comment:17

Replying to @simon-king-jena:

And thank you for the link to PyCFunction_GetFlags. So, it seems that sage.misc.sageinspect should get a little addition sage.misc.sage_inspect_cython written in Cython that provides helpers for inspection of Cython methods.

It is puzzling. When I call PyCFunction_GetFlags on the function/method being wrapped, no crash occurs in that function. However, importing lazy_attribute then fails, and so Sage won't start. No idea how the two things (a function that gives me information on other functions without raising an error, and an import statement) could possibly interfere.

@jdemeyer
Copy link

comment:18

Replying to @simon-king-jena:

Replying to @simon-king-jena:

And thank you for the link to PyCFunction_GetFlags. So, it seems that sage.misc.sageinspect should get a little addition sage.misc.sage_inspect_cython written in Cython that provides helpers for inspection of Cython methods.

It is puzzling. When I call PyCFunction_GetFlags on the function/method being wrapped, no crash occurs in that function. However, importing lazy_attribute then fails, and so Sage won't start. No idea how the two things (a function that gives me information on other functions without raising an error, and an import statement) could possibly interfere.

It's hard to tell without seeing the code. Are you sure the PyCFunction_GetFlags call is what creates the problem and not for example a changed import or the mere fact of accessing obj.method?

Anyway, I'm still very curious what you think about [comment:11] and [comment:15]

@simon-king-jena
Copy link
Member Author

comment:19

The proposal from comment:15 would of course solve the problem of the introspection that is needed to wrap a cython function in a cached method. However, it would mean that the specification of uniqueness would be violated for those cython functions for which the argument spectrum is not available.

I'll try to show you a diff file shortly, so that you can see how calling PyCFunction_GetFlags makes the import of lazy_attribute fail. Currently, I am rebuilding Sage, so, it may take a while.

@jdemeyer
Copy link

comment:20

Replying to @simon-king-jena:

The proposal from comment:15 would of course solve the problem of the introspection that is needed to wrap a cython function in a cached method. However, it would mean that the specification of uniqueness would be violated for those cython functions for which the argument spectrum is not available.

True, but I think the uniqueness is only important for certain very specific functions. In any case, if we want to solve this ticket, I think it's the only way to go.

@simon-king-jena
Copy link
Member Author

comment:21

Try this:

diff --git a/src/sage/misc/cachefunc.pyx b/src/sage/misc/cachefunc.pyx
index a4276b9..515c74a 100644
--- a/src/sage/misc/cachefunc.pyx
+++ b/src/sage/misc/cachefunc.pyx
@@ -481,6 +481,12 @@ import sage.misc.weak_dict
 from sage.misc.weak_dict import WeakValueDictionary
 from sage.misc.decorators import decorator_keywords
 
+from cpython cimport PyObject
+
+cdef extern from "methodobject.h":
+    cdef int PyCFunction_GetFlags(PyObject *op)
+
+
 cdef frozenset special_method_names = frozenset(['__abs__', '__add__',
             '__and__', '__call__', '__cmp__', '__coerce__', '__complex__', '__contains__', '__del__',
             '__delattr__', '__delete__', '__delitem__', '__delslice__', '__dir__', '__div__',
@@ -2672,6 +2678,7 @@ cdef class CachedMethod(object):
         # we need to analyse the argspec
         f = (<CachedFunction>self._cachedfunc).f
         if self.nargs==0:
+            bla = PyCFunction_GetFlags(<PyObject *>f)
             args, varargs, keywords, defaults = sage_getargspec(f)
             if varargs is None and keywords is None and len(args)<=1:
                 self.nargs = 1

According to the crash report that I get, the error occurs in the line args, varargs, keywords, defaults = sage_getargspec(f), thus, after PyCFunction_GetFlags has returned something.

@jdemeyer
Copy link

comment:22

Replace

cdef int PyCFunction_GetFlags(PyObject *op)

by

cdef int PyCFunction_GetFlags(object op) except -1

@jdemeyer
Copy link

comment:23

With the correct except declaration, there is an error if the argument is not a "built-in" (Cython) method:

SystemError: Objects/methodobject.c:64: bad argument to internal function

@simon-king-jena
Copy link
Member Author

comment:24

Replying to @jdemeyer:

With the correct except declaration, there is an error if the argument is not a "built-in" (Cython) method:

SystemError: Objects/methodobject.c:64: bad argument to internal function

OK, we then have an error whose traceback points to calling the function.

@simon-king-jena
Copy link
Member Author

comment:25

Apparently PyCFunction_GetFlags is the wrong tool (or we first need to extract the cfunction from the method), since always an error is raised.

@jdemeyer
Copy link

comment:26

Replying to @simon-king-jena:

Apparently PyCFunction_GetFlags is the wrong tool (or we first need to extract the cfunction from the method), since always an error is raised.

It is the right tool for bound Cython methods.

@jdemeyer
Copy link

comment:27

I am currently testing something else, but for example

sage: K = GF(5)
sage: m = K.factored_unit_order
sage: m
<built-in method factored_unit_order of FiniteField_prime_modn_with_category object at 0x7f357853c1e0>

This m object should be suitable as argument for PyCFunction_GetFlags.

@nbruin
Copy link
Contributor

nbruin commented Feb 24, 2015

comment:60

Replying to @simon-king-jena:

Because of Cython caching, we can not use the embedded argspec information

This is a one-off hitch. It just means that when we enable embedsignature globally, people should clear their cython cache (once).

even if we could enable "embedsignature(True)" globally

We can.

if "embedsignature(True)" was enabled globally, some valid Cython code wouldn't compile.

but it's easy to reformulate those cases.

@jdemeyer
Copy link

comment:61

Let me mention that #17847 is now in Sage, which makes this ticket feasible.

@simon-king-jena
Copy link
Member Author

comment:62

Would it be ok to rebase this branch on top of the current develop branch (and then force-push)? Or would it be ok if I merged this branch into the current develop and start with it? Or merge the current develop into this branch? I still don't get what is considered best praxis.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Apr 18, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

831173bMerge branch 't/17814/make_calling_a_cached_method_independent_of_source_code_inspection' into t/17814/rebased-make_calling_a_cached_method_independent_of_source_code_inspection
b340eebUse embedded signature for introspection

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Apr 18, 2015

Changed commit from a0b829d to b340eeb

@simon-king-jena
Copy link
Member Author

comment:64

I am now using the embedded signatures (due to #17847) for introspection. I preserved the previous commits by merging the previous ticket branch into the current develop to get the new ticket branch; hopefully the correct procedure.

In order to demonstrate what is happening, I added a dummy method to a nested test class provided in sage.misc.nested_class. As it turns out, there has been a bug in the previous sage_getargspec that is now fixed. I suppose that the bug actually is in Cython:

sage: from sage.misc.nested_class import MainClass
sage: print MainClass.NestedClass.NestedSubClass.dummy.func_defaults
None

Actually the default is not None but is a tuple, which is correctly determined by the new version of sage_getargspec:

sage: sage_getargspec(MainClass.NestedClass.NestedSubClass.dummy)
ArgSpec(args=['self', 'x', 'r'], varargs='args', keywords='kwds', defaults=((1, 2, 3.4),))

Moreover, the above does not involve reading the source code. I wrote a new function that determines the argspec from the embedded signature, and strips the signature at the same time:

sage: print MainClass.NestedClass.NestedSubClass.dummy.__doc__
NestedSubClass.dummy(self, x, *args, r=(1, 2, 3.4), **kwds)
File: sage/misc/nested_class.pyx (starting at line 314)

                A dummy method to demonstrate the embedding of
                method signature for nested classes.
...
sage: from sage.misc.sageinspect import _extract_embedded_signature
sage: print _extract_embedded_signature(MainClass.NestedClass.NestedSubClass.dummy.__doc__, 'dummy')[0]
File: sage/misc/nested_class.pyx (starting at line 314)

                A dummy method to demonstrate the embedding of
                method signature for nested classes.
...
sage: _extract_embedded_signature(MainClass.NestedClass.NestedSubClass.dummy.__doc__, 'dummy')[1]
ArgSpec(args=['self', 'x', 'r'], varargs='args', keywords='kwds', defaults=((1, 2, 3.4),))

The tests in sage.misc pass.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Apr 18, 2015

Changed commit from b340eeb to 0e8ef00

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Apr 18, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

0e8ef00Catch syntax error when extraction of signature fails

@videlec
Copy link
Contributor

videlec commented Apr 24, 2015

comment:66

On 6.7-beta2 I got

sage: K=GF(5)
sage: K.factored_order
Cached version of <method 'factored_order' of
'sage.rings.finite_rings.finite_field_base.FiniteField' objects>

@videlec
Copy link
Contributor

videlec commented Apr 24, 2015

comment:67

sorry already in comment:27... could you update the description then?

@simon-king-jena
Copy link
Member Author

comment:68

Replying to @videlec:

On 6.7-beta2 I got

sage: K=GF(5)
sage: K.factored_order
Cached version of <method 'factored_order' of
'sage.rings.finite_rings.finite_field_base.FiniteField' objects>

You mean, you got it after removing src/sage/rings/finite_rings/finite_field_base.pyx? Then you say the problem is fixed.

@simon-king-jena
Copy link
Member Author

comment:69

Replying to @videlec:

sorry already in comment:27... could you update the description then?

I don't understand what info you need.

@videlec
Copy link
Contributor

videlec commented Apr 24, 2015

comment:70

Replying to @simon-king-jena:

Replying to @videlec:

On 6.7-beta2 I got

sage: K=GF(5)
sage: K.factored_order
Cached version of <method 'factored_order' of
'sage.rings.finite_rings.finite_field_base.FiniteField' objects>

You mean, you got it after removing src/sage/rings/finite_rings/finite_field_base.pyx? Then you say the problem is fixed.

Nope. Just starting from a fresh sage-6.7.beta2.

@videlec
Copy link
Contributor

videlec commented Apr 24, 2015

comment:71

Replying to @videlec:

Replying to @simon-king-jena:

Replying to @videlec:

On 6.7-beta2 I got

sage: K=GF(5)
sage: K.factored_order
Cached version of <method 'factored_order' of
'sage.rings.finite_rings.finite_field_base.FiniteField' objects>

You mean, you got it after removing src/sage/rings/finite_rings/finite_field_base.pyx? Then you say the problem is fixed.

Nope. Just starting from a fresh sage-6.7.beta2.

My mistake... I read too fast!!

@simon-king-jena
Copy link
Member Author

comment:72

I still don't see what info was requested. Hence, I change it back to "needs review".

@vbraun
Copy link
Member

vbraun commented Apr 29, 2015

Reviewer: Volker Braun

@vbraun
Copy link
Member

vbraun commented Apr 29, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants