Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attaching a pyx file in the presence of __init__.py results in wrong module names #12868

Open
simon-king-jena opened this issue Apr 22, 2012 · 16 comments

Comments

@simon-king-jena
Copy link
Member

Create a new folder (say, /home/simon/SAGE/work/tests/) and create a file test.pyx with the content

class bla(type):
    pass

Start sage, attach the file, and you will find something like

sage: attach test.pyx
Compiling ./test.pyx...
sage: bla.__module__
'sage: attach test.pyx
Compiling ./test.pyx...
sage: bla.__module__
'_home_simon_SAGE_work_tests_test_pyx_0'

Then, create an __init__.py file in the same folder. You will find

sage: attach test.pyx
Compiling ./test.pyx...
sage: bla.__module__
'_home_simon_SAGE_work_tests_test_pyx._home_simon_SAGE_work_tests_test_pyx_0'
sage: sys.modules.has_key(bla.__module__)
False

In fact, even if __init__.py is present, the module is still known under the "short" name, but the class doesn't know about it:

sage: import _home_simon_SAGE_work_tests_test_pyx_0
sage: _home_simon_SAGE_work_tests_test_pyx_0.bla
<class '_home_simon_SAGE_work_tests_test_pyx._home_simon_SAGE_work_tests_test_pyx_0.bla'>

Please change the component if you think that there is one that fits less bad than "misc"...

Component: misc

Issue created by migration from https://trac.sagemath.org/ticket/12868

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:1

I'm not yet sure whether that's really a bug, or just confusing. (If I understand correctly, the presence of an __init__.py file was unintentional in your case.)

The behaviour should probably also depend on whether some directory of the path to the attached file is in PYTHONPATH. Haven't yet looked at the code though.

[W.r.t. the component: "User interface" seems a candidate as well.]

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:2

Btw., the output of (in your case)

sage: bla?

looks different as well, depending on whether an __init__.py file is present in the same directory.

Without one, there's also:

Loaded File:	/home/leif/.sage/temp/sleepless/16739/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so
Source File:	/home/leif/.sage/temp/sleepless/16739/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so

(The attached file is /tmp/foo/foo.pyx.)

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:3

Looks like distutils were responsible for that behaviour...

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:4

In the Cython-generated .c file, I see things like

...
/* Module declarations from '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' */
#define __Pyx_MODULE_NAME "_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0"
int __pyx_module_is_main__tmp_foo_foo_pyx___tmp_foo_foo_pyx_0 = 0;

/* Implementation of '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' */
static char __pyx_k_1[] = "File: _tmp_foo_foo_pyx_0.pyx (starting at line 2)";
static char __pyx_k_2[] = "_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0";
static char __pyx_k__bar[] = "bar";
static char __pyx_k____main__[] = "__main__";
static char __pyx_k____test__[] = "__test__";
static PyObject *__pyx_n_s_2;
static PyObject *__pyx_n_s____main__;
static PyObject *__pyx_n_s____test__;
static PyObject *__pyx_n_s__bar;

static PyMethodDef __pyx_methods[] = {
  {0, 0, 0, 0}
};

#if PY_MAJOR_VERSION >= 3
static struct PyModuleDef __pyx_moduledef = {
    PyModuleDef_HEAD_INIT,
    __Pyx_NAMESTR("_tmp_foo_foo_pyx_0"),
    __Pyx_DOCSTR(__pyx_k_1), /* m_doc */
    -1, /* m_size */
    __pyx_methods /* m_methods */,
    NULL, /* m_reload */
    NULL, /* m_traverse */
    NULL, /* m_clear */
    NULL /* m_free */
};
#endif
...

with an __init__.py file present. (My dummy class is called bar rather than bla.)

In the temporary build directory, also a symbolic link to __init__.py is created:

/home/leif/.sage/temp/sleepless/17317/spyx/_tmp_foo_foo_pyx:
total 112
drwxr-xr-x 3 leif leif  4096 2012-04-22 17:45 .
drwxr-xr-x 3 leif leif  4096 2012-04-22 17:45 ..
drwxr-xr-x 4 leif leif  4096 2012-04-22 17:45 build
-rw-r--r-- 1 leif leif     0 2012-04-22 17:45 err
lrwxrwxrwx 1 leif leif    16 2012-04-22 17:45 foo.pyx -> /tmp/foo/foo.pyx
lrwxrwxrwx 1 leif leif    20 2012-04-22 17:45 __init__.py -> /tmp/foo/__init__.py
-rw-r--r-- 1 leif leif  1536 2012-04-22 17:45 log
-rw-r--r-- 1 leif leif  1400 2012-04-22 17:45 setup.py
-rw-r--r-- 1 leif leif 46397 2012-04-22 17:45 _tmp_foo_foo_pyx_0.c
-rw-r--r-- 1 leif leif  3993 2012-04-22 17:45 _tmp_foo_foo_pyx_0.html
-rw-r--r-- 1 leif leif   157 2012-04-22 17:45 _tmp_foo_foo_pyx_0.pyx
-rwxr-xr-x 1 leif leif 36595 2012-04-22 17:45 _tmp_foo_foo_pyx_0.so

setup.py there looks "clean".

So I'm pretty sure -- if at all -- distutils are to blame for the unexpected behaviour, although it seems quite reasonable to me.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:5

Replying to @nexttime:

So I'm pretty sure -- if at all -- distutils are to blame for the unexpected behaviour, although it seems quite reasonable to me.

Nope, actually Cython is causing this, since the following does not solve your problem (if it really is one):

diff --git a/sage/misc/cython.py b/sage/misc/cython.py
--- a/sage/misc/cython.py
+++ b/sage/misc/cython.py
@@ -432,6 +432,11 @@
     SAGE_ROOT  = os.environ['SAGE_ROOT']
     SAGE_LOCAL = SAGE_ROOT + '/local/'
 
+# Remove unwanted link to an __init__.py file from the current directory,
+# i.e., from the temporary build directory only (#12868):
+if os.path.exists("__init__.py"):
+    os.unlink("__init__.py")
+
 extra_link_args =  ['-L' + SAGE_LOCAL + '/lib']
 extra_compile_args = %s
 

You can verify that Cython changes the module name depending on whether an __init__.py file is present by

$ sage --sh -c 'cython test.pyx' && grep '^#.*__Pyx_MODULE_NAME' test.c`

Although it is not yet clear to me how Cython finds the __init__.py if I remove it from the temporary build directory, as there's only the symbolic link to the "unrelated" original .pyx source file, but probably Cython somehow "knows" about the relation.

@simon-king-jena
Copy link
Member Author

comment:6

Replying to @nexttime:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:7

Replying to @simon-king-jena:

Replying to @nexttime:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

No, I'd rather say it's a "user error"... ;-)

Seriously, my above patch didn't work because the link was removed just too late (since I initially thought the problem was caused by distutils, which it is not).

Currently changing the patch to remove the (in your opinion) undesirable link to __init__.py earlier, such that cython, when invoked, won't see it.

@simon-king-jena
Copy link
Member Author

comment:8

Replying to @nexttime:

Currently changing the patch to remove the (in your opinion) undesirable link to __init__.py earlier, such that cython, when invoked, won't see it.

That wouldn't be a solution. Perhaps the user wants (for some reason) to have it in the folder (in particular, it is no user error).

I believe that it should simply not matter for attaching files whether or not __init__.py is there. Of course, if you have a Python file and want that you can import (not attach) it, then you would create the __init__.py (this is why I created it, anyway).

Attaching a file should be independent from importing, and both should simultaneously be possible. So, unlinking __init__.py is bad.

Up to yesterday, I was in the belief that "attaching a pyx-file" means that

  1. the file is copied into a temporary directory,
  2. the compilation is done there, and
  3. the module is imported from the temporary directory.

Since the temporary directory would not contain __init__.py, there should be no problem, isn't it?

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:9

Replying to @simon-king-jena:

Replying to @nexttime:

Currently changing the patch to remove the (in your opinion) undesirable link to __init__.py earlier, such that cython, when invoked, won't see it.

That wouldn't be a solution. Perhaps the user wants (for some reason) to have it in the folder (in particular, it is no user error).

In the temporary build folder? Not sure whether there are situations where this is needed.

This one works for me:

diff --git a/sage/misc/cython.py b/sage/misc/cython.py
--- a/sage/misc/cython.py
+++ b/sage/misc/cython.py
@@ -382,6 +382,10 @@
         os.system(cmd)
         if os.path.exists("%s/setup.py" % build_dir):
             os.unlink("%s/setup.py" % build_dir)
+        if os.path.exists("%s/__init__.py" % build_dir):
+            sys.stderr.write("Note: Ignoring %s/__init__.py ...\n" % abs_base)
+            sys.stderr.flush()
+            os.unlink("%s/__init__.py" % build_dir)
 
     if compile_message:
         print "Compiling %s..."%filename

No idea whether it breaks anything else. Note that you have to run ./sage -b after changing the file, which in turn causes (for whatever reason) dozens of extension modules to get rebuilt.

In case having __init__.py in the temporary build directory is desirable, we could just restore the link after Cython has been run.

I believe that it should simply not matter for attaching files whether or not __init__.py is there. Of course, if you have a Python file and want that you can import (not attach) it, then you would create the __init__.py (this is why I created it, anyway).

Attaching a file should be independent from importing, and both should simultaneously be possible. So, unlinking __init__.py is bad.

Importing in the usual way should still work, since you don't import from the temporary directory, do you?

Up to yesterday, I was in the belief that "attaching a pyx-file means" that

  1. the file is copied into a temporary directory,
  2. the compilation is done there, and
  3. the module is imported from the temporary directory.

That's correct. But in addition, symbolic links to any(!) file in the directory containing the attached file are created in the temporary directory (since e.g. you otherwise wouldn't be able to import or include other files in that directory from the attached file). Above, I've only changed that part.

Since the temporary directory would not contain __init__.py, there should be no problem, isn't it?

That's what I was saying. (Unless you try to import __init__ in the attached file...)

@simon-king-jena
Copy link
Member Author

comment:10

Replying to @nexttime:

That wouldn't be a solution. Perhaps the user wants (for some reason) to have it in the folder (in particular, it is no user error).

In the temporary build folder? Not sure whether there are situations where this is needed.

No, in the original folder.

This one works for me:

diff --git a/sage/misc/cython.py b/sage/misc/cython.py
--- a/sage/misc/cython.py
+++ b/sage/misc/cython.py
@@ -382,6 +382,10 @@
         os.system(cmd)
         if os.path.exists("%s/setup.py" % build_dir):
             os.unlink("%s/setup.py" % build_dir)
+        if os.path.exists("%s/__init__.py" % build_dir):
+            sys.stderr.write("Note: Ignoring %s/__init__.py ...\n" % abs_base)
+            sys.stderr.flush()
+            os.unlink("%s/__init__.py" % build_dir)
 
     if compile_message:
         print "Compiling %s..."%filename

...
That's correct. But in addition, symbolic links to any(!) file in the directory containing the attached file are created in the temporary directory (since e.g. you otherwise wouldn't be able to import or include other files in that directory from the attached file). Above, I've only changed that part.

I see! So, you are not unlinking it from the original directory.

That's what I was saying. (Unless you try to import __init__ in the attached file...)

Hm. Difficult to tell. If you have files foo.pyx and bar.py in your folder, and in foo.pyx you want to do from bar import Bar, then I guess you need to have __init__.py in the folder in which the compilation happens, isn't it? So, there are situations in which the presence of __init__.py in the temporary folder (via a symbolic link) makes sense.

The best solution would be to patch Cython, so that the wrong naming of modules does not occur (in some post above, I think it was shown that the naming is chosen by Cython, not by distutils).

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:11

Replying to @nexttime:

[...] in addition, symbolic links to any(!) file in the directory containing the attached file are created in the temporary directory (since e.g. you otherwise wouldn't be able to import or include other files in that directory from the attached file).

Of course there are other, probably smarter (though not necessarily simpler) ways to achieve this, e.g. by modifying sys.path for the attached module, and by passing according -I options to cython (although I think the generated setup.py already takes care of the latter).

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:12

Replying to @simon-king-jena:

The best solution would be to patch Cython, so that the wrong naming of modules does not occur (in some post above, I think it was shown that the naming is chosen by Cython, not by distutils).

I don't think Cython is wrong by what it does, as __init__.py files belong to the package (and hence directory) they're contained in, i.e. .../<package_name>/__init__.py belongs to <package_name>.

Consequently, all .py and .pyx files there also belong to that package, hence their module names get the package prefix.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:13

Replying to @simon-king-jena:

Replying to @nexttime:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

That should IMHO be solvable by importing / loading the module differently, although I think you run into the same if for example you create links to extension modules and import from these rather than their original location. So I'm not sure if it really is an error.

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:14

Replying to @nexttime:

Replying to @simon-king-jena:

Replying to @nexttime:

... the following does not solve your problem (if it really is one):

We have a module containing a class bla, but sys.modules[bla.__module__] gives a key error. I am surprised that you question whether that is a problem.

That should IMHO be solvable by importing / loading the module differently, although I think you run into the same if for example you create links to extension modules and import from these rather than their original location.

I think that's actually a cleaner (and probably also safer) solution.

diff --git a/sage/misc/cython.py b/sage/misc/cython.py
--- a/sage/misc/cython.py
+++ b/sage/misc/cython.py
@@ -382,6 +382,10 @@
         os.system(cmd)
         if os.path.exists("%s/setup.py" % build_dir):
             os.unlink("%s/setup.py" % build_dir)
+        if os.path.exists("%s/__init__.py" % build_dir):
+            sys.stderr.write("Note: Not ignoring %s/__init__.py ...\n" % abs_base)
+            sys.stderr.flush()
+            # os.unlink("%s/__init__.py" % build_dir)
 
     if compile_message:
         print "Compiling %s..."%filename
@@ -527,7 +531,13 @@
         if os.system(cmd):
             raise RuntimeError, "Error making local copy of shared object library for %s"%filename
 
-    return name, build_dir
+    if os.path.exists("%s/__init__.py" % build_dir):
+        # In that case, the module name Cython creates contains the package
+        # name, i.e., the name of the directory the file is contained in.
+        assert not build_dir.endswith(os.path.sep)
+        return "%s.%s" % (base, name), os.path.dirname(build_dir)
+    else:
+        return name, build_dir
 
 
 

Note that this change has the side effect that now __init__.py also gets executed, which IMHO is the desired behaviour:

sage: attach("/tmp/foo/foo.pyx")
Note: Not ignoring /tmp/foo/__init__.py ...
Compiling /tmp/foo/foo.pyx...
Hello, I'm '/tmp/foo/__init__.py'.
sage: bar.__module__
'_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0'
sage: sys.modules[bar.__module__]
<module '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' from '/home/leif/.sage//temp/sleepless/21086//spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so'>
sage: bar?
Type:		classobj
String Form:	_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0.bar
Namespace:	Interactive
Loaded File:	/home/leif/.sage/temp/sleepless/21086/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so
Source File:	/home/leif/.sage/temp/sleepless/21086/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so

sage: 

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:15

Just checked: Imports in the attached file also work properly; the imported modules also get the (temporary) package name prepended:

sage: attach("/tmp/foo/foo.pyx")
Note: Not ignoring /tmp/foo/__init__.py ...
Compiling /tmp/foo/foo.pyx...
Hello, I'm '/tmp/foo/__init__.py'.
sage: bar.__module__
'_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0'
sage: sys.modules[bar.__module__]
<module '_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0' from '/home/leif/.sage//temp/sleepless/21279//spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so'>
sage: bar?
Type:		classobj
String Form:	_tmp_foo_foo_pyx._tmp_foo_foo_pyx_0.bar
Namespace:	Interactive
Loaded File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so
Source File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/_tmp_foo_foo_pyx_0.so

sage: Baz?
Type:		classobj
String Form:	_tmp_foo_foo_pyx.baz.Baz
Namespace:	Interactive
Loaded File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/baz.py
Source File:	/home/leif/.sage/temp/sleepless/21279/spyx/_tmp_foo_foo_pyx/baz.py

sage: Baz.__module__
'_tmp_foo_foo_pyx.baz'
sage: !cat /tmp/foo/foo.pyx
# foo.pyx

from baz import Baz

class bar:
    pass

sage: !cat /tmp/foo/baz.py
# baz.py

class Baz:
    pass
sage: 

@nexttime
Copy link
Mannequin

nexttime mannequin commented Apr 22, 2012

comment:16

I'll attach a proper patch after further testing.

@jdemeyer jdemeyer modified the milestones: sage-5.11, sage-5.12 Aug 13, 2013
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.1, sage-6.2 Jan 30, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.2, sage-6.3 May 6, 2014
@sagetrac-vbraun-spam sagetrac-vbraun-spam mannequin modified the milestones: sage-6.3, sage-6.4 Aug 10, 2014
@mkoeppe mkoeppe removed this from the sage-6.4 milestone Dec 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants