Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: fixing test module name collisions in namespace packages #6966

Open
0cjs opened this issue Mar 25, 2020 · 2 comments
Open

Discussion: fixing test module name collisions in namespace packages #6966

0cjs opened this issue Mar 25, 2020 · 2 comments
Labels
topic: collection related to the collection phase type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature

Comments

@0cjs
Copy link
Contributor

0cjs commented Mar 25, 2020

In #5147 I addressed the issues with pytest when using namespace packages (i.e., directories without a __init__.py file).¹ The problems discussed here are related to that and stem from the same source:

  1. If you have namespace packages foo and bar, you cannot currently have foo/util_test.py and bar/util_test.py because these will both cause creation of a module with a fully-qualified name of util_test and insert it (unnecessarily) into sys.modules. (Pytest does catch this collision, for test files anyway, and generates an error during collection.)
  2. sys.path has an extra entry added to it for each namespace module, which can cause unexpected changes in what modules get imported as compared to running outside pytest.

A solution for this (for Python ≥3.5, anyway) is already available in the py package: use importlib instead of the older import routines so you can avoid both adding the module to sys.modules and changing sys.path. (This is done by passing ensuresyspath='importlib' to LocalPath.pyimport(); the support for this was added in commit 055bcb1a in 2018.) However, pytest does not currently provide any way to do this, and the best way to enable this is not clear at the moment, at least to me. (One of my worries is issues with backwards compatibility.)

To solve this issue for my projects, I've created this plugin. (Depending on when you're reading this, a more recent version may be available on the master branch.)²

This uses essentially the same technique as py's LocalPath.pyimport() above: use the Python 3.5 importlib API to load a spec, create the module and execute its code, all directly from the file without using sys.path, and without inserting the newly-created module into sys.modules. My version is slightly more complex because I do this only for files matching *.pt (and need to use a couple of tricks to get the interpreter to load files not ending in .py); this preserves backward compatibility for *_test.py and test_*.py files.³

So that's a proof-of-concept. I'd love to have feedback on any of the particulars of how I'm doing that, whether related to the collision/path problem or just the ability to use files not ending in .py.

I'm also hoping we can discuss further how we might move forward with a real fix for the issues raised by namespace packages (for both test modules, as discussed here, and code-under-test modules, as discussed in #5147): the collisions and the sys.path changes. Having done this, it seems to have opened up the possibility of being fixed with a plug-in instead of by changing pytest itself, which I suppose has its advantages and disadvantages.

Thoughts?


¹ Namespace package problems have also been discussed in other issues: #2371, I think #1028, and #221. I suspect that what fixed that last one is the source of our current problems.

² If you are a bit confused about what you see under the src/ subdirectory of that repo, yes, I am indeed using pytest to test my 6502 assembler code. :-) You can see some more "normal" testing of Python code under the lib/ directory.

³ Actually, the main reason I did this was so I can use foo.* on the command line to reference all files related to module foo, rather than foo{.md,_test.py,.py}, especially when there are foobar* files kicking around, too, but the ability to separate the files with the new behaviour and maintain backward-compatibility is a nice bonus.

@Zac-HD Zac-HD added topic: collection related to the collection phase type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature labels Mar 26, 2020
@Traktormaster
Copy link

Hi!
I have just faced this issue today myself. (or something related)

Code stucture

There is a code repository I work on where all components of a library and their tests are inside PEP420 (implicit) namespaces. They are like little projects but we manage them in a single repository for convenience. Simplified example:

Repo root
+ Component-A
|   + proj                     # PEP420
|   |   + comp_a
|   |   |  + __init__.py
|   + tests                    # PEP420
|   |   + proj                 # PEP420
|   |   |   + comp_a
|   |   |      + __init__.py
+ Component-B
|   + proj                     # PEP420
|   |   + comp_b
|   |   |  + __init__.py
|   + tests                    # PEP420
|   |   + proj                 # PEP420
|   |   |   + comp_b
|   |   |      + __init__.py
...

Each component root directory in the above structure is added to PYTHONPATH so importing everything can be done so:

from proj.comp_a import xy
from proj.comp_b import ab
# when testing
from tests.proj.comp_a import fixture_x
# and so on...

Where pytest fails

If I call pytest like python -m pytest /path/to/proj/Component-B/tests the tests will be found and they will run. However they will be imported without regard to the implicit namespaces. The import module of tests.proj.comp_b will be comp_b.
This will still kind of work (I'm not sure about unwanted side effects), but it will definitely fail if the name of "comp_b" is something common that gets in conflict with a real top level package/module.

Conflict example

Let's say the comp_b package's name is actually zmq. It contains some utilities to work with zmq and as such pyzmq is installed as a site package. In the intended context the zmq pacakge is not in conflict with the installed library, because it is inside a namespace of proj.zmq.
When pytest ignores the PEP420 namespaces and inserts the folder of /path/to/proj/Component-B/tests/proj so it can import the zmq module it will shadow the real zmq library. Making all tests actually unable to run at all.

My fix

My issue is pytest not respecting PEP420 namespaces. As such I debugged the whole way down where the importing of the test modules takes place. I found a convenient place to insert some changes.
Here are my changes to py/_path/local.py. Warning: I'm fairly certain this could break something else, but it works for me (TM).

--- local.py	2020-03-31 19:30:40.175111163 +0200
+++ local_fix.py	2020-03-31 19:30:48.768444588 +0200
@@ -620,28 +620,41 @@
 
     def pypkgpath(self):
         """ return the Python package path by looking for the last
         directory upwards which still contains an __init__.py.
         Return None if a pkgpath can not be determined.
         """
         pkgpath = None
         for parent in self.parts(reverse=True):
             if parent.isdir():
                 if not parent.join('__init__.py').exists():
                     break
                 if not isimportable(parent.basename):
                     break
                 pkgpath = parent
+        # check for already existing paths that include the pkgpath
+        # in case there is one, it's better to assume that some directories are actually PEP420 namespaces
+        spkgpath = str(pkgpath)
+        best_sys_path = None
+        for sysp in sys.path:
+            if spkgpath.startswith(sysp):
+                if not best_sys_path:
+                    best_sys_path = sysp
+                elif len(best_sys_path) < len(sysp):
+                    best_sys_path = sysp
+        if best_sys_path:
+            while str(pkgpath).startswith(best_sys_path) and not best_sys_path.startswith(str(pkgpath.dirpath())):
+                pkgpath = pkgpath.dirpath()
         return pkgpath
 
     def _ensuresyspath(self, ensuremode, path):
         if ensuremode:
             s = str(path)
             if ensuremode == "append":
                 if s not in sys.path:
                     sys.path.append(s)
             else:
                 if s != sys.path[0]:
                     sys.path.insert(0, s)
 
     def pyimport(self, modname=None, ensuresyspath=True):
         """ return path as an imported python module.

@0cjs
Copy link
Contributor Author

0cjs commented Mar 31, 2020

@Traktormaster Yes, you've implemented more or less the solution I discussed in a comment on #5147, originally proposed by heofling on Stack Overflow.

For your own use, you might find it more convenient to use his technique of monekypatching py in a conftest.py in your repo, rather than having to use a separate, modified version of py. (I believe that monkeypatching is also what I also used in my "test" implementation, which actually did go into production, but I no longer have access to that code.)

In my comment I also proposed an API change to pypkgpath() to add a parameter that would allow you to select the behaviour your want, but of course this would still need pytest to be tweaked to be able to use that parameter.

I'd suggest at least dropping a comment in that issue pointing to your comment here, so that the maintainers know that there's further interest in this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: collection related to the collection phase type: proposal proposal for a new feature, often to gather opinions or design the API around the new feature
Projects
None yet
Development

No branches or pull requests

3 participants