Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`develop` and `install --single-version-externally-managed` are not compatible with each other for namespace packages #250

Closed
ghost opened this issue Aug 29, 2014 · 28 comments
Labels

Comments

@ghost
Copy link

@ghost ghost commented Aug 29, 2014

Originally reported by: hongqn (Bitbucket: hongqn, GitHub: hongqn)


To reproduce this issue, do the following in a virtualenv:

  1. prepare source code of two packages with same namespace package ns:

    mkdir -p pkg1/ns/pkg1 pkg2/ns/pkg2
    echo "from setuptools import setup, find_packages; setup(name='pkg1', packages=find_packages(), namespace_packages=['ns'])" > pkg1/setup.py
    echo "__import__('pkg_resources').declare_namespace(__name__)" > pkg1/ns/__init__.py
    touch pkg1/ns/pkg1/__init__.py
    echo "from setuptools import setup, find_packages; setup(name='pkg2', packages=find_packages(), namespace_packages=['ns'])" > pkg2/setup.py
    echo "__import__('pkg_resources').declare_namespace(__name__)" > pkg2/ns/__init__.py
    touch pkg2/ns/pkg2/__init__.py
    
  2. install pkg1 using --single-version-externally-managed

    cd pkg1
    python setup.py install --single-version-externally-managed --record record.txt
    cd ..
    
  3. install pkg2 using develop

    cd pkg2
    python setup.py develop
    cd ..
    
  4. pkg1 is accessible

    python -c 'import ns.pkg1'
    
  5. pkg2 is unexpectedly inaccessible

    python -c 'import ns.pkg2'
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ImportError: No module named pkg2
    

This was discussed in pip's issue tracker pypa/pip#3 years ago (because pip uses install --single-version-externally-managed for pip install and use develop for pip install --editable), but they still have not gotten a resolution.

I think this issue should be resolved in setuptools instead of pip. install --single-version-externally-managed creates a {pkg}-nspkg.pth in site-packages directory. It will create a ns module and assign its __path__ to the ns directory in site-packages. So pkg_resources.declare_namespace() in pkg2's ns/init.py has no chance to run.

I have two proposals to fix it:

  1. Fix install --single-version-exteranlly-managed. Modify the {pkg}-nspkg.pth file to append a line

    import pkg_resources; pkg_resources.declare_namespace('ns')
    

    at the end of it. So it will scan sys.path to gather other directories into ns.__path__.

    A tricky point is that we must ensure the {pkg}-nspkg.pth filename is alphabetically larger than easy-install.pth, so that we have source code directory installed by develop command in sys.path when running declare_namespace. Is to rename to zzz-{pkg}-nspkg.pth a good idea?

  2. Fix develop command. Create a {pkg}-nspkg.pth in site-packages directory when a namespace package is installed using develop command, like what install --single-version-externally-managed does. The content of the pth file could be:

    /path/to/sourcecode
    import pkg_resources; pkg_resources.declare_namespace('ns')
    

    The /path/to/sourcecode line is duplicated with what is inserted in easy-install.pth. It is here to make sure sys.path contains it before declare_namespace runs. It could be omitted if we have a zzz- prefix for the pth filename.

What's the dev team's opinion? I prefer to the second proposal because it only affects develop command which is mostly used in develop environment so it will not affect production systems.

I'd like to contribute code fix if you agree with either one of the proposals.


@ghost
Copy link
Author

@ghost ghost commented Aug 29, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Thanks for the detailed bug report. Well stated.

I would say that anything with a zzz- prefix smells of a hack. Let's avoid that.

That implies that --single-version-externally-managed can't be adapted (option 1).

Option (2) looks viable on its surface. Would that technique still work if there were multiple packages (of the same namespace) installed using --develop? The prescribed technique does pollute the sys.path somewhat (as you say, duplicating entries), but that may be acceptable in a development environment.

Feel free to take a stab at tests and/or implementation toward that end. It sure would be nice to see pip issue 3 closed. I do suggest you draw attention to this ticket in distutils-sig and pypa-dev for broader feedback.

@ghost
Copy link
Author

@ghost ghost commented Sep 22, 2014

Original comment by rbtcollins (Bitbucket: rbtcollins, GitHub: rbtcollins):


Hi, I have a related situation: I'd like to be able to clone the source of a tree using a namespace package (say foo.bar) and have 'import foo.bar' work when other elements of foo are already installed.

A pth file with:
import pkg_resources; pkg_resources.declare_namespace('oslo')
import pkg_resources; pkg_resources.fixup_namespace_packages('')

in it suffices to enable this, but ideally that wouldn't be needed. I'm poking around at the moment to figure out just whats up.

@ghost
Copy link
Author

@ghost ghost commented Sep 22, 2014

Original comment by rbtcollins (Bitbucket: rbtcollins, GitHub: rbtcollins):


(And such a .pth file also fixes importing foo.bar after doing pip install -e .)

@ghost
Copy link
Author

@ghost ghost commented Sep 22, 2014

Original comment by rbtcollins (Bitbucket: rbtcollins, GitHub: rbtcollins):


Further data; the key call is fixup_namespace_packages(''), everything else Just Works for both a simple clone on PYTHONPATH and for "pip install -e ." and for 'python setup.py develop'.

I'm sure this indicates a race somewhere in site.py, digging deeper. But it suggests that fixing the race will be sufficient (e.g. ensure that by the end of site.py processing we've done a fixup_namespace_packages('') call).

@ghost
Copy link
Author

@ghost ghost commented Sep 24, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


I'm not sure what role site.py plays.

I don't see a race condition. All I see is that the current directory, '', is never declared as containing namespace packages. Adding it at startup would change that behavior.

@ghost
Copy link
Author

@ghost ghost commented Nov 13, 2014

Original comment by EvaSDK (Bitbucket: EvaSDK, GitHub: EvaSDK):


Got bitten by this a well after splitting my company's huge module into namespaces.
I already did some fixes in pylint/astroid for nested namespaces support but now I am not sure it is even needed (edit: actually it appears it is still needed).
The import pkg_resources; pkg_resources.fixup_namespace_packages('') in {mod}-nspkg.pth file fixed it for me.
Hope we can get a fix for this in a release soon.

@ghost
Copy link
Author

@ghost ghost commented Mar 26, 2015

Original comment by hobson (Bitbucket: hobson, GitHub: hobson):


I'm having the same problem as @EvaSDK... splitting a monolith into managable, independent pieces. Any workarounds?

@ghost
Copy link
Author

@ghost ghost commented May 7, 2015

Original comment by dsully (Bitbucket: dsully, GitHub: dsully):


Also interested in a fix here.

@ghost
Copy link
Author

@ghost ghost commented Jun 30, 2015

Original comment by Guyverthree (Bitbucket: Guyverthree, GitHub: Guyverthree):


I am also interested in the work on this.

@ghost
Copy link
Author

@ghost ghost commented Jul 1, 2015

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


@hobson: The only workaround I know currently is to always install all namespace packages using easy_install or with pip --egg. In that case, setup.py develop works fine with namespace packages.

@ghost
Copy link
Author

@ghost ghost commented Jul 5, 2015

Original comment by ncoghlan (Bitbucket: ncoghlan, GitHub: ncoghlan):


I have an alternative suggestion: give setuptools the ability to use native namespace packages in Python 3, and suggest that folks that need that native namespace support in Python 2 use the importlib2 backport of the Python 3 import system rather than attempting to emulate it on a per-package basis.

That is, I'd like to request the ability to disable the generation of namespace package __init__.py files entirely by passing a new option like --no-namespace-init. We introduced native namespace package support due to all the edge cases associated with attempting to define them externally, so setuptools is now in the situation in Python 3.3+ where it's turning off the native support (by providing an __init__.py file), and then attempting to recreate it while running that file.

If there's no __init__.py file in a directory at all, then Python 3 will assume it's a namespace package, and the same behaviour can be enabled in Python 2 by installing the importlib2 backport.

@ghost
Copy link
Author

@ghost ghost commented Jul 5, 2015

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


the same behaviour can be enabled in Python 2 by installing the importlib2 backport.

I've tried to verify this assertion without success:

$ mkdir a
$ mkdir b
$ mkdir a/foo
$ mkdir b/foo
$ touch a/foo/bar.py
$ touch b/foo/frob.py
$ PYTHONPATH=a:b python3.4 -c "import foo.bar, foo.frob"
$ PYTHONPATH=a:b python2.7 -c "import foo.bar, foo.frob"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named foo.bar$ sudo python2.7 -m easy_install -q importlib2 > /dev/null
$ PYTHONPATH=a:b python2.7 -c "import foo.bar, foo.frob"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named foo.bar
$ PYTHONPATH=a:b python2.7 -c "import importlib2; import foo.bar, foo.frob"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named foo.bar
$ PYTHONPATH=a:b python2.7 -c "import importlib2; importlib2.import_module('foo.bar')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/__init__.py", line 104, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/_bootstrap.py", line 2313, in _gcd_import
    return _find_and_load(name, _gcd_import)
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/_bootstrap.py", line 2296, in _find_and_load
    return _find_and_load_unlocked(name, import_)
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/_bootstrap.py", line 2271, in _find_and_load_unlocked
    _call_with_frames_removed(import_, parent)
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/_bootstrap.py", line 356, in _call_with_frames_removed
    return f(*args, **kwds)
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/_bootstrap.py", line 2313, in _gcd_import
    return _find_and_load(name, _gcd_import)
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/_bootstrap.py", line 2296, in _find_and_load
    return _find_and_load_unlocked(name, import_)
  File "/Library/Python/2.7/site-packages/importlib2-3.5.0.2-py2.7.egg/importlib2/_bootstrap.py", line 2283, in _find_and_load_unlocked
    raise ModuleNotFoundError(name)
importlib2._bootstrap.ModuleNotFoundError: No module named 'foo'

I helped contribute to the PEP-420 implementation (mostly in design discussions and tests), but my understanding was that PEP-420 required some foundational changes to the C-level import machinery itself, so couldn't be readily backported to Python 3.2 or earlier.

@ncoghlan Have you seen or can you produce an example where the presence of importlib2 enables PEP-420 namespace packages for Python prior to 3.3?

@ghost
Copy link
Author

@ghost ghost commented Jul 5, 2015

Original comment by ncoghlan (Bitbucket: ncoghlan, GitHub: ncoghlan):


Running "importlib2.install_import_hooks()" at startup should completely
shadow the default Python 2 machinery. I know it works to enable the
directory caching speedups, but I haven't personally tested the namespace
package support.

@ghost
Copy link
Author

@ghost ghost commented Jul 6, 2015

Original comment by rbtcollins (Bitbucket: rbtcollins, GitHub: rbtcollins):


You may need to hook it as Nick says.

I looked into supporting namespace packages in 2.x via this route about 6 months back and it wasn't quite there, but it certainly was the intent.

I'd file a bug on this in the importlib2 tracker.

@ghost
Copy link
Author

@ghost ghost commented Jul 6, 2015

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Thanks @ncoghlan for the tip. I found that I was able to make the example succeed with this invocation:

$ PYTHONPATH=a:b python2.7 -c "import importlib2.hook; importlib2.hook.install(); import foo.bar, foo.frob"

I'm excited about this prospect and would like to continue to explore it. It seems like maybe this issue isn't the best forum for this discussion. I've opened a new one as #406.

@ghost
Copy link
Author

@ghost ghost commented Jul 6, 2015

Original comment by embray (Bitbucket: embray, GitHub: embray):


Still catching up on this thread, but I would love to see this fixed. I've been aware of this problem for years, and just worked around it whenever it affected me. I like some version of @hongqn's second proposal, but now I'll go back and see what others have said. Just wanted to add my 👍 to fixing this.

@ghost
Copy link
Author

@ghost ghost commented Dec 7, 2015

Original comment by embray (Bitbucket: embray, GitHub: embray):


Ran across this issue yet again, in light of recently experiencing this issue again, having a thought similar to this as to how to fix it, and being quite sure this has been discussed before....

Last time there was any activity on this, the discussion got derailed by the suggestion to use importlib2 and #406. While having import system parity between Python 2 and Python 3 would certainly be ideal, the importlib2 proposal is sort of an all-or-nothing approach, and is maybe overkill and overcomplexity for the majority of import stories where the normal Python 2 import system works fine.

For the narrow case interoperability between the legacy setuptools -nspkg.pth hack and ./setup.py develop, I really think it would be best--at least for Python < 3.3 installs--to stick to fixing -nspkg.pth to use pkg_resources.declare_namespace.

The existing behavior of setuptools with respect to both namespace packages and dev installs are hacks. We just need to fix those hacks to be compatible with each other, and forget them in cases where they aren't needed at all.

@ghost
Copy link
Author

@ghost ghost commented Dec 7, 2015

Original comment by embray (Bitbucket: embray, GitHub: embray):


BTW, I can confirm that modifying ./setup.py develop to install an additional nspkg.pth file for the package being installed in develop mode resolves the issue (this is option 2 in the OP). It doesn't depend on the lexicographical order or anything like that either. Seems to Just Work. Haven't tested all corner cases yet though.

@ghost
Copy link
Author

@ghost ghost commented Dec 7, 2015

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


That's great news. Thanks for helping with this!

@ghost
Copy link
Author

@ghost ghost commented Dec 7, 2015

Original comment by embray (Bitbucket: embray, GitHub: embray):


I'll go ahead and supply a patch and let people decide where to go from there.

(This should probably be accompanied by an update to pip to ensure that these additional -nspkg.pths are uninstalled too)

@jaraco
Copy link
Member

@jaraco jaraco commented May 22, 2016

@embray Did you have any luck working on a patch for this issue?

@jaraco
Copy link
Member

@jaraco jaraco commented Nov 6, 2016

Building on the draft in #789, and with the help of Barry, Andrew, and Eric today, I made some progress on the develop-nspkg branch. Initial testing indicates that this technique does work as long as the __init__.py doesn't exist in the namespace under development.

Some concerns still remain:

  • When pip uninstalling the package, the -nspkg.pth file isn't removed.
  • __init__.py has to be removed, which can affect find_packages. One option is to make find_packages work in this model. Another could be to possibly adapt the -nspkg.pth file to run unconditionally.
  • Should we have some tests?
@jaraco
Copy link
Member

@jaraco jaraco commented Nov 7, 2016

I've added a test that captures the basic expectation (linked above).

In the develop-nspkg-always branch, I've removed the check for __init__.py, which allows the test to pass even without removing __init__.py, but only on Python 3.2 and earlier. It appears that the mixed presence of the __init__.py (present for myns.pkgB under develop, but not present for myns.pkgA installed by pip) causes a conflict between the legacy namespace package support and the PEP 420 namespace package support. I'd like to see if there's a way to support this environment, because if it's necessary for package maintainers to remove these __init__.py files, that's likely to be difficult to support.

@jaraco
Copy link
Member

@jaraco jaraco commented Nov 13, 2016

In #789, @ncoghlan said:

While I agree that situation [where __init__.py in the develop package] isn't ideal, I believe it should still work as long as the __init__.py does the necessary dance to set its own __path__ correctly using pkg_resources or pkgutil.

And he's right. The problem is that the presence of the __init__.py file in any path representing the namespace causes the PEP 420 find/load behavior to be disabled and since #805, the -nspkg.pth behavior is also disabled, so there's nothing left to set up the namespace packages for installed packages.

If I revert the behavior introduced in #805, the tests in the develop-nspkg-always branch now pass. I think I want to write a test for #805 to demonstrate the use case and expectation.

In the meantime, can @ncoghlan or any one else imagine a way that the -nspkg.pth file could be written such that it doesn't interfere with the loading of PEP 420 namespace packages elsewhere in the PYTHONPATH? This comment provides some detail on the differences between these packages.

@jaraco
Copy link
Member

@jaraco jaraco commented Nov 13, 2016

I want to write a test for #805

I've done this and pushed the test into master.

@ncoghlan
Copy link
Member

@ncoghlan ncoghlan commented Nov 14, 2016

@jaraco Looking more closely at that, the problematic piece is this expression: types.ModuleType(%(pkg)r), and in particular the fact it doesn't set __spec__ and various other attributes the interpreter is expected to see post PEP 451 (as long as __spec__ is set, the import system can fill in most of the others, but without __spec__ some operations won't work as expected).

So in 3.5+, what you probably want to be doing is:

  1. Use https://docs.python.org/3/library/importlib.html#importlib.machinery.PathFinder.find_spec to create the namespace spec based on the values of pkg and pth.
  2. Use https://docs.python.org/3/library/importlib.html#importlib.util.module_from_spec to create the module based on that

I think the necessary parameters to replace the current types.ModuleType call in the template would look something like:

module_from_spec(PathFinder.find_spec(%(pkg)r, [os.path.dirname(p)])

The current explicit version check would then be replaced by a check for module_from_spec (which only appeared in 3.5).

If you wanted to handle 3.4 as well, then the simplest approach would likely be to extract the pieces of https://hg.python.org/cpython/file/3.5/Lib/importlib/_bootstrap.py#l570 that are applicable to _NamespaceLoader in 3.4.

@jaraco
Copy link
Member

@jaraco jaraco commented Dec 8, 2016

Thanks @ncoghlan. That's really helpful. I've tested it and it appears to be working on Python 3.5 and 3.6. The problem is now Python 3.3 and 3.4 - since they don't have the interfaces, and since this code needs to be embedded in a .pth line, it's not really feasible to backport importlib machinery.

I think it's also not viable to include the behavior in a library (like setuptools) -- because this logic runs at startup, it probably shouldn't have dependencies on any functionality not in the stdlib. I did draft an implementation using such a technique, but it would be very brittle in scenarios where setuptools is downgraded or possibly upgraded or uninstalled.

Which leads me to think that we might have specifically not to support Python 3.3 and 3.4 for this use case. Can anyone thing of a way around this issue?

@ncoghlan
Copy link
Member

@ncoghlan ncoghlan commented Dec 8, 2016

@jaraco Limiting support for this to 3.5+ only seems like a reasonable decision to me.

There were genuinely missing pieces in the initial iterations of PEP 420 and PEP 451, and they didn't get fully resolved until the new helper APIs were added in 3.5.

@jaraco jaraco closed this in 6f5dc38 Dec 11, 2016
LoicGrobol added a commit to bencrabbe/npdependency that referenced this issue Oct 8, 2020
# Changes

## Added

- setuptools machinery to make this an installable package ([b0a68b5](b0a68b5))
- console entry point for `graph_parser3.py` which is now invokable by the command `graph_parser` when the npdependnecy package is installed ([1ed4e4c](1ed4e4c))
- a version number (0.1.0), I suggest using [semver](https://semver.org/) for future versions of the package

## Changed

- The `src` dir has been renamed to `npdependency` because setuptools [makes](pypa/setuptools#250) everything harder when a package's parent directory is not named after the package
- The readme has been changed to reflect the changes here and now advises to install this as a package (for now using pip and the git http url, but we should probably release it on Pypi at some point) and use the console entry point instead of calling the script directly (although that is still possible, although the path using `src` have to be rewritten, see above)
- `DependencyDataset` is now in `deptree` rather than directly in `graph_parser3`

## Removed

- The Profiterole and SPMRL stuff, which is not actually related (I think?) to the parser. It still lives in the git history anyway, but we could make it a dedicated repo (in fact, I already have a repo for the MCVF conversion stuff somewhere with improvements)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants