Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setup.py sys calls break in a pip install #4474

Closed
malina-kirn opened this issue Jan 27, 2015 · 32 comments
Closed

setup.py sys calls break in a pip install #4474

malina-kirn opened this issue Jan 27, 2015 · 32 comments
Labels
Build issues Issues with building from source, including different choices of architecture, compilers and OS

Comments

@malina-kirn
Copy link

I'm installing python + various packages (including scipy) using the python Chef recipe. I execute various pip install commands wrapped by the pip API provided by this Chef recipe.

Unfortunately, the setup.py file from scipy Lines 217-219 makes calls to get system arguments in an attempt to determine whether or not the setup is being run through pip. In Chef, these checks don't appear to work and the install script enters the else conditional. The else conditional imports numpy on Line 237. Since I'm using pip to install everything, sometimes numpy is installed prior to scipy setup.py being called, sometimes it isn't. Consequently, I get intermittent installation failures when installing scipy.

I don't fully understand the motivation behind the else block, but assume it's in there for a good reason.

The conditional statement in setup.py Lines 217-219 is terribly hacky. I don't know if there's a better way to determine if setup.py is being called from pip - the only similarly hacky workaround I can think of is to step through the process IDs of the current process and its parents to see if any of them contain pip. Yuck. Any thoughts on how to resolve this?

@pv
Copy link
Member

pv commented Jan 27, 2015

Please post here what is the content of sys.argv in the failing case.

@rgommers
Copy link
Member

I assume it's these lines, which start at line 217 in current master:

if len(sys.argv) >= 2 and ('--help' in sys.argv[1:] or
        sys.argv[1] in ('--help-commands', 'egg_info', '--version',
                        'clean')):

It's the least hacky thing we came up with so far to cover all the use-cases for distutils, setuptools and pip - which is not easy. And it's not that bad.

A full traceback here would be helpful. As would a more detailed explanation of how to reproduce the issue.

@malina-kirn
Copy link
Author

@rgommers: Woops, sorry, was looking at line numbers for the release I've been using. I updated my original post with the correct line numbers. That is the code block in question.

@pv I'll do my best to pull sys.argv info out of Chef and echo back.

@malina-kirn
Copy link
Author

Digging in, I was able to isolate this to an issue associated with utilizing the pip install -e option (unrelated to Chef). When this option is utilized, setup.py is called twice with two different argvs: ['-c', 'egg_info'] and ['-c', 'develop', '--no-deps']. This can be reproduced by calling:

pip install -e git+https://github.com/scipy/scipy.git@master#egg=scipy

You can see the else clause is entered on the second execution of setup.py.

I am not installing scipy itself via git, but I am installing my own packages that depend on scipy using the -e flag. I'm speculating, but I would guess this also causes all dependencies of my internal package to also be installed with these arguments.

I just happened to hit this error while running Chef, but I suspect Chef is entirely unrelated.

Anyway, I suspect the fix is to modify the conditional to include develop as one of the setup.py arguments that could originate from pip.

@rgommers
Copy link
Member

I can replace this. With numpy install things work, but in a clean virtualenv with --no-site-packages the above command fails. pip is doing something very ugly here, as can be seen from the -c in sys.argv. This is what fails:

python -c "import setuptools, tokenize; __file__='/home/rgommers/Code/tmp/tmp_pip_e/src/scipy/setup.py'; exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" develop --no-deps:

@rgommers
Copy link
Member

python setup.py develop indeed doesn't work now; python setupegg.py develop does. We could fix that quite easily it looks like - I'll have a go at it.

@rgommers
Copy link
Member

Probably fixed by gh-4475.

EDIT: eh, probably not, since it fails on importing numpy. But let's get that PR merged first.

@rgommers rgommers added the Build issues Issues with building from source, including different choices of architecture, compilers and OS label Jan 27, 2015
@pv
Copy link
Member

pv commented Jan 27, 2015

Is this actually even fixable on our side? setup.py develop needs to
build Scipy, but that requires numpy.distutils.
.
The "setup.py egg_info", which succeeds here, should have already
informed pip that Scipy requires having Numpy installed.
.
This looks more like a pip problem to me.

@rgommers
Copy link
Member

@pv you're probably right. It's not easy to see from the output what exact commands pip uses, but it looks like it messes up. The setup_requires argument is passed on correctly by Scipy.

@pv
Copy link
Member

pv commented Jan 27, 2015

Not related to this issue, but:
.
We might actually even want to remove setup_requires, since it doesn't
do anything useful:
.

  • its value is not included in the *.egg-info output the egg_info
    command generates
  • it forces numpy be installed when setup() is called
  • but at the point setup() would be called for anything else than
    egg_info, we have already failed to import numpy.distutils if it wasn't
    already installed
    .
    So its only effect seems to be to install Numpy when running egg_info,
    but this installation goes to some temporary location rather than into
    the system. It seems to me it just trigger building numpy a second time
    when installing scipy, completely without any usefulness.

@rgommers
Copy link
Member

I'm pretty sure that it used to be necessary. The description of setup_requires and install_requires on http://pythonhosted.org/setuptools/setuptools.html#new-and-changed-setup-keywords confirms that setup_requires is the one that guarantees that numpy is available at build time.

Numpy will be built twice in some (or all) cases, that's sub-optimal. But without it there's no guarantee that numpy is built and installed into a location visible for scipy before scipy is built.

@rgommers
Copy link
Member

If your analysis is correct, that looks like a setuptools bug.

@malina-kirn
Copy link
Author

Thanks @rgommers, happy to test if you like.

@pv
Copy link
Member

pv commented Jan 27, 2015

@rgommers: note that the page says "setuptools will attempt to obtain these (even going so far as to download them using EasyInstall) before processing the rest of the setup script or command" --- it appears to me that the whole setup_requires stuff is designed only to deal with dependencies arising after setup() is called, so it will not help us here when our setup() comes from numpy. The egg-info stuff produced by setup.py egg_info is identical regardless of whether setup_requires is present or not --- egg info writing entry point for writing out setup-requirements is missing from this list: https://bitbucket.org/pypa/setuptools/src/9db269094f1dd2b79591bdfd95f1e759b647acda/setup.py?at=default#cl-126 Moreover, even if it was present, I don't think pip understands it :(

@malina-kirn
Copy link
Author

@rgommers: comments on your PR, but it looks like the change to add develop may have gone in the wrong place?

@pv
Copy link
Member

pv commented Jan 27, 2015

@malina-kirn: this issue is a bug in pip --- it should respect setup_requires, but it does not.
Not really possible to fix it by changing setup.py in scipy

@rgommers
Copy link
Member

Hmm, I'll go back and look at why adding the setup_requires fixed the install order for Buildout then - that was pretty much the same issue.

@rgommers
Copy link
Member

That issue was gh-3566. I think it does work. Try:

$ virtualenv issue4474 --no-site-packages   # has pip/setuptools, not numpy
$ source issue4474/bin/activate
$ cd path_to_scipy_repo
$ git clean -xdf
$ python setup.py egg_info    # this will actually build a numpy egg (?!)
$ ls
...
numpy-1.9.1-py2.7-linux-i686.egg  scipy.egg-info

When removing setup_requires from setup.py, then egg_info doesn't build anything. So this does work for tools that first run egg_info and then install. It very ugly though that egg_info starts to build dependencies.

@rgommers
Copy link
Member

It's sort of documented here (at least it vaguely mentions setup_requires): https://pythonhosted.org/setuptools/setuptools.html#egg-info-create-egg-metadata-and-set-build-tags

@rgommers
Copy link
Member

And building numpy twice cannot be avoided if you want to build all dependencies in an isolated environment and install them in random order. At least buildout does that by design.

@rgommers
Copy link
Member

Conclusion: I think my PR fixes this issue, first setup.py egg_info followed by setup.py develop should work.

@pv
Copy link
Member

pv commented Jan 28, 2015

@rgommers: That the numpy egg is in the current directory does not appear to make
it available for import. The following does not work in virtualenv --no-site-packages
.
virtualenv --no-site-packages /tmp/env
. /tmp/env/bin activate
python setup.py egg_info # <- builds numpy egg
python setup.py install # <- fails to import numpy
python setup.py develop # <- fails to import numpy
.
My log is at: https://gist.github.com/anonymous/fd73d85c8b2737d5ad92
Same result with python setupegg.py and with gh-4475.
pip install -e git+https://github.com/rgommers/scipy.git@setup-py-fixes#egg=scipy also fails with numpy import in setup.py develop.
If you tried it and it works for you, I don't understand what is the
difference in the setup.

@malina-kirn
Copy link
Author

I see yet another behavior. I can reproduce @pv results with python setup.py, but not with pip install -e. Note: I use virtualenvwrapper, which turns off site packages by default.

Using python setup.py:

mkvirtualenv scipy1
python setup.py egg_enfo # works
pip list # shows numpy is not installed
python setup.py install # fails to import numpy
python setup.py develop # fails to import numpy

Using pip -e:

mkvirtualenv scipy2
pip install -e git+https://github.com/rgommers/scipy.git@setup-py-fixes#egg=scipy # fails on generate_cython since I don't have Cython on my system, but gets past the import numpy
pip list # shows numpy installed

I'm not sure if the second installation mechanism is supposed to call generate_cython in this scenario or not. Certainly I don't need Cython for a typical install of scipy (pip install scipy).

@pv
Copy link
Member

pv commented Jan 28, 2015

@malina-kirn: you mentioned earlier that the failures are sporadic --- the question is whether pip says (i) Installing collected packages: scipy, numpy or (ii) Installing collected packages: numpy, scipy.
If I give the URL, it prints (i) and fails as above. If I instead do pip install -e /path/to/scipy and give location of the local git repository instead of URL, it prints (ii) and succeeds until the Cython error.

Which did it do for you? Also, what is your pip --version? (Mine is 1.5.6)

@rgommers
Copy link
Member

@pv you're correct, I also see that Installing collected packages: scipy, numpy still fails. It looks like setup_requires still does something - triggers the numpy egg build, which cannot be imported. I guess that means that setuptools forgets to add the place where it puts the numpy egg to PYTHONPATH.

Don't know if this is a regression or if it never worked. There's a lot of issues for setup_requires in the setuptools bug tracker. I remember looking at https://bitbucket.org/pypa/setuptools/issue/209/setup_requires-and-install_requires-dont before. https://bitbucket.org/pypa/setuptools/issue/293/make-it-possible-for-pip-to-take-control also seems relevant.

@rgommers
Copy link
Member

I'm going to give up on this one - enough time wasted on poorly implemented setuptools features.

@malina-kirn
Copy link
Author

@pv Ah, indeed, it's undoubtedly an intermittent failure when using pip install -e. I haven't been able to reproduce it on my command line, but my Chef script hits it consistently. The Chef script indirectly installs scipy as a dependency of my own package, which I install via pip install -e. Likely a deterministic but non-ordered operation. I'm also using pip 1.5.6 on my laptop and Chef is configured to always get the latest pip via get-pip.py.

I've put in a workaround in my VM installation by installing a version of numpy prior to installing my own package. If my package states a newer dependency on scipy and/or numpy, it will get updated. The only time this could cause problems is if the scipy setup.py file needs a newer version of numpy than I've referenced in the system install. But the numpy.distutils import has been in setup.py for a while, so it's unlikely I'll need to keep updating the Chef recipe with newer numpy releases to support the scipy install.

In the meantime, I would like to file a bug against setuptools, but am not sufficiently-versed in setuptools to continue with follow-ups on the ticket. @pv, would you be willing to file this bug? I'd like to follow, but probably could not meaningfully contribute.

@rgommers thanks for the attempt, I appreciate your attention to this matter. Would you like me to close this ticket - or would you like to keep open and/or close yourself?

@malina-kirn
Copy link
Author

Update: my Chef VM reports pip --version of 6.0.6. Wow, big jump from 1.5 to 6.0! https://pip.pypa.io/en/latest/news.html Wonder if this is specific to pip 6? I wonder if this will always break in pip 6?

@pv pv changed the title setup.py sys calls break in a Chef pip install setup.py sys calls break in a pip install Jan 29, 2015
@pv
Copy link
Member

pv commented Jan 29, 2015

@malina-kirn: The installation order indeed was changed in pip 6.0.x, see pypa/pip#1934 As to why it still fails: perhaps, you listed scipy before numpy (or vice versa?) in the install_requires of your own package? (setup_requires is not relevant for pip as noted above.) It looks like pip will just blindly install stuff in the reverse order it comes upon it, without trying to resolve dependencies.

@malina-kirn
Copy link
Author

@pv: I've never had much luck changing the order of packages in install_requires. As far as I can tell, pip and/or setuptools never respects that ordering. I've tried with numpy before scipy and vice versa, no joy. My pip --version on my VM now reports pip 6.0.7, since they just released an update. Thankfully my manual workaround still works, so I've got a patch on this for now. The situation in pypa/pip/issues/2381 looks a little bleak. FYI: this is an Ubuntu 12.04 VM, so a wheel wouldn't help.

I suspect pypa/pip/issues/1934 isn't related, since that ticket addresses issues in install_requires, whereas I think the problem here is related to pip not respecting setup_requires.

@pv
Copy link
Member

pv commented Jan 29, 2015

The present issue can be solved by making either install_requires or
setup_requires to behave correctly. Currently, the problem is that
setup_requires does not do anything useful (for us), and install_requires does
not result in pip installing things in dependency order.

@malina-kirn
Copy link
Author

Ah, I understand now. Thanks for the clarification.

I tried digging around in that fix in pypa/pip#1934. Just reversing the requirements order doesn't seem to cover all use cases, as far as I can tell. On a hunch, I removed numpy from my own install_requires, to see if relying on scipy's install_requires fixed the issue. That worked. I'm guessing, but I think by putting numpy in my own install_requires, both scipy and numpy end up being put in the same set of requirements at the same 'level.' Whereas, if I rely on scipy to install numpy for me, I suspect that my own dependency on scipy gets called after scipy's dependency on numpy is called.

I'm not a huge fan of relying on transitive package dependencies to install packages that I actually need for my own code. But this does seem like a viable workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build issues Issues with building from source, including different choices of architecture, compilers and OS
Projects
None yet
Development

No branches or pull requests

4 participants