Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependencies should be installed one-at-a-time #42

Closed
pytoxbot opened this issue Sep 17, 2016 · 20 comments
Closed

Dependencies should be installed one-at-a-time #42

pytoxbot opened this issue Sep 17, 2016 · 20 comments
Labels
area:documentation feature:change something exists already but should behave differently level:medium rought estimate that this might be neither easy nor hard to implement

Comments

@pytoxbot
Copy link

It is a sad fact that many packages have a setup.py which imports other packages. This is sub-optimal, but there is unfortunately no way to avoid it given the current state of python packaging tools. For example, any package which uses the Numpy C API has to import numpy in its setup.py in order to ask it for build information. So, for example, the packages 'scipy' and 'pandas' both have a setup-time requirement that numpy is already installed.

The official word from the 'pip' designers is that the only way to install such packages is by running multiple pip invocations: pypa/pip#25

That is, this does not (and will never) work: pip install numpy scipy pandas

This does work, and is the only supported method: pip install numpy; pip install scipy; pip install pandas

My package that I want to test with tox depends on these other packages. But when I try to run tox, it always generates the first form (which doesn't work), and there's no way to tell it to generate the second form (which does). The result is that tox cannot produce working virtualenvs for testing.

AFAICT the only advantage of the first form is that if there is some problem with the packages, pip can notice earlier and avoid trashing your python install. However, tox only ever installs into throwaway virtualenvs, so this is not really an advantage. Therefore I'd suggest that tox always and unconditionally process the deps= option by calling 'pip install' separately for each entry, and in the order they are listed in the config file.

@pytoxbot
Copy link
Author

Original comment by @msabramo

Does it make sense to pursue adding an option to pip to make it "serialize" the installs?

Because:

  • This issue is not limited to just tox.
  • I am leaning lately towards not listing deps directly in tox.ini and instead being more DRY by referencing a requirements file that I already have -- e.g.:
deps = -r{toxinidir}/test.pipreq

In the above case, tox wouldn't know how to serialize and only pip would.

@pytoxbot
Copy link
Author

Original comment by @kynan

@hpk42: Thanks, I'm not sure this is not an ideal workaround though: Isn't it a feature to group by actual value of the indexserver?

I've now settled with a different, less than ideal solution: I only list numpy in the //deps// and explicitely install //h5py// in //commands// via 'pip install h5py>=2.0.0'

See the updated gist: https://gist.github.com/ae38982d5a53de5eaf8b

Installing //h5py// however still fails, because I need to tell it where to look for 'mpi.h' by exporting the environment variable 'C_INLCUDE_PATH' before calling pip. I couldn't find a way of exporting environment variables in 'tox.ini' since the commands aren't run in a shell. Could someone help out? Or should I file a separate bug?

@pytoxbot
Copy link
Author

Original comment by @hpk42

@fr710: Indeed, just looked at the source and the grouping is done by the value not the name of the indexserver. I've fixed it with development, install with "pip install -i http://pypi.testrun.org -U tox", this fixed the example you provided.

@pytoxbot
Copy link
Author

Original comment by @kynan

The workaround holger mentions doesn't work for me. Despite using 2 different (yet identical) index servers, tox still generates only a single pip invocation:

{{{
cmdargs=[local('/tmp/foo/.tox/py27/bin/pip'), 'install', '-i', 'http://pypi.python.org', '--download-cache=/tmp/foo/.tox/_download', 'numpy>=1.6.0', 'h5py>=2.0
.0']
}}}

This is the dummy package I've been using: https://gist.github.com/ae38982d5a53de5eaf8b

@pytoxbot
Copy link
Author

Original comment by @njsmith

Fair enough. We still need some sort of reasonable declarative syntax for grouping together subsets of the deps, then. Some options

{{{
deps1 = ...
deps2 = ...
}}}

{{{
deps = (numpy) (scipy) (nose coverage foo bar)
}}}

{{{
deps = numpy | scipy | nose coverage foo bar
}}}

(AFAICT neither () nor | are legal characters in requirements specifications, so they should be safe to use here: http://packages.python.org/distribute/pkg_resources.html#requirements-parsing)

(EDIT: originally I had {} braces instead of (), but of course that clashes with tox's variable substitution syntax, duh.)

@pytoxbot
Copy link
Author

Original comment by @hpk42

I definitely want tox to grow more customizability for dep installation. If we can keep to a declarative way i'd prefer that, see the indexserver grouping above for an example. Declarations have the advantage that they usually can remain compatible through successive new versions and features of tox. And allowing imperative ways like in a "setup_commands" as a substitute for "pip install {deps}" raises problems, for example, how does this interact with indexserver grouping?

@pytoxbot
Copy link
Author

Original comment by @njsmith

On further thought, you're right -- there are situations where you have to pass multiple packages to 'pip' at the same time. Example: package "a" depends on "b", and vice-versa. Since this is a run-time dependency, not an install-time dependency, there is no problem. At the setup.py level, you have to install them sequentially, but you can do that in either order. At the 'pip' level, when you do 'pip install a' then it will automatically install b as well, so calling pip twice sequentially like 'pip install a; pip install b' will also work.

But!

If your dependencies are "a==1.0 b==1.2", then you have to pass them both to pip in a single call. If you do 'pip install a==1.0' first, then you'll get some random version of 'b', and vice-versa.

New idea, very similar to Mikhail's: how about having a 'setup_commands=' option that defaults to 'pip install {deps}', but can be overridden. This would replace current deps processing, and have the same semantics (only run at virtualenv setup time, etc.)

@pytoxbot
Copy link
Author

Original comment by @kmike

Btw, nice hack with indexserver, thanks :)

@pytoxbot
Copy link
Author

Original comment by @kmike

Just an idea: what about adding 'commands' to CreationConfig and allowing the execution of arbitrary commands right after virtualenv creation via some tox.ini option (something like 'post_creation_commands')?

@pytoxbot
Copy link
Author

Original comment by @hpk42

If we can default to separate installs without difficulty, it's of course preferrable. Tox could then guarantee to install separately in the order in which deps were specified. I can imagine a problem: consider dependencies that are specified as github/bitbucket addresses or files instead of PyPI distribution names. If they require each other then separate installs cannot work, can they?

As to syntax, it's actually possible today to define groups by using different "pypi" servers that all point to the same one, something like this:

{{{
[tox]
indexserver =
g1 = http://pypi.python.org
g2 = http://pypi.python.org

[testenv]
deps = :g1: numpy
:g2: scipy
...
}}}

This works because tox issues different pip invocations for each indexserver. We could extend this syntax to allow numbers directly, so that ":1:" does not need an "indexserver" entry but uses the default one, still maintaining the separation.

On a sidenote, also telling this to myself, let's not get too pip-specific as there are people who would like to see a variant of tox runs that uses easy_install.

@pytoxbot
Copy link
Author

Original comment by @njsmith

I think that with the way distutils/setuptools/distribute work, if you have two packages that depend on each other at install-time then they are actually impossible to install by any means. So I doubt you'll run into many such packages... Python package managers don't have a concept of separated install and setup phases like dpkg/rpm do.

AFAIU, pip's logic is:

  • First, ask each package for its dependencies (this requires running setup.py, which is what causes the problems)
  • Then, pick a linear order to install them in, and install them just as if by doing multiple calls to pip.

So a linear order should always be possible, since that's what pip does in any case...

Now, there are probably tox.ini files in the wild that don't have dependencies listed in the proper order. I think in that case pip will just do its standard dependency resolution anyway, though -- if 'a' depends on 'b', and you do 'pip install a; pip install b' then the first call will install both 'a' and 'b', and the second call will be no-op. No big deal.

Still, if you want to be conservative, maybe the easiest way would be to have a install_deps_sequentially=False|True option?

(With the first: syntax, you may also have some trouble because : can occur in dependency names -- e.g. I have a dependency that is just a http://... URL.)

@pytoxbot
Copy link
Author

Original comment by @kmike

NLTK test suite has the same issue and a workaround: https://github.com/nltk/nltk/blob/master/tox.ini
Better support for this would be much appreciated!

{{{
deps = first:numpy
scipy
...
}}}

How to make a rule "install numpy first, scipy second and then all other packages" with this syntax?

@pytoxbot
Copy link
Author

Original comment by @hpk42

Congrats to the nice issue number :)

I think we will need a new option to allow for separation of dep-installs because other configurations may require the current default of installing multiple deps in one go (they might require each other).

Maybe "dep_install = separate"?
Or we could invent a qualifier for a dep-specification like this:
{{{

deps = first:numpy
scipy
...
}}}

It more directly expressed the need for first installing numpy and i'd just slightly prefer this implementation.
What do you think?

@mbauman
Copy link

mbauman commented Mar 17, 2017

Any progress or work-around for this issue? As far as I'm aware, this makes it impossible to test projects with such dependencies with tox on CI services.

Could a simple solution be to allow a list of install_commands?

@obestwalter
Copy link
Member

obestwalter commented Mar 17, 2017

Hi @mbauman,

Any progress or work-around for this issue? As far as I'm aware, this makes it impossible to test projects with such dependencies with tox on CI services.

Impossible it is not. The righteous way you only need to know.

[testenv]
deps = 
    pytest
    # whatever else where order does not matter
    
commands =
    pip install <packages which need to be installed before other packages>
    pip install <other packages>        
    # now do your actual testing ...
    py.test tests/unit

See SO.

BTW pip install numpy scipy pandas works nowadays so packages are getting their act together, which is much better than implement workarounds into e.g. tox. You can do it as described above and maybe it would help if it is documented better ... attaching a label for this.

I would actually tend to close this as wontfix ..

@mbauman
Copy link

mbauman commented Mar 17, 2017

Thanks for the response. Yes, that works well enough in many cases. Where it falls short, however, is if you've listed dependencies within setup.py. This case is harder because it now fails during the installation of the package itself into the virtual environment. Given that {package} is not available within the commands list, it seems like the best workaround is to create a custom script to call from install_command that just does pip install numpy && pip install {opts} {packages}. See the patch in dssg/pgdedupe#17 for a concrete example.

@obestwalter
Copy link
Member

obestwalter commented Mar 17, 2017

This case is harder because it now fails during the installation of the package itself into the virtual environment.

But then it fails independently of tox. Then it just doesn't work on install either unless you have numpy installed already, or am I missing something?

@mbauman
Copy link

mbauman commented Mar 17, 2017

That's correct. In practice, though, this isn't a big deal. Nearly all of our users/environments have numpy already installed, and if they don't, the error is immediate and has a very obvious resolution. Sure, it's not ideal, but it's better than requiring the users to manually track down and install all the dependencies. I'm well aware that the real solution is to change setup.py in the problematic dependency, but it is not a very open or accessible open source project (fastcluster).

@obestwalter
Copy link
Member

obestwalter commented Mar 17, 2017

@mbauman I understand. Well, for those cases we have the above mentioned workaround in tox.

@obestwalter obestwalter added area:documentation feature:change something exists already but should behave differently level:medium rought estimate that this might be neither easy nor hard to implement and removed documentation labels Sep 3, 2017
@gaborbernat
Copy link
Member

The use case here is mostly packages missing their build dependencies. The solution to these cases is to migrate them to pyproject.toml and use build requires. In an effort to encourage this best practice while not complicating our code bawe we'll close this as won't do.

@tox-dev tox-dev locked and limited conversation to collaborators Jan 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area:documentation feature:change something exists already but should behave differently level:medium rought estimate that this might be neither easy nor hard to implement
Projects
None yet
Development

No branches or pull requests

4 participants