New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packages listed in setup_requires are downloaded into the directory that contains setup.py #80

Closed
bb-migration opened this Issue Sep 16, 2013 · 6 comments

Comments

Projects
None yet
1 participant
@bb-migration

bb-migration commented Sep 16, 2013

Originally reported by: ngrilly (Bitbucket: ngrilly, GitHub: ngrilly)


I use setup_requires in one of my projects, and was a bit surprised to find the eggs listed in setup_requires added to my working directory after having run python setup.py sdist.

This behavior is clearly documented here but is still quite unexpected.

Is there a reason for this or is it something that can changed?


@bb-migration

This comment has been minimized.

bb-migration commented Sep 16, 2013

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Because those packages are necessary to setup the project (setup_requires) but not necessary to run it (install_requires), they're downloaded temporarily in order to be available for setup to run. The current directory is used as that directory is most relevant to the setup.py which is being invoked. It's independent so doesn't interfere with other projects, but it's persistent, so the dependencies can be re-used on a subsequent run.

I can imagine the dependencies could be stored in $TEMP or ./.egg-cache or not stored at all (downloaded on each invocation).

I'm not inclined to do the latter - in my experience, re-use of those setup-time dependencies is useful. Storage in a temporary directory like $TEMP has the potential for cross-project interaction and possible security concerns on non-Windows systems. I could entertain further the idea of a specific, local directory designated for setup-time dependencies.

What sort of behavior were you expecting? Do you have any suggestions on where the eggs could be stored and how they should be managed?

@bb-migration

This comment has been minimized.

bb-migration commented Sep 17, 2013

Original comment by ngrilly (Bitbucket: ngrilly, GitHub: ngrilly):


Hi Jason,

Thank you for following up.

I fully agree with you on:

  • not using $TEMP because of the risk of cross-project interactions;
  • persisting the dependencies somewhere in order to re-use them.

To answer your last question about the sort of behavior I was expecting: I was not expecting the command python setup.py sdist to have any side effect, and I was especially not expecting it to save a bunch of files into my project directory.

Even the command python setup.py --help triggers the download of dependencies listed in setup_requires. This is really unexpected. This is the first time I encounter a command line program that has a side effect when using --help.

A minimal improvement could be to group downloaded dependencies into a sub-directory tmp, or in another directory already created by setuptools like build or dist.

But in my opinion, the real issue is with the feature itself. The whole idea of "downloading a dependency, to use it temporarily during setup, without really installing it, but keep it somewhere in order to not have to download it again for the next setup" sounds really hacky. It's usually the sign of some design issue.

Here is some information about my use case. I use gettext and need to compile some .po files to the corresponding .mo files. If someone downloads a source distribution of my package, which includes the .po files, but not the .mo files, I want them to be automatically compiled when the user runs python setup.py install or pip install mypackage. This is done with something like this:

from setuptools import setup
from distutils.command.build import build as _build

class build(_build):
    sub_commands = [('compile_catalog', None)] + _build.sub_commands

setup(
    # ...
    cmdclass={
        'build': build,
    },
    install_requires = ['Babel', 'Genshi'],
    setup_requires = ['Babel']
)

Frankly, I will be perfectly okay with just using install_requires and removing the line about setup_requires. I don't mind if Babel is installed in my environment, even if I use it only during the setup.

The only reason why I had to use setup_requiresis because I need Babel to be installed before the setup.py of my package is run. If I don't specify the setup_requires clause, then Babel can be installed after, and then my setup script fails.

Is there a way to install Babel in the environment before running the setup script of my package, without using setup_requires?

For the sake of comparison with other packaging tools in other programming languages, I think the design adopted by npm is easier to understand and feels more "orthogonal": https://npmjs.org/doc/misc/npm-scripts.html. And almost all my code since a few years is Python so I'm definitely not a node.js programmer.

Cheers

@bb-migration

This comment has been minimized.

bb-migration commented Sep 17, 2013

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Nicolas, those are some good examples. Thanks for elaborating.

You're right that it seems hacky, and that's because it's a rough approximation to cover lots of use cases. It's the only hook to provide dependency resolution early in the setup process (one of the first steps when 'setup()' is invoked). It's meant to be used for any case to customize the 'build', 'test', or 'install' steps (or to extend the commands available). That's why the dependencies are run before '--help' is handled, because the dependencies could theoretically extend the commands available by this setup script.

There are many cases where the downloading of temporary packages seems unwanted. See keyring #105 for another example. In your case, you want Babel to be available only if 'build' is invoked. In the Keyring case, the users wanted the dependency only available if 'ptr' was invoked. In another package I maintain, hgtools needs to be present both at build time and install time.

To answer your question directly, there is no good way currently in setuptools to indicate that a dependency should be installed to the target environment early. Indeed, even the meaning of 'target environment' isn't well-defined until later in the process.

I can think of a hack that might work. If you added something like this to your script:

import sys
sys.argv[1:1] = ['easy_install', 'Babel']

setup(...)

That will inject into the stream of commands to install Babel before invoking the other commands. Again, that's hacky, may not work, and may not work in all environments (or under pip). I would recommend against that for anything but an internally-distributed app/library.

The npm model is interesting. It appears to provide a more general model for a packager to provide arbitrarily complex hooks at various hook points. I believe their approach is wise and has some advantages. However, setuptools depends on (is an extension of) distutils, so to a large extent is constrained by the limitations that distutils imposes. At the same time, there's an effort underway in the Python Packaging community to explicitly limit the flexibility of packagers to have arbitrary hooks or extensions (making the packaging as declarative as possible if not fully declarative). Of course, the npm-scripts also discourages use of the scripts, especially where the npm already provides declarative support for that functionality. So in many ways, their approach is similar to the approach we're developing. Out of curiousity, if you had the npm-scripts capability for your project, would you explicitly install Babel as a preinstall or prepublish command? Would that cause Babel to become available to the installer?

The packaging metadata work is expected to proceed on the pypa-dev list. Please consider following/joining that group or on distutils-sig.

@bb-migration

This comment has been minimized.

bb-migration commented Sep 17, 2013

Original comment by ngrilly (Bitbucket: ngrilly, GitHub: ngrilly):


That's why the dependencies are run before '--help' is handled, because the dependencies could theoretically extend the commands available by this setup script.

Ok, now I understand why the dependencies listed in setup_requires have to be installed even when calling setup.py --help-commands: this is because the dependencies can, for example, add a command which has to be listed by --help-commands.

To answer your question directly, there is no good way currently in setuptools to indicate that a dependency should be installed to the target environment early. Indeed, even the meaning of 'target environment' isn't well-defined until later in the process.

I'm not sure about this, but maybe the solution has to be looked for in pip instead of setuptools. In my case, my package needs Babel anyway, during the setup, but also after the setup, when using the package. If pip was able to install packages in the correct order of dependencies (install Babel first because it does not depend on my package, and install my package second because it depends on Babel), then I could specify the dependency on Babel only in install_requiresand I would not need to use setup_requires. I think it would be helpful in lot of use cases. Do you think it is possible? I had a quick look to pip source code and it looks like dependencies are not sorted in any way before being installed, but I may be mistaken (it was really a quick look...).

Out of curiousity, if you had the npm-scripts capability for your project, would you explictly install Babel as a preinstall or prepublish command? Would that cause Babel to become available to the installer?

Good question :) I've read again the npm documentation and in my case, if my package was written in node.js, I would:

  • Add Babel to devDependencies;
  • Use a prepublish script to execute Babel's compile_catalog.

devDependencies are installed only for people developing the package. They are not installed for simple users. This is the place to declare dependencies used to build, test and document the package. Maybe the equivalent in the Python world could be a command line argument to setuptools to tell we are in "dev" mode?

The prepublish script is run before building the package tarball. It can be used for any build step which is not architecture specific like compiling CoffeeScript/LESS, compiling .po files to .mo files, downloading i18n from the CLDR repository, etc. The equivalent in the Python world could be data generated by the build step and added as package data to the source tarball:

setup(
    # ...
    include_package_data = True,
    package_data={'': [
        'locale/*/LC_MESSAGES/*.mo',
    ]},
    # ...

The packaging metadata work is expected to proceed on the pypa-dev list. Please consider following/joining that group or on distutils-sig.

I joined pypa-dev. By the way, is there any connection between projects hosted at github.com/pypa and those hosted at bitbucket.org/pypa?

@bb-migration

This comment has been minimized.

bb-migration commented Sep 18, 2013

Original comment by ngrilly (Bitbucket: ngrilly, GitHub: ngrilly):


I had a quick look to pip source code and it looks like dependencies are not sorted in any way before being installed, but I may be mistaken (it was really a quick look...).

Answering my own question: pip installs packages in a deterministic way based on the order of the requirements given to pip. It's clearly explained by Carl Meyer in this discussion: https://groups.google.com/forum/#!topic/python-virtualenv/Ei2nzDdO6IU

@bb-migration

This comment has been minimized.

bb-migration commented Oct 19, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Cache eggs required for building in .eggs dir

This makes it so that these eggs don't prevent install_requires from
installing these packages.

Fixes ticket #80; workaround for ticket #209

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment