-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional requirements? #793
Comments
What would the backend package be beyond matplotlib and qt? |
Hmm. Good question. I guess it wouldn't be anything. Still, my gut says that I shouldn't wait for Qt to install if I'm never going to use Qt. Nor should it take up space on my disks if it won't be used. I'm just not sure how to accomplish that without breaking things for people who expect Qt to be installed with matplotlib. |
I'm also not sure whether this is a conda issue or an issue for the matplotlib conda package. But I'm guessing that some possible solutions include changes to conda, so I put it here. |
Another example: IPython works without the notebook stuff installed, so ipython-notebook is a separate package. And of course the notebook works fine without nbconvert. But, nbconvert needs pandoc and pygments and maybe a few other things, which are not installed as dependencies of any ipython* package. So if you want nbconvert, you have to know to get pandoc and pygments. Those could be added to the ipython-notebook package, but given how troublesome pandoc is, that's probably a poor plan. You could make an ipython-nbconvert package, but there wouldn't be any actual code in it, only a couple of dependencies. |
What you are suggesting is what we call metapackages (a package with no code, only metadata). The ipython-notebook is a metapackage. It only exists for the dependencies (and for the "app" entry point for the launcher). |
I'm personally a fan of using metapackages to solve these kinds of problems. They are already well supported, and you can actually do non-trivial things by specifying metadata in metapackages and passing off to the SAT solver. In my opinion, the only thing that's still fundamentally missing from the package metadata spec is conflicting packages. Optional dependencies are something that I've thought about before, but I've never been clear how they would actually work. The SAT dependency solver in conda tries to pick the minimal number of packages to install, meaning any optional dependency would not be installed, unless it would be already installed anyway. I guess you could require that if it is installed that it be a certain version, although that also requires #634 to be useful. There was an internal suggestion for a "provides" feature, as a way to deal with things like spyder that can depend on either pyqt or pyside. I quote my response below:
[NB: Except for the conflicting concept I mentioned above]
The idea here was to provide a way to install spyder in environments where pyqt is disallowed due to licensing restrictions. |
I have a conda package that depends only on a few things, but I want to package it so that users get some important optional dependencies by default as well. In Debian this is possible with the "recommends" list @asmeurer – You're saying this should be done by creating a separate conda metapackage? |
pip also has this feature ... there it's called "extras": If the recommended solution for this remains to create meta-packages, can this be added to the conda docs? (I couldn't find something useful with Google) |
Look at, for instance, how |
It would be great to have the matplotllib recipe drop the dependency on Qt. I use matplotlib purely with the agg backend, and I use it with Kivy. Neither use of it requires Qt, but I still get Qt when I specify matplotlib as a dependency in conda recipes. I have to go through and delete all the Qt stuff after the env is built. I think there should be a 'matplotlib' package, and a 'matplotlib-with-Qt' metapackage. |
I think there's a reasonable overhead in specifying a new yaml and building a new package for each set of optional dependencies - yes it can work but I think it would be much easier to specify in the same yaml as the required dependencies IMHO. My suggestion would be to allow optional dependencies to simply be labelled sections under the build or run requirements. The labelled sections then specify the dependencies for that label - e.g. for the hypothetical meta.yaml below: package:
name: mypackage
version: 1.0.0
requirements:
build:
- python
run:
- python
- numpy
pyqt:
- pyqt
docs:
- sphinx
- numpydoc >= 0.5.0
test:
- nose
- mock Only python and numpy would get installed with:
to install all the optional deps you would use
...which would be equivalent to
To install mypackage with just the pyqt dependency would be
Conveniently this would also solve my own issue (#1665) where I want to be able to independently install the test deps. |
Fwiw, I really like this syntax. |
I think something like this is a good idea. However, it's important to note that it depends on the underlying package being able to gracefully handle the different combinations of dependencies. It's also similar in execution to our |
IMO, this syntax
I would find it better if there is an additional way to add binary packages, which can take specific files and the rest is taken by the main package. Like:
This would build 4 packages: mypackage-tests, mypackage-docs, mypackage-pyqt and mypackage. Each package can be installed as a normal package... In this case, the three additional packages depend on the exact version of main package, so that updates to e.g mypackage-pyqt will also update the main package and keep them in sync. See also the debian dir for the matplotlib debian package, which works similar, only the above info is split across multiple files: https://anonscm.debian.org/cgit/python-modules/packages/matplotlib.git/tree/debian
|
Upon further reflection, it seems to me that the "package feature" idea really can't be fully executed without some significant improvements to conda's internals. In particular, this will require some sort of persistent storage to save which features the user has installed, and conda currently doesn't save details like this. In fact, conda really doesn't save data about the current environment beyond the list of packages installed. We have some good ideas about what we could do if we did give environments a database of metadata to play with; this would be yet another thing. @JanSchulz's idea looks more feasible, because it relies on the standard package/dependency mechanism. In the meanwhile, I do think that a lot of the work here can be accomplished on the package side (e.g., without conda changes). For instance, conda packages should nominally specify their minimal set of dependencies. If there is functionality that requires additional packages, that can be offered in the documentation, and the code should be written in such a manner as to detect the presence of those dependencies and fail gracefully if they are absent. |
@mcg1969 There are two sides: IMO the only changes needed for my proposal would be on the conda-build side: instead of one package produce multiple. Conda itself wouldn't need any changes. But then a user would need to do the work, e.g. selecting the right backend. To make that easier things like alternatives ( |
See also #1696 |
I agree with the principle here, but I don't think it really works in practice. I think there are a lot of cases where a package is not strictly a dependency, yet it's something that will be wanted by 80% or 90% of users of another package. Asking each one of those users to discover and install the recommended packages separately is a big burden on them, and requires a lot of human communication via documentation at a time when people are not likely to be reading it (when they are first trying things out and getting started, not diving deep). What will most likely happen in those cases is that the "recommended" packages just won't get installed, and then users will miss out on features that most people will want. Yet if those recommended packages are listed as hard dependencies, then there is no way that the software can be installed without them, which can be a major problem for some subset of users (when there are licensing issues, as for Qt with matplotlib, when e.g. some packages aren't available on certain platforms, due to lack of binaries, or just because the dependencies are really big). NumPy, for instance, can be installed with no dependencies, yet most users will want the associated numerical libraries, and if it's installed without them they may give up on NumPy because they think it's too slow. So it would be really great if recommended packages could be installed by default, unless overridden explicitly to obtain a minimal installation. Yes, you can always solve this problem with metapackages, but that's a very heavyweight solution, multiplying the number of packages greatly. I do think it's the right approach for Matplotlib, which is already splitting into separate packages to avoid requiring Qt, but it seems like quite a burden on package maintainers in general, with the predictable result that they'll usually keep the 80% of users happy while making life very painful for the other 20%. |
Has there been any progress on this issue? We remain very frustrated when trying to make conda packages, being unable to specify optional packages without having them become strict dependencies! |
As an intermediate option, I'd love to see sth like
I like that by default all batteries are included, but for people that know what they are doing, such an option would be useful for conda and pip. There does exists a --no-deps option, but that excludes everything. |
Part of conda 4.4 |
As conda 4.4 has not been released, is there some PR you can point to (or other documentation) explaining what's been implemented? |
The PR for conda is #4982. Work now needs to be for conda-build and documentation. I've created an issue in conda-build to track progress there. conda/conda-build#1964 |
Hi there, thank you for your contribution to Conda! This issue has been automatically locked since it has not had recent activity after it was closed. Please open a new issue if needed. |
Sounds like an oxymoron. I started thinking about this just now when I did
conda install matplotlib
. I'm planning to use matplotlib exclusively in the IPython notebook, so I won't be using the Qt backend for matplotlib. However, Qt is installed because it is a dependency of matplotlib. Since Qt is so large, downloading the package took the bulk of the total install time.Is there a way to mark that some dependencies will only be needed by some users? I'm not sure I can think of a good way to handle this sort of situation. Maybe a separate package called
matplotlib-Qtbackend
would be sufficient, but then some users will be quite surprised to find that installing matplotlib and Qt is not sufficient to get them a Qt backend. What if the matplotlib and Qt packages both knew to look for the other when they were installed, and if both are present, to install the backend package?I'm just brainstorming.
The text was updated successfully, but these errors were encountered: