Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conda installer should clean up unneeded dependencies #232

Closed
elpres opened this issue Dec 18, 2014 · 17 comments
Closed

Conda installer should clean up unneeded dependencies #232

elpres opened this issue Dec 18, 2014 · 17 comments

Comments

@elpres
Copy link

elpres commented Dec 18, 2014

The installer should keep track of which packages were installed by the user and which as dependencies and delete dependencies when they aren't needed any more. For example, numba used to have llvm and llvmmath as dependencies. Now both have been replaced by llvmlite, but after the upgrade they still stick around even though they aren't needed by any other package and weren't explicitly installed by the user.

@ilanschnell
Copy link
Contributor

Good point, but what if some other program is still using the dependencies. Is removing all dependencies always what you want? I'm not sure. For example, if you install scipy, you probably write some programs which also use numpy, so is removing numpy desired (when removing scipy)?

@elpres
Copy link
Author

elpres commented Jan 7, 2015

In this case I would say that NumPy should be kept if:

  1. the user has installed it explicitly (it was already installed before the installation of SciPy).
  2. there are other packages installed by the user which also depend on NumPy.

So the logic would be something like this: For each package that is installed as a dependency, we keep track of the packages that require it. Say, we keep that information in a table with columns "Package name" and "Required for". Whenever a package is installed as a dependency for something else, we add a row with both packages in it ("Package name": NumPy, "Required for": SciPy). Then we install PyMC which also depends on NumPy, so we add another row ("Package name": NumPy, "Required for": PyMC).

Then, if we remove a package (that was explicitly installed), we look into that table and retrieve all rows where this package is in the "Required for" column. We delete all of these rows from the table and check whether the dependencies (the values in the "Package name" column) are present in other rows of the table. If they are, then there are still more packages that depend on them, so we leave them. If there aren't, we know that this dependency has become an orphan and can be cleaned up.

There are more things to keep in mind to make sure this dependency table is consistent, like:

  • when a package is upgraded and its dependencies change like in the case of Numba described in the first post,
  • if a user is trying to delete a package that is still required by another one, what to do? Deny or ask to remove the dependents as well?

and so on, but I think this looks doable without being too hard. I'm running Arch Linux, and its package manager actually shows a massage when a previously required package becomes unneeded. Simply asking the user what they want to do with those packages would already be a great course of action.

@rajeevyadav
Copy link

I needed to install a number of additional packages for my learning project, now I am unable to update conda, dependencies errors etc, can't clean, can't downgrade without a fresh install. I see unsatisfiable package specification as error when I try to do conda update --all

@asmeurer
Copy link
Contributor

asmeurer commented Mar 3, 2015

conda/conda#454 is an effective dependency of this feature.

@asmeurer
Copy link
Contributor

asmeurer commented Mar 3, 2015

@rajeevyadav please open a new issue, and give command line output of what happens when you run conda update conda (and also conda info).

@rajeevyadav
Copy link

conda update conda doesn't give any error. When I try conda update anaconda following hint:
following combinations of packages create a conflict with the
--python 3.4*

  • anaconda

philippmuller added a commit to OpenSourceEconomics/econ-project-templates that referenced this issue May 15, 2015
…bash, update bash/batch scripts accordngly. Implement check for deletions by keeping a cached version of the installed spec in .env. Entirely checking the installed packages for removals wasn't possible due to possible duplication issues from installed dependencies. Waiting for ContinuumIO/anaconda-issues#232 seemed sensible
@jimmycallin
Copy link

Any updates on this? My current way to make sure I have a clean environment to export is to remove the conda environment and manually reinstall all necessary dependencies before i run conda env export > file.yaml. I'm sure there is a better way, but this fix would nonetheless really help.

@cbarrick
Copy link

For reference, see Debian's APT package manager.

APT maintains a flag in its database indicating if a package was installed manually or automatically. If a package was installed and is not a dependency of another package, it is a candidate for auto-removal. If such packages exist, the apt-get command issues a warning to the user. These packages can be removed with apt-get autoremove, or APT can be configured to search for and remove these packages every time it is invoked.

This solution is not without faults. For example, many online tutorials will give users a big command that lists the main package being installed, as well as the dependencies. This will cause the dependencies to be marked as manually installed. So there's definitely a UX issue that requires careful training of the community. This definitely applies to conda.

This is probably my favorite feature of APT and is a big reason that I believe APT is the best package manager out there. It's definitely something conda should consider.

@raphmur
Copy link

raphmur commented Jan 4, 2018

Bump

It has been quite some time, but I would like to bump this issue up!
This looks like a very good feature.
The package manager can, for each package, keep track of:

  • auto or manual install or core module
  • list of modules that depend on it

Then, removing a package completely may need several calls to "autoremove" since packages can have some sort of recursive dependency.

All in all, something like "conda remove spyder" and "conda autoremove" should for instance reset the env to its state before "conda install spyder"

Do you think this looks doable?

@msarahan
Copy link
Contributor

msarahan commented Jan 4, 2018

Conda 4.4 differentiates between manually-specified packages and dependencies. It should be possible to do this soon. CC @kalefranz @nehaljwani

@sebma
Copy link

sebma commented Mar 13, 2018

@msarahan Hi, I'm looking forward to having this great feature. Please let us know when this is done.

@kalefranz
Copy link

To have conda (4.4+) clean up unneeded packages use the --prune flag with an install or update command.

@sebma
Copy link

sebma commented Mar 27, 2018

@kalefranz The --prune is missing from the --help documentation :

$ conda install --help | grep prune
$ conda -V
conda 4.5.0
$

Is this normal ?

@kalefranz
Copy link

kalefranz commented Mar 28, 2018 via email

@sebma
Copy link

sebma commented Mar 28, 2018

@kalefranz Before I file an issue to the right GitHub repo, I have one question :

Is the --prune option available to the conda command or to the conda-env command ?

@tadeu
Copy link

tadeu commented Jun 8, 2018

@sebma, it's available only to the conda-env command

@sebma
Copy link

sebma commented Jun 8, 2018

@tadeu Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests