Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Compare environments #3781

Closed
ijstokes opened this issue Nov 1, 2016 · 13 comments · Fixed by #10022
Closed

[ENH] Compare environments #3781

ijstokes opened this issue Nov 1, 2016 · 13 comments · Fixed by #10022
Labels
locked [bot] locked due to inactivity source::anaconda created by members of Anaconda, Inc. type::feature request for a new feature or capability

Comments

@ijstokes
Copy link

ijstokes commented Nov 1, 2016

I find I am often in a situation where I want to know if my current Conda environment matches the environment specification of an environment.yml file or a requirements.txt file or some other specification of Conda packages. Why? Because as much as I love Conda environments, sometimes I'd like to know:

"Can I just run this in my current environment without changing anything? Because if so, then I know others can also run this thing in this same environment that I know they have access to, rather than having to complicate my life and theirs with instructions on how to augment their current environment appropriately."

So off the top of my head this would mean the ability to do something like:

conda compare environment.yml # report on delta between current environment and environment.yml
conda compare foo bar # where foo and bar are two named environments

But as it stands I end up writing some complicated conda list -e | grep statements by hand, after doing cat environment.yml.

@ijstokes ijstokes added the type::feature request for a new feature or capability label Nov 1, 2016
@basnijholt
Copy link
Contributor

This would be really great.

I often encounter problems with parallelization where some commands are executed on a Jupyterhub server and some on a cluster that it is connected to via ssh.

In order to run everything successfully, I need to ensure that the environments on both machines are the same.

Right now I do it by just comparing the outputs of conda list --export.

@kalefranz kalefranz added the source::anaconda created by members of Anaconda, Inc. label May 5, 2017
@bburns
Copy link

bburns commented Dec 16, 2017

I agree this would be nice to have - I came across a bug in pytest/py and it took me a while to figure out where the problem was - a side by side comparison like this would be really helpful -

> conda compare root fooenv

# installed packages
                      root      fooenv
alabaster             0.7.10     
anaconda              5.0.1       
anaconda-client       1.6.5       
anaconda-navigator    1.6.9       
anaconda-project      0.8.0       
asn1crypto            0.22.0    0.22.0
astroid               1.5.3     
astropy               2.0.2       
attrs                           17.3.0
babel                 2.5.0       

@mbargull
Copy link
Member

I find it quite comfortable to use a diff tool of my choice, i.e.,

$diffTool <(conda list -n env1) <(conda list -n env2)

or for environment files

$diffTool env1.yml <(conda list -en env2)

with $diffTool being any of diff, diff -y, sdiff, meld, ...
That way use can even use a tool to do 3-way compares, like

meld base.yml <(conda list -en env1) <(conda list -en env2)

IMHO, a conda compare (if needed at all) could just be simple wrapper around a diff tool (e.g., given by a parameter --diff-tool=sdiff). So the only own functionality would just have to be parsing its arguments to determine whether they are YAML files or environment names/paths.

@ChrisBarker-NOAA
Copy link

I was just using diff on the output of conda list-- and it was actually pretty unhelpful. I guess the problem is that the two environments are too different -- one had a bunch of extra packages, as well as maybe some different versions of the same package -- so the diff was pretty ugly.

I;d live a tool that gave a me a clean difference:

  • these packages are in a and not b
  • these packages are in b and not a
  • these packages are in both but different versions

OK -- off to write that ----

-CHB

@ChrisBarker-NOAA
Copy link

ChrisBarker-NOAA commented Sep 19, 2018

DONE.

Here is a hacked-together python script that compares two conda environments:

https://gist.github.com/ChrisBarker-NOAA/00106a2af2029259ba7496f865c39086

It would be great if similar functionality could be built in to conda.

@mingwandroid
Copy link
Contributor

mingwandroid commented Sep 19, 2018 via email

@ChrisBarker-NOAA
Copy link

@mingwandroid: yeah, that would be nice. But would require familiarizing myself with the conda code first -- hopefully I'll have time some day, but don't hold your breath :-(

But nice to know you're open to the idea -- if anyone else has the rountoits, feel free to get ideas from my code -- I was pleasantly surprised how easy it was to use a sets to get out what I wanted -- I'd hardly ever used sets for anything meaningful before ...

@ClayCampaigne
Copy link

I had an issue on this topic. When I typed conda info --envs, I was given a list that included two different environments with the same name, except one was capitalized. I don't know how I got myself into this predicament. I ran @ChrisBarker-NOAA's tool on these two envs, and they were identical. Here's my stupid part: I tried to delete one, and it deleted both.
This was on a Mac.

@ChrisBarker-NOAA
Copy link

The Mac is weird. It has a case insensitive, but case preserving file system.

This behaves oddly with *nix tools that are case sensitive.

I’m not sure conda can prevent these oddities.

@sidhant007
Copy link
Contributor

I would like to work on this issue. I encountered a similar issue to what the OP did and have written this script: https://github.com/sidhant007/DiffConda to compare between the current conda env and a particular yml config file.

OP's motivation was to be able to check if a condo environment is capable of running what is mentioned in an environment specification (a .yml file)

I think the use-case that @ChrisBarker-NOAA highlighted, i.e to be able to compare two different conda envs is less-interesting because if the user already has the two conda environments created then they will just switch to the one required, on the contrary the OP's aim was to avoid creating the second environment if the .yml file showed that is a subset of the current environment.

So for the first iteration (a basic version of conda compare), I would like to propose this:

  1. conda compare environment.yml - Reports packages lacking in the current environment required by environment.yml and if there are any version mismatches.
  2. conda compare -n myenv environment.yml - Same as above except in this we consider the environment as myenv

Implementation wise: It will be similar to how conda list is written, since this is also a command that won't change anything on the backend. We use package match specifications to define whether a package in the current environment "matches" with the one mentioned in the environment.yml.

Are they are any suggestions/objections to such a feature?

@ChrisBarker-NOAA
Copy link

Well, what you find interesting depends on what you need to do :-) -- in my case, I knew that one of the environments I had didn't work, but I did't know why. The problem is that if you build two environments from the same spec at different times, you get a different result if some packages have been updated and aren't pinned. or if you had an environment and added packages one by one.

Anyway, what people find interesting aside, the ability to compare environments is useful, and it would be good if it could be done from a environment file OR a live environment with the same code.

But what is being asked for now is another step: testing whether a given application can run in a given environment, which would require, as stated, using package match specifications, which means checking against a requirements file, not an environemnt.yaml.

checking: "does anything need to be changed in this environment to fit this spec" has got to be in conda already.

I'm pretty that in the general case, there is no robust way to say whether an application that can run in one environment will run in another one without knowing its requirements (unless the environments are identical) -- there's no way to know what particular version it might need otherwise. So:

OP's motivation was to be able to check if a condo environment is capable of running what is mentioned in an environment specification (a .yml file)

that depends on the environment specification -- is in in "match" form, which could work, or a full, everything pinned version, which would not work in the general case.

-CHB

@sidhant007
Copy link
Contributor

sidhant007 commented Jun 22, 2020

@ChrisBarker-NOAA
Agreed that it varies from use-case and some people might find comparing two env's a better option.

I'm pretty that in the general case, there is no robust way to say whether an application that can run in one environment will run in another one without knowing its requirements (unless the environments are identical) -- there's no way to know what particular version it might need otherwise.

I agree with this and understand that there is no easy way to have a fool-proof check of whether an application will run or not.

I am hopeful that I/someone else can extend upon my PR and add the functionality to be able to compare between two conda environments as well, given the maintainers are happy with my current PR.

that depends on the environment specification -- is in in "match" form, which could work, or a full, everything pinned version, which would not work in the general case.

As of now, I believe that my use case and some other people's use case (in the thread above) is resolved by providing the environment specification in the "match" form.

@github-actions
Copy link

Hi there, thank you for your contribution to Conda!

This issue has been automatically locked since it has not had recent activity after it was closed.

Please open a new issue if needed.

@github-actions github-actions bot added the locked [bot] locked due to inactivity label Aug 20, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked [bot] locked due to inactivity source::anaconda created by members of Anaconda, Inc. type::feature request for a new feature or capability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants