Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Problems/Solutions when using conda for reproducible environments. #2997

Open
cpaulik opened this issue Jul 7, 2016 · 6 comments
Open

Comments

@cpaulik
Copy link

@cpaulik cpaulik commented Jul 7, 2016

I'd like to open a discussion about issues that need to be solved to make conda a real solution for producing reproducible environments.

A few of the issues that I noticed are:

  1. Different conda versions have different package resolution strategies. Because of this it can happen that an environment.yml file stops working with a newer conda version. This might be solved by including the conda version in the YAML file. But AFAIK it is not possible to have multiple conda's in one miniconda/anaconda installation.
  2. It seems that older conda versions can not be used after some time since the package format is still changing. See e.g. #1642 . But this might only apply to new packages so please correct me if that's wrong.
  3. Conda and conda-env have to work together and work with channels. This is already covered in #2800 and other issues.

This is related to a discussion started in conda-forge/openblas-feedstock#10

@ocefpaf and others, please add other problems or solutions to my issues in this thread.

@conda-maintainers If you think there is a better place to discuss this "meta"-issue then please feel free to redirect this to the proper channels.

@kalefranz

This comment has been minimized.

Copy link
Member

@kalefranz kalefranz commented Jul 7, 2016

Happy to have the discussion here!

@ocefpaf

This comment has been minimized.

Copy link

@ocefpaf ocefpaf commented Jul 7, 2016

  1. Different conda versions have different package resolution strategies. Because of this it can happen that an environment.yml file stops working with a newer conda version. This might be solved by including the conda version in the YAML file. But AFAIK it is not possible to have multiple conda's in one miniconda/anaconda installation.

I guess it is hard to rely on conda for that specific issue b/c we are asking the environment.yml, that conda will create, to manage conda's version. Not sure if that is solvable 😕

For cases like that I rely on docker or a two-step install where I pin conda before creating the environment.yml.

I cannot say much about 2 and 3 as I haven't stumbled upon them (yet).

@cpaulik

This comment has been minimized.

Copy link
Author

@cpaulik cpaulik commented Jul 8, 2016

I guess it is hard to rely on conda for that specific issue b/c we are asking the environment.yml, that conda will create, to manage conda's version. Not sure if that is solvable

I think I've heard that the long term goal for conda is to be installable via pip and not depend on a miniconda installation. If this is still the goal then it should be possible for conda to bootstrap itself into an environment and then install the rest of the packages in the .yml file.

@cpaulik

This comment has been minimized.

Copy link
Author

@cpaulik cpaulik commented Jul 8, 2016

Is a conda environment truly separated from the system environment? My guess would be that if e.g. a C library is not available through conda then the system library is used.

@jakirkham

This comment has been minimized.

Copy link
Contributor

@jakirkham jakirkham commented Jul 12, 2016

Is a conda environment truly separated from the system environment?

Nope. In some cases you don't want true separation anyways. For example system libraries glibc on Linux or libSystem on Mac, we want to use and do not want to package them. Similarly there are some other libraries that are tied to the system and it would be a bad idea for us to package them or otherwise not use the system ones. Though I do get the drift of your question and note that you have a point.

My guess would be that if e.g. a C library is not available through conda then the system library is used.

Yes, this can happen and is known to cause problems in some cases. It's not clear to me whether those problems are easily solved in conda-build itself or whether it is simply up to packagers to remain vigilant. I have tended to accept the latter as status quo and thus appreciate things like conda-forge to keep myself and others honest w.r.t. to using conda packages and not relying heavily on the system. Additionally, tools like conda inspect linkages are a good way to see what libraries are linked to by a package. This can help ensure the right things are packaged.

@ChrisBarker-NOAA

This comment has been minimized.

Copy link

@ChrisBarker-NOAA ChrisBarker-NOAA commented Aug 8, 2016

with regards to:

"""
Different conda versions have different package resolution strategies. Because of this it can happen that an environment.yml file stops working with a newer conda version.

...
This might be solved by including the conda version in the YAML file.
"""

As conda is getting more traction, it's time for the project to pay serious attention to backwards compatibility. Ideally, you would always be able to install an environment.yaml file generated with an older conda.

I don't understand the issue in this case -- but if the conda version used to created the environment.yaml file were preserved -- then newer versions of conda would know which "strategy" it should use to re-install the environment. So you wouldn't need to actually use an older version of conda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.