Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing packages in OS X #925

Closed
dhimmel opened this issue Jul 20, 2016 · 8 comments
Closed

Missing packages in OS X #925

dhimmel opened this issue Jul 20, 2016 · 8 comments
Assignees

Comments

@dhimmel
Copy link

dhimmel commented Jul 20, 2016

When trying to install this environment.yml on OS X, the following error occurs:

Solving package specifications: .
Error: Packages missing in current osx-64 channels: 
  - fontconfig 2.11.1 6
  - libsodium 1.0.10 0
  - mistune 0.7.2 py35_0
  - zeromq 4.1.4 0

Looking at zeromq on Anaconda, it appears that 4.1.4 is available for linux, but not OS X. This is causing issues for our project.

Is it expected behavior that version-specific package availability is platform dependent? We want linux, OS X, and windows users to be able to use the same environment. What is the recommended solution for these issues? It looks like #844, #855, #856, and this issue are all related.

@ilanschnell
Copy link
Contributor

Not all packages are always available on all platforms. On OSX, for example, we link pyzmq statically to zeromq, so there is no reason to include the package in the distribution. Also version, but in particular build numbers are in general not the same across platforms. Was the environment.yml file created on Linux, and now you want to use it on OSX?

@mingwandroid
Copy link

we link pyzmq statically to zeromq, so there is no reason to include the package in the distribution

That's a Python-distribution centric viewpoint, zeromq is a useful library in and of itself. For example R could share the same zeromq in r-rzmq if we built it (as it stands, on OS X, I believe we build another static version for R). Also, other developers wishing to source development libraries from conda may want to use it.

Also, IMHO it would be better if our build numbers were standardized whenever possible (if I fix a bug that only affects Linux, I won't make a new build for OS X, but it would be good if the user - or a process - could work backwards to the next lowest number on their platform and be assured that it's very nearly equivalent).

@dhimmel
Copy link
Author

dhimmel commented Jul 20, 2016

Was the environment.yml file created on Linux, and now you want to use it on OSX?

Yes. Our goal is to specify the environment in a way that you can develop/run on OS X, Windows, or Linux.

Are we only supposed to specify the versions of the user facing packages (such as jupyter and pandas)? Should we be using environment.yml or requirements.txt?

@ilanschnell
Copy link
Contributor

Even with more standardization, taking the environment.yml file from one architecture and t trying to use it on another is not going to work in general. My recommendation to to start fresh on the new platform with Miniconda, and that add packages you need.

@dhimmel
Copy link
Author

dhimmel commented Jul 20, 2016

My recommendation to to start fresh on the new platform with Miniconda, and that add packages you need.

I started with miniconda on linux and just specified installing jupyter, pandas, and numexpr. Now how do we make sure everyone in our project can use the same version of those packages?

Should we upload multiple environment files to GitHub -- e.g. environment-osx.yml and environment-linux.yml? Should we change our environment.yml to only include the user facing packages. For example:

name: cognoma-cancer-data
dependencies:
- ipython=5.0.0=py35_0
- nbconvert=4.2.0=py35_0
- notebook=4.2.1=py35_0
- numexpr=2.6.0=np111py35_0
- numpy=1.11.1=py35_0
- pandas=0.18.1=np111py35_0
- python=3.5.2=0
prefix: /home/dhimmel/anaconda3/envs/cognoma-cancer-data

Should we store package versions with build information removed? We're looking for a solution that allows a diverse team of individuals to contribute to a project with the same package versions.

dhimmel added a commit to dhimmel/cancer-data that referenced this issue Jul 22, 2016
Switch environment.yml from specifying every installed package to just listing
explicit dependencies (packages users directly import or interact with). Remove
build numbers from version specification.

Aims to address ContinuumIO/anaconda-issues#925 where an environment created on
Linux was not available on OS X.
@dhimmel
Copy link
Author

dhimmel commented Jul 22, 2016

Possible workaround?

I updated our environment.yml to the following (see dhimmel/cancer-data@8fb66e9):

name: cognoma-cancer-data
dependencies:
- jupyter=1.0.0
- numexpr=2.6.0
- numpy=1.11.1
- pandas=0.18.1
- python=3.5.2

So essentially I removed build numbers and specified only user-facing packages. By "user-facing", I mean dependencies that users will explicitly interact with (like jupyter) or import (like pandas).

I expect this approach will resolve operating system compatibility issues, while sacrificing some in terms of an identical computing environment. @ilanschnell and @mingwandroid, can you comment on this workaround? Do you think it's the right way forward?

@mingwandroid
Copy link

We have talked of recording which packages were explicitly installed, as opposed to those that were installed as dependencies. I think knowing this and having a way to query it is roughly what you are looking for here? @kalefranz, @mcg1969, is this still something we're considering?

@dhimmel
Copy link
Author

dhimmel commented Jul 22, 2016

what you are looking for here?

Taking a step back, our priority is to allow users on OS X, Linux, and Windows to contribute to our codebase, while maintaining an identical environment. Now, I think it's okay if the environment isn't 100% identical, as long as the packages people directly use are the same versions.

So it seems that there are possibly different options:

  1. Specifying different environments for each operating system. For example, we could have a environment-osx.yml and environment-linux.yml. This seems like it will excel at reproducibility but will be a pain to maintain.
  2. Specifying only explicitly installed packages as described above. By not specifying builds and only specifying versions of fewer packages, we would be less likely to run into OS-specifically availability issues.

I went with option 2 (dhimmel/cancer-data@8fb66e9) and am wondering whether that's the best approach currently?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants