Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PackagesNotFoundError when setting up virtual environment #19

Closed
1 task done
banksad opened this issue Dec 29, 2021 · 7 comments
Closed
1 task done

PackagesNotFoundError when setting up virtual environment #19

banksad opened this issue Dec 29, 2021 · 7 comments
Labels
bug Something isn't working

Comments

@banksad
Copy link

banksad commented Dec 29, 2021

When running environment.yml from the Anaconda Prompt I get the following:

PackagesNotFoundError: The following packages are not available from current channels:

  - datatable

Current channels:

  - https://conda.anaconda.org/oxfordcontrol/win-64
  - https://conda.anaconda.org/oxfordcontrol/noarch
  - http://conda.anaconda.org/gurobi/win-64
  - http://conda.anaconda.org/gurobi/noarch
  - https://repo.anaconda.com/pkgs/main/win-64
  - https://repo.anaconda.com/pkgs/main/noarch
  - https://repo.anaconda.com/pkgs/r/win-64
  - https://repo.anaconda.com/pkgs/r/noarch
  - https://repo.anaconda.com/pkgs/msys2/win-64
  - https://repo.anaconda.com/pkgs/msys2/noarch

I'm pretty sure this is because datatable needs to be installed with pip rather than conda.

A less likely explanation could be operating system dependency (I'm on Windows 10), in which case appending --no-builds may be a solution.

I have put datatable down to the end of the pip section as below:

  - pip:
    - specification_curve
    - twopiece
    - stargazer
    - matplotlib-scalebar
    - black-nb
    - pyhdfe
    - skimpy
    - dataprep
    - graphviz
    - pygraphviz
    - ruptures
    - deadlinks
    - datatable

This works, but the current set of dependencies seem to have conflicts as I get:

Collecting package metadata (repodata.json): done
Solving environment: -
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
  • I will investigate which packages have conflicts
@aeturrell
Copy link
Owner

Unfortunately this looks like a Windows issue; conda forge has pre-built binaries for MacOS and Linux only, so running the build command on Windows will fail.

More generally, the environment is a mess! This is chiefly because of the large number of packages. To make the environment more consistent and reproducible, I've thought about:

  1. cutting down the number of packages
  2. building a docker container (tried this a couple of times without success)
  3. having different environments for different parts of the book (not clear if this is supported by Jupyter Book though)
  4. using poetry instead (tried this but again little success in getting a reproducible environment)

It would be desirable to have an environment that reliably builds across Mac, Linux, and even Windows, and a pre-built Docker container that users could interact with via Binder (this option is available in the book but doesn't currently work). It looks like 1 is probably the best long-term solution—but it would be useful to know the packages that cause the inconsistency so we could work around them.

@aeturrell aeturrell added the bug Something isn't working label Dec 30, 2021
@banksad
Copy link
Author

banksad commented Dec 31, 2021

I've had a look at package conflicts and there are quite a lot of them (conflicts2.txt).

I tried installing in sections and also experimented with pip-compile and there doesn't seem to be a set of specific versions that avoid conflicts. As you say, it's pretty difficult / impossible to achieve this given the number of packages.

Agree with the four options you've suggested (poetry looks interesting).

I guess in the interim it's sufficient to just have contributors install an environment as close to yours as possible - given all of the packages are being used for a small set of isolated examples.

Therefore it might be good to extract the specific versions of packages you wrote the book with using bash conda list --export > package-list.txt, so that others can sequentially install that list, and then at least the book can be tested in a Windows environment where people have sequentially installed that list. Then when new packages are added or versions are upgraded, tests can be rerun from this point. What do you reckon?

@aeturrell
Copy link
Owner

That is indeed quite a list of inconsistencies! I've created a package-list.txt file now, so you should be able to see exact versions. I'll have a think about how to tackle approach 1.

@aeturrell
Copy link
Owner

Merging of #21 sees reproducible builds (at least for the most recent commit) arrive via a dockerfile. I've also pushed a pre-built image to the Coding for Economists dockerhub—be warned though, it's 12GB! Using Mamba, I managed to get the build time down to something quite sensible.

Can you check if you can build or alternatively pull down and run the image?

@banksad
Copy link
Author

banksad commented Jan 9, 2022

Sounds good, I tested the Dockerfile before and it started to run fine but stalled when trying to resolve all the package conflicts.

Will test this tomorrow

@banksad
Copy link
Author

banksad commented Jan 11, 2022

Success! Took about an hour

Capture2

@aeturrell
Copy link
Owner

An hour is a long time, but a previous attempt took 6 hours to build so I'm counting this as a win on both reproducibility and build-time. There's a future question of reducing the number of packages to make everything a bit more agile (and perhaps allow dev not in a docker env...), but I'll close this as the immediate concern is addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants