Conda driven dependency management#297
Conversation
|
The conda-lock file was created with: conda-lock \
--mamba \
--lockfile composed.conda-lock.yml \
-f environment.base.yml \
-f environment.dev.yml \
-f environment.tutorials.yml \
-f environment.fourc.yml \
-p linux-64 \
-p osx-arm64 |
|
I must admit, I have some doubts about this solution. Without a requirements file, turning QUEENS into a pypi package is more difficult and requires cumbersome manual work from the user, namely installing the requirements manually. So, after/before Currently, on ubuntu, all the packages should be installable via pip such that we 'only' use conda due to performance reasons. I guess the main motivation for this move is to support Mac? Could you elaborate on why Mac requires conda? I did not understand this from the PR description 😅 Some questions that come to mind:
I believe most of the problems stem from frontend dependencies (iterators + ML libraries), particularly due to the large number and the lack of a clear boundary between the front and backends. To be fair, this is a remnant from the days when QUEENS was seen as a monolith, before the (mainstream) library interpretation. My gut feeling is that moving the ML libraries' related functionality into an optional package (much like 4C interfaces) would make this problem less relevant, particularly if it is not required for most users. Supporting multiple platforms is always a pain, but essential for a healthy user base. I do think this is an essential step. I do support any improvement over the current state, as it has grown into a weird limbo :) |
Thanks a lot for kicking off this discussion on the PR. Just to add some context: this PR is not coming out of the blue, even though I admit that I failed to open a proper discussion or issue on that. It builds on discussions we’ve had over the past weeks in our meetings. I know you haven’t been able to join those recently, but for transparency, the relevant background can also be found in the meeting notes and related issue. The main motivation behind this change is that our current dependency workflow has effectively stalled. We haven’t been able to update dependencies in a long time, and recent attempts have shown that the current setup makes this increasingly difficult. The proposed solution is meant as a pragmatic way to break this deadlock. It builds on the status quo where we already recommend using Conda (primarily for performance reasons). So in that sense, the default user experience does not fundamentally change. While it has technically always been possible to install via pip, this has never been an officially supported or particularly reliable path. A key observation during this process was that Conda’s dependency resolver proved significantly more robust in our case. In particular, it consistently produced valid environments (something I also encountered while working on the (unofficial) macOS support) which was not the case with pip-based setup. However, macOS support is more of a secondary benefit here; the primary goal is to get back to a maintainable and updatable state. That said, I fully agree with your broader point: this should not be the end state. In my view, this is an intermediate step. The goal should absolutely be to support both pip install queens (via PyPI) and conda install queens. However, getting to that point requires additional restructuring and packaging work, which we are not yet set up for. So the intention here is to first stabilize and modernize the current setup with a solution that is already working in practice and is maintainable (i.e., simple). Once that foundation is in place, we can take the next step toward a cleaner separation of dependencies (e.g., frontend vs. backend, optional ML components) and proper multi-channel distribution. For this I was looking into pixi which might support all that we want but requires some changes again, and we did not want to commit as of now. @gilrrei Have you heard of pixi? |
I disagree on this point on pip. The only supported way was to install QUEENS via pip in the sense of Pip-tools-conda-combi was a pragmatic solution introduced by me 🙈, with consequences later on. Sorry about that. I would therefore not call this (= condalock) a pragmatic solution, but rather a simplifying one. This worked with the move to dask, we reduced functionality before reintroducing functionality, so I won't block this, not my style :) But I do want to point out the consequences of the current move.
I know we were in agreement on this :) I just have a different opinion on the order for reaching the goal: fix the optional dependencies first, so the resolvers have less work to do. Of course, this is more work, I understand why doing this simpler step first.
I'm not arguing against the conda/mamba/pip/pixi resolvers; these work beautifully. I'm not familiar with pixi :/ I am trying to point out that we need a good resolver because of the large number of (large) dependencies, even for simple use cases. Just to reiterate: I understand the move, but I do think this is a (small) step backward in terms of usability, hopefully only temporarily. Again, I'm not blocking this, even if I could. It's a nonhacky working solution, and if the community sees this as a necessary step, then we should go for it. At the end, this only affects the installation process, and we have the luxury to be able to dynamically change stuff :) |
|
Ohh, I forgot the most important point, the human one! Thanks for putting in the work and identifying a working solution for this important problem :) |
|
@sbrandstaeter sorry for the comment here but I am always super interested in Python package management and repo setup. Just one question from my side after I read all the comments with high interest. Have you also looked into https://docs.astral.sh/uv/? More and more projects are moving to it and I also plan to change a lot of stuff to use it. |
|
@davidrudlstorfer Happy to exchange ideas here — @gilrrei input pushed me to aim for a more complete endgame solution, so here’s a quick update: I explored different approaches and currently find My goal for QUEENS is to support both workflows: pip/uv as well as conda-based usage. For developers, I’d recommend pixi as the default since it integrates both cleanly while keeping everything in I’m also aiming to make Happy to keep discussing and iterating on this together. |
Description and Context:
What and Why?
This PR migrates QUEENS from the previous mixed
pip-tools+condadependency setup to a conda-first workflow.The new setup is based on layered environment files for normal user installation with
mambaand onconda-lockfor reproducible CI and development environments.The goal is that dependencies are managed centrally through conda environments instead of a parallel
requirements.in/requirements.txtworkflow.Why this change
The old setup split dependency management between conda and
pip-tools, which made maintenance and CI behavior harder to reason about. It also made updating the versions difficult. And the setup worked only on Linux/Ubuntu.With this PR:
mambaworkflowconda-lockworkflowHow to use the new setup
Minimal QUEENS installation
Use this if you only want the core QUEENS functionality:
mamba env create -n queens -f environment.base.yml conda activate queens pip install --no-deps -e .Full QUEENS installation
Use this if you want the full QUEENS functionality:
mamba env create -n queens -f environment.base.yml mamba env update -n queens -f environment.dev.yml mamba env update -n queens -f environment.tutorials.yml mamba env update -n queens -f environment.fourc.yml conda activate queens pip install --no-deps -e .Optional feature and what they are for:
environment.dev.yml: development tools, linting, typing, docsenvironment.tutorials.yml: tutorial notebooks and examplesenvironment.fourc.yml: 4C integrationRecommended setup for development
Reproducible QUEENS installation
Use this if you want the most reproducible setup and the one that matches CI (you need to install conda-lock first). It includes all optional dependency groups.
conda-lock install -n queens composed.conda-lock.yml conda activate queens pip install --no-deps -e .Mac support
These changes also open the door to using QUEENS on macOS, including Apple Silicon, because the conda-based environment setup and lockfile now support that platform as well.
This is not tested in CI yet, but it should make such support much more feasible. I am planning to extend this further in this PR if possible.
Main changes
pip-toolsworkflow and deleted outdatedrequirements*.in/requirements*.txtfilesenvironment.base.ymlenvironment.dev.ymlenvironment.tutorials.ymlenvironment.fourc.ymlmambaconda-lockenvironmentcomposed.conda-lock.ymlfrom pre-commit large-file and YAML lint checksRelated Issues and Pull Requests
Interested Parties