Skip to content

Conda driven dependency management#297

Closed
sbrandstaeter wants to merge 14 commits into
queens-py:mainfrom
sbrandstaeter:conda-driven-dependency-management
Closed

Conda driven dependency management#297
sbrandstaeter wants to merge 14 commits into
queens-py:mainfrom
sbrandstaeter:conda-driven-dependency-management

Conversation

@sbrandstaeter
Copy link
Copy Markdown
Member

Description and Context:
What and Why?

This PR migrates QUEENS from the previous mixed pip-tools + conda dependency setup to a conda-first workflow.

The new setup is based on layered environment files for normal user installation with mamba and on conda-lock for reproducible CI and development environments.

The goal is that dependencies are managed centrally through conda environments instead of a parallel requirements.in / requirements.txt workflow.

Why this change

The old setup split dependency management between conda and pip-tools, which made maintenance and CI behavior harder to reason about. It also made updating the versions difficult. And the setup worked only on Linux/Ubuntu.

With this PR:

  • normal users get a clean mamba workflow
  • developers get a reproducible conda-lock workflow
  • CI uses the same locked dependency set consistently
  • dependency definitions are now centralized in the conda environment files
  • the installation works equally for macos with Apple silicon

How to use the new setup

Minimal QUEENS installation

Use this if you only want the core QUEENS functionality:

mamba env create -n queens -f environment.base.yml
conda activate queens
pip install --no-deps -e .

Full QUEENS installation

Use this if you want the full QUEENS functionality:

mamba env create -n queens -f environment.base.yml
mamba env update -n queens -f environment.dev.yml
mamba env update -n queens -f environment.tutorials.yml
mamba env update -n queens -f environment.fourc.yml
conda activate queens
pip install --no-deps -e .

Optional feature and what they are for:

environment.dev.yml: development tools, linting, typing, docs
environment.tutorials.yml: tutorial notebooks and examples
environment.fourc.yml: 4C integration
Recommended setup for development

Reproducible QUEENS installation

Use this if you want the most reproducible setup and the one that matches CI (you need to install conda-lock first). It includes all optional dependency groups.

conda-lock install -n queens composed.conda-lock.yml
conda activate queens
pip install --no-deps -e .

Mac support

These changes also open the door to using QUEENS on macOS, including Apple Silicon, because the conda-based environment setup and lockfile now support that platform as well.

This is not tested in CI yet, but it should make such support much more feasible. I am planning to extend this further in this PR if possible.

Main changes

  • removed the old pip-tools workflow and deleted outdated requirements*.in / requirements*.txt files
  • aligned dependency management around:
    • environment.base.yml
    • environment.dev.yml
    • environment.tutorials.yml
    • environment.fourc.yml
  • updated documentation and installation instructions to use mamba
  • clarified that optional environment files are only needed for optional features
  • updated GitHub and GitLab CI to use the conda-lock environment
  • updated remote/cluster environment creation to support the new workflow
  • exempted composed.conda-lock.yml from pre-commit large-file and YAML lint checks

Related Issues and Pull Requests

  • Closes
  • Related to

Interested Parties

Note: More information on the merge request procedure in QUEENS can be found in the Submit a pull request section in the CONTRIBUTING.md file.

@sbrandstaeter
Copy link
Copy Markdown
Member Author

sbrandstaeter commented Apr 13, 2026

The conda-lock file was created with:

conda-lock \               
  --mamba \
  --lockfile composed.conda-lock.yml \
  -f environment.base.yml \
  -f environment.dev.yml \
  -f environment.tutorials.yml \
  -f environment.fourc.yml \
  -p linux-64 \
  -p osx-arm64

@gilrrei
Copy link
Copy Markdown
Member

gilrrei commented Apr 13, 2026

I must admit, I have some doubts about this solution.

Without a requirements file, turning QUEENS into a pypi package is more difficult and requires cumbersome manual work from the user, namely installing the requirements manually. So, after/before pip install queens, we would still need to install the dependencies manually, hence clone the repo (or get the info from somewhere else). This solution essentially forces users to use an environment-yaml-supporting package management system. Although I enjoy using conda/mamba, this is quite a strong restriction for a library.

Currently, on ubuntu, all the packages should be installable via pip such that we 'only' use conda due to performance reasons. I guess the main motivation for this move is to support Mac? Could you elaborate on why Mac requires conda? I did not understand this from the PR description 😅 Some questions that come to mind:

  • Which packages are not available? Are they needed? What if we split front and backend dependencies?
  • How do other projects solve this problem?

I believe most of the problems stem from frontend dependencies (iterators + ML libraries), particularly due to the large number and the lack of a clear boundary between the front and backends. To be fair, this is a remnant from the days when QUEENS was seen as a monolith, before the (mainstream) library interpretation. My gut feeling is that moving the ML libraries' related functionality into an optional package (much like 4C interfaces) would make this problem less relevant, particularly if it is not required for most users.

Supporting multiple platforms is always a pain, but essential for a healthy user base. I do think this is an essential step. I do support any improvement over the current state, as it has grown into a weird limbo :)

@sbrandstaeter
Copy link
Copy Markdown
Member Author

I must admit, I have some doubts about this solution.

Without a requirements file, turning QUEENS into a pypi package is more difficult and requires cumbersome manual work from the user, namely installing the requirements manually. So, after/before pip install queens, we would still need to install the dependencies manually, hence clone the repo (or get the info from somewhere else). This solution essentially forces users to use an environment-yaml-supporting package management system. Although I enjoy using conda/mamba, this is quite a strong restriction for a library.

Currently, on ubuntu, all the packages should be installable via pip such that we 'only' use conda due to performance reasons. I guess the main motivation for this move is to support Mac? Could you elaborate on why Mac requires conda? I did not understand this from the PR description 😅 Some questions that come to mind:

  • Which packages are not available? Are they needed? What if we split front and backend dependencies?
  • How do other projects solve this problem?

I believe most of the problems stem from frontend dependencies (iterators + ML libraries), particularly due to the large number and the lack of a clear boundary between the front and backends. To be fair, this is a remnant from the days when QUEENS was seen as a monolith, before the (mainstream) library interpretation. My gut feeling is that moving the ML libraries' related functionality into an optional package (much like 4C interfaces) would make this problem less relevant, particularly if it is not required for most users.

Supporting multiple platforms is always a pain, but essential for a healthy user base. I do think this is an essential step. I do support any improvement over the current state, as it has grown into a weird limbo :)

Thanks a lot for kicking off this discussion on the PR.

Just to add some context: this PR is not coming out of the blue, even though I admit that I failed to open a proper discussion or issue on that. It builds on discussions we’ve had over the past weeks in our meetings. I know you haven’t been able to join those recently, but for transparency, the relevant background can also be found in the meeting notes and related issue.

The main motivation behind this change is that our current dependency workflow has effectively stalled. We haven’t been able to update dependencies in a long time, and recent attempts have shown that the current setup makes this increasingly difficult.

The proposed solution is meant as a pragmatic way to break this deadlock. It builds on the status quo where we already recommend using Conda (primarily for performance reasons). So in that sense, the default user experience does not fundamentally change. While it has technically always been possible to install via pip, this has never been an officially supported or particularly reliable path.

A key observation during this process was that Conda’s dependency resolver proved significantly more robust in our case. In particular, it consistently produced valid environments (something I also encountered while working on the (unofficial) macOS support) which was not the case with pip-based setup. However, macOS support is more of a secondary benefit here; the primary goal is to get back to a maintainable and updatable state.

That said, I fully agree with your broader point: this should not be the end state. In my view, this is an intermediate step. The goal should absolutely be to support both pip install queens (via PyPI) and conda install queens. However, getting to that point requires additional restructuring and packaging work, which we are not yet set up for.

So the intention here is to first stabilize and modernize the current setup with a solution that is already working in practice and is maintainable (i.e., simple). Once that foundation is in place, we can take the next step toward a cleaner separation of dependencies (e.g., frontend vs. backend, optional ML components) and proper multi-channel distribution. For this I was looking into pixi which might support all that we want but requires some changes again, and we did not want to commit as of now. @gilrrei Have you heard of pixi?

@gilrrei
Copy link
Copy Markdown
Member

gilrrei commented Apr 14, 2026

The proposed solution is meant as a pragmatic way to break this deadlock. It builds on the status quo where we already recommend using Conda (primarily for performance reasons). So in that sense, the default user experience does not fundamentally change. While it has technically always been possible to install via pip, this has never been an officially supported or particularly reliable path.

I disagree on this point on pip. The only supported way was to install QUEENS via pip in the sense of pip install -e . . I was not talking about PyPi. The new installation method is a major change (you have to install dependencies yourself), especially since even Python developers may not understand the difference between pip and conda.

Pip-tools-conda-combi was a pragmatic solution introduced by me 🙈, with consequences later on. Sorry about that. I would therefore not call this (= condalock) a pragmatic solution, but rather a simplifying one. This worked with the move to dask, we reduced functionality before reintroducing functionality, so I won't block this, not my style :) But I do want to point out the consequences of the current move.

That said, I fully agree with your broader point: this should not be the end state. In my view, this is an intermediate step. The goal should absolutely be to support both pip install queens (via PyPI) and conda install queens. However, getting to that point requires additional restructuring and packaging work, which we are not yet set up for.

So the intention here is to first stabilize and modernize the current setup with a solution that is already working in practice and is maintainable (i.e., simple). Once that foundation is in place, we can take the next step toward a cleaner separation of dependencies (e.g., frontend vs. backend, optional ML components) and proper multi-channel distribution.

I know we were in agreement on this :) I just have a different opinion on the order for reaching the goal: fix the optional dependencies first, so the resolvers have less work to do. Of course, this is more work, I understand why doing this simpler step first.

A key observation during this process was that Conda’s dependency resolver proved significantly more robust in our case. In particular, it consistently produced valid environments (something I also encountered while working on the (unofficial) macOS support) which was not the case with pip-based setup. However, macOS support is more of a secondary benefit here; the primary goal is to get back to a maintainable and updatable state.

For this I was looking into pixi which might support all that we want but requires some changes again, and we did not want to commit as of now. @gilrrei Have you heard of pixi?

I'm not arguing against the conda/mamba/pip/pixi resolvers; these work beautifully. I'm not familiar with pixi :/ I am trying to point out that we need a good resolver because of the large number of (large) dependencies, even for simple use cases.

Just to reiterate: I understand the move, but I do think this is a (small) step backward in terms of usability, hopefully only temporarily. Again, I'm not blocking this, even if I could. It's a nonhacky working solution, and if the community sees this as a necessary step, then we should go for it. At the end, this only affects the installation process, and we have the luxury to be able to dynamically change stuff :)

@gilrrei
Copy link
Copy Markdown
Member

gilrrei commented Apr 14, 2026

Ohh, I forgot the most important point, the human one! Thanks for putting in the work and identifying a working solution for this important problem :)

@davidrudlstorfer
Copy link
Copy Markdown

@sbrandstaeter sorry for the comment here but I am always super interested in Python package management and repo setup. Just one question from my side after I read all the comments with high interest.

Have you also looked into https://docs.astral.sh/uv/? More and more projects are moving to it and I also plan to change a lot of stuff to use it.

@sbrandstaeter
Copy link
Copy Markdown
Member Author

sbrandstaeter commented Apr 17, 2026

@davidrudlstorfer Happy to exchange ideas here — @gilrrei input pushed me to aim for a more complete endgame solution, so here’s a quick update:

I explored different approaches and currently find pixi the most promising. Incidentally, it combines conda (for conda dependencies) and uv (for PyPI), giving us the benefits of both ecosystems in a unified setup. During the development, I also tried pure uv for the first time; before that I had heard about it before but never used it. I found that uv is a significant improvement over pip for PyPI dependencies. So I now have a setup where QUEENS can be installed from pypi dependencies only, but using uv as pip was not able to solve the QUEENS' complex dependency tree.

My goal for QUEENS is to support both workflows: pip/uv as well as conda-based usage. For developers, I’d recommend pixi as the default since it integrates both cleanly while keeping everything in pyproject.toml, which I find important and is one of the things I don't like about conda.

I’m also aiming to make pip install queens work smoothly as an end goal. And maybe even conda install queens

Happy to keep discussing and iterating on this together.

@sbrandstaeter sbrandstaeter mentioned this pull request Apr 19, 2026
29 tasks
@sbrandstaeter
Copy link
Copy Markdown
Member Author

As mentioned above, I have been working on a better solution based on pixi.

Please find a suggestion here #301

Thus closing in favour of #301

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants