Add support for YML files in `conda create --file` (WIP) #14113

jaimergp · 2024-08-01T15:38:36Z

Description

Looking into parts of #11633 (conda env create -> conda create).

This is the minimum effort required to have YML support in conda create. It has a few problems UX-wise, though:

There's no yes/no prompt
The reported information is different (terser)
We can't get the --name from the environment.yml file
It doesn't support the pseudo-plugin system

This is just here to see what breaks and inform of the challenges in conda.cli.install to make a case for a bigger refactor.

What I would like to have instead

I envision a new Environment class that CLIs need to fulfill with details such as packages, channels or solver settings, and then this class delegates to the adequate install backends (solver, explicit...). This starts to build up a homogeneized Environment schema that could look like a potential conda.toml or environment.yml v2. That's a big change so I'll draft a roadmap to get there if interested. My idea would be to have a single file that represents the input state of the environment, and then operating on the environment would mean editing that file (on disk or virtually), and then applying the changes to disk. Again, borrowing a few concepts from the Pixi model.

Edit: I went ahead and implemented parts of the details dropdown. See this comment: #14113 (comment)

Checklist - did you ...

Add a file to the news directory (using the template) for the next release's release notes?
Add / update necessary tests?
Add / update outdated documentation?

codspeed-hq · 2024-08-01T16:41:52Z

CodSpeed Performance Report

Merging #14113 will degrade performances by 14.78%

_{Comparing jaimergp:conda-create-yml (eee2ea7) with main (7c4941c)}

Summary

❌ 3 regressions
✅ 18 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`main`	`jaimergp:conda-create-yml`	Change
❌	`test_install[classic]`	220.5 ms	258.7 ms	-14.78%
❌	`test_update[classic-update]`	275.1 ms	309.5 ms	-11.12%
❌	`test_update[classic-upgrade]`	270.7 ms	309 ms	-12.41%

jaimergp · 2024-08-06T07:54:40Z

I expanded a little bit on that details tag above.

conda environment modification UX

There are four main commands:

conda create: Creates a new environment (identified by name or full path). Accepts a list of MatchSpec strings and/or a list of TXT files containing MatchSpecs. Paths and URLs are also accepted.
conda install: Same as create, but it expects an existing environment.
conda remove: Removes package(s) from the environment, and their dependents. Can also remove the full environment if --all is passed.
conda update: Installs the latest version of a package already installed, as long as it's solvable.

All of these can be tackled as the same type of input if expressed as a single input file. Let's assume that file is environment.yml for practical purposes.

conda create -n env -c conda-forge python=3.10:

name: env
channels:
- conda-forge
dependencies:
- python=3.10

conda install -n env numpy just means adding numpy to the implicit file:

name: env
channels:
- conda-forge
dependencies:
- python=3.10
- numpy=*  #

conda update -n python means unpinning python, solving, and then repinning to the resolved version. Note this command only accepts package names.

name: env
channels:
- conda-forge
dependencies:
- python=*  # would become python=3.12 
- numpy=*

conda remove numpy is as obvious as it sounds; remove its entry and resolve:

name: env
channels:
- conda-forge
dependencies:
- python=*  # would become python=3.12

However conda remove can also remove dependencies not explicitly in the input file, but maybe transitory dependencies that are part of the one of the variants. For example, mkl BLAS for numpy. We can conda remove mkl and we would obtain openblas instead, while still keeping numpy. We would need an extra input field to prevent a package from being installed, similar to run_constrained in meta.yaml. This can operate as a series of pins too!

name: env
channels:
- conda-forge
dependencies:
- python=*  # would become python=3.12
constraints:
- mkl <0 # removes or prevents installation of mkl

`--file` flags

The --file flag (which can be specified multiple times) accepts TXT files. These TXT files have newline-separated MatchSpecs or URLs. #-leading lines are ignored. If @EXPLICIT is in the file, it's considered an explicit install with no solver invocation: the URLs are just fetched and linked in order of appearance.

Since they only specify specs, they can be concatenated easily. They are meant to complement the CLI, not replace it!

environment.yml files are not supported by this option (yet). This format has more details whose concatenation is not as obvious: what to do with name or channels? Who wins? First? Last? These would need to be solved before added to the new input file format. As a result we would start only accepting one of these at a time.

Solving-affecting flags

There are also a number of solver flags that can affect the result of the solve:

Postprocess dependencies of passed specs after the solve. These can only be implemented under the "everything is a file" model by post-processing the hypothetical lockfile that gets created. Note this leaves the environment in an inconsistent state.
- --no-deps: only the packages corresponding to the spec are installed. This can be replicated by adding the URL to the artifact directly as part of the spec.
- --only-deps: everything but the passed packages is installed. User would need to craft the list of dependencies required by the package. I feel this is not too used in practice, and it's mostly to support development environments that are better served by specialized tooling.
Adjust how dependencies are dealt with. These flags are useful to solve complications of dealing with the "installed packages should not change unless necessary" behaviour that conda prefers. In my opinion, these could be implemented as a series of constraints for freeze-installed, or as a series of spec=* in dependencies:
- --specs-satisfied-skip-solve: if the installed stuff satisfies the constraints, do not update even if there are newer versions compatible.
- --freeze-installed: constrain everything else while solving the new specs.
- --update-deps: force the update of the dependencies of the specs we passed (consider it a partial update-all).
- --update-specs: this is the default behaviour when not freezing.
- --update-all: this is in principle the default behaviour of a fresh input file. So it's a matter of resolving it again with the new repodata.

There are also some extra CLI flags that concern how the channels are fetched and can have an effect on the solution:

--repodata-fn: which repodata file(s) are fetched from the remote channels(s). This can affect the solution.
--no/strict-channel-priority: how to deal with several channels at once.
from condarc, only_tar_bz2. This could be deprecated.

And of course, the --solver flag.

All other flags

Everything else in the CLI should be considered a runtime option that does not affect the solution of the environment, and hence could be just kept around when necessary. For example, --copy can be used without issues in the CLI and its presence won't affect which packages are installed.

The proposed schema

This is the proposed schema for a more explicit input file that can potentially replace the state stored in conda-meta/history, conda-meta/state and conda-meta/pinned.

name: str
description: str
last_modified: datetime
channels: list of str  # these should be ideally URLs for fully resolved channels
channel_options:
  repodata_fn: list of str
  # maybe authentication stuff
platforms: list of str
solver_options: dict
  solver: str
  channel_priority: flexible or strict
  use_only_tar_bz2: bool
  aggressive_update_packages: list of str
dependencies: list of str or dict of (str, list(str))
constraints: list of str  # conda-meta/pinned
variables: dict of (str, str)  # conda-meta/state

travishathaway · 2024-08-06T09:29:13Z

@jaimergp,

I'm curious why you see the need to eventually support multiple --file options. Wouldn't this just needlessly overcomplicate our implementation and the CLI interface? Under what circumstances would this be useful?

Another thing that we could do to make the CLI even more simplified would be to either accept a --file option or list of MatchSpecs on the command line. This would also help further simplify our implementation.

jezdez · 2024-08-06T09:36:13Z

To support @travishathaway, I've mentioned this to @jaimergp in person already, I also think multiple --file invocations with .yml file are a nice-to-have since it would imply resolving a merging strategy first. We can take a look at how micromamba handles it of course, but that's not just a refactor then anymore, but a larger feature addition.

Could we achieve the end goals of this ticket basically in multiple steps, to reduce the code churn? Deprecation of existing flags and code paths need to be accounted for as well.

jaimergp · 2024-08-06T10:21:00Z

I'm curious why you see the need to eventually support multiple --file options. Wouldn't this just needlessly overcomplicate our implementation and the CLI interface? Under what circumstances would this be useful?

Because it's already supported with .txt files. We can choose to only allow multiples ones with different formats, but it could be useful on setups like conda create -n dev --file base-deps.yml --file os-specific-deps.yml.

Eventually it doesn't matter as long as we are able to construct "The Source Of Truth File" from all those input files, and dump it in the conda-meta.

Could we achieve the end goals of this ticket basically in multiple steps, to reduce the code churn? Deprecation of existing flags and code paths need to be accounted for as well.

Absolutely. I don't intend this PR to be merged. It's mostly a conversation driver so we can discuss code challenges with good technical context (e.g. how the diff looks like). When we have a decision, we can create an epic/meta with the smaller items and work on them one by one.

Maybe the first step is a quick prototype of the dreamt CLI plus the draft implementation of the new explicit-state environment file, which maybe I drop here in this PR.

jaimergp · 2024-08-20T16:58:54Z

Hello @conda/conda-core! This PR is still in draft but we have reached a milestone here. conda create --file passes the conda env create -f tests :)

Let me recap what I've done here:

Created a conda.cli.install2 module that reimplements parts of conda.cli.install.
Added a new Environment class that is able to accumulate much of the input data necessary to operate on an environment. It has a weird scope overlap with PrefixData but it's more CLI-ish, if that makes sense. Still unsure if this would be an implementation detail or a first-class API citizen (maybe encapsulating the logic of a potential environment.yml v2 file format).
- The merge classmethod allows us to combine Environment objects, regardless the source (CLI, txt, yml...). This is so we can support multiple TXT files, but it also handles multiple YMLs if you want. The CLI data is massaged into an Environment object too, so this makes it super easy to combine all the possible sources.
- Fun fact: This is mostly so we can deal with the annoying feature of environment.yml being able to provide an env name / path, and then have it overridden in the CLI (sometimes). I don't like this but we need it for a smooth transition I guess.
I've also split the main install function in smaller ones, and rewrote some of them so we can get Transaction objects out of them. This allows us to have the same UX across input sources.
- Note that conda env create never asked for confirmation or reported the summary of the transaction. Same with explicit files. So, technically, the correct translation for conda env create -f some.yml is conda create --file some.yml --yes.

There's still some more work to do (like a smarter file format detector that doesn't have to deal with the now deprecated Anaconda.org environments), but you get the idea.

jaimergp · 2024-08-22T08:33:47Z

pre-commit.ci autofix

for more information, see https://pre-commit.ci

…da-create-yml

kcpevey · 2024-09-13T14:46:56Z

Excited to see this! I need to specify my optional test/docs dependencies in a separate file from my core dependencies. It would be great to be able to use the yml format instead of the txt format for specifying multiple files 💜

Minimum diff possible to implement 'conda create --file environment.yml'

6cd347c

jaimergp requested a review from a team as a code owner August 1, 2024 15:38

conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Aug 1, 2024

jaimergp marked this pull request as draft August 1, 2024 15:39

Avoid circular imports

a59bbfb

pass name for txt

af3192b

add Environment data model

ab873ee

beeankha mentioned this pull request Aug 6, 2024

Combine conda env create with conda create and make the former an alias #13015

Open

jaimergp added 17 commits August 7, 2024 09:27

Draft --dry-run support

f6514f7

enable for install too

82d68e1

enable for update too

41ec664

Raise if no name or prefix are provided

1393823

Implement create --clone

72c9118

Add solver transactions

80298e7

Merge branch 'main' of github.com:conda/conda into conda-create-yml

0af1ed5

some fixes

c77fefb

some more fixes

661613e

handle empty requirements early

5c9c494

guess what... more fixes

32f80bb

let context define the default channels (condarc + cli)

f52e962

fix .to_dict()

d8ec61a

add note about force_32bit

319b9b2

fix _should_retry_unfrozen

b107269

pre-commit

112f1f6

more pre-commit

d612cce

jaimergp added 11 commits August 13, 2024 15:57

fix str prefix

f83ae40

Split in better isolated helper functions

082168d

reimplement clone as a transaction driven func, add handle_txn callables

507aebd

pre-commit

17db613

unused import

b564546

this is a set

c96cbf6

change a couple names

a3beff1

more prefix checks for 'conda create'

efcfafd

extend 'conda env create' tests to 'conda create' too

82ad0dc

this test should also complain about protected envs?

3126283

more pre-commit

86b5dbe

jaimergp added 2 commits August 22, 2024 08:18

add tests

c527169

py <3.11 friendly isoformat strings

a264c0a

pre-commit-ci bot and others added 6 commits August 22, 2024 08:34

[pre-commit.ci] auto fixes from pre-commit.com hooks

da2bb8d

for more information, see https://pre-commit.ci

name == prefix doesn't apply to name=base

fd21e55

Add base protection proto

fbfad16

warn only for now

ce9cb20

Merge branch 'conda-create-yml' of github.com:jaimergp/conda into con…

3b5841f

…da-create-yml

no timezone in timestamps for py<3.11

eee2ea7

beeankha mentioned this pull request Sep 23, 2024

Collapse conda_env into conda #11633

Open

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for YML files in `conda create --file` (WIP) #14113

Add support for YML files in `conda create --file` (WIP) #14113

jaimergp commented Aug 1, 2024 •

edited

Loading

codspeed-hq bot commented Aug 1, 2024 •

edited

Loading

jaimergp commented Aug 6, 2024 •

edited

Loading

travishathaway commented Aug 6, 2024

jezdez commented Aug 6, 2024 •

edited

Loading

jaimergp commented Aug 6, 2024

jaimergp commented Aug 20, 2024

jaimergp commented Aug 22, 2024

kcpevey commented Sep 13, 2024

Add support for YML files in conda create --file (WIP) #14113

Are you sure you want to change the base?

Add support for YML files in conda create --file (WIP) #14113

Conversation

jaimergp commented Aug 1, 2024 • edited Loading

Description

Checklist - did you ...

codspeed-hq bot commented Aug 1, 2024 • edited Loading

CodSpeed Performance Report

Merging #14113 will degrade performances by 14.78%

Summary

Benchmarks breakdown

jaimergp commented Aug 6, 2024 • edited Loading

conda environment modification UX

--file flags

Solving-affecting flags

All other flags

The proposed schema

travishathaway commented Aug 6, 2024

jezdez commented Aug 6, 2024 • edited Loading

jaimergp commented Aug 6, 2024

jaimergp commented Aug 20, 2024

jaimergp commented Aug 22, 2024

kcpevey commented Sep 13, 2024

Add support for YML files in `conda create --file` (WIP) #14113

Add support for YML files in `conda create --file` (WIP) #14113

jaimergp commented Aug 1, 2024 •

edited

Loading

codspeed-hq bot commented Aug 1, 2024 •

edited

Loading

jaimergp commented Aug 6, 2024 •

edited

Loading

`--file` flags

jezdez commented Aug 6, 2024 •

edited

Loading