Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To do: per-package channels #25

Closed
cjdoris opened this issue Feb 24, 2022 · 7 comments
Closed

To do: per-package channels #25

cjdoris opened this issue Feb 24, 2022 · 7 comments

Comments

@cjdoris
Copy link
Collaborator

cjdoris commented Feb 24, 2022

Here's a proposal.

channels = ["channel1", "channel2"]

[deps]
foo = "1"

[deps.bar]
version = "2"
channels = ["channel3"]

So each package can have a list of channels that it can be installed from. If not specified, it uses the top-level channels list, which if not specified defaults to ["conda-forge"] (as currently).

So in the example, foo implicitly has channels=["channel1", "channel2"].

Now when merging two packages of the same name from different CondaPkg.toml files, take the intersection of their channel lists. Throw an error if the intersection is empty (showing which TOML files specified what). If non-empty, pick one of the channels (arbitrarily?) and install with the channel::package syntax. Don't use the --channel argument at all.

This means that it is guaranteed that a package will be installed from one of the channels specified in the CondaPkg.toml file.

The empty intersection thing might be an issue though - the CondaPkg.toml author ideally should include all the channels that the package could be acceptably installed from. Perhaps only default to conda-forge for a given package if no occurrence of that package in any CondaPkg.toml file specifies any channels.

It also avoids the current weird "channel crossover" behaviour where one TOML file specifies Package1 and Channel1 and another specified Package2 and Channel2, and it's possible to have Package1 installed from Channel2 and Package2 installed from Channel1. (Because we just do conda install -c Channel1 -c Channel2 Package1 Package2.)

@moble
Copy link

moble commented Feb 24, 2022

Ah, this is interesting. Instead of just the channel::package=ver syntax, it looks like you could support the whole range of MatchSpec capabilities by using package[channel='channel',version='ver'] syntax, with other key-value pairs as needed — and just some extra processing to resolve conflicting channels, as you've suggested. In particular, the difference between a package version '=0.53' and '==0.53' seems easier to deal with if you just put it in as a key-value pair.

It's not clear to me what exactly you mean by

merging two packages of the same name from different CondaPkg.toml files

Are you talking about just merging the channel spec for each package, or merging the whole package spec so that conda only sees each package once? For example, suppose one of the Project.toml dependencies in your example has this CondaPkg.toml:

[deps.bar]
channels = ["channel3", "channel4"]
md5 = "12345678901234567890123456789012"

When combined with your CondaPkg.toml above, you could merge the channels in the two bar specs in the julia code, but still supply them as separate specs to conda, so that the resulting command would look like this:

conda install foo[version='1'] bar[version='2',channel='channel3'] bar[channel='channel3',md5='12345678901234567890123456789012']

Conda has no trouble with the fact that bar is specified twice, and already has lots of complex logic to merge the specs (which you surely don't want to reproduce). Is this what you're intending?

@cjdoris
Copy link
Collaborator Author

cjdoris commented Feb 24, 2022

You're absolutely right and in fact currently if a package appears in several CondaPkg.toml files with different versions then we specify the package several times. So indeed we could just allow the user to specify any of the spec fields and pass the specs on to Conda without changing them.

(Unfortunately Pip and Julia dependencies do need to be merged.)

So we could leave the current behaviour as is and just allow specifying more fields.

The main issue is with taking the union of the top-level channels from different toml files. It would be nice to guarantee that the packages in a file are installed from one of the specified channels, which isn't currently the case. To do this would require some logic with the top level channels and those on the package.

It would also be nice to let a package specify several channels it could come from.

@moble
Copy link

moble commented Feb 26, 2022

I think the --channel CLI flags (especially when priority is turned off) are intended to say "You'll be able to find every package at least once by searching these channels, but I don't care where you get these packages." So in that sense, it's a non-restrictive statement, and taking the union would be fine. On the other hand, I would think of per-package channels as being more precise, and thus restrictive: "You must not install the package from anywhere else."

If you make the top-level channels option restrictive, you run into the problem that someone might say channels=["defaults", "bioconda"], just because they know their packages can be found in those channels, and don't want to bother checking if there are any other channels that might also work. But then I come along and specify that one of the packages they request must actually come from conda-forge, and now there's a conflict where there doesn't need to be.

Ultimately the problem is that conda itself doesn't support more complex channel specification, making it difficult for you to do so. But I'm also not sure it's really necessary. How often do people know that a package should only be installed from a limited set of channels, while being unable to decide on one channel to install from? Even the bioconda people recommend prioritizing conda-forge, for example. It seems like it would be fair for you to go the easiest route of just taking the union of all top-level channels, and if people turn up problems with that approach in the future, address it with more specific options.

@cjdoris
Copy link
Collaborator Author

cjdoris commented Feb 28, 2022

You make a lot of good points, and I should go with what is most natural in Conda-land. I just find it weird that you can install the "same" (but actually different) package from different channels. But if that's how Conda works there's no point me trying to fight it.

So this proposal becomes: keep precisely the current behaviour, but allow the user to set all the fields of the Conda package spec, in particular the channel.

@moble
Copy link

moble commented Feb 28, 2022

I agree that it's weird, but also just how conda works. The only alternatives I can think of involve

  1. Calling conda install multiple times — once for each CondaPkg.toml. But that still involves ambiguity about the ordering, and would potentially install and reinstall the same packages, taking lots of time.
  2. Trying to solve things yourself, which would require either using the conda API in complicated ways, or recreating that API in julia code — all in an effort to make CondaPkg behave differently from Conda itself.

So both for simplicity and consistency, I think your proposal sounds like the best option.

@cjdoris
Copy link
Collaborator Author

cjdoris commented Mar 2, 2022

@moble I've implemented this on the main branch now, I'd be grateful if you could try it out. You can now do CondaPkg.add("llvmlite", channel="numba") or equivalently pkg> conda add numba::llvmlite.

For now I've only added support for specifying the channel - supporting subdir and build is for another day.

@cjdoris cjdoris closed this as completed Mar 11, 2022
@moble
Copy link

moble commented Apr 13, 2022

Sorry, this fell off my radar for a bit. Yes, this works very nicely with the latest release. Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants