-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lockfile implementation of categories instead of category #278
Comments
Preliminary experiment with micromamba indicates success. 🚀 |
Is there an existing branch or implementation to test? |
Sorry, not yet. I just did by hand the experiment with micromamba described above. Would you like to try an implementation? |
Sure! One question: how should duplicate entires in separate input files be handled. Say, for example, we are running
Currently, Conda Lock with solve for I would argue that it should be only in the |
As written, that would be a dependency conflict. The idea is to merge the dependencies into a single simultaneous solve. So what would happen logically is: dependencies:
- python =3.10
- pandas =1.4
- pandas =1.5 and this goes to the solver, which in this case would fail because there's no Pandas package with version 1.4 and 1.5 simultaneously. But let's suppose the versions were compatible, for example if dependencies:
- python =3.10
- pandas =1.5.1
- pandas =1.5 Then the solver would find the solution To compute which things should go in the Similarly, to compute which things should go in Therefore, in this case, all packages go into both Does this logic make sense? |
Yes, that does. Concatenating them together seems like the least error-prone solution. Just to clarify though, do we expect conda-lock to currently fail, or is this the wanted future behavior? Because when I ran the previous example on the current main branch, it did not fail. |
It is a bug that solving succeeds. I'm guessing that there is some failed merging behavior. I haven't looked at the code, but I expect in pseudocode (class and parameter names are probably incorrect!!!) that the Pandas dependencies should look something like: From They should merge into: |
So after some research and planning, I decided to split the implementation into 2 pieces. First, as you said, the current behavior when seeing repeated dependencies in multiple input files is to overwrite the version constraint rather than merge it. Since this is unwanted behavior, I wrote a mini PR to merge the version strings instead. I have a second branch with an implementation for supporting multiple categories in lockfiles (built on top of the merging versions branch) located here: https://github.com/srilman/conda-lock/tree/multiple-categories |
This sounds excellent!!! Thanks so much for digging into this! I will try to review this soon. |
I made a first pass at thinking through the details. (I might have some misconceptions about the implementation, so please correct any nonsense I write!) I get the impression that we may want to refactor a few things in order to implement this correctly... First we need to understand the context of It seems to me like we don't need to, and shouldn't, merge platforms in the One minor complication is that we don't know the list of platforms before running Perhaps one way to proceed is to refactor source-file-specific functionality into a platforms = platform_overrides or union(sf.platforms for sf in source_files) or DEFAULT_PLATFORMS
spec = {platform: aggregate_lock_specs([sf.spec(platform) for sf in source_files]) for platform in platforms} This way Does it make sense what I'm writing? |
Sorry for the delay @maresb! Overall, I agree with the approach you suggested, especiall
But to clarify, how will we parse source files in this approach? Right now, I believe we parse a source file given a platform, in order to handle things like os preprocessing selectors. In this new approach, would we parse the source file in a platform-independent fashion, and then apply a platform to it? I would much prefer that method, since that would make it easier to add more selector-related features in the future, such as
Either way, I would be happy to take a first pass implementing this. Any ideas on how to split this into smaller tasks in order to reduce the size of the overall PR? |
No worries, great to hear back from you @srilman! I am not sure I understand your question:
I think that multiple parsing passes are required because we don't know at the beginning what is the final list of platforms. So I think we need to parse unfiltered to extract the |
Feel free to take a stab at it. I think you are thinking about this more deeply than I am. |
In order to make it possible to, for instance, reliably install packages in both
dev
andmain
, I proposed in mamba-org/mamba#1209 changing thecategory
attribute from a string to a list of strings calledcategories
.There hasn't been any movement yet (which I'm aware of) towards an implementation. I have an idea which might offer an easy way forward, purely in conda-lock at first...
What would happen if we duplicate the entries in the lockfile, so that the lockfile looks roughly like:?
If we are lucky, then perhaps the conda-lock and micromamba implementations will follow the expected behavior. (To be tested!) This way we wouldn't need a new lockfile format version.
If this works, this is still rather redundant, but not catastrophically so. We should aim to do better in lockfile format v2. In the meantime, since the micromamba folks believe that extra attributes will be ignored, we could start generating interoperable lockfiles which look like
so that this can be read as v1 by ignoring
categories
, or alternatively read as v2 by ignoringcategory
(effectively containing duplicate entries). This way, we could first develop everything inconda-lock
, and then only once we're ready, work on implementing v2 in micromamba.The text was updated successfully, but these errors were encountered: