Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature resolver, workspace inheritance #17

Open
wants to merge 31 commits into
base: main
Choose a base branch
from

Conversation

o-santi
Copy link

@o-santi o-santi commented Jan 15, 2024

This MR adds support for weak dependency features, and the new optional feature syntax, as well as workspace inheritance of cargo toml properties.

Feature resolver

I've completely rewritten the resolver.rs file, in order to support the new features. I stopped using overlays and instead went with attribute sets together with a merging function. I thought I could do it entirely stateless (without remembering which features were activated before the one we're activating), but I think it is impossible, exactly because of weak dependency features. Instead, the code looks more like a continuation passing style, where I pass the state of the packages into the next function (the seen argument, present in almost all functions).

Still, the code does look very similar to cargo's new feature resolver code, and it gives pretty good results: I've been able to compile cargo and alacritty with it and it is reasonably fast. Both of them are quite feature/dependency heavy so I think they give a fairly good stress test.

There still are some rough edges that I'm not sure I implemented correctly, mainly dev features not ending up mangling with normal features (see resolver.nix#L195), but the way I fixed it seems to give correct results. Also, nix flake check is failing, but with an error related to gen-workspaces git sources (it seems like there are some old urls? I haven't given a proper look), and I don't believe my changes have caused it.

Workspace Inheritance

This feature was mainly needed in order to compile cargo, because it is used fairly often there. The main changes that come from this is that pkgSet now must hold the edition when we call mkPkgInfoFromCargoToml, which also needs workspace variable to inherit from. I haven't added all the features that should be added here, just the ones necessary to compile cargo.

Rígille S. B. Menezes and others added 30 commits June 21, 2023 14:36
this is to not enable build dependencies on normal dependencies, but I
think I will refactor this to be a better pattern.
now features never turn packages on, and instead
`enablePackageWithFeatures` must be called. that already was the
pattern that was being used, so this actually simplifies the code.
now I should fix the cyclic references causing infinite loops.
it was missing an input for the noc template
@o-santi
Copy link
Author

o-santi commented Jan 16, 2024

I believe I've fixed the nix flake check issue. When rigille added the nix-filter, he forgot to add it for the tests too.

Copy link
Owner

@oxalica oxalica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, thanks for your help on this! Sorry for that I'm busy on moving recently and suffers from many problems IRL, thus cannot give a quick response. Hopefully I can start (re?)working on this next week. Currently I took a quick look of this PR.

Some general ideas:

  1. It's good to have workspace inherited fields. But it worth a split. It can be a much smaller change than the resolver itself.

  2. Ideally, it's preferred to update on the current overlay-based approach, since I believe it can use the lazy-evaluation to calculate in reverse-dependency order, only merging on necessary. But you mentioned that the performance of iterative approach (this PR) is also competent for most cases. It would be good to have some benchmark data.

  3. The format you are using seems differs from current one. Are you using a specific formater and/or configuration? Currently we don't auto-format, but it's better to be consistent with other code.

  4. nix flake check passes for me, which is a good start. But we need more tests, both unit tests and integration tests, for the new feature syntax.

    Also, nix flake check is failing, but with an error related to gen-workspaces git sources

    I don't reproduce that failure. Is that statement outdated or there's some impurities?

I pushed this PR into the branch resolver-v2-iterative now, for whom want to test and/or use v2 features.

Comment on lines +62 to +66
if [[ -z "$edition" ]]; then
globalEdition="$(jq --raw-output '.package.edition // ""' "$cargoTomlJson")"
else
globalEdition="$edition"
fi
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also same changes in other builders. Are they here to allow edition overriding? It's weird since edition is bounded by the code written, and using wrong edition may fail the compilation or cause unintended behavior changes. Why would this be necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is related to the workspace inheritance, because crates are allowed to inherit the edition from the main workspace, and so I defined edition as a variable that we derived from the workspace and only check the Cargo.toml if that isn't defined (because we don't have access to the workspace's toml at the build script).

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is related to the workspace inheritance, because crates are allowed to inherit the edition from the main workspace, and so I defined edition as a variable that we derived from the workspace and

Ok, I understood. But again, all workspace inheritance related changes should be split into another PR.

only check the Cargo.toml if that isn't defined

Also according the cargo reference, workspace inheritance only happens when explicitly specified {key}.workspace = true.

(because we don't have access to the workspace's toml at the build script).

We can make it possible by passing additional attribute like workspaceCargoToml. I don't think one workspace can spread in multiple repositories, or can it?

Copy link
Author

@o-santi o-santi Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also according the cargo reference, workspace inheritance only happens when explicitly specified {key}.workspace = true.

Yes, what I meant by not defined is that that line was returning {workspace : true} instead of the year of the edition.

We can make it possible by passing additional attribute like workspaceCargoToml. I don't think one workspace can spread in multiple repositories, or can it?

I think the actual problem is that you need to crawl these workspaces, because a workspace might have a nested workspace that inherits from the first one, AFAIK (I might be mistaken). Otherwise that sounds plausible.

@@ -77,6 +82,7 @@ configurePhase() {
while read -r name; do
read -r path
read -r binEdition

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code here is untouched and this diff seems unnecessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, I changed that line for debugging and didn't realize that it was there when changing. Will remove!

Comment on lines 65 to +68
filter (pkg:
(lockVersion != null -> pkg.version == lockVersion) &&
(lockSource != null -> pkg.source or null == lockSource))
(pkgsByName.${lockName} or []);
(pkgsByName.${lockName} or []);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a confusing since it looks like tree lines are all arguments. But the last line is actually the second argument for filter, while everything above is its first argument as a lambda.

I'd suggest:

Suggested change
filter (pkg:
(lockVersion != null -> pkg.version == lockVersion) &&
(lockSource != null -> pkg.source or null == lockSource))
(pkgsByName.${lockName} or []);
(pkgsByName.${lockName} or []);
candidates =
filter
(pkg:
(lockVersion != null -> pkg.version == lockVersion) &&
(lockSource != null -> pkg.source or null == lockSource))
(pkgsByName.${lockName} or []);

Comment on lines 131 to +134
in if elem feat prev then
prev
else if defs' ? ${feat} then
foldl' go ([ feat ] ++ prev) defs'.${feat}
else if hasPrefix "dep:" feat then
[ { dep = substring 4 (-1) feat; } ] ++ prev
else if m == null then
[ feat ] ++ prev
else if isWeak then
[ { dep = depName; feat = depFeat; } ] ++ prev
else
[ { dep = depName; } { dep = depName; feat = depFeat; } ] ++ prev;
else if defs' ? ${feat} then
foldl' go ([ feat ] ++ prev) defs'.${feat}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I makes the first branch have different indentation to all branches below. I'd like to keep it as-is, or break the line before if.

in
  if ... then
    prev
  else if .. then
    ...

Comment on lines +158 to +159
} else
{ type = "normal";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last branch is using a different indentation. Please align all branches.

lib/resolve.nix Show resolved Hide resolved
Comment on lines +180 to +182
lib.attrsets.foldAttrs (args: acc:
{ features = lib.lists.unique ((acc.features or []) ++ (args.features or []));
deferred = lib.lists.unique ((acc.deferred or []) ++ (args.deferred or []));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that lib.lists.unique costs O(n^2) for each invocation. I'm concerning about the performance on crates with lots of features, eg. windows-sys. Though Windows is unsupported by Nix, cross compilation and WSL are still relevant.

Also the features are unsorted (thus you have lib.naturalSort on tests). This will make the feature list passing into drvs dependent on the dep graph order (ie. definition order in Cargo.toml), thus preventing the actually-same dependency drvs to be reused.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, and I would prefer if it was as hashset of feature names, but it also wouldn't provide any order to that, and I didn't think that using attribute sets with null values for features were readable. Given that I didn't feel any performance issues I didn't think to change that, but I think I will soon.

cycle2 = testUpdate defs [ "c" ] [ "b" "c" ];
cycle3 = testUpdate defs [ "a" ] [ "a" "b" "c" ];
});
link1 = testUpdate defs [ "a" ] [ "a" ];
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there so many indentations? Are your trying to align the entries inside attrset with the let variables? They are irrelevant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, emacs' nix-mode's doing. I think I will end up needing an auto formatter.

@@ -80,6 +80,7 @@ rec {
mkRustPackageOrWorkspace =
{ defaultRegistries, pkgsBuildHost, buildRustCrate, stdenv }@default:
{ src # : Path
, lockPath ? null
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
, lockPath ? null
, lockPath ? src + "/Cargo.lock"

Thus the null check below can be removed.

workspace_members = (if manifest ? workspace then members else []);
root_package = (if manifest ? package then [ "" ] else []);
main-workspace = if manifest ? "workspace" then manifest.workspace else {};
maybe-crates = lib.attrNames (lib.filterAttrs (path: type: type == "directory") (builtins.readDir src));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why assuming all directories are members? Even though some directories looks like a crate, but we cannot import it just because of this, since it may fail the resolution if it's actually another workspace or orphan crate.

Ideally, we need traverse through path-dependencies from root Cargo.toml for workspaces without explicit members, but I think ignoring that case for now is a more simpler and correct way to go. Also because this PR is about the v3 resolver.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I changed this because they were using crates from the sub directory without listing them in the members section (I think cargo did this). The proper way would be to crawl the Cargo toml and figure out which of them should be included, but AFAIK, all this does is list these crates as "possible crates to include", so it shouldn't make any difference if they aren't imported.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I changed this because they were using crates from the sub directory without listing them in the members section (I think cargo did this).

Yes and no. According to cargo reference,

All path dependencies residing in the workspace directory automatically become members. Additional members can be listed with the members key, ...

If one sub directory is not included by path dependencies, it's a standalone package or another workspace, which needs their own Cargo.lock. It's an uncommon usage, but still possible.

The proper way would be to crawl the Cargo toml and figure out which of them should be included, but AFAIK, all this does is list these crates as "possible crates to include", so it shouldn't make any difference if they aren't imported.

The difference is non-members should not be resolved and their features should not be merged into the current workspace. Eg. if workspace members a and b both depend on c but with different features, their features will get merged, and both a and b will be built on the c with union of all features enabled (thus they can interoperate). If b is not a member, this never happen, and they can be built on different feature set of c.

@o-santi
Copy link
Author

o-santi commented Feb 2, 2024

Thanks for your review! I will address some comments, but I think it is worth going back and rewrite it in a overlay based approach. The main ideas should stay the same, and I think it should overall be faster. I'm also interested in seeing how NixOS's merging functions work, because they do almost the same things as I was trying to achieve (concatenate lists and or booleans on merge), and they seem to work relatively well for that.

The main problem that I had when I tried using overlays was deep changing the features sets without overriding any other part (which is what made me write the mergeChanges function), because they looked so unnatural and so ugly that I had to think that I was doing something wrong.

@o-santi
Copy link
Author

o-santi commented Feb 2, 2024

But yeah, just to finish up. The seen argument present in all the functions is already the prev: argument provided in the overlay approach, I would just need to add the final and switch a little bit the functions in order to use overlay functions. I think that the main issue would be that I would still like to go with the approach that I did (of merging the attribute sets), and overlays are kind of orthogonal to that.

@oxalica
Copy link
Owner

oxalica commented Feb 2, 2024

I think it is worth going back and rewrite it in a overlay based approach.

I was also working on this and had some sketch code. It would be good to estimate its complexity and performance comparing to the current PR before the decision on algorithms. So no need to rush rewriting it :)

Could you please make another PR for the workspace inheritance (with edition passing)? We can finish that first.

@o-santi
Copy link
Author

o-santi commented Feb 2, 2024

I can, I will get it done next week. The actual code pertaining to that is much much smaller and relatively simple, compared to the resolver.

@therishidesai
Copy link

Any update on when this will be updated and merged in? I'm currently using this fork to use nocargo since I have v2 dependencies.

@rigille
Copy link

rigille commented Apr 16, 2024

@therishidesai I would recommend using this one https://gitlab.com/deltaex/nocargo until this branch merges, we've been using that in production for the last months so it tends to receive updates/fixes faster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants