Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convenient support for out-of-tree sources #102

Open
kolloch opened this issue Feb 12, 2020 · 29 comments
Open

Convenient support for out-of-tree sources #102

kolloch opened this issue Feb 12, 2020 · 29 comments
Labels
enhancement New feature or request

Comments

@kolloch
Copy link
Collaborator

kolloch commented Feb 12, 2020

If you use crate2nix with your own project, you can either use a tools.nix based workflow in which Cargo.nix is generated in a derivation and imported. Or you can pregenerate your Cargo.nix file if you prefer or using git with sub modules.

If you want to package a crate which source is not in the tree but that you fetch with a derivation, only the tools option is currently convenient. Pregeneration of the source is not really supported.

I propose that crate2nix:

  • Automatically detects the persence of a crate-src.nix file that evaluates to a derivation fetching the sources.
  • If so, during generation it will instantiate crate-src.nix and expect the Cargo.lock/Cargo.toml file in the retrieved source.
  • The generated Cargo.nix files and the crate-hashes.json files are stored in the same file as the sources.
@andir
Copy link
Collaborator

andir commented Feb 13, 2020

That does sound good. How do you think about just providing an identiier from crates.io (as an alternative path)?

Perhaps I am thinking a bit too far here but having the ability to feed crate2nix a list of crates (from crates.io) would be nice for cli tools (and testing a bunch of crates for compatibility, caching them on a binary cache, …).

@Mic92
Copy link
Member

Mic92 commented Feb 13, 2020

Regarding inclusion into nixpkgs: How big would be the generated files?

@kolloch
Copy link
Collaborator Author

kolloch commented Feb 13, 2020

@Mic92 Pretty much proportional to the number of transitive dependencies - 135KB for Crate2nix for hundreds of dependencies. 2-3 times the size of Cargo.lock + Cargo. toml. If it were a priority to reduce that, there are some things we can do but I would not expect wonders.

Similar binaries could be grouped into virtual workspaces so that their dependencies resolve to the same versions where possible and to avoid duplication.

@kolloch
Copy link
Collaborator Author

kolloch commented Feb 13, 2020

Oh, the static part of Crate2nix code is about 17KB and can be easily factored out of needed.

@Mic92
Copy link
Member

Mic92 commented Feb 13, 2020

I wonder if recursive nix could be used to avoid having the generated files.

@kolloch
Copy link
Collaborator Author

kolloch commented Feb 13, 2020

You can easily avoid checking in the generated files with "import from derivation". With recursive nix, you have to know all your dependencies beforehand - so, no, that doesn't help.

You could code most of the functionality of crate2nix in nix itself now that the Toml parser in nix is good enough. Then you would still need to copy the Cargo.lock files into nixpkgs. And I doubt that it would be faster.

And speed is probably the main concern? Because of the template language, the dependency tree of crate2nix is e.g. 5x bigger than ripgrep.

@kolloch
Copy link
Collaborator Author

kolloch commented Feb 13, 2020

The approach with checking in the Cargo.lock file should also work in principle with naersk @nmattia

@Mic92
Copy link
Member

Mic92 commented Feb 13, 2020

Yeah speed and memory consumption are the main concern we have with evaluation of nixpkgs.

@kolloch
Copy link
Collaborator Author

kolloch commented Feb 13, 2020

I think we can cut down on file size by a factor of two or so -- afterwards it becomes difficult if it should stay readable. We at least have to preserve the info that allows feature resolution and the information necessary to retrieve the code.

@kolloch
Copy link
Collaborator Author

kolloch commented Apr 15, 2020

I pushed early but fairly complete support for out-of-tree sources!

See https://github.com/kolloch/crate2nix/blob/master/out-of-tree-sources.md for a description.

This does not (yet) try to trim down the size of the generated nix-file since that is somewhat orthogonal. Tell me what you think! @Mic92 @andir

@Mic92
Copy link
Member

Mic92 commented Apr 18, 2020

This looks really good. I just tried it. For use in nixpkgs we probably would import the Cargo.nix file into a different file to add things like metadata and some other customization. I wonder if we should default to niv to update all the sources. It handles github & git & tarballs well so far, which is more or less how all our Rust projects are built so far. I will experiment a bit.

@Mic92
Copy link
Member

Mic92 commented Apr 18, 2020

I just tried to package a couple of command line tools we have in nixpkgs:

jq -r "keys" < nix/sources.json.txt
[
  "bat",
  "exa",
  "fd",
  "hexyl",
  "lsd",
  "miniserve",
  "vivid"
]

However it run into conflicts:

$ crate2nix generate
Generated ./workspace.nix successfully.
Building ./workspace.nix workspaceMemberDirectory: done.
Generated ./Cargo.toml successfully.
Updating Cargo.lock:
  "cargo" "generate-lockfile" "--manifest-path" "./Cargo.toml"
  warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
  package:   /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/workspace_members/fd/Cargo.toml
  workspace: /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/Cargo.toml
  warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
  package:   /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/workspace_members/bat/Cargo.toml
  workspace: /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/Cargo.toml
  warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
  package:   /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/workspace_members/exa/Cargo.toml
  workspace: /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/Cargo.toml
  warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
  package:   /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/workspace_members/hexyl/Cargo.toml
  workspace: /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/Cargo.toml
  warning: profiles for the non root package will be ignored, specify profiles at the workspace root:
  package:   /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/workspace_members/miniserve/Cargo.toml
  workspace: /home/joerg/git/nixpkgs/pkgs/tools/rust-applications/Cargo.toml
      Updating crates.io index
  error: failed to select a version for `git2`.
      ... required by package `bat v0.13.0 (/home/joerg/git/nixpkgs/pkgs/tools/rust-applications/workspace_members/bat)`
  versions that meet the requirements `^0.13` are: 0.13.2, 0.13.1, 0.13.0

  the package `git2` links to the native library `git2`, but it conflicts with a previous package which links to `git2` as well:
  package `git2 v0.9.1`
      ... which is depended on by `exa v0.9.0 (/home/joerg/git/nixpkgs/pkgs/tools/rust-applications/workspace_members/exa)`

  failed to select a version for `git2` which could resolve this conflict
Error: "cargo" "generate-lockfile" "--manifest-path" "./Cargo.toml"
=> exited with: 101

Here is the full list of sources:

sources.nix.txt
sources.json.txt

{
  "sources": {
    "bat": {
      "type": "Nix",
      "import": "nix/sources.nix",
      "attr": "bat"
    },
    "exa": {
      "type": "Nix",
      "import": "nix/sources.nix",
      "attr": "exa"
    },
    "fd": {
      "type": "Nix",
      "import": "nix/sources.nix",
      "attr": "fd"
    },
    "hexyl": {
      "type": "Nix",
      "import": "nix/sources.nix",
      "attr": "hexyl"
    },
    "lsd": {
      "type": "Nix",
      "import": "nix/sources.nix",
      "attr": "lsd"
    },
    "miniserve": {
      "type": "Nix",
      "import": "nix/sources.nix",
      "attr": "miniserve"
    },
    "vivid": {
      "type": "Nix",
      "import": "nix/sources.nix",
      "attr": "vivid"
    }
  }
}

Do you think there is a way to work around these conflicts?

@kolloch
Copy link
Collaborator Author

kolloch commented Apr 18, 2020

Hi @Mic92, Thank you so much for your experiments!

Conflicts

Aaah, to bad about the conflicts in the native libraries. Since crate2nix would libs independently again, I think that this would actually not lead to compile time errors. But the cargo error blocks this.

I see two ways forward:

  1. Merge the various Cargo.lock files.
    1a. Require a Cargo.lock file to already exist in the source. Makes the code easier but makes it much more likely that we have the same dependency with different minor versions in different Cargo.lock files.
    1b. Generate new Cargo.lock files for each source from the same index state. On cargo update call per source.
  2. Resolve the dependencies independently in crate2nix. For crates in the standard index this is fairly easy (see below). We would need to prefetch dependencies from git. BTW, it would be nice if nix-prefetch-git had the store path as part of the JSON output.

I'll give it a go and report back. It would basically mean that the Cargo.toml/Cargo.lock would go away. At the same time, I hated having them as a "clutch" and liked having them because you still could use Cargo to build.

Resolving deps without Cargo

I am fairly confident that I could write the resolver in rust. It is actually quite easy to read the index and so on. Similar to a vendored index, this resolving could even happen in a derivation (well, not in nixpkgs because of the import-from-derivation not-wanted issue) I implemented the version matching in nix today/yesterday, because, ...ehem... an invisible force made me. ;)

https://github.com/kolloch/nix-cargo-index

It is not terribly fast, though, but fun to play with! The most complex piece is the semver parser. I wonder if someone already wrote one...

In rust code, I could just use the existing crate to do that..

niv

I basically included the "Nix" source type also as an escape hatch to do crazy stuff but mostly for niv. It definitely makes sense for github sources.

Ergonomically, I could even make using niv sources easier by including an option to make "--import ./nix/sources.nix" the default for "source add nix". In that case you'd only have to write source add nix ..your name in the sources....

@Mic92
Copy link
Member

Mic92 commented Apr 18, 2020

Hi @Mic92, Thank you so much for your experiments!

I see two ways forward:

  1. Merge the various Cargo.lock files.
    1a. Require a Cargo.lock file to already exist in the source. Makes the code easier but makes it much more likely that we have the same dependency with different minor versions in different Cargo.lock files.

We require so far Cargo.lock for most applications, when using buildRustPackage, so its not a regression.

1b. Generate new Cargo.lock files for each source from the same index state. On cargo update call per source.

This has the advantage though that we build the same version that upstream tests.

  1. Resolve the dependencies independently in crate2nix. For crates in the standard index this is fairly easy (see below). We would need to prefetch dependencies from git. BTW, it would be nice if nix-prefetch-git had the store path as part of the JSON output.

Should be straightforward to fix. Otherwise does nix-prefetch work for you?

I'll give it a go and report back. It would basically mean that the Cargo.toml/Cargo.lock would go away. At the same time, I hated having them as a "clutch" and liked having them because you still could use Cargo to build.

Resolving deps without Cargo

I am fairly confident that I could write the resolver in rust. It is actually quite easy to read the index and so on. Similar to a vendored index, this resolving could even happen in a derivation (well, not in nixpkgs because of the import-from-derivation not-wanted issue) I implemented the version matching in nix today/yesterday, because, ...ehem... an invisible force made me. ;)

Would this produce the same results as cargo's dependencies tree.

kolloch/nix-cargo-index

It is not terribly fast, though, but fun to play with! The most complex piece is the semver parser. I wonder if someone already wrote one...

Do you think it could run into performance problems if we have more packages?
i.e. 100-1000?
I think this is what most projects where using: https://github.com/steveklabnik/semver

niv

I basically included the "Nix" source type also as an escape hatch to do crazy stuff but mostly for niv. It definitely makes sense for github sources.

Ergonomically, I could even make using niv sources easier by including an option to make "--import ./nix/sources.nix" the default for "source add nix". In that case you'd only have to write source add nix ..your name in the sources....

Yes that would be great! Right now we would have redundant information.

@Mic92
Copy link
Member

Mic92 commented Apr 18, 2020

store path support for nix-prefetch-git: NixOS/nixpkgs#85511

@kolloch
Copy link
Collaborator Author

kolloch commented Apr 18, 2020

store path support for nix-prefetch-git: NixOS/nixpkgs#85511

Thank you :)

@kolloch
Copy link
Collaborator Author

kolloch commented Apr 18, 2020

Cargo.lock vs other dependency resolution: I think you lean towards going with the Cargo.lock files from upstream for now. And I agree.

Advantage: Same versions as upstream, probably working better. Disadvantage: More minor version differences.

I agree. Overall, reliability is more important than squashing build time. I'll try to make this work.

If we get into trouble with scaling to a lot of packages, the workaround is simple: Use multiple workspaces. That said, I could imagine that it scales quite well. Not sure about nix reading that much data though. I already thought about splitting it into multiple files...

kolloch added a commit that referenced this issue May 4, 2020
I first built the out-of-tree source support from Cargo workspaces.
This had the advantage and the disadvantage of resolving the crate
versions over all out-of-tree sources.

See discussion in #102.
@Mic92
Copy link
Member

Mic92 commented May 4, 2020

The nix-prefetch-git is merged now.

kolloch added a commit that referenced this issue May 4, 2020
Lock files are no longer merged by generated a Cargo workspace but by
crate2nix itself.

There is `tools.nix` support but it is still a little weird.
@kolloch
Copy link
Collaborator Author

kolloch commented May 4, 2020

Here is the full list of sources:

sources.nix.txt
sources.json.txt

Interesting enough, I get a niv error if I use those...

❯ niv show
miniserve
  homepage: <barabajagal>
  url: https://github.com/svenstaro/miniserve/archive/4934905d87c1d9fb1d6deaa3adf7908450da4410.tar.gz
  owner: svenstaro
  branch: master
  url_template: https://github.com/<owner>/<repo>/archive/<rev>.tar.gz
  repo: miniserve
  type: tarball
  sha256: 1a7hqpr71ffsxaj1cvbf479cxqzif2byif1j6lq2km2zszwp3sra
  description: :star2: For when you really just want to serve some files over HTTP right now!
  rev: 4934905d87c1d9fb1d6deaa3adf7908450da4410
exa
  homepage: https://the.exa.website/
  url: https://github.com/ogham/exa/archive/78ba0b8973cbfbb48b85b8beae6f7de412064d11.tar.gz
  owner: ogham
  branch: master
  url_template: https://github.com/<owner>/<repo>/archive/<rev>.tar.gz
  repo: exa
  type: tarball
  sha256: 02mppg3bca6fxizvwga4nvpq8yhwpgbf29szgz8qaf84hqphzm58
  description: A modern version of niv: <stdout>: commitBuffer: invalid argument (invalid character)

kolloch added a commit that referenced this issue May 4, 2020
@kolloch
Copy link
Collaborator Author

kolloch commented May 4, 2020

The "workspace-less" out-of-tree sources support is ready for testing.

Unfortunately, some of the crates that you wanted to test, fail to build. It is not impossible that I introduced an error with my changes. It is also possible that they fail to build individually.

@Mic92
Copy link
Member

Mic92 commented May 4, 2020

I can try some other ones.

@kolloch
Copy link
Collaborator Author

kolloch commented May 5, 2020

Some where about needing the nightly compiler, some were about missing native dependencies (solvable with overrides hopefully).

@Mic92
Copy link
Member

Mic92 commented May 5, 2020

I tried with coloursum, mdcat, ripgrep and tre. Works well so far.
In nixpkgs we could implement it like this:

rustPackages.nix:

{ newScope, workspaceMembers }: let
  callPackage = newScope workspaceMembers;
in {
  ripgrep = callPackage ./ripgrep {};
  tre = callPackage ./tre {};
}

ripgrep

{ ripgrep }:
ripgrep.overrideAttrs (old: {
  preFixup = ''
    (cd target/release/build/ripgrep-*/out
    installManPage rg.1
    installShellCompletion rg.{bash,fish})
    installShellCompletion --zsh "$src/complete/_rg"
  '';
  meta = {
    # ...
  };
})

There is one thing that bugs me a bit however. To build rust packages we not have potentially 3 locations to touch:

  1. Adding the source either with niv or your builtin fetcher support.
  2. Adding crates overrides.
  3. The entry in rustPackages to add metadata and further overrides

The other problem is that we currently have no way of specifying patches.

In an ideal world one could have a derivation file that were crate2nix would take source + patches from as well as the meta data:

{ ripgrep, fetchFromGitHub, fetchpatch }:
ripgrep.overrideAttrs (old: {
  src = fetchFromGitHub {
    owner = "BurntSushi";
    repo = pname;
    rev = version;
    sha256 = "1c0v51s05kbg9825n6mvpizhkkgz38wl7hp8f3vzbjfg4i8l8wb0";
  };
  patches = [(fetchpatch {})];
  preFixup = ''
    (cd target/release/build/ripgrep-*/out
    installManPage rg.1
    installShellCompletion rg.{bash,fish})
    installShellCompletion --zsh "$src/complete/_rg"
  '';
  meta = {
    # ...
  };
})

crate2nix would it could call it like that to get the source

(callPackage ./ripgrep {
  # actually one could import `rustPackages.nix` and do a `builtins.attrNames` to
  # get all package attributes to generate the dummy derivations. 
  ripgrep = stdenv.mDerivation { 
    name = "ripgrep";
  };
}).overrideDerivation (old: {
  phases = [ "unpackPhase" "patchPhase" "installPhase" ];
  installPhase = ''
    mkdir -p $out
    cp * $out
  '';
})

After building the source it could update the workspaceMember as before. Does this sound reasonable to you? This way we don't need special support for any fetcher logic + patch logic we have in nixpkgs.

@Mic92
Copy link
Member

Mic92 commented May 5, 2020

I now you like having your builtin prefetcher to not having to manually calculating checksums. However we have already more advanced tools for that like https://github.com/ryantm/nixpkgs-update or https://github.com/Mic92/nix-update that can handle way more than fetchgit.

@kolloch
Copy link
Collaborator Author

kolloch commented May 5, 2020

Hi @Mic92, I think that automatic updates would be awesome. Let's optimize the flow! I can't immediately see what requirements a package has to fulfill to be automatically updateable.

In the case of rust, the "auto-update" should be on the root package with its Cargo.lock. The other versions should still be whatever the Cargo.lock specifies. Correct?

E.g. with niv it would mean that we update the sources and the rerun crate2nix generate.

We could hook this into the update script mechanism. (I haven't dabbled with that before)

It would help me, if you could tell me what "interface" or "package format" crate2nix has to provide to make the packages auto-updateable. Maybe, the tools are already quite general, or we need to tune them a bit to work with this set up.

I can probably write the rust-specific glue if I know how it should look like.

@kolloch
Copy link
Collaborator Author

kolloch commented May 5, 2020

@Mic92 also, did you see my email about a VC?

@Mic92
Copy link
Member

Mic92 commented Jul 24, 2020

@kolloch do you still have time to work on this? If not we maybe should just describe the design we discussed and hope that someone else from the community can help on this.

@maisiliym
Copy link

There are crates without lock files, like serde_json, so it should be able to try without one.

@ghost
Copy link

ghost commented Sep 1, 2022

(I'm assuming that the new crate2nix source command is what this issue turned into).

Is there a best practice here for how to use this feature when upstream doesn't include a Cargo.lock?

crate2nix seems to insist on it being in the same directory as Cargo.toml, but with crate2nix source that directory is an immutable store path, so I can't put it there between running crate2nix source generate and crate2nix generate:

https://github.com/kolloch/crate2nix/blob/198de9f237cb97ad32e0da7fb914b8e8d9fd87b7/crate2nix/src/lib.rs#L228

Should I submit a PR to add a command-line flag to tell crate2nix generate where to look for the Cargo.lock? Or is there some much simpler way to deal with this that I've just totally missed? rg Cargo.lock sample_* didn't turn up any ideas.

@kolloch kolloch removed this from the 0.9 milestone Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants