Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support revisions and more #1775

Merged

Conversation

andreabedini
Copy link
Member

Hello,

this PR consolidates my effors of the past few weeks. It solves the problem of supporting revisions (#1675) and streamlines the creation of the nix plan.

The major changes are to call-cabal-project-to-nix.nix.

I replace cabal v2-freeze with another tool (./nix-tools/make-install-plan) that

  1. computes an install plan just like v2-freeze would
  2. writes a plan.json
  3. writes a cabal.freeze file
  4. splits out the cabal files for all the configured packages in the plan (pre-existing packages don't have a cabal file in the plan)
  5. converts those cabal files to nix using Cabal2Nix

I changed plan-to-nix to reference the cabal files and their nix version into the nix plan (when available). When we don't have a cabal file in the plan (I assume this happens only for the pre-existing packages) we fall back to old reference to hackage.nix.

e.g. the nix plan for a simple package that depends on zlib looks like this

{
  pkgs = hackage:
    {
      packages = {
        bytestring.revision = (((hackage.bytestring)."0.10.12.0").revisions).default;
        zlib.revision = import ./cabal-files/zlib.nix;
        zlib.cabalFile = ./cabal-files/zlib.cabal;
        zlib.flags.non-blocking-ffi = false;
        zlib.flags.bundled-c-zlib = false;
        zlib.flags.pkg-config = false;

./cabal-files/zlib.cabal is the exact revision as used by cabal for the plan (according to index-state for example).
./cabal-files/zlib.nix is generated from the same cabal file

There's also another change I had to make around how we pass repositories to cabal. I realised that since we rewrite the repository urls, the install plan will have the wrong pkg-src:

      "pkg-src": {
        "type": "repo-tar",
        "repo": {
          "type": "secure-repo",
          "uri": "file:/nix/store/9df5wp18lssqk3w1nvw32kiwqi99793x-hackage-repo-hackage.haskell.org-at-2022-11-09T000000Z"
        }
      },

Rather than working around the issue (e.g. by another bunch of substitutions), I found a way to let cabal fetch the repositories without without messing up the urls. Basically I made a fake version of curl that maps urls to paths in the nix store. E.g.

    echo "Installing fake curl"
    PATH=${
      fakeCurl ({
        "http://hackage.haskell.org" = hackageRepo { index-state = cached-index-state; sha256 = index-sha256-found; };
      } // inputMap)
    }/bin:$PATH

This implementation is quite crude: a shell wrapper that does a simple substitution on the curl arguments, turning, e.g., http://hackage.haskell.org into file:///nix/store/.... I have another implementation which uses an actual webserver. Another alternative would be to wrap wget instead which has a much simpler command line interface.

This involved quite some changes and as it stands this gets rid of dotCabal entirely. (extra-hackages is just a collateral victim, I could be restored, maybe).

Now the plan has the right information

      "pkg-src": {
        "type": "repo-tar",
        "repo": {
          "type": "secure-repo",
          "uri": "http://hackage.haskell.org/"
        }
      },

and so that the nix version can have the right source too

    src = (pkgs.lib).mkDefault (pkgs.fetchurl {
      url = "http://hackage.haskell.org/package/zlib-0.6.2.3/package/zlib-0.6.2.3.tar.gz";
      sha256 = "807f6bddf9cb3c517ce5757d991dde3c7e319953a22c86ee03d74534bd5abc88";
      });
    }

I like this approach because it allows us to just do cabal update without having to parse and rewrite anything. Extra repositories and index-states are naturally managed by cabal. We only need to provide an input map (which we do already).

Last bit was rerouting the cabal file all the way down to the builder, sometimes replacing the revision.

The result is that this works:tm: I checked I can build cardano-node with it.

Final notes:

  • While I think this is the right approach, the implementation still leaves some technical debt. In particular, make-install-plan and plan-to-nix should be merged into a single program. As it stands the former converts the plan to json and latter parses the json back again.
  • This might or might not trigger other simplications in the codebase but, not having a good picture of it, I cannot tell. A
  • AFAIK there are only two references to hackage.nix left: one to get the index (basically only to get index-state-hashes.nix) and the other for cabal's pre-existing packages.
  • I haven't even looked at the tests. Likely I broke something.

I am keen on receiving feedback. Let me know what you think of this.

@hamishmack
Copy link
Collaborator

bors try

iohk-bors bot added a commit that referenced this pull request Nov 11, 2022
@iohk-bors
Copy link
Contributor

iohk-bors bot commented Nov 11, 2022

try

Build failed:

@hamishmack
Copy link
Collaborator

Did you check that this:

zlib.cabalFile = ./cabal-files/zlib.cabal;

Does not trigger unwanted rebuilds? I can't help worrying that this will include the hash of the plan in the hash for the library derivation itself.

Does every change to the plan result in a rebuild of zlib?

I think we discussed encoding the cabal files as nix strings to avoid this if it is a problem. Using __readFile ./cabal-files/zlib.cabal might also work.

@michaelpj
Copy link
Collaborator

Rather than working around the issue (e.g. by another bunch of substitutions), I found a way to let cabal fetch the repositories without without messing up the urls. Basically I made a fake version of curl that maps urls to paths in the nix store.

Yikes. This sounds a bit scary.

The thing I would naturally think to do in this case is to get cabal to use a file:/ repository pointing to the tarball that we have. Would that not allow cabal to do its thing normally without requiring us to intercept HTTP requests, which feels quite fragile?

Copy link
Collaborator

@michaelpj michaelpj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have a good sense of whether this makes sense. The main thing that makes me a bit nervous is the curl hack.

inherit parseIndexState parseSourceRepositoryPackages parseRepositories
# These are only exposed for tests
parseSourceRepositoryPackageBlock parseRepositoryBlock;
inherit parseIndexState parseSourceRepositoryPackages;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can your tool also spit out at least the index-states? would that let us fix the problem with us parsing them wrong? I guess the s-r-ps are harder because we rely on the magic comments...

@@ -540,7 +494,6 @@ final: prev: {
evalPackages = final.lib.mkDefault evalPackages;
inputMap = final.lib.mkDefault inputMap;
} ];
extra-hackages = args.extra-hackages or [] ++ callProjectResults.extra-hackages;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you purge this from everywhere? since now it seems like it's totally unsupported

@andreabedini
Copy link
Member Author

@hamishmack

Does not trigger unwanted rebuilds? I can't help worrying that this will include the hash of the plan in the hash for the library derivation itself.

It seems to only include the hash of the cabal file itself, which (I guess) would be fine.

      "prePatch": "cat /nix/store/2l2hnj92khrxkny7cwbzq5l3diwy30nb-zlib.cabal > zlib.cabal",

To test this I made a simple empty package built it adding zlib as depedency and again adding also another package (foldl). The derivation of zlib does not seem to have changed.

 {
-  "/nix/store/gvx4nv5bab8bhkf95vc8qzh6r73pblla-tmp-project2-exe-tmp-project2-0.1.0.0.drv": {
+  "/nix/store/xmr2ihs54kcs3a0zsrjyfy9mdgzsfvx5-tmp-project2-exe-tmp-project2-0.1.0.0.drv": {
     "outputs": {
       "data": {
-        "path": "/nix/store/a9r9pdd0kfh0ibwm8xlqc8sk1ps97a8h-tmp-project2-exe-tmp-project2-0.1.0.0-data"
+        "path": "/nix/store/b65sarw7gnjq4aqwnvqxz9x05k8iaj7v-tmp-project2-exe-tmp-project2-0.1.0.0-data"
       },
       "out": {
-        "path": "/nix/store/sf3wpjw5p52gdf4lcxai2jx6xaplnjhs-tmp-project2-exe-tmp-project2-0.1.0.0"
+        "path": "/nix/store/ihf8zzwzqx7shl6nqii5d78zgl4dmd3a-tmp-project2-exe-tmp-project2-0.1.0.0"
       }
     },
     "inputSrcs": [
-      "/nix/store/zhmajpzng3jjzqfjrs05nfrd8qknbbqy-source-root-exe-tmp-project2"
+      "/nix/store/0zrc9ynjpvfwsa9r9l3kisngxm2k5hib-source-root-exe-tmp-project2",
     ],
     "inputDrvs": {
       "/nix/store/3p8lp2fzr30nws00x811lg7k9z5axg45-glibc-locales-2.35-163.drv": [
         "out"
       ],
-      "/nix/store/4lcnpwddj2ii5dj01j5djn48k90jl80i-tmp-project2-exe-tmp-project2-0.1.0.0-config.drv": [
-        "out"
-      ],
       "/nix/store/f6k3g1wyzb3wk9rdvq60ifkd7pxvxzwa-bash-5.1-p16.drv": [
         "out"
       ],
@@ -28,13 +25,19 @@
       "/nix/store/i25q6bwyb6xjr59vbly2yr4785i0i977-remove-references-to.drv": [
         "out"
       ],
+      "/nix/store/i3511kdvz1mk918psg6isvcfz0mng4kf-tmp-project2-exe-tmp-project2-0.1.0.0-ghc-8.10.7-env.drv": [
+        "out"
+      ],
       "/nix/store/l5z5blrzvmkkxzah8j9mmhhdal3ppywh-stdenv-linux.drv": [
         "out"
       ],
+      "/nix/store/lxvgibbmjfxr0kmmzw89vg6pr7gvx7r6-foldl-lib-foldl-1.4.12.drv": [
+        "out"
+      ],
       "/nix/store/qjlndz62lyrb1xwfk5lysadsfsaqgwx6-zlib-lib-zlib-0.6.3.0.drv": [
         "out"
       ],
-      "/nix/store/ykj2l9y4g4hj3ncpd3hc4pdqpn5bc496-tmp-project2-exe-tmp-project2-0.1.0.0-ghc-8.10.7-env.drv": [
+      "/nix/store/v8ccikkvdinwv21z809zyqqwks08l79g-tmp-project2-exe-tmp-project2-0.1.0.0-config.drv": [
         "out"
       ]
     },
     ...

@andreabedini
Copy link
Member Author

andreabedini commented Nov 14, 2022

@michaelpj

The thing I would naturally think to do in this case is to get cabal to use a file:/ repository pointing to the tarball that we have. Would that not allow cabal to do its thing normally without requiring us to intercept HTTP requests, which feels quite fragile?

Notice that we cannot "just" point cabal to a 01-index.tar.gz or even to a checked out CHaP repo. Cabal will insist on generating 01-index.cache, 01-index.tar.idx right next to those file (which it cannot do because they live in /nix/store). So the previous solution was to do one (per repo I guess) extra derivations to move the files in a tmp spot and do cabal update there, and then link that output back into .cabal/packages. Also we had to know what repositories cabal will try to access ahead of time, because we need to know what to link in `.cabal/packages).

In this way we are free to do cabal update inside the build scripts without having to care about any of this. In particular we don't need to parse cabal.project.

The shell script that wraps curl does feel a bit hacky yes. I think the solution based on a local unix-socket accessible webserver,is cleaner https://gist.github.com/andreabedini/d726c191fd7b6eb93dd954115d679547

@michaelpj
Copy link
Collaborator

A couple of questions:

  1. What happens if we provide a cabal.project.local that includes a repository stanza for a repository that already exists in cabal.project but with a different location? Will it override it?
  2. You can cabal update a specific repository. Can you cabal update a file:/ repository without hitting the network?

Where I'm going with this is: what if we added an overriding repository stanza for each repository that pointed to a file:/ repository with our remote repo content, and then cabal updated just those repositories in a way that doesn't hit the network. Does this make any sense at all?

@andreabedini
Copy link
Member Author

  1. What happens if we provide a cabal.project.local that includes a repository stanza for a repository that already exists in cabal.project but with a different location? Will it override it?

It turns out they all get appeneded, despite having the same identifier. Clearly this is a code path that nobody has exercised before.

        , projectConfigRemoteRepos =
            [ RemoteRepo
                { remoteRepoName = RepoName
                    { unRepoName = "hackage.haskell.org" }
                , remoteRepoURI = http://hackage.haskell.org/
                , remoteRepoSecure = Just True
                , remoteRepoRootKeys =
                    [ "fe331502606802feac15e514d9b9ea83fee8b6ffef71335479a2e68d84adc6b0"
                    , "1ea9ba32c526d1cc91ab5e5bd364ec5e9e8cb67179a471872f6e26f0ae773d42"
                    , "2c6c3627bd6c982990239487f1abd02e08a02e6cf16edb105a8012d444d870c3"
                    , "0a5c7ea47cd1b15f01f5f51a33adda7e655bc0f0b0615baa8e271f4c3351e21d"
                    , "51f0161b906011b52c6613376b1ae937670da69322113a246a09f807c62f6921"
                    ]
                , remoteRepoKeyThreshold = 3
                , remoteRepoShouldTryHttps = True
                }
            , RemoteRepo
                { remoteRepoName = RepoName
                    { unRepoName = "cardano-haskell-packages" }
                , remoteRepoURI = https://input-output-hk.github.io/cardano-haskell-packages
                , remoteRepoSecure = Just True
                , remoteRepoRootKeys =
                    [ "aaa"
                    , "bbb"
                    , "ccc"
                    ]
                , remoteRepoKeyThreshold = 0
                , remoteRepoShouldTryHttps = False
                }
            , RemoteRepo
                { remoteRepoName = RepoName
                    { unRepoName = "cardano-haskell-packages" }
                , remoteRepoURI = https://somewhere-else
                , remoteRepoSecure = Just True
                , remoteRepoRootKeys =
                    [ "ddd"
                    , "eee"
                    , "fff"
                    ]
                , remoteRepoKeyThreshold = 0
                , remoteRepoShouldTryHttps = False
                }
            ]
  1. You can cabal update a specific repository. Can you cabal update a file:/ repository without hitting the network?

You can

Where I'm going with this is: what if we added an overriding repository stanza for each repository that pointed to a file:/ repository with our remote repo content, and then cabal updated just those repositories in a way that doesn't hit the network. Does this make any sense at all?

The result is that the plan will still have file:///nix/store/... as URIs. To be clear, it's not a big problem, we had a workaround before. But we will have to a bunch of string substitutions to remove those paths from the plan and I wanted to avoid doing that.

@michaelpj
Copy link
Collaborator

The result is that the plan will still have file:///nix/store/... as URIs.

Continuing my dumb questions: why is this a problem at all?

@andreabedini
Copy link
Member Author

Continuing my dumb questions: why is this a problem at all?

Because we don't want the plan to depend on how its built. The nix version of the plan should be entirely executable on its own. E.g. it should say: plutus-ledger-api from https:///... at hash 11223344 depends on lens-5.0.1 from https://hackage.haskell.org/package/lens-5.2/lens-5.2.tar.gz at hash xxxyyyzzz with this particular cabal file, etc, etc.

@michaelpj
Copy link
Collaborator

Because we don't want the plan to depend on how its built. The nix version of the plan should be entirely executable on its own.

... why? The plan is there to be consumed by more Nix code? I guess there's a problem if it depends on stuff in the Nix store that might not be there (e.g. if you've materialised it)?

Anyway, so trying to again phrase the problem:

  • We want cabal to plan for us, and we'd like it to do that with downloaded repository that we have (?) and without hitting the network for the repository (?) although we're allowed to do it for the remote sources (?) so we can get their hashes (?)
  • We want to have a build plan that refers to the remote repository for package sources
  • We want to be able to build the plan just using the downloaded repository that we have and we're allowed to hit the network because we've got FODs for all the remote sources (?)

@michaelpj
Copy link
Collaborator

michaelpj commented Nov 16, 2022

More dumb questions:

There's also another change I had to make around how we pass repositories to cabal. I realised that since we rewrite the repository urls, the install plan will have the wrong pkg-src:

Can we just... not rewrite the repository URLs at this stage? Why are we doing that? Could we let cabal generate the plan using the normal repository URLs, and then just mess with it later when we're building? 🤔

@andreabedini
Copy link
Member Author

andreabedini commented Nov 17, 2022

... why? The plan is there to be consumed by more Nix code? I guess there's a problem if it depends on stuff in the Nix store that might not be there (e.g. if you've materialised it)?

One fundamental thing is that changing the plan should not arbitrarily rebuild everything (which would happen in every package derivation depends on the plan). We already do this, it's one of the main features of haskell.nix.

Can we just... not rewrite the repository URLs at this stage? Why are we doing that? Could we let cabal generate the plan using the normal repository URLs, and then just mess with it later when we're building? thinking

I find it cleaner if the produced plan was much self-contained as possible, but I have to agree it is not an hard requirement. I think we could stll have a plan that points to nix paths for the packages. @hamishmack @angerman what do you think?

@hamishmack
Copy link
Collaborator

@hamishmack
Copy link
Collaborator

CC @L-as @brainrake

@angerman
Copy link
Collaborator

I'm not much a fan of the curl thing, but Hamish also argued it uses the default path for cabal, e.g. same codepaths executed as outside. I still wonder how much of this we could push into "fixing cabal" rather than working around it.

@andreabedini
Copy link
Member Author

@hamishmack

Will this break https://github.com/mlabs-haskell/haskell-nix-extra-hackage ?

🤔 likely, I will make sure that will keep working.

@angerman

I'm not much a fan of the curl thing, but Hamish also argued it uses the default path for cabal, e.g. same codepaths executed as outside. I still wonder how much of this we could push into "fixing cabal" rather than working around it.

I know! but I wouldn't know what exactly we need to fix. The problem is that cabal needs to "initialise the repo" and there's no other way that doing cabal update to trigger that, but if we do it wrong cabal update will also actually try to update the index. I wonder if I could make another utility that only "initialise the repo" (I keep using quotes because I am not entirely sure what this consists of).

@L-as
Copy link
Contributor

L-as commented Nov 22, 2022

We just depend on being able to pass hackages to cabalProject'

@michaelpj
Copy link
Collaborator

@L-as could you migrate to using something like CHaP instead?

@andreabedini
Copy link
Member Author

andreabedini commented Nov 29, 2022

I went to remove the curl hack as discussed on slack and I realised it's really not necesary at all. The reason we get

          "uri": "file:/nix/store/9df5wp18lssqk3w1nvw32kiwqi99793x-hackage-repo-hackage.haskell.org-at-2022-11-09T000000Z"

is because we leave that url in .cabal/config after bootstrapping the hackage tarball, but we don't have to. Indeed for extra repositories like chap we already do something slightly different, we use a nix store path to boostrap but we then use the original cabal configuration for the rest. Cabal still recognised the bootstrapped repo and won't do any network IO.

What I changed is basically to bootstrap hackage just like we do with extra-repositories, which is 1) download the tarball 2) make a temporary .cabal/config with the nix store path, run cabal v2-update, copy .cabal/packages/hackage.haskell.org to $out. That will be a valid repository that cabal later will recognise as hackage (when linked into .cabal/packages/hackage.haskell.org).

Also, I did stress test that the builder derviations do not depend on the plan itself. I kept changing the plan derivation (to add comments, simplify scripts etc) and doing nix build would rebuild the plan but rebuilding the project was a no-op.

edit:

I should have put back all the extra-hackages stuff.

@angerman
Copy link
Collaborator

angerman commented Dec 1, 2022

Cabal2nix?

@andreabedini
Copy link
Member Author

Cabal2nix?

Right, I am referring to nix-tools/lib/Cabal2Nix.hs. This is not nixpkgs's cabal2nix!

@michaelpj
Copy link
Collaborator

Where are we with this?

@andreabedini
Copy link
Member Author

Where are we with this?

We were waiting for hadrian support to be merged, which has just happened. Now I am updating the materialisations to check that everything is ok.

@andreabedini
Copy link
Member Author

Reporting here, @hamishmack has updated all the materialisations but we discovered a bug. In a situation where a plan contains different versions of the same package (e.g. cabal building cabal), haskell.nix somehow applies the cabal file revision from hackage to the local package, which is hell wrong.

This prompted me to fix a design wart that should eradicate this kind of issues. I know we are all keen to get this in but it's better to do it right (given the cost of rebuilding everything and busting all the caches).

The design wart is that I am passing the cabal file along with their nix version while they should be inside the nix version. Given they are keyed by name, it's too easy to mix them up.

modules/package.nix Outdated Show resolved Hide resolved
@andreabedini andreabedini mentioned this pull request Dec 20, 2022
3 tasks
Cabal files are now obtained from the index tarball (as cabal does)
directly. This allows us to always pick the cabal file revision that
cabal would pick, without having to understand or reference hackage.
hamishmack and others added 7 commits January 27, 2023 14:47
Nix derivations are already able to pass a attribute to the builder as a
file. This means we don't need to use pkgs.writeText to turn the
cabalFile attribute into a file, saving one derivation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants