Skip to content
This repository has been archived by the owner on Oct 25, 2023. It is now read-only.

refactor: split code for lake update and lake build #138

Merged
merged 5 commits into from
Nov 24, 2022

Conversation

gebner
Copy link
Member

@gebner gebner commented Nov 22, 2022

Fixes #119 and fixes #137.

The code for lake update and lake build is now completely disjoint. lake build will never modify the manifest, and lake update always builds the manifest from scratch. (Which replaces the rm lean_packages/manifest.json hack I've been putting into lean3port/mathlib3port.)

The resolution policy is changed to "first revision wins" (going from the root package to the leaves of the dependency tree). If the root package specifies a dependency, then this dependency is always updated and overrides dependencies with the same name in dependent packages (e.g. std4 in mathlib4, which is imported both directly and transitively via aesop).

Local dependencies (.path) are now stored in the manifest as well. The manifest thus always contains the full information about where all the dependencies come from. When running lake build, dependencies are resolved by only looking at the manifest.

There is a bit of complication due to the -d flag. Since we don't have IO.FS.chdir but want to store relative paths in the manifest.json file, we need to keep two books during lake update, one for the relative paths (relative to the root package) and one for the actual paths relative to the current directory (this is MaterializeResult.pkgDir vs. MaterializeResult.relPkgDir).

@gebner
Copy link
Member Author

gebner commented Nov 22, 2022

So the windows story is bad, but as far as I can tell there's no regression here.

  • Apparently you can't delete a git repo using IO.FS.removeDirAll. Not sure why, but from what I can tell we didn't test this before. (And we still don't because I've replaced the deletion by a warning on Windows now.)
  • Test 44 (lake --old) randomly fails on Windows. It worked the second time. 🤷

@tydeu
Copy link
Member

tydeu commented Nov 22, 2022

@gebner

Thanks! Now I don't have to fix this all myself! I am a little confused though as to why you are storing local paths in the manifest? Can you elaborate some one the your reasoning there?

  • Apparently you can't delete a git repo using IO.FS.removeDirAll. Not sure why, but from what I can tell we didn't test this before. (And we still don't because I've replaced the deletion by a warning on Windows now.)

Yeah this is a bug I have encountered before. I should probably have filed an issue about in on the Lean 4 repository, but it honestly never occurred to me to do so.

@tydeu
Copy link
Member

tydeu commented Nov 22, 2022

Also, feel free to add yourself to the authors listing in the files you radically changed (e.g., Lake.Load.Main, Lake.Load.Manifest).

@tydeu tydeu added bug Something isn't working chore Maintenance tasks or refactors labels Nov 22, 2022
@gebner
Copy link
Member Author

gebner commented Nov 22, 2022

I am a little confused though as to why you are storing local paths in the manifest? Can you elaborate some one the your reasoning there?

The idea is to record the whole output of the dependency resolution step in the manifest, which has some advantages:

  1. There is less for lake build to do, and hence less opportunities to introduce nondeterminism.
  2. The manifest.json file documents all dependency locations, and you can check it as a user.
  3. If Lake decides to switch from one local source to another (or git for that matter), then it will show up in your git diff.
  4. You can edit the lockfile and replace a git dependency by a local one (and vice versa).
  5. It is also what Nix does in the flake lockfiles.

There are still some decisions done during lake build, most significantly the options are computed at that point.

Note that with the new resolution procedure, you no longer need to edit the manifest to achieve (4), you can just add a require foo from "bla" to your root lakefile.

I am happy to remove the PackageEntry.local case and rely on the dependency resolution at lake build time to figure out what the local paths should be. Then we could get rid of all the relative path bookkeeping.

@gebner
Copy link
Member Author

gebner commented Nov 22, 2022

This whole PR is super urgent BTW. The mathport CI has been broken for two days now. We need to get 4.1.0 out today.

@gebner
Copy link
Member Author

gebner commented Nov 23, 2022

Friendly ping!

It's okay if you don't have time right now, just say so. I can also do the release myself.

@tydeu
Copy link
Member

tydeu commented Nov 23, 2022

@gebner

Friendly ping!

Sorry about missing this. I didn't check GitHub yesterday. 😞

This whole PR is super urgent BTW. The mathport CI has been broken for two days now. We need to get 4.1.0 out today.

Ah, that's bad. But #137 should not be fatal. You should be able to pin the relevant version of transitive dependencies in the top-level lakefile to workaround this. Right?

I do not think 4.1.0 is ready for a formal release. However, if no workaround is possible, updating lean4 to the prerelease is fine.

@tydeu
Copy link
Member

tydeu commented Nov 23, 2022

@gebner

As to the design decision:

The idea is to record the whole output of the dependency resolution step in the manifest,
[...]
There are still some decisions done during lake build, most significantly the options are computed at that point.
[...]
I am happy to remove the PackageEntry.local case and rely on the dependency resolution at lake build time to figure out what the local paths should be. Then we could get rid of all the relative path bookkeeping.

This gives me some mixed feelings. You make a strong case for storing a static reproducible configuration in the lockfile. However, the fact that some decisions are still made at build time renders the current solution a bit of a mix (which I imagine is unlike Nix). My feeling is that we should either lean into the reproducibly completely or avoid it for now. In the former case, we could store the options in the manifest and even compile dependency lakefiles into oleans. This way multiple calls to lake build are guaranteed to use the same dependency configuration.

@tydeu
Copy link
Member

tydeu commented Nov 23, 2022

As an aside, you may also want to update the author information for Lake.Load.Materialize.

Lake/Load/Main.lean Outdated Show resolved Hide resolved
Lake/Load/Materialize.lean Outdated Show resolved Hide resolved
Lake/Load/Materialize.lean Outdated Show resolved Hide resolved
@gebner
Copy link
Member Author

gebner commented Nov 23, 2022

You should be able to pin the relevant version of transitive dependencies in the top-level lakefile to workaround this. Right?

I just checked and that's possible, but it's super inconvenient as you need to manually keep it in sync.

I do not think 4.1.0 is ready for a formal release.

What do you think is missing? The branch has been idling for months, and there's a couple of breaking changes in there that we should deploy rather sooner than later (like the manifest location change).

@gebner
Copy link
Member Author

gebner commented Nov 23, 2022

However, the fact that some decisions are still made at build time renders the current solution a bit of a mix (which I imagine is unlike Nix). My feeling is that we should either lean into the reproducibly completely or avoid it for now.

Yeah, no. In Nix you can still pass options to the locked dependency ("flake input"). The current solution in this PR is fully reproducible though.

@tydeu
Copy link
Member

tydeu commented Nov 23, 2022

Yeah, no. In Nix you can still pass options to the locked dependency ("flake input").

Oh, okay. In that case, I am fine with the PR's current approach.

@tydeu
Copy link
Member

tydeu commented Nov 24, 2022

@gebner

I just checked and that's possible, but it's super inconvenient as you need to manually keep it in sync.

I agree. I suggested it simply as a stop gap measure to reduce the urgency of the a Lake update.

I do not think 4.1.0 is ready for a formal release.

What do you think is missing? The branch has been idling for months, and there's a couple of breaking changes in there that we should deploy rather sooner than later (like the manifest location change).

I agree. The three major topics I still wanted to address before releasing 4.1.0 were: (1) issues with dependency resolution (i.e., this), (2) issues with dynlib builds (including #132), and (3) some problems with the custom target API (partially including leanprover/lean4#2759). I actually have coded a fix for (2) that I just need to debug and have planned what to do for (3). I have just been busy.

Here is my proposal: I will work to get my changes pushed by this weekend and then whatever is done will be released beginning of next week with a hard cut off on the end of the month (i.e., end of next Wednesday). That gives me a week to get everything ship shape.

@tydeu tydeu added this to the v4.1.0 milestone Nov 24, 2022
@gebner
Copy link
Member Author

gebner commented Nov 24, 2022

Would you prefer doing two releases (4.1.0 and say 4.2.0) in short order, or updating Lean 4 to a Lake prerelease? We need to get this out asap.

@tydeu
Copy link
Member

tydeu commented Nov 24, 2022

@gebner

We need to get this out asap.

Really? Even a week is too much? Even by this weekend? This issue has been here for quite a while and was only recently discovered, what is the urgency? The mathport build can be fixed by pinning the version as you confirmed worked above (and looking at the commit history of mathport it does not update often enough for that to be too much of a pain that it cannot wait a week).

On the PR front, I am fine merging it as soon as its done. Is it done?

Would you prefer doing two releases (4.1.0 and say 4.2.0) in short order, or updating Lean 4 to a Lake prerelease?

If the situation is desperate, updating to a prerelease is fine.

@gebner
Copy link
Member Author

gebner commented Nov 24, 2022

Really? Even a week is too much? Even by this weekend?

I mean yeah, why wait? There is a fix available after all.

This issue has been here for quite a while and was only recently discovered, what is the urgency?

We've only added aesop as a dependency last week, and this immediately triggered #137. We didn't hit the issue before because we didn't have many dependencies.

Now Scott wants to get started with category theory using aesop, and I'm not keen out keeping dependency revisions in sync manually.

There will be lots of other issues like this during the mathlib port, where we run into bugs simply by scaling up. The solution clearly cannot be to stay small because that used to work. We need to grow, and we're under time pressure.

looking at the commit history of mathport it does not update often enough

Every other day is still enough.

On the PR front, I am fine merging it as soon as its done. Is it done?

As far as I'm concerned, yes.

If the situation is desperate, updating to a prerelease is fine.

I don't really mind either way, I'll just update the Lake submodule to the latest v4.x.x revision then.

@tydeu tydeu merged commit 86a95c3 into leanprover:v4.x.x Nov 24, 2022
@tydeu
Copy link
Member

tydeu commented Nov 24, 2022

@gebner

Really? Even a week is too much? Even by this weekend?

I mean yeah, why wait? There is a fix available after all.

I think it would be nice to wait until I can write some proper release notes (which I can get done by this weekend) with migration information including the breaking change here and others like the breaking change of the manifest location, but that is up to you.

@tydeu
Copy link
Member

tydeu commented Nov 24, 2022

@gebner Also, do ping me if you update Lean 4 to the last v4.x.x. That way I know to merge it into master (to keep that branch containing the proper history of commits on Lean 4 master).

@gebner
Copy link
Member Author

gebner commented Nov 24, 2022

Sorry for the rush btw. The Lean 4 PR is on the queue leanprover/lean4#1879 and should merge itself in an hour or so.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working chore Maintenance tasks or refactors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants