Skip to content

Merging github repos and history

Eric Wieser edited this page Jul 20, 2023 · 16 revisions

Merging the git history

Reasons to do this:

  • It makes git blame trace back to mathlib3
  • It make the github contributor stats reflect not just the porting effort, but also the mathlib3 effort:
    • past contributors to mathlib3 are still recognized in the total contributor count, even if they no longer contribute
    • the age of the project is clearer; the history goes back much further
  • It gives us the option of merging the repos without breaking links to commits.

The approach I suggested is to avoid rewriting any history, and instead carefully structure merge commits to produce

gitGraph
   branch mathlib3
   commit
   checkout main
   commit
   checkout mathlib3
   commit
   checkout main
   commit id: "last porting commit in mathlib4"
   commit
   checkout mathlib3
   commit id: "last commit in mathlib3" tag: "port-complete"
   checkout main
   commit id: "last forward-porting commit in mathlib4" tag: "port-complete"
   checkout mathlib3
   branch fix-hist
   checkout main
   branch _
   checkout main
   commit
   checkout fix-hist
   commit id: "mathport output"
   checkout main
   commit
   checkout fix-hist
   merge _ id: "Merge of last mathport output into master, discarding all changes from mathport"
   checkout main
   commit
   checkout fix-hist
   merge main id: "Combine the mathlib3 and mathlib4 repos"

Note that the fix-hist branch can happen in parallel to further mathlib4 development; as shown in the diagram, we merge the histories at the port-complete tags, and then can re-merge any changes that happened while we were working on the history merge. This subsequent merge is trivial, since the previous one should be a no-op. This means we do not need to rush into doing this.

The final commit of fix-hist is then pushed directly to master: crucially, without going through bors which would flatten the history we spent all the effort creating.

Zulip message:
* "Combine the mathlib3 and mathlib4 repos"
|\
| * Merge of last mathport output into master, discarding all changes from mathport
| |\
| | * automated fixed (Mathbin -> Mathlib)
| | * mathport output for last commit in mathlib3, with lean3 versions deleted
| | * last commit in mathlib3
| | |
| |
* | misc mathlib4 features that happen
* | misc mathlib4 features that happen
|/
/
|
* last forward-porting commit in mathlib4
* last porting commit in mathlib4
* misc mathlib4 commits
|

Merging the repos

There is a Zulip Poll here. Note that option 1 has a typo in the poll, and the corrected version is shown as the fourth option!

Option 1: do all future work in leanprover-community/mathlib4

This is the "do nothing" option, where the merged history is pushed to mathlib4, and mathlib3 is left alone.

Advantages:

  • Mathlib4 commit messages of the form feat: some feature (#XXXX) still auto-link to the right PR
  • We continue with a "clean" issue tracker that doesn't have ancient lean3 issues in it

Disadvantages:

  • Mathlib3 commit messages of the form feat: some feature (#XXXX) do not link correctly

Option 2: do all future work in leanprover-community/mathlib

In this option we push the merged history to mathlib/main, and rename the current mathlib/master branch to lean3-main (by dropping the master branch entirely we avoid confusion between the two; and main is now the git default anyway). We then transfer all the open issues, copy across all the branches, and then ask contributors to re-open their PRs (we can write a bot that gives them a link to do this)

Advantages:

  • Mathlib3 commit messages of the form feat: some feature (#XXXX) still auto-link to the right PR
  • We don't have to type a 4
  • Existing mathlib3 PRs can be revived by merging master!

Disadvantages:

  • Mathlib4 commit messages of the form feat: some feature (#XXXX) do not link correctly
  • Open PRs have to be re-opened, and discussion is discontinuous as a result.
  • We inherit all the old lean3 issues and PRs
  • Downstream lean3 projects running leanproject up (there should be almost none of these) have to add branch = "lean3-main" to the mathlib entry in their leanpkg.toml.

Option 3: rename leanprover-community/mathlib to leanprover-community/mathlib3, leanprover-community/mathlib4 to leanprover-community/mathlib

Advantages:

  • Mathlib4 commit messages of the form feat: some feature (#XXXX) still auto-link to the right PR
  • We don't have to type a 4
  • Full URLs to mathlib4 issues and PRs automatically redirect to urls without the 4

Disadvantages:

  • Mathlib3 commit messages of the form feat: some feature (#XXXX) do not link correctly
  • Any full URLs to mathlib3 issues and PRs now break