Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase: invent a better way to recreate commit topology (think: `--preserve-merges` done right) #447

Closed
wants to merge 37 commits into from

Conversation

Projects
None yet
6 participants
@dscho
Copy link
Member

commented Dec 23, 2017

The Git for Windows project uses the "Git garden shears" (a Unix shell script, piggy-backing on the interactive rebase) to rebase a thicket of branches, maintaining the branch structure.

To this end, it invents a couple of new commands for the todo list to

  1. label the current revision with an easy-to-read name
  2. reset the current revision to a previously-labeled one
  3. merge previously-labeled revisions

In contrast to --preserve-merges, this design makes the topology clear in the todo list and allows for reordering commits or even for changing the branch topology (introducing new branches, reordering commits from several branches into a single one, etc

The shears.sh script uses some ugly tricks to "add" those commands, causing issues with the stability, testability and performance.

This Pull Request (which is of course not intended to be merged by the Git project, why use Pull Requests when you can force everybody to send patches through a lossy medium like a mailing list) has the patches to teach core Git's rebase -i proper to perform the same trick.

The PR was opened mainly to leverage the Travis CI configuration to get this tested more thoroughly than a mere patch review ever could.

Funnily enough, those patches are already maintained in a thicket of branches, which are of course maintained using the patched rebase -i itself (using --recreate-merges=no-rebase-cousins, to be precise). So there has been some interactive testing already :-)

@dscho dscho force-pushed the dscho:sequencer-shears branch 2 times, most recently from 83d7862 to 3661c96 Dec 23, 2017

@boogisha
Copy link

left a comment

Could be unimportant nitpick in comparison to the functionality itself, but for what it's worth, alongside "this is what we have" and "this is what we (sometimes) don't want", the commit message might benefit from the additional/final "this is what we`re now making possible" diagram, too, concluding the text itself and making the point clear (if not already).

For example, I myself am not sure whether D' is going to stay with A as ancestor, which seems be desired here (basically making the whole process no-op if "todo" script is left unchanged)...?

That said, wouldn't this mode actually be a better default?

In rebasing from HEAD (F) to B it kind of seems unexpected that commit D gets rebased in the first place, as that one isn't found in "F to B" traversal, and even less expected that topology might/will change, and by default.

Sorry if I'm missing something obvious, might be lacking additional knowledge (or some fundamental one, even) .

Thanks, Buga

p.s. Not sure if I did this correctly, I (thought I) was commenting on commit d4d755d, "rebase -i: introduce --recreate-merges=no-rebase-cousins".

@dscho dscho force-pushed the dscho:sequencer-shears branch 3 times, most recently from e0d8c0d to 199c030 Dec 28, 2017

@dscho

This comment has been minimized.

Copy link
Member Author

commented Jan 2, 2018

@boogisha first of all: thank you for your interest and your comments.

Could be unimportant nitpick in comparison to the functionality itself, but for what it's worth, alongside "this is what we have" and "this is what we (sometimes) don't want", the commit message might benefit from the additional/final "this is what we`re now making possible" diagram, too, concluding the text itself and making the point clear (if not already).

For example, I myself am not sure whether D' is going to stay with A as ancestor, which seems be desired here (basically making the whole process no-op if "todo" script is left unchanged)...?

Good point. Here is the updated commit: ce0dc5c

That said, wouldn't this mode actually be a better default?

Yes, I would agree. Alas, backwards-compatibility prohibits us from making it the default. We could, of course, introduce a config setting later to opt-in to recreate merges by default.

Even better: it does not need to be me who makes that patch ;-)

In rebasing from HEAD (F) to B it kind of seems unexpected that commit D gets rebased in the first place, as that one isn't found in "F to B" traversal, and even less expected that topology might/will change, and by default.

        C
      /   \
A - B - E - F
  \   /
    D

In this example, we rebase "from" B "onto" B. That means that all commits that are reachable from F but not from B get rebased. Including D. If you were to perform a traditional rebase, git rebase B would flatten all those patches into a single branch (and skip the merges):

A - B - D' - C' - E' - F'

In short: it is totally expected that commits C, D, E and F are rebased.

What the no-rebase-cousins mode accomplishes is simply to avoid re-rooting the commits that are "incomparable" to the base commit of the rebase operation, i.e. commits which are neither ancestor of the base commit nor have the base commit as ancestor are not being forced into having the base commit as ancestor.

The idea is that running git rebase -i --recreate-merges=no-rebase-cousins <any-ancestor-of-HEAD> will create a todo list that, unless edited interactively, results in the identical HEAD after completing the rebase because every single pick and merge will fast-forward to the original commit.

@dscho dscho force-pushed the dscho:sequencer-shears branch 2 times, most recently from 04033bb to b5b2815 Jan 6, 2018

@jacob-keller

This comment has been minimized.

Copy link
Contributor

commented Jan 12, 2018

I haven't dug into this all the way, but it caught my eye, as I've run into the problems that --preserve-merges has, and I've seen your shears.sh script before.

I definitely want to see this feature in the main tree and built into the sequencer, as it's done here. Thanks for continuing to push this concept!

@dscho

This comment has been minimized.

Copy link
Member Author

commented Jan 12, 2018

I definitely want to see this feature in the main tree and built into the sequencer, as it's done here. Thanks for continuing to push this concept!

Will contribute it directly after v2.16.0 is out!

@boogisha

This comment has been minimized.

Copy link

commented Jan 19, 2018

@dscho Thanks, updated commit ce0dc5c now looks clear, even to me :)

(and sorry for a bit delayed reply)

Yes, I would agree. Alas, backwards-compatibility prohibits us from making it the default. We could, of course, introduce a config setting later to opt-in to recreate merges by default.

This one confuses me a bit, though - if --recreate-merges is a new option being added inside this series, what kind of "backwards-compatibility" are we to be concerned with...?

Unless you mean in comparison to existing default behavior of --preserve-merges, which --recreate-merges seems aimed at superseding - but even then it shouldn't matter that much, I would think, as you are not breaking any existing contracts/scripts, just that it should be clearly communicated that (once deprecated) --preserve-merges can\should be replaced with --recreate-merges=rebase-cousins (instead of plain --recreate-merges, which could then default to more sensible "no-rebase-cousins" mode).

In this example, we rebase "from" B "onto" B. That means that all commits that are reachable from F but not from B get rebased. Including D. If you were to perform a traditional rebase, git rebase B would flatten all those patches into a single branch (and skip the merges):

A - B - D' - C' - E' - F'

In short: it is totally expected that commits C, D, E and F are rebased.

Yeah, it came to me a bit after I posted the message that commits for rebasing are actually picked by (1) "reachable from HEAD but not reachable from B", instead of (2) "found inside HEAD to B traversal only". I'm just not using rebase much with merge commits (one reason being its current fragility), thus slipped the difference (with no merge commits, (1) and (2) make no difference, picking the same commits).

Thanks for clarifying.

The idea is that running git rebase -i --recreate-merges=no-rebase-cousins will create a todo list that, unless edited interactively, results in the identical HEAD after completing the rebase because every single pick and merge will fast-forward to the original commit.

This I understand (and like), and thus I find "no-rebase-cousins" to be a more appropriate default mode for --recreate-merges - which we seem to agree on, except the "backwards-compatibility" part, which I might be missing.

That said, I would even argue --recreate-merges=no-rebase-cousins should be default rebase behavior, but yeah, I can understand the backwards-compatibility constraint here, and the point of possibly having a config setting :) (hmm, might be that's what you thought I think by talking about defaults... or not? :) )

@jacob-keller

This comment has been minimized.

Copy link
Contributor

commented Jan 20, 2018

This one confuses me a bit, though - if --recreate-merges is a new option being added inside this series, what kind of "backwards-compatibility" are we to be concerned with...?

I thought he meant that you can't make "recreate-merges" be the default for rebase interactive mode. I certainly think that we could make no-rebase-cousins the default for recreate-merges.

@dscho

This comment has been minimized.

Copy link
Member Author

commented Jan 22, 2018

Yes, I would agree. Alas, backwards-compatibility prohibits us from making it the default. We could, of course, introduce a config setting later to opt-in to recreate merges by default.

This one confuses me a bit, though - if --recreate-merges is a new option being added inside this series, what kind of "backwards-compatibility" are we to be concerned with...?

Oh, I misunderstood! I thought you wanted to make --recreate-merges the default (as guessed correctly by @jacob-keller).

I can certainly make the no-rebase-cousins mode the default for --recreate-merges. It would make my life easier, anyway.

@boogisha

This comment has been minimized.

Copy link

commented Jan 24, 2018

Oh, I misunderstood! I thought you wanted to make --recreate-merges the default (as guessed correctly by @jacob-keller).

All clarified now, thanks both! :)

I can certainly make the no-rebase-cousins mode the default for --recreate-merges. It would make my life easier, anyway.

Yes, and it seems to make the most sense (to me at least) - having unchanged git rebase --recreate-merges "todo" script eventually ending up as a no-op.

@dscho dscho force-pushed the dscho:sequencer-shears branch from b5b2815 to 06b23ae Feb 26, 2018

@dscho dscho force-pushed the dscho:sequencer-shears branch from 06b23ae to 08d2ae2 Mar 9, 2018

@boogisha

This comment has been minimized.

Copy link

commented Mar 10, 2018

@dscho With all due respect to the great work you did so we actually have this feature implemented, and understanding that your impression might be different, but being heavily involved in the discussion / thinking / testing that led to it, too, I'm kind of left with a bitter aftertaste that d41a29c commit message doesn`t do justice to Sergey Organov.

I have a long list of reasons to support this claim (and I'm willing to discuss it, as I might have understood him better from the beginning), but not to waste your time, I'm proposing a slightly updated commit message instead, might be serving the purpose better to give credit where credit is due, something like this:

rebase -i --recreate-merges: offer a smart way to rebase merge commits

Previously, we punted on the question how to carry over amendments to
merge commits. Instead, we always performed new merges.

Such amendments to merge commits may very well be necessary, though,
e.g.  if one side of the history changed a function signature and the
other side added a caller.

However, Sergey Organov came up with an amazingly natural idea[1] how to 
preserve such amendments: instead of recreating the merge commit from 
scratch, we can incorporate the changes of the original merge commit.

Phillip Wood further decomposed and beautifully simplified its 
implementation[2], the fundamental idea behind it still being: whether a 
branch was merged or rebased, the resulting trees are the exact same. In 
mathematical terms, "merging" and "rebasing" are "dual" operations
(explained in more details in "patch theory"[3], too).

Therefore, when we rebased a merge commit's parent onto upstream, we can
re-interpret the result as being "merged with upstream".

By merging those "merged parents" into the original merge commit (using
the original merge parent as merge base), we can combine the amendments
of the original merge commit with the changes introduced by rebasing the
merge commit's parents.

This includes changes introduced in the upstream, but also changes
introduced by the user e.g. when amending, dropping or reordering
commits in the todo list.

Using aforementioned duality between merging and rebasing, we can now
re-interpret the result of that latest merge as "rebasing the merge
commit".

This is a very powerful technique with oddly intuitive results. Let's
expose this by introducing a new flag `-R` for the todo command `merge`,
and let's use it by default when generating those todo lists.

It needs to be a new option because that strategy requires an original
merge commit, with a matching number of parents, otherwise it simply
won't make sense.

[1] https://public-inbox.org/git/87r2oxe3o1.fsf@javad.com/
[2] https://public-inbox.org/git/6c8749ca-ec5d-b4b7-f1a0-50d9ad2949a5@talktalk.net/
[3] https://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theory

Reported-by: Sergey Organov <sorganov@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@boogisha

This comment has been minimized.

Copy link

commented Mar 11, 2018

@dscho Regarding the code itself, not sure if I'm doing something wrong, but I'm getting a consistent/reproducible crash on Ubuntu 16.04, built Git from this pull request (hopefully correctly), git version 2.16.2.546.g08d2ae2ce.

Repo setup script:

#!/bin/bash

# rm -rf ./.git
# rm -f ./test.txt

git init

touch ./test.txt
git add -- test.txt

# prepare repository
for i in {1..8}
do
	echo X$i >>test.txt
	git commit -am "X$i"
done
git tag master-old

# prepare branch A
git checkout -b A
sed -i '2iA1' test.txt
git commit -am "A1"
sed -i '4iA2' test.txt
git commit -am "A2"
sed -i '6iA3' test.txt
git commit -am "A3"

# prepare branch B
git checkout -b B master
sed -i '5iB1' test.txt
git commit -am "B1"
sed -i '7iB2' test.txt
git commit -am "B2"
sed -i '9iB3' test.txt
git commit -am "B3"

git checkout -b topic A
git merge -s ours --no-commit B # merge A and B with `-s ours`
sed -i '8iM' test.txt           # amend merge commit ("evil merge")
git commit -am "M"
git tag original-merge

# master moves on...
git checkout master
git cherry-pick B^     # cherry-pick B2 into master
sed -i "1iX9" test.txt # add X9
git commit -am "X9"

git checkout topic

# (0) ---X8--B2'--X9 (master)
#        |\
#        | A1---A2---A3 (A)
#        |             \
#        |              M (topic)
#        |             /
#        \-B1---B2---B3 (B)

In this situation, I do git rebase --recreate-merges --onto master master-old, and it halts with this message:

The previous cherry-pick is now empty, possibly due to conflict resolution.
If you wish to commit it anyway, use:

    git commit --allow-empty

Otherwise, please use 'git reset'
interactive rebase in progress; onto f3352a0
Last commands done (4 commands done):
   pick 745cc1a B1
   pick e89c081 B2
Next commands to do (7 remaining commands):
   pick 6defd59 B3
   label M
You are currently rebasing branch 'topic' on 'f3352a0'.

Untracked files:
	rebasing-merge--recreate-merges.sh

nothing added to commit but untracked files present
Could not apply e89c081... B2

Now, no matter if I first do git reset or git commit --allow-empty as proposed (or none of it, even), after git rebase --continue, Git seems to die:

error: Your local changes to the following files would be overwritten by merge:
	test.txt
Please commit your changes or stash them before you merge.
Aborting
error: merge conflicts while merging ee095db into 30f197d with merge base 0ba351e:
error: merging of trees b784ebd87f53c3f597428c71a046018913eaf822 and 6fda341f54d8b7c36f6f82155ba81961a34fe5ae failed

error: git-rebase died of signal 11

Let me know if you need more data... and please note I'm a very novice Linux user ;)

@dscho dscho force-pushed the dscho:sequencer-shears branch 2 times, most recently from 0a0f2e0 to 14115b9 Mar 11, 2018

@dscho

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2018

I'm kind of left with a bitter aftertaste that d41a29c commit message doesn`t do justice to Sergey Organov.

I am sorry, that is not my intention. It was my impression that his approach was not viable, and that Phillip's approach is vastly superior (even if Sergey apparently did not bother to weigh its pros and cons).

But I do not want you to be bitter. So I changed the commit message (copy-edited yours).

I'm getting a consistent/reproducible crash on Ubuntu 16.04

I will try to find some time to reproduce this here.

Thank you so much for being so thorough and helpful! It has been a pleasure working with you so far, and I think the result is already so much better than what I came up with on my own, alone.

@dscho

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2018

I'm getting a consistent/reproducible crash on Ubuntu 16.04

I will try to find some time to reproduce this here.

I cut out some time I wanted to spend on exercise, and exercised my brain muscle instead: dscho@b3aad3a (this test still needs a lot of love, of course, but I think I'll get there, as soon as I refactored the unpack_trees() call so I do not have to duplicate it all over the place.

Tomorrow, though. Or day after tomorrow if I am still sick.

@boogisha

This comment has been minimized.

Copy link

commented Mar 12, 2018

I am sorry, that is not my intention. It was my impression that his approach was not viable, and that Phillip's approach is vastly superior (even if Sergey apparently did not bother to weigh its pros and cons).

But I do not want you to be bitter. So I changed the commit message (copy-edited yours).

No worries, nothing to be sorry about, we all have different perspectives. But, thank you, really, for everything you did, and you're doing, it means (and shows) a lot.

Thank you so much for being so thorough and helpful! It has been a pleasure working with you so far,

Hehe, wanting to write this before I even read your reply, I'll just say now that the feeling is mutual :) I'm glad if I can be of help, and if I get to learn something in the process, even better.

I cut out some time I wanted to spend on exercise, and exercised my brain muscle instead: dscho/git@b3aad3a (this test still needs a lot of love, of course, but I think I'll get there, as soon as I refactored the unpack_trees() call so I do not have to duplicate it all over the place.

Tomorrow, though. Or day after tomorrow if I am still sick.

Take your time. I suffered a nasty stomach flu a few days ago, so I`m still recovering myself, too.

@dscho

This comment has been minimized.

Copy link
Member Author

commented Mar 27, 2018

I am sorry, that is not my intention. It was my impression that his approach was not viable, and that Phillip's approach is vastly superior (even if Sergey apparently did not bother to weigh its pros and cons).

But I do not want you to be bitter. So I changed the commit message (copy-edited yours).

No worries, nothing to be sorry about, we all have different perspectives.

We do. But we also have different priorities, it seems, and we also have different ideas how to form consensus. I now regret editing in Sergey into the commit message, because his messages are the reason I don't want to read this mail thread anymore. He ignores everything I say (except the parts that can be contorted into seemingly agreeing with him), he does not answer any question, let alone consider that his strategy might be awful. And then he ridicules me for still trying to convince him. I so want to throw out this mail thread from my memory. It makes me sick.

@boogisha

This comment has been minimized.

Copy link

commented Mar 28, 2018

Eh, I guess this has to do with recent replies to that mailing list topic...? :( I did get a bunch of e-mails from you and Sergey lately, but didn't have time to check them out yet (and I won't be able to do so for a while, at least), so I'm not really sure what's happening, but I wanted to reply here as it seems it got a bit out of control - I'm sorry that you feel like that, it shouldn't be the way all this works, especially for the people that are actually doing the most of the work... and I'm sorry if anything I did took part in the feeling :(

But all this said, Sergey does have his part in raising the issue and pushing for its solution, coming up with the initial idea, even, so I think the commit message mention is fair enough, and the right thing to do - which you did, and at least shouldn't be a thing to regret over, unrelated to the feelings between the two of you, not to be confused, even if it seems to spill over and color the thread itself.

It's just pretty unfortunate that you seem not to (get along with | understand) each other too much, discussions needlessly getting overheated, eventually causing bad feelings... and for no good reason, I`m afraid :/ I can only suggest to avoid further direct communication, not making it any worse, and possibly have me look into the current state of it (soon), hopefully being able to come to some middle ground, and for the better of everyone.

What me personally makes a bit upset is that I was able to hint what both of you are talking about so far (not sure if that is going to be the case with the latest replies once I get to them, though, but I hope so), where you both seem to aim for the best of it, but eventually just get to annoy each other so much that the main purpose of your very discussion falls out of sight, lost in the noise.

But I don't mind it much as I really find (interactive) rebase to be one of Git`s greatest possibilities, thus trying to keep myself motivated to have the new merge rebasing logic as good as possible, helping in possibly the only way I can at the moment, discussing it through, as much as my humble knowledge allows me to.

Please don't feel bad about all this, but also feel free to follow your inner senses. I might prefer to see some things discussed further, or changed, even, but I would also totally understand if you would like to get over with all this already, nothing to blame you for - and I guess some changes will be possible after the fact as well, if needed.

No matter what you decide upon, might be after letting it settle a bit, thanks again, for everything, and heads up! Please :) You're doing a great job, and without me telling you that. All this should be fun and enjoying, and if it slipped off path, let's try making it so again ;)

@dscho

This comment has been minimized.

Copy link
Member Author

commented Mar 30, 2018

it shouldn't be the way all this works, especially for the people that are actually doing the most of the work...

That's certainly how this feels.

But all this said, Sergey does have his part in raising the issue and pushing for its solution, coming up with the initial idea, even, so I think the commit message mention is fair enough, and the right thing to do - which you did, and at least shouldn't be a thing to regret over, unrelated to the feelings between the two of you, not to be confused, even if it seems to spill over and color the thread itself.

Hannes Sixt came up with the original idea. So I think Hannes deserves the credit. And Phillip deserves the credit for putting the derailed train wreck back into a productive direction. I cannot mention them all.

But if I should mention what made me implement the changes, it was Phillip's idea. I will change the commit message accordingly, to set the record straight.

If you want, I can give Sergey credit for annoying me so much that I took a break from this project for almost two weeks.

Please don't feel bad about all this

For the moment, I do. There is nothing you or I can do about it.

I guess this comes back to the difference between computer scientists, programmers and software engineers: computer scientists come up with theories that look good on paper, programmers write code, engineers use programming to create solutions. While I certainly fall prey to the appeal of nice theories (and could talk all night about them over a good drink or three), I was always interested in solutions (and consequently, I am annoyed when others stand in the way of solutions).

All this should be fun and enjoying, and if it slipped off path, let's try making it so again ;)

Let's see. For the moment, I am struggling with the problem that Phillip's strategy -- even if it is simple in theory -- does not map well into the code present in merge-recursive.c/unpack-trees.c. The problem is keeping those merge conflicts as merge conflicts while continuing to merge the next merge head's changes (i.e. I have to perform two 3-way merges: the first between the original merge commit and the updated first parent, the second one between the result of the first merge and the updated second parent, and if the first merge fails, the current code prevents the next merge from happening.)

At least I am in the process again of focusing on the solution. That should turn the fun back on for me.

@winksaville

This comment has been minimized.

Copy link
Contributor

commented Apr 6, 2018

@dscho I tried using --recreate-merges on a directory in which I'm using contrib/subtree. I.e. I added the directory using git add it using:

git subtree add --prefix lib/gbenchmark --squash gb-v1.4.0

I was hoping --recreate-merges would work as --preserve-merges doesn't, but turns out --recreate-merges didn't work either. I definitely could have made a mistake, are you expecting --recreate-merges to work in this case?

@dscho

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2018

@winksaville this part of the sequencer-shears patch thicket is not supposed to address anything requiring merges other than regular recursive merges. And I think subtree requires a different merge strategy...

Note, however, that the sequencer-shears patch thicket is in a non-functional state right now, as I struggle with getting the idea of Phillip Wood implemented, which would allow us to at least try to rebase non-recursive merges. The problem is this: the strategy calls for the original merge commit to be merges with the new first parent (using the old first parent as merge base) and then with the new second parent (using the old second parent as merge base). If the first of these two merges produces merge conflict (and it is easy to construe a case where it would), there is currently no way in Git to continue with another merge. However, I want to continue with another merge... If you have any idea how to implement this, please speak up.

@winksaville

This comment has been minimized.

Copy link
Contributor

commented Apr 6, 2018

@dscho, I'm not sure if a different merge strategy or not, I'm to inexperienced, but here is the "graph" of the subtree:

$ git log --graph --pretty="format:%h %s" master..
* c37dcd18 In gbenchmark st.range is not int64_t
*   26fa64e6 git subtree add --prefix lib/gbenchmark --squash gb-v1.4.0
|\  
| * 91fa3bf5 Squashed 'lib/gbenchmark/' content from commit 54d92f93
* 0df831af Delete lib/gbenchmark in preparation for updating to newer version.
* a6bc71d6 Updates to use gbenchmark plus some tools and documentation.

So 91fa3bf5 is a squash of 54d92f93 and is just dangling which is probably unnatural, but I would hope it would be "easy" to recreate. Let me know what else you might need or give me some hints on what I need to do to have --recreate-merges work in this scenario.

dscho and others added some commits Dec 22, 2017

sequencer: fast-forward `merge` commands, if possible
Just like with regular `pick` commands, if we are trying to rebase a
merge commit, we now test whether the parents of said commit match HEAD
and the commits to be merged, and fast-forward if possible.

This is not only faster, but also avoids unnecessary proliferation of
new objects.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase: start to deprecate --preserve-merges
The design of --preserve-merges was never meant to allow any interactive
rebase, as demonstrated by the inability to reorder commits, to change
merge commits' ancestry or to introduce new merge commits.

The --rebase-merges mode we just introduced has a design that fixes
those issues, and therefore we can now safely start to deprecate the
--preserve-merges.

While at it, explain a little better in the man page what the
`--rebase-merges` mode is all about.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase -i --rebase-merges: offer a smart way to rebase merge commits
Previously, we punted on the question how to carry over amendments to
merge commits. Instead, we always performed new merges from scratch.

Such amendments to merge commits may very well be necessary, though,
e.g.  if one side of the history changed a function signature and the
other side added a caller.

In a discussion inspired by Johannes Sixt's "cherry-pick -m 1" idea,
Phillip Wood came up with the fundamental idea behind this commit.

The premise is that both "rebase" and "merge" try to reconcile diverging
changes. Both "rebase" and "merge" should result in identical trees
(after merge conflicts are resolved, if there were any).

A "merge" typically boils down to a "3-way" merge, with a "merge base"
and two "merge heads" that diverged from said merge base. Think e.g.
last week's master as the merge base, from where a developer branched
off a topic branch (the first merge head), and the current master as
second merge head.

A "rebase" is more complicated than a single "3-way merge": When
rebasing commits, they are "cherry-picked" one by one. That way, the
changes introduced by that commit are reconciled with the changes
introduced by the "rebase" so far: a "3-way merge" between HEAD and the
cherry-picked commit, with the latter's parent commit as merge base
(HEAD would be the rebased parent commit in the common case).

In the context of a cherry-pick, it is important to keep in mind that
the "diverging changes" are not reflected by commit history. For the
purpose of a 3-way merge, they don't have to be.

Now let's look again at the problem of "rebasing merge commits", i.e.
how to "cherry-pick a merge commit"? It is not as simple as a "3-way
merge" (a single one, that is) because we do not have a single merge
base from where sprung exactly two diverging changesets.

Just like when we cherry-pick a regular commit, in the case of a merge
commit we have diverging changes for *every* one of its parents. And
just like before, the original parent commit is the merge base, and the
rebased parent commit is a merge head. The other merge head is the
original merge commit itself.

By starting with the original merge commit and then performing these
3-way merges sequentially, one for every parent, we reconcile all
diverging changes. So we "rebased" the merge commit, including all of
its amendments.

This is the most logical generalization of the cherry-pick concept: if
we only have one parent, the strategy outlined above is identical to
Git's cherry-pick operation.

Side-note: While this description only talked about merge commits with
exactly two parent commits above, the principle still holds for merge
commits with more than two parent commits ("octopus merges").

That strategy requires an original merge commit, with a matching number
of parents, though, and therefore it would not make sense in general to
do this for every `merge` command in the todo list: what if there was no
original merge commit, or if the specified merge commit had a different
number of parents? It therefore cannot be the default mode for the
`merge` command, thus we introduce a new flag `-R` for that.

However, for existing merge commits, this strategy *is* valid and *does*
lead to the most intuitive results. Therefore, let's use it by default
when generating todo lists in `git rebase --rebase-merges`.

Original-idea-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Igor Djordjevic <igor.d.djordjevic@gmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase --rebase-merges: a "merge" into a new root is a fast-forward
When a user provides a todo list containing something like

	reset [new root]
	merge my-branch

let's do the same as if pulling into an orphan branch: simply
fast-forward.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase -i --rebase-merges: add a section to the man page
The --rebase-merges mode is probably not half as intuitive to use as
its inventor hopes, so let's document it some.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase --rebase-merges: adjust man page for octopus support
Now that we support octopus merges in the `--rebase-merges` mode,
we should give users who actually read the manuals a chance to know
about this fact.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase --rebase-merges: add test for --keep-empty
If there are empty commits on the left hand side of $upstream...HEAD
then the empty commits on the right hand side that we want to keep are
being pruned.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
sequencer: extract helper to update active_cache_tree
This patch extracts the code from is_index_unchanged() to initialize or
update the index' cache tree (i.e. a tree object reflecting the current
index' top-level tree).

The new helper will be used in the upcoming code to support `git rebase
-i --root` via the sequencer.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
t3430: add some realistic tests for --rebase-merges
This commit adds a lengthy test case to t3430 that reflects some
challenging use cases for the --rebase-merges option.

In particular, it sets up a scenario which demonstrates that "evil merges"
happen in practice, and they are necessarily introducing those extra
changes.

It then sets up three "upstream" branches with competing changes that
are designed to conflict with the changes to rebase.

The purpose of this added test case is two-fold:

1. to document what we expect --rebase-merges to accomplish, and even more
   to document what we do *not* expect it to be able to do.

2. to explore what kinds of merge conflicts --rebase-merges can produce
   (spoiler: we can end up with some bad ones, with unintuitively-nested
   merge conflicts).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase --rebase-merges: give rerere a chance
Just like with `pick` commands, `merge` commands should also get a chance
to resolve already-recorded conflicts.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase --rebase-merges: root commits can be cousins, too
Reported by Wink Saville: when rebasing with no-rebase-cousins, we
will want to refrain from rebasing all of them, even when they are
root commits.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase --rebase-merges: avoid "empty merges"
The `git merge` command does not allow merging commits that are already
reachable from HEAD: `git merge HEAD^`, for example, will report that we
are already up to date and not change a thing.

In an interactive rebase, such a merge could occur previously, e.g. when
competing (or slightly modified) versions of a patch series were applied
upstream, and the user had to `git rebase --skip` all of the local
commits, and the topic branch becomes "empty" as a consequence.

Let's teach the todo command `merge` to behave the same as `git merge`.

Seeing as it requires some low-level trickery to create such merges with
Git's commands in the first place, we do not even have to bother to
introduce an option to force `merge` to create such merge commits.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
rebase --rebase-merges: use decreasing marker sizes for nested conflicts
When encountering nested conflicts, it can really be challenging to make
sense of what goes where. The semi-realistic example that was added to
t3430-rebase-merges.sh, for example, shows this nested conflict:

	int hi(void) {
		printf("Hello, world!\n");
	}
	<<<<<<< intermediate merge
	<<<<<<< HEAD
	/* main event loop */
	void event_loop(void) {
		/* TODO: place holder for now */
	=======
	=======
	}
	>>>>>>> <HASH>... merge head #1
	/* caller */
	void caller(void) {
		hi();
	>>>>>>> <HASH>... original merge
	}

This is really confusing, in particular because the nested merge
conflict is not contained in one arm of the outer merge conflict, but
they seem to be interleaved.

With this patch, the first 3-way merge produces conflict markers that are
one character longer than the second 3-way merge's conflict markers, and
it becomes a *little* more readable:

	int hi(void) {
		printf("Hello, world!\n");
	}
	<<<<<<< intermediate merge
	<<<<<<<< HEAD
	/* main event loop */
	void event_loop(void) {
		/* TODO: place holder for now */
	========
	=======
	}
	>>>>>>> <HASH>... merge head #1
	/* caller */
	void caller(void) {
		hi();
	>>>>>>>> <HASH>... original merge
	}

It still does not immediately make a whole lot of sense, and instead
requires some brain-twisting and inspection of the intermediate state to
understand.

So what is going on? Well, after the intermediate merge, the event loop
was added (via upstream, onto which we rebased), but with conflict
markers, because the second parent had added the caller() function in the
same place in the original merge. Since the rebased second parent also has
the event loop added (through upstream, onto which it was rebased), the
conflict markers added in the first 3-way merge *cause* the conflict in
the second 3-way merge.

And the last conflict marker, which looked as if it was concluding the
outer conflict, is actually part of the "inner" (i.e. nested) conflict
and just happens to not cause any further conflicts in the outer merge.

Granted, it would be slightly more obvious if the conflict markers
causing conflicts were wrapped in their own little conflict markers:

	int hi(void) {
		printf("Hello, world!\n");
	}
	<<<<<<< intermediate merge
	<<<<<<<< HEAD
	=======
	>>>>>>> <HASH>... merge head #1
	/* main event loop */
	void event_loop(void) {
		/* TODO: place holder for now */
	<<<<<<< intermediate merge
	========
	=======
	}
	>>>>>>> <HASH>... merge head #1
	/* caller */
	void caller(void) {
		hi();
	>>>>>>>> <HASH>... original merge
	}

At least now it is obvious that the extra `<<<<<<<< HEAD` before the
event loop, and the extra `========` after it, *caused* the "outer"
merge conflict. But xdl_merge() does not wrap the merge conflicts this
way because there are only three unconflicting lines between the
conflicting lines, and xdl_merge() tries to optimize for a minimal total
number of lines (including the added conflict markers).

In practice, the functions would be longer, and xdl_merge() *would* wrap
only the nested conflict markers in outer conflict markers.

It is still not something you would want to encounter in your every-day
work, but presenting it this way is better than what we had before.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
WIP rebase --rebase-merges: try to avoid unnecessary merge conflicts
When rebasing a regular merge between two parent commits, we have that
problem that we have to perform *two* 3-way merges, because we want to
merge in the changes (like amendments, merge conflict resolutions, etc)
from the original merge commit, too.

When the first of these 3-way merges already had conflicts, then we run
the chance of ending up with nested conflicts in the second 3-way merge.

So let's see in that case whether we gain something by merging the
original merge commit with the other parent first, and if that resulted in
a clean merge, proceed to merge the first parent (in this case, we cannot
end up with nested merge conflicts).

This simplifies the realistic example of a nested merge conflict to a
non-nested merge conflict. Before:

	int hi(void) {
		printf("Hello, world!\n");
	}
	<<<<<<< intermediate merge
	<<<<<<<< HEAD
	/* main event loop */
	void event_loop(void) {
		/* TODO: place holder for now */
	========
	=======
	}
	>>>>>>> <HASH>... merge head #1
	/* caller */
	void caller(void) {
		hi();
	>>>>>>>> <HASH>... original merge
	}

With this patch, this becomes much simpler:

	int hi(void) {
		printf("Hello, world!\n");
	}
	/* main event loop */
	void event_loop(void) {
		/* TODO: place holder for now */
	}
	<<<<<<<< HEAD
	========
	/* caller */
	void caller(void) {
		hi();
	}
	>>>>>>>> <HASH>... intermediate merge

Note: this needs to be refactored and stuff and things. It may even be
necessary to dive deeper into the code and implement a "W merge" that
avoids the problem where (one part of) one file would benefit from merging
the second parent before the first, while another (part of the same) file
would benefit from the reverse order.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
TODO
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

@dscho dscho force-pushed the dscho:sequencer-shears branch from 7f23c20 to 0e9cb8f Apr 23, 2018

@dscho

This comment has been minimized.

Copy link
Member Author

commented Oct 22, 2018

This made it into master a long time ago.

@dscho dscho closed this Oct 22, 2018

@dscho

This comment has been minimized.

Copy link
Member Author

commented Oct 22, 2018

This made it into master a long time ago.

5 similar comments
@dscho

This comment was marked as outdated.

Copy link
Member Author

commented Oct 22, 2018

This made it into master a long time ago.

@dscho

This comment was marked as outdated.

Copy link
Member Author

commented Oct 22, 2018

This made it into master a long time ago.

@dscho

This comment was marked as outdated.

Copy link
Member Author

commented Oct 22, 2018

This made it into master a long time ago.

@dscho

This comment was marked as outdated.

Copy link
Member Author

commented Oct 22, 2018

This made it into master a long time ago.

@dscho

This comment was marked as outdated.

Copy link
Member Author

commented Oct 22, 2018

This made it into master a long time ago.

@dscho dscho deleted the dscho:sequencer-shears branch Oct 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.