Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a flag to fix past merges. #89

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

matts1
Copy link

@matts1 matts1 commented Jul 5, 2023

This is a simple implementation where a commit chain A -> {B, C} -> merge commit -> D is just treated as A -> B -> C -> D for the purposes of absorb.

This is a simple implementation where a commit chain A -> {B, C} -> merge commit -> D is just treated as A -> B -> C -> D for the purposes of absorb.
@tummychow
Copy link
Owner

instinctively i think this is unsound. what if the merge is nontrivial and contains conflict resolutions? modifying B/C (the parents of the merge) might invalidate the resolution or introduce new conflicts. have you tested such a case?

@matts1
Copy link
Author

matts1 commented Jul 5, 2023

Those sound like reasonable concerns. Here's an algorithm that should address that. If you think it sounds reasonable I might implement it, but it does seem like a lot of work.

Given a tree:

       +-- A --+
base --+       +-- merge -- C
       +-- B --+

We would treat it internally as:

       +-- A -- mergeA=$(diff A merge) --+
base --+                                 +-- C
       +-- B -- mergeB=$(diff B merge) --+

We would only absorb if:

  • Of the set of all commits that don't commute, there is a single most common ancestor (eg. If both A and B fail to commute, we don't absorb)
  • It commutes with all mergeX commits

This would mean that:

  • Absorbing into C is a non-issue
  • Absorbing into base is a non-issue because it guarantees that every other commit commutes
  • Absorbing into A is fine, because that would guarantee that the branch B commutes, and that the changes that we make to A would also commute with mergeA (and vice versa)

@tummychow
Copy link
Owner

tummychow commented Jul 5, 2023

i'm going to be honest with you: convincing me that the described algorithm is correct will be as hard, possibly harder, than actually implementing it. which is why i didn't even attempt to support merges in the current design. can you demonstrate with a worked example? like, patches with line offsets.

Of the set of all commits that don't commute, there is a single most common ancestor (eg. If both A and B fail to commute, we don't absorb)

i'm not sure i follow this sentence. if both A and B fail to commute, they have a common ancestor (base). can you rigorously define "most common"?

@matts1
Copy link
Author

matts1 commented Jul 6, 2023

Oh, sorry, most common child, not most common ancestor. Let's only work with a single file, since we only need a single file

Base:

foo

A:

+ def
+ fn
+ here
foo
+ a = ...

B:

foo
+ b = ...

merged:

def
fn
extra
here
foo
a = ...
b = ...

I'll give you a more concrete algorithm with a slight modification - some pseudocode for the algorithm in something between rust and python

def select_absorb_commit(commit, hunk, base) -> Option<Commit>:
  if commit == base:
    return None

  if len(commit.parents) == 1:
    if not commute(commit, hunk):
      return Some(commit)
    return select_absorb_commit(commit.parent, hunk, base)
  
  commits  = {
    parent: select_absorb_commit(parent, hunk, base)
    for parent in commit.parents
  }
  # Ignore parents of the merge commit that commuted
  filtered = {k: c for k, v in commits if v == Some(c)}
  if len(filtered) == 1:
    parent, absorb_commit = next(filtered)
    if commute(diff(commit, parent), hunk):
      return absorb_commit

  return None

In all following examples, we will be writing f(commit) as a shorthand for select_absorb_commit(commit=commit, hunk=<hunk>, base=base)

Examples:
Hunk containing changes between def and fn:

f(merged)
  f(a) -> a
  f(b) -> None
  filtered = {a -> a}
  commute(diff(merged, a), hunk) -> True
  -> return a

Hunk containing changes between foo and a = ...

f(merged)
  f(a) -> a
  f(b) -> b
  filtered = {a -> a, b -> b}
  len != 1 -> return None

Hunk containing changes between fn and extra:

f(merged)
  f(a) -> a
  f(b) -> None
  filtered = {a -> a}
  commute(diff(merged, a), hunk) -> False
  -> return None

@tummychow
Copy link
Owner

hmmm, okay. this looks like something i could see myself accepting. i won't take this pr as-is, but you're welcome to start hacking on the algorithm (or throw your hands up in the air and say "nah it's too much work", in which case i'll close this pr and move the description to a new issue).

i think the only detail that's missing from the pseudocode is that all merge parents could return the same result (eg filtered = {a -> base, b -> base}) if the stack was deep enough to include their git-merge-base. that might be simple as replacing len(filtered) with len(unique(values(filtered))) and then iterating across all the merge parents, or perhaps it ends up being more complicated.

@devinrhode2
Copy link

Maybe this is answered in the pseudo-code, but I'm curious exactly how you'll decide the order of the two commits that are being merged.

Naively, I think the commit with the older commit date could go first. I think commit date is better than author date because it's a more likely indication on conflict resolution.

A commit that was authored 12 months ago and committed 12 months ago is more likely to have conflicts if applied on top of a commit made 2 days ago (regardless of author date), compared to applying a commit from 2 days ago on top of a commit made 12 months ago.

Given this, we now have a reasonable heuristic for choosing which commit goes first.

The second commit is then just the diff of A and the merge commit.

@matts1
Copy link
Author

matts1 commented Jul 7, 2023

@tummychow Good catch, I hadn't considered that. Yes, it should be as simple as len(unique(values(filtered))).

@devinrhode2 I believe you're misunderstanding something, but it's hard to say exactly what without an example of what you think would happen. If it helps clarify anything, we specifically do not perform any merges, and absorb will never create a fixup unless the following git rebase --rebase-merges is guaranteed to succeed, so it's not neccesary to do anything heuristic-based, and we don't need to decide which commit goes first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants