Skip to content
This repository was archived by the owner on Jul 24, 2020. It is now read-only.

Fix: Broken round-trip fidelity for merges of converged files#207

Merged
durin42 merged 3 commits into
schacon:masterfrom
cwalther:convergedmerge
May 25, 2011
Merged

Fix: Broken round-trip fidelity for merges of converged files#207
durin42 merged 3 commits into
schacon:masterfrom
cwalther:convergedmerge

Conversation

@cwalther
Copy link
Copy Markdown
Contributor

I have found a circumstance in which pushing a changeset from Mercurial to Git, then pulling it back again will result in a different changeset ID (as well as slight repository corruption that will break some operations such as copy detection).

This happens when the changeset in question is a merge that contains files that had the same contents in both parents, but arrived at those contents by different histories (and therefore have different file revisions in Mercurial). When making such a merge, Mercurial will create a new file revision that ties together the two parent revisions, and therefore reports the file as changed in the merge, even though its content did not change with respect to either parent. Git on the other hand only tracks file contents, not their histories, and will therefore not report the file as changed after importing that merge into Git. On pulling back the merge into Mercurial, Hg-Git must recreate the merged file revision, even though it is not told so by Git. It currently does not do that, but keeps the file revision from the first parent, with the result that 1. the changeset ID is different from the original, and 2. the filelog history is disjoint and doesn’t match the changeset history, so that activities that travel back along the file history, e.g. copy detection while merging, will fail when trying to go back to an ancestor that lies in the history of the second parent.

The same thing also happens when a merge conflict is resolved by taking the entire file contents from the first parent and discarding any changes from the second parent.

The following shell script will reproduce the issue, showing that the original and back-imported revision IDs are different, and why:

hg init hgrepo1
cd hgrepo1
echo A > afile
hg add afile 
hg ci -m "origin"
echo B > afile
hg ci -m "A->B"
echo C > afile
hg ci -m "B->C"
hg up -r0
echo C > afile
hg ci -m "A->C"
hg merge -r2
hg ci -m "merge"
hg log --template "{node}\n" -r tip > ../mergeid-orig
cd ..
git init --bare gitrepo
cd hgrepo1
hg bookmark -r4 master
hg push -r master ../gitrepo
cd ..
hg init hgrepo2
cd hgrepo2              
hg pull -r master ../gitrepo
hg log --template "{node}\n" -r tip > ../mergeid-from-git
cd ..
cmp mergeid-orig mergeid-from-git && echo equal || echo different

hg debugindex hgrepo1/.hg/store/data/afile.i
hg debugindex hgrepo2/.hg/store/data/afile.i

The attached commits add a test for the problem (that currently fails) and a fix (that makes the test succeed and doesn’t break any other tests that weren’t failing already).

@durin42 durin42 merged commit 709b284 into schacon:master May 25, 2011
@durin42
Copy link
Copy Markdown
Collaborator

durin42 commented May 25, 2011

Thanks, pushed. Outstanding detective work!

On Tue, May 24, 2011 at 1:49 PM, cwalther
reply@reply.github.com
wrote:

I have found a circumstance in which pushing a changeset from Mercurial to Git, then pulling it back again will result in a different changeset ID (as well as slight repository corruption that will break some operations such as copy detection).

This happens when the changeset in question is a merge that contains files that had the same contents in both parents, but arrived at those contents by different histories (and therefore have different file revisions in Mercurial). When making such a merge, Mercurial will create a new file revision that ties together the two parent revisions, and therefore reports the file as changed in the merge, even though its content did not change with respect to either parent. Git on the other hand only tracks file contents, not their histories, and will therefore not report the file as changed after importing that merge into Git. On pulling back the merge into Mercurial, Hg-Git must recreate the merged file revision, even though it is not told so by Git. It currently does not do that, but keeps the file revision from the first parent, with the result that 1. the changeset ID is different from the original, and 2. the filelog history is disjoint and doesn’t match the changeset history
 , so that activities that travel back along the file history, e.g. copy detection while merging, will fail when trying to go back to an ancestor that lies in the history of the second parent.

The same thing also happens when a merge conflict is resolved by taking the entire file contents from the first parent and discarding any changes from the second parent.

The following shell script will reproduce the issue, showing that the original and back-imported revision IDs are different, and why:

hg init hgrepo1
cd hgrepo1
echo A > afile
hg add afile
hg ci -m "origin"
echo B > afile
hg ci -m "A->B"
echo C > afile
hg ci -m "B->C"
hg up -r0
echo C > afile
hg ci -m "A->C"
hg merge -r2
hg ci -m "merge"
hg log --template "{node}\n" -r tip > ../mergeid-orig
cd ..
git init --bare gitrepo
cd hgrepo1
hg bookmark -r4 master
hg push -r master ../gitrepo
cd ..
hg init hgrepo2
cd hgrepo2
hg pull -r master ../gitrepo
hg log --template "{node}\n" -r tip > ../mergeid-from-git
cd ..
cmp mergeid-orig mergeid-from-git && echo equal || echo different

hg debugindex hgrepo1/.hg/store/data/afile.i
hg debugindex hgrepo2/.hg/store/data/afile.i

The attached commits add a test for the problem (that currently fails) and a fix (that makes the test succeed and doesn’t break any other tests that weren’t failing already).

Reply to this email directly or view it on GitHub:
#207

@cwalther
Copy link
Copy Markdown
Contributor Author

Thanks, that was quick!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants