Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge-ort: fix bug with renormalization and rename/delete conflicts #1174

Closed

Conversation

newren
Copy link
Contributor

@newren newren commented Dec 27, 2021

Original report: https://lore.kernel.org/git/CAN0XMOK8iHZnbtYw7CPAQGJcmuVSDxQoFNFEwiaa41V89F1rzA@mail.gmail.com/

Built in v2.34.1, but rebases onto and/or merges cleanly with newer versions.

Changes since v1:

  • Added Stolee's Reviewed-by

cc: Derrick Stolee stolee@gmail.com
cc: Elijah Newren newren@gmail.com

@newren
Copy link
Contributor Author

newren commented Dec 28, 2021

/submit

@gitgitgadget-git
Copy link

Submitted as pull.1174.git.git.1640650846612.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-git-1174/newren/merge-ort-rename-delete-renormalization-bug-v1

To fetch this version to local tag pr-git-1174/newren/merge-ort-rename-delete-renormalization-bug-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-git-1174/newren/merge-ort-rename-delete-renormalization-bug-v1

@gitgitgadget-git
Copy link

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 12/27/2021 7:20 PM, Elijah Newren via GitGitGadget wrote:
> From: Elijah Newren <newren@gmail.com>
> 
> Ever since commit a492d5331c ("merge-ort: ensure we consult df_conflict
> and path_conflicts", 2021-06-30), when renormalization is active AND a
> file is involved in a rename/delete conflict BUT the file is unmodified
> (either before or after renormalization), merge-ort was running into an
> assertion failure. 

This "the file is unmodified" is critical, as when I looked at the test,
it seemed too simple. I asked myself, "Why does renormalization matter
here?" Turns out it is just an artifact of the carefully organized cases.

>  		if (opt->renormalize &&
>  		    blob_unchanged(opt, &ci->stages[0], &ci->stages[side],
>  				   path)) {
> -			ci->merged.is_null = 1;
> -			ci->merged.clean = 1;
> -			assert(!ci->df_conflict && !ci->path_conflict);
> +			if (!ci->path_conflict) {
> +				/*
> +				 * Blob unchanged after renormalization, so
> +				 * there's no modify/delete conflict after all;
> +				 * we can just remove the file.
> +				 */
> +				ci->merged.is_null = 1;
> +				ci->merged.clean = 1;
> +				 /*
> +				  * file goes away => even if there was a
> +				  * directory/file conflict there isn't one now.
> +				  */
> +				ci->df_conflict = 0;
> +			} else {
> +				/* rename/delete, so conflict remains */
> +			}

This breakdown of the cases is informative, and I like how self-contained
the change is.

> +test_expect_success 'rename/delete vs. renormalization' '
> +	git init subrepo &&
> +	(
> +		cd subrepo &&
> +		echo foo >oldfile &&
> +		git add oldfile &&
> +		git commit -m original &&
> +
> +		git branch rename &&
> +		git branch nuke &&
> +
> +		git checkout rename &&
> +		git mv oldfile newfile &&
> +		git commit -m renamed &&
> +
> +		git checkout nuke &&
> +		git rm oldfile &&
> +		git commit -m deleted &&
> +
> +		git checkout rename^0 &&
> +		test_must_fail git -c merge.renormalize=true merge nuke >out &&
> +
> +		grep "rename/delete" out
> +	)
> +'
> +
>  test_done

I tested this on the latest 'master' and saw the following:

  git: merge-ort.c:3846: process_entry: Assertion `!ci->df_conflict && !ci->path_conflict' failed

so it indeed hits this case.

This patch looks good to me. Thanks!

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>

@gitgitgadget-git
Copy link

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

Ever since commit a492d53 ("merge-ort: ensure we consult df_conflict
and path_conflicts", 2021-06-30), when renormalization is active AND a
file is involved in a rename/delete conflict BUT the file is unmodified
(either before or after renormalization), merge-ort was running into an
assertion failure.  Prior to that commit (or if assertions were compiled
out), merge-ort would mis-merge instead, ignoring the rename/delete
conflict and just deleting the file.

Remove the assertions, fix the code appropriately, leave some good
comments in the code, and add a testcase for this situation.

Reported-by: Ralf Thielow <ralf.thielow@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
@newren newren force-pushed the merge-ort-rename-delete-renormalization-bug branch from 5841f3d to 72876b9 Compare December 30, 2021 01:55
@newren
Copy link
Contributor Author

newren commented Dec 30, 2021

/submit

@gitgitgadget-git
Copy link

Submitted as pull.1174.v2.git.git.1640902135926.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-git-1174/newren/merge-ort-rename-delete-renormalization-bug-v2

To fetch this version to local tag pr-git-1174/newren/merge-ort-rename-delete-renormalization-bug-v2:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-git-1174/newren/merge-ort-rename-delete-renormalization-bug-v2

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Derrick Stolee <stolee@gmail.com> writes:

> This breakdown of the cases is informative, and I like how self-contained
> the change is.
>  ....
>
> This patch looks good to me. Thanks!
>
> Reviewed-by: Derrick Stolee <dstolee@microsoft.com>

Thanks, both.

A related tangent, but I was looking at the data structure involved
and noticed that the casting between structure types "merged_info"
and "conflict_info" looked a bit ugly.  It might be worth cleaning
them up into 

 (A) a union with two struct, with "clean" member in the union to
     switch between the two structures; or

 (B) a single structure that looks like "conflict_info" but inlines
     members of "merged_info" into it.

The latter may be cleaner and simpler, and the unified data type
would be the "merge info", which may be representing cleanly merged
path, or conflicted path, and would justify conditional use of some
members based on the value of the .clean member.

@gitgitgadget-git
Copy link

This branch is now known as en/merge-ort-renorm-with-rename-delete-conflict-fix.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via f5761ec.

@gitgitgadget-git
Copy link

On the Git mailing list, Elijah Newren wrote (reply to this):

On Thu, Dec 30, 2021 at 2:56 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Derrick Stolee <stolee@gmail.com> writes:
>
> > This breakdown of the cases is informative, and I like how self-contained
> > the change is.
> >  ....
> >
> > This patch looks good to me. Thanks!
> >
> > Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
>
> Thanks, both.
>
> A related tangent, but I was looking at the data structure involved
> and noticed that the casting between structure types "merged_info"
> and "conflict_info" looked a bit ugly.

Yes, that's true.

> It might be worth cleaning them up into
>
>  (A) a union with two struct, with "clean" member in the union to
>      switch between the two structures; or
>
>  (B) a single structure that looks like "conflict_info" but inlines
>      members of "merged_info" into it.
>
> The latter may be cleaner and simpler, and the unified data type
> would be the "merge info", which may be representing cleanly merged
> path, or conflicted path, and would justify conditional use of some
> members based on the value of the .clean member.

These are heavily used data structures.  Note that:
  sizeof(struct conflict_info) = 216
  sizeof(struct merged_info) = 64
In particular, we have to allocate one or the other of these for every
path (both file and directory) involved in the merge.  Since the
former is 3.375 times bigger than the latter, and the vast majority of
paths involved in a merge usually do not conflict (think of files only
changed on one side), using just one combined struct would require
more than 3x the amount of memory.  So I'd rather avoid (B).

(A) may work, but I'd still have to allocate merged_info instead of
the union type to avoid the memory increase.  And since we have an
amount of memory allocated that is smaller than the union, when
accessing it via the union, Stolee would probably still want all the
same casting safeguards (as a safety check to avoid out-of-bounds
accesses) that I think you're complaining about.

@gitgitgadget-git
Copy link

User Elijah Newren <newren@gmail.com> has been added to the cc: list.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 69edbf7.

@gitgitgadget-git
Copy link

There was a status update in the "New Topics" section about the branch en/merge-ort-renorm-with-rename-delete-conflict-fix on the Git mailing list:

A corner case bug in the ort merge strategy has been corrected.

Will merge to 'next'.
source: <pull.1174.git.git.1640650846612.gitgitgadget@gmail.com>

@gitgitgadget-git
Copy link

This patch series was integrated into seen via e25f11a.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 5ed02b3.

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 942b4c5.

@gitgitgadget-git
Copy link

This patch series was integrated into next via bb81dd4.

@gitgitgadget-git gitgitgadget-git bot added the next label Jan 5, 2022
@gitgitgadget-git
Copy link

There was a status update in the "Cooking" section about the branch en/merge-ort-renorm-with-rename-delete-conflict-fix on the Git mailing list:

A corner case bug in the ort merge strategy has been corrected.

Will merge to 'master'.
source: <pull.1174.git.git.1640650846612.gitgitgadget@gmail.com>

@gitgitgadget-git
Copy link

This patch series was integrated into seen via 2c54104.

@gitgitgadget-git
Copy link

This patch series was integrated into next via 2c54104.

@gitgitgadget-git
Copy link

This patch series was integrated into master via 2c54104.

@gitgitgadget-git
Copy link

Closed via 2c54104.

@newren newren deleted the merge-ort-rename-delete-renormalization-bug branch January 13, 2022 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant