Skip to content

Conversation

wetneb
Copy link
Contributor

@wetneb wetneb commented Nov 8, 2023

Changes since v1:

  • improve commit message to mention the use case of custom merge drivers
  • improve documentation to show available options and recommend switching to "histogram"
  • add tests

I have left out:

  • switching the default to "histogram", because it should only be done in a subsequent release
  • adding a configuration variable to control this option, because I was not sure how to call it. Perhaps "merge-file.diffAlgorithm"?

cc: Phillip Wood phillip.wood123@gmail.com

Copy link

Welcome to GitGitGadget

Hi @wetneb, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that your Pull Request has a good description, as it will be used as cover letter. You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <revi.ewer@example.com>, Ill Takalook <ill.takalook@example.net>

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@Ikke
Copy link
Contributor

Ikke commented Nov 8, 2023

/allow

Copy link

User wetneb is now allowed to use GitGitGadget.

@wetneb
Copy link
Contributor Author

wetneb commented Nov 8, 2023

/preview

Copy link

Preview email sent as pull.1606.git.git.1699479242716.gitgitgadget@gmail.com

@wetneb
Copy link
Contributor Author

wetneb commented Nov 8, 2023

/submit

Copy link

Submitted as pull.1606.git.git.1699480494355.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v1

To fetch this version to local tag pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v1

Copy link

On the Git mailing list, Antonin Delpeuch wrote (reply to this):

Hi all,

Here a few more thoughts about this patch, to explain what brought me to needing that. If this need is misguided, perhaps you could redirect me to a better solution.

I am writing a custom merge driver for Java files. This merge driver internally calls git-merge-file and then solves the merge conflicts which only consist of import statements (there might be cases where it gets it wrong, but I can then use other tools to cleanup those import statements). When testing this, I noticed that the merge driver performed more poorly on other sorts of conflicts, compared to the standard "ort" merge strategy. This is because "ort" uses the "histogram" diff algorithm, which gives better results than the "myers" diff algorithm that merge-file uses.

Intuitively, if "histogram" is the default diff algorithm used by "git merge", then it would also make sense to have the same default for "git merge-file", but I assume that changing this default could be considered a bad breaking change. So I thought that making this diff algorithm configurable would be an acceptable move, hence my patch.

Of course, the diffing could be configured in other ways, for instance with its handling of whitespace or EOL (similarly to what the "git-diff" command offers). I think those options would definitely be worth exposing in merge-file as well. If you think this makes sense, then I would be happy to work on a new version of this patch which would attempt to include all the relevant options. I could also try to add the corresponding tests.

But perhaps my need is misguided? Could it be that I should not be writing a custom merge driver, but instead use another extension point to only process the conflicting hunks after execution of the existing merge driver? I couldn't find such an extension point, but it can well be that I missed it.

Thank you,

Antonin

Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Antonin

On 08/11/2023 21:54, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
> > This makes it possible to use other diff algorithms than the 'myers'
> default algorithm, when using the 'git merge-file' command.

I think being able to select the diff algorithm is reasonable. I might be nice to mention the use of "git merge-file" in custom merge drivers as a motivation in the commit message.

> Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> ---
>      merge-file: add --diff-algorithm option
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1606%2Fwetneb%2Fmerge_file_configurable_diff_algorithm-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v1
> Pull-Request: https://github.com/git/git/pull/1606
> >   Documentation/git-merge-file.txt |  5 +++++
>   builtin/merge-file.c             | 28 ++++++++++++++++++++++++++++
>   2 files changed, 33 insertions(+)
> > diff --git a/Documentation/git-merge-file.txt b/Documentation/git-merge-file.txt
> index 6a081eacb72..917535217c1 100644
> --- a/Documentation/git-merge-file.txt
> +++ b/Documentation/git-merge-file.txt
> @@ -92,6 +92,11 @@ object store and the object ID of its blob is written to standard output.
>   	Instead of leaving conflicts in the file, resolve conflicts
>   	favouring our (or their or both) side of the lines.
>   > +--diff-algorithm <algorithm>::
> +	Use a different diff algorithm while merging, which can help
> +	avoid mismerges that occur due to unimportant matching lines
> +	(such as braces from distinct functions).  See also
> +	linkgit:git-diff[1] `--diff-algorithm`.

Perhaps we could list the available algorithms here so the user does not have to go searching for them in another man page.

>   EXAMPLES
>   --------
> diff --git a/builtin/merge-file.c b/builtin/merge-file.c
> index 832c93d8d54..1f987334a31 100644
> --- a/builtin/merge-file.c
> +++ b/builtin/merge-file.c
> @@ -1,5 +1,6 @@
>   #include "builtin.h"
>   #include "abspath.h"
> +#include "diff.h"
>   #include "hex.h"
>   #include "object-name.h"
>   #include "object-store.h"
> @@ -28,6 +29,30 @@ static int label_cb(const struct option *opt, const char *arg, int unset)
>   	return 0;
>   }
>   > +static int set_diff_algorithm(xpparam_t *xpp,
> +			      const char *alg)
> +{
> +	long diff_algorithm = parse_algorithm_value(alg);
> +	if (diff_algorithm < 0)
> +		return -1;
> +	xpp->flags = (xpp->flags & ~XDF_DIFF_ALGORITHM_MASK) | diff_algorithm;
> +	return 0;
> +}
> +
> +static int diff_algorithm_cb(const struct option *opt,
> +				const char *arg, int unset)
> +{
> +	xpparam_t *xpp = opt->value;
> +
> +	BUG_ON_OPT_NEG(unset);
> +
> +	if (set_diff_algorithm(xpp, arg))
> +		return error(_("option diff-algorithm accepts \"myers\", "
> +			       "\"minimal\", \"patience\" and \"histogram\""));
> +
> +	return 0;
> +}
> +
>   int cmd_merge_file(int argc, const char **argv, const char *prefix)
>   {
>   	const char *names[3] = { 0 };
> @@ -48,6 +73,9 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
>   			    XDL_MERGE_FAVOR_THEIRS),
>   		OPT_SET_INT(0, "union", &xmp.favor, N_("for conflicts, use a union version"),
>   			    XDL_MERGE_FAVOR_UNION),
> +		OPT_CALLBACK_F(0, "diff-algorithm", &xmp.xpp, N_("<algorithm>"),
> +			     N_("choose a diff algorithm"),
> +			     PARSE_OPT_NONEG, diff_algorithm_cb),
>   		OPT_INTEGER(0, "marker-size", &xmp.marker_size,
>   			    N_("for conflicts, use this marker size")),
>   		OPT__QUIET(&quiet, N_("do not warn about conflicts")),

This patch looks sensible to me, it would be nice to have some tests though.

Best Wishes

Phillip

> base-commit: 98009afd24e2304bf923a64750340423473809ff

Copy link

User Phillip Wood <phillip.wood123@gmail.com> has been added to the cc: list.

Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Antonin

On 17/11/2023 21:42, Antonin Delpeuch wrote:
> Hi all,
> > Here a few more thoughts about this patch, to explain what brought me to > needing that. If this need is misguided, perhaps you could redirect me > to a better solution.
> > I am writing a custom merge driver for Java files. This merge driver > internally calls git-merge-file and then solves the merge conflicts > which only consist of import statements (there might be cases where it > gets it wrong, but I can then use other tools to cleanup those import > statements). When testing this, I noticed that the merge driver > performed more poorly on other sorts of conflicts, compared to the > standard "ort" merge strategy. This is because "ort" uses the > "histogram" diff algorithm, which gives better results than the "myers" > diff algorithm that merge-file uses.

I cannot comment on this particular use but I think in general calling "git merge-file" from a custom merge driver is perfectly sensible. Have you tested your driver with this patch to see if you get better results with the histogram diff algorithm?

> Intuitively, if "histogram" is the default diff algorithm used by "git > merge", then it would also make sense to have the same default for "git > merge-file", but I assume that changing this default could be considered > a bad breaking change. So I thought that making this diff algorithm > configurable would be an acceptable move, hence my patch.

I can see there's an argument for changing the default algorithm of "git merge-file" to match what "ort" uses. I know Elijah found the histogram algorithm gave better results in his testing when he was developing "ort". While it would be a breaking change if on the average the new default gives better conflicts it might be worth it. This patch would mean that someone wanting to use the "myers" algorithm could still do so.

> Of course, the diffing could be configured in other ways, for instance > with its handling of whitespace or EOL (similarly to what the "git-diff" > command offers). I think those options would definitely be worth > exposing in merge-file as well. If you think this makes sense, then I > would be happy to work on a new version of this patch which would > attempt to include all the relevant options. I could also try to add the > corresponding tests.

It would be nice to see some tests for this patch, ideally using a test case that gives different conflicts for "myers" and "histogram". We could add the other options later if there is a demand.

Best Wishes

Phillip

> But perhaps my need is misguided? Could it be that I should not be > writing a custom merge driver, but instead use another extension point > to only process the conflicting hunks after execution of the existing > merge driver? I couldn't find such an extension point, but it can well > be that I missed it.
> > Thank you,
> > Antonin
> > 

Copy link

On the Git mailing list, Antonin Delpeuch wrote (reply to this):

Hi Phillip,

Thank you so much for taking the time to review this!

On 19/11/2023 17:43, Phillip Wood wrote:
> I cannot comment on this particular use but I think in general calling > "git merge-file" from a custom merge driver is perfectly sensible. > Have you tested your driver with this patch to see if you get better > results with the histogram diff algorithm?

Yes, I can confirm that the results are better in my use case indeed.

> I can see there's an argument for changing the default algorithm of > "git merge-file" to match what "ort" uses. I know Elijah found the > histogram algorithm gave better results in his testing when he was > developing "ort". While it would be a breaking change if on the > average the new default gives better conflicts it might be worth it. > This patch would mean that someone wanting to use the "myers" > algorithm could still do so.

Agreed. I would be happy to submit a follow-up patch to change the default. Or would you prefer to have it in the same patch (as a separate commit)? I was worried this would make my patch less likely to get merged.

> It would be nice to see some tests for this patch, ideally using a > test case that gives different conflicts for "myers" and "histogram". > We could add the other options later if there is a demand.

Will do.

> Perhaps we could list the available algorithms here so the user does > not have to go searching for them in another man page.

This part is copied from "Documentation/merge-strategies.txt", which redirects to the manual for git-diff in the same way. I assume it was done so that whenever a new diff algorithm is introduced, it only needs documenting in one place. But I agree it is definitely more user-friendly to list the algorithms directly. Should I change the documentation of merge strategies in the same way?

Best wishes,

Antonin

@wetneb wetneb force-pushed the merge_file_configurable_diff_algorithm branch 3 times, most recently from d6eacab to ce139d2 Compare November 19, 2023 23:07
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <phillip.wood123@gmail.com> writes:

> I can see there's an argument for changing the default algorithm of
> "git merge-file" to match what "ort" uses. I know Elijah found the
> histogram algorithm gave better results in his testing when he was
> developing "ort". While it would be a breaking change if on the
> average the new default gives better conflicts it might be worth
> it. This patch would mean that someone wanting to use the "myers"
> algorithm could still do so.

Sounds like a sensible thing to do.  First allow to configure the
custom algorithm from the command line option (and optionally via a
configuration variable) and ship it in a release, start giving a
warning if the using script did not specify the configuration or the
command line option and used the current default and ship it in the
next release, wait for a few releases and then finally flip the
default, or something like that.

Thanks.

This makes it possible to use other diff algorithms than the 'myers'
default algorithm, when using the 'git merge-file' command. This helps
avoid spurious conflicts by selecting a more recent algorithm such as
'histogram', for instance when using 'git merge-file' as part of a custom
merge driver.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>
@wetneb wetneb force-pushed the merge_file_configurable_diff_algorithm branch from ce139d2 to 842b5ab Compare November 20, 2023 18:45
@wetneb
Copy link
Contributor Author

wetneb commented Nov 20, 2023

/submit

Copy link

Submitted as pull.1606.v2.git.git.1700507932937.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v2

To fetch this version to local tag pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v2

Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Antonin

On 20/11/2023 19:18, Antonin Delpeuch via GitGitGadget wrote:
> From: Antonin Delpeuch <antonin@delpeuch.eu>
> > This makes it possible to use other diff algorithms than the 'myers'
> default algorithm, when using the 'git merge-file' command. This helps
> avoid spurious conflicts by selecting a more recent algorithm such as
> 'histogram', for instance when using 'git merge-file' as part of a custom
> merge driver.
> > Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
> Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>

This version looks good to me. Thanks for adding the tests and well done for finding a test case that shows the benefits of changing the diff algorithm so clearly.

For future reference note that the custom on this list is not to add "Reviewed-by:" unless the reviewer explicitly suggests it. In this case I'm happy for it to be left as is.

Best Wishes

Phillip

> ---
>      merge-file: add --diff-algorithm option
>      >      Changes since v1:
>      >       * improve commit message to mention the use case of custom merge
>         drivers
>       * improve documentation to show available options and recommend
>         switching to "histogram"
>       * add tests
>      >      I have left out:
>      >       * switching the default to "histogram", because it should only be done
>         in a subsequent release
>       * adding a configuration variable to control this option, because I was
>         not sure how to call it. Perhaps "merge-file.diffAlgorithm"?
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1606%2Fwetneb%2Fmerge_file_configurable_diff_algorithm-v2
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1606/wetneb/merge_file_configurable_diff_algorithm-v2
> Pull-Request: https://github.com/git/git/pull/1606
> > Range-diff vs v1:
> >   1:  4aa453e30be ! 1:  842b5abf33c merge-file: add --diff-algorithm option
>       @@ Commit message
>            merge-file: add --diff-algorithm option
>        >            This makes it possible to use other diff algorithms than the 'myers'
>       -    default algorithm, when using the 'git merge-file' command.
>       +    default algorithm, when using the 'git merge-file' command. This helps
>       +    avoid spurious conflicts by selecting a more recent algorithm such as
>       +    'histogram', for instance when using 'git merge-file' as part of a custom
>       +    merge driver.
>        >            Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
>       +    Reviewed-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>        >         ## Documentation/git-merge-file.txt ##
>        @@ Documentation/git-merge-file.txt: object store and the object ID of its blob is written to standard output.
>         	Instead of leaving conflicts in the file, resolve conflicts
>         	favouring our (or their or both) side of the lines.
>         >       -+--diff-algorithm <algorithm>::
>       -+	Use a different diff algorithm while merging, which can help
>       ++--diff-algorithm={patience|minimal|histogram|myers}::
>       ++	Use a different diff algorithm while merging. The current default is "myers",
>       ++	but selecting more recent algorithm such as "histogram" can help
>        +	avoid mismerges that occur due to unimportant matching lines
>       -+	(such as braces from distinct functions).  See also
>       ++	(such as braces from distinct functions). See also
>        +	linkgit:git-diff[1] `--diff-algorithm`.
>         >         EXAMPLES
>       @@ builtin/merge-file.c: int cmd_merge_file(int argc, const char **argv, const char
>         		OPT_INTEGER(0, "marker-size", &xmp.marker_size,
>         			    N_("for conflicts, use this marker size")),
>         		OPT__QUIET(&quiet, N_("do not warn about conflicts")),
>       +
>       + ## t/t6403-merge-file.sh ##
>       +@@ t/t6403-merge-file.sh: test_expect_success 'setup' '
>       + 	deduxit me super semitas jusitiae,
>       + 	EOF
>       +
>       +-	printf "propter nomen suum." >>new4.txt
>       ++	printf "propter nomen suum." >>new4.txt &&
>       ++
>       ++	cat >base.c <<-\EOF &&
>       ++	int f(int x, int y)
>       ++	{
>       ++		if (x == 0)
>       ++		{
>       ++			return y;
>       ++		}
>       ++		return x;
>       ++	}
>       ++
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u < 30)
>       ++		{
>       ++			u++;
>       ++		}
>       ++		return u;
>       ++	}
>       ++	EOF
>       ++
>       ++	cat >ours.c <<-\EOF &&
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u < 30)
>       ++		{
>       ++			u++;
>       ++		}
>       ++		return u;
>       ++	}
>       ++
>       ++	int h(int x, int y, int z)
>       ++	{
>       ++		if (z == 0)
>       ++		{
>       ++			return x;
>       ++		}
>       ++		return y;
>       ++	}
>       ++	EOF
>       ++
>       ++	cat >theirs.c <<-\EOF
>       ++	int f(int x, int y)
>       ++	{
>       ++		if (x == 0)
>       ++		{
>       ++			return y;
>       ++		}
>       ++		return x;
>       ++	}
>       ++
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u > 34)
>       ++		{
>       ++			u--;
>       ++		}
>       ++		return u;
>       ++	}
>       ++	EOF
>       + '
>       +
>       + test_expect_success 'merge with no changes' '
>       +@@ t/t6403-merge-file.sh: test_expect_success '--object-id fails without repository' '
>       + 	grep "not a git repository" err
>       + '
>       +
>       ++test_expect_success 'merging C files with "myers" diff algorithm creates some spurious conflicts' '
>       ++	cat >expect.c <<-\EOF &&
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u < 30)
>       ++		{
>       ++			u++;
>       ++		}
>       ++		return u;
>       ++	}
>       ++
>       ++	int h(int x, int y, int z)
>       ++	{
>       ++	<<<<<<< ours.c
>       ++		if (z == 0)
>       ++	||||||| base.c
>       ++		while (u < 30)
>       ++	=======
>       ++		while (u > 34)
>       ++	>>>>>>> theirs.c
>       ++		{
>       ++	<<<<<<< ours.c
>       ++			return x;
>       ++	||||||| base.c
>       ++			u++;
>       ++	=======
>       ++			u--;
>       ++	>>>>>>> theirs.c
>       ++		}
>       ++		return y;
>       ++	}
>       ++	EOF
>       ++
>       ++	test_must_fail git merge-file -p --diff3 --diff-algorithm myers ours.c base.c theirs.c >myers_output.c &&
>       ++	test_cmp expect.c myers_output.c
>       ++'
>       ++
>       ++test_expect_success 'merging C files with "histogram" diff algorithm avoids some spurious conflicts' '
>       ++	cat >expect.c <<-\EOF &&
>       ++	int g(size_t u)
>       ++	{
>       ++		while (u > 34)
>       ++		{
>       ++			u--;
>       ++		}
>       ++		return u;
>       ++	}
>       ++
>       ++	int h(int x, int y, int z)
>       ++	{
>       ++		if (z == 0)
>       ++		{
>       ++			return x;
>       ++		}
>       ++		return y;
>       ++	}
>       ++	EOF
>       ++
>       ++	git merge-file -p --diff3 --diff-algorithm histogram ours.c base.c theirs.c >histogram_output.c &&
>       ++	test_cmp expect.c histogram_output.c
>       ++'
>       ++
>       + test_done
> > >   Documentation/git-merge-file.txt |   6 ++
>   builtin/merge-file.c             |  28 +++++++
>   t/t6403-merge-file.sh            | 124 ++++++++++++++++++++++++++++++-
>   3 files changed, 157 insertions(+), 1 deletion(-)
> > diff --git a/Documentation/git-merge-file.txt b/Documentation/git-merge-file.txt
> index 6a081eacb72..71915a00fa4 100644
> --- a/Documentation/git-merge-file.txt
> +++ b/Documentation/git-merge-file.txt
> @@ -92,6 +92,12 @@ object store and the object ID of its blob is written to standard output.
>   	Instead of leaving conflicts in the file, resolve conflicts
>   	favouring our (or their or both) side of the lines.
>   > +--diff-algorithm={patience|minimal|histogram|myers}::
> +	Use a different diff algorithm while merging. The current default is "myers",
> +	but selecting more recent algorithm such as "histogram" can help
> +	avoid mismerges that occur due to unimportant matching lines
> +	(such as braces from distinct functions). See also
> +	linkgit:git-diff[1] `--diff-algorithm`.
>   >   EXAMPLES
>   --------
> diff --git a/builtin/merge-file.c b/builtin/merge-file.c
> index 832c93d8d54..1f987334a31 100644
> --- a/builtin/merge-file.c
> +++ b/builtin/merge-file.c
> @@ -1,5 +1,6 @@
>   #include "builtin.h"
>   #include "abspath.h"
> +#include "diff.h"
>   #include "hex.h"
>   #include "object-name.h"
>   #include "object-store.h"
> @@ -28,6 +29,30 @@ static int label_cb(const struct option *opt, const char *arg, int unset)
>   	return 0;
>   }
>   > +static int set_diff_algorithm(xpparam_t *xpp,
> +			      const char *alg)
> +{
> +	long diff_algorithm = parse_algorithm_value(alg);
> +	if (diff_algorithm < 0)
> +		return -1;
> +	xpp->flags = (xpp->flags & ~XDF_DIFF_ALGORITHM_MASK) | diff_algorithm;
> +	return 0;
> +}
> +
> +static int diff_algorithm_cb(const struct option *opt,
> +				const char *arg, int unset)
> +{
> +	xpparam_t *xpp = opt->value;
> +
> +	BUG_ON_OPT_NEG(unset);
> +
> +	if (set_diff_algorithm(xpp, arg))
> +		return error(_("option diff-algorithm accepts \"myers\", "
> +			       "\"minimal\", \"patience\" and \"histogram\""));
> +
> +	return 0;
> +}
> +
>   int cmd_merge_file(int argc, const char **argv, const char *prefix)
>   {
>   	const char *names[3] = { 0 };
> @@ -48,6 +73,9 @@ int cmd_merge_file(int argc, const char **argv, const char *prefix)
>   			    XDL_MERGE_FAVOR_THEIRS),
>   		OPT_SET_INT(0, "union", &xmp.favor, N_("for conflicts, use a union version"),
>   			    XDL_MERGE_FAVOR_UNION),
> +		OPT_CALLBACK_F(0, "diff-algorithm", &xmp.xpp, N_("<algorithm>"),
> +			     N_("choose a diff algorithm"),
> +			     PARSE_OPT_NONEG, diff_algorithm_cb),
>   		OPT_INTEGER(0, "marker-size", &xmp.marker_size,
>   			    N_("for conflicts, use this marker size")),
>   		OPT__QUIET(&quiet, N_("do not warn about conflicts")),
> diff --git a/t/t6403-merge-file.sh b/t/t6403-merge-file.sh
> index 2c92209ecab..fb872c5a113 100755
> --- a/t/t6403-merge-file.sh
> +++ b/t/t6403-merge-file.sh
> @@ -56,7 +56,67 @@ test_expect_success 'setup' '
>   	deduxit me super semitas jusitiae,
>   	EOF
>   > -	printf "propter nomen suum." >>new4.txt
> +	printf "propter nomen suum." >>new4.txt &&
> +
> +	cat >base.c <<-\EOF &&
> +	int f(int x, int y)
> +	{
> +		if (x == 0)
> +		{
> +			return y;
> +		}
> +		return x;
> +	}
> +
> +	int g(size_t u)
> +	{
> +		while (u < 30)
> +		{
> +			u++;
> +		}
> +		return u;
> +	}
> +	EOF
> +
> +	cat >ours.c <<-\EOF &&
> +	int g(size_t u)
> +	{
> +		while (u < 30)
> +		{
> +			u++;
> +		}
> +		return u;
> +	}
> +
> +	int h(int x, int y, int z)
> +	{
> +		if (z == 0)
> +		{
> +			return x;
> +		}
> +		return y;
> +	}
> +	EOF
> +
> +	cat >theirs.c <<-\EOF
> +	int f(int x, int y)
> +	{
> +		if (x == 0)
> +		{
> +			return y;
> +		}
> +		return x;
> +	}
> +
> +	int g(size_t u)
> +	{
> +		while (u > 34)
> +		{
> +			u--;
> +		}
> +		return u;
> +	}
> +	EOF
>   '
>   >   test_expect_success 'merge with no changes' '
> @@ -447,4 +507,66 @@ test_expect_success '--object-id fails without repository' '
>   	grep "not a git repository" err
>   '
>   > +test_expect_success 'merging C files with "myers" diff algorithm creates some spurious conflicts' '
> +	cat >expect.c <<-\EOF &&
> +	int g(size_t u)
> +	{
> +		while (u < 30)
> +		{
> +			u++;
> +		}
> +		return u;
> +	}
> +
> +	int h(int x, int y, int z)
> +	{
> +	<<<<<<< ours.c
> +		if (z == 0)
> +	||||||| base.c
> +		while (u < 30)
> +	=======
> +		while (u > 34)
> +	>>>>>>> theirs.c
> +		{
> +	<<<<<<< ours.c
> +			return x;
> +	||||||| base.c
> +			u++;
> +	=======
> +			u--;
> +	>>>>>>> theirs.c
> +		}
> +		return y;
> +	}
> +	EOF
> +
> +	test_must_fail git merge-file -p --diff3 --diff-algorithm myers ours.c base.c theirs.c >myers_output.c &&
> +	test_cmp expect.c myers_output.c
> +'
> +
> +test_expect_success 'merging C files with "histogram" diff algorithm avoids some spurious conflicts' '
> +	cat >expect.c <<-\EOF &&
> +	int g(size_t u)
> +	{
> +		while (u > 34)
> +		{
> +			u--;
> +		}
> +		return u;
> +	}
> +
> +	int h(int x, int y, int z)
> +	{
> +		if (z == 0)
> +		{
> +			return x;
> +		}
> +		return y;
> +	}
> +	EOF
> +
> +	git merge-file -p --diff3 --diff-algorithm histogram ours.c base.c theirs.c >histogram_output.c &&
> +	test_cmp expect.c histogram_output.c
> +'
> +
>   test_done
> > base-commit: 98009afd24e2304bf923a64750340423473809ff

Copy link

This branch is now known as ad/merge-file-diff-algo.

Copy link

This patch series was integrated into seen via c3697c1.

Copy link

There was a status update in the "New Topics" section about the branch ad/merge-file-diff-algo on the Git mailing list:

"git merge-file" learned to take the "--diff-algorithm" option to
use algorithm different from the default "myers" diff.

Will merge to 'next'?
source: <pull.1606.v2.git.git.1700507932937.gitgitgadget@gmail.com>

Copy link

This patch series was integrated into seen via da72d4d.

Copy link

This patch series was integrated into seen via 2951092.

Copy link

There was a status update in the "Cooking" section about the branch ad/merge-file-diff-algo on the Git mailing list:

"git merge-file" learned to take the "--diff-algorithm" option to
use algorithm different from the default "myers" diff.

Will merge to 'next'.
source: <pull.1606.v2.git.git.1700507932937.gitgitgadget@gmail.com>

Copy link

This patch series was integrated into seen via 7c48c60.

Copy link

This patch series was integrated into next via ab43a54.

Copy link

This patch series was integrated into seen via a41f9ee.

Copy link

There was a status update in the "Cooking" section about the branch ad/merge-file-diff-algo on the Git mailing list:

"git merge-file" learned to take the "--diff-algorithm" option to
use algorithm different from the default "myers" diff.

Will merge to 'master'.
source: <pull.1606.v2.git.git.1700507932937.gitgitgadget@gmail.com>

Copy link

This patch series was integrated into seen via 05af269.

Copy link

This patch series was integrated into seen via 54ff021.

Copy link

This patch series was integrated into seen via 30a95ac.

Copy link

This patch series was integrated into seen via 8c1cfc6.

Copy link

This patch series was integrated into seen via 7895686.

Copy link

This patch series was integrated into master via 7895686.

Copy link

This patch series was integrated into next via 7895686.

Copy link

Closed via 7895686.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants