Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge-tree: load default git config #1530

Conversation

derrickstolee
Copy link

@derrickstolee derrickstolee commented May 10, 2023

This patch was reviewed on the Git security list, but the impact seemed limited to Git forges using merge-ort to create merge commits. The forges represented on the list have deployed versions of this patch and thus are no longer vulnerable.

Thanks,
-Stolee

cc: gitster@pobox.com
cc: me@ttaylorr.com
cc: christian.couder@gmail.com
cc: Felipe Contreras felipe.contreras@gmail.com
cc: Elijah Newren newren@gmail.com

The 'git merge-tree' command handles creating root trees for merges
without using the worktree. This is a critical operation in many Git
hosts, as they typically store bare repositories.

This builtin does not load the default Git config, which can have
several important ramifications.

In particular, one config that is loaded by default is
core.useReplaceRefs. This is typically disabled in Git hosts due to
the ability to spoof commits in strange ways.

Since this config is not loaded specifically during merge-tree, users
were previously able to use refs/replace/ references to make pull
requests that looked valid but introduced malicious content. The
resulting merge commit would have the correct commit history, but the
malicious content would exist in the root tree of the merge.

The fix is simple: load the default Git config in cmd_merge_tree().
This may also fix other behaviors that are effected by reading default
config. The only possible downside is a little extra computation time
spent reading config. The config parsing is placed after basic argument
parsing so it does not slow down usage errors.

Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
@derrickstolee derrickstolee self-assigned this May 10, 2023
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

Submitted as pull.1530.git.1683745654800.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1530/derrickstolee/stolee/refs-replace-upstream-v1

To fetch this version to local tag pr-1530/derrickstolee/stolee/refs-replace-upstream-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1530/derrickstolee/stolee/refs-replace-upstream-v1

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

>     This patch was reviewed on the Git security list, but the impact seemed
>     limited to Git forges using merge-ort to create merge commits. The
>     forges represented on the list have deployed versions of this patch and
>     thus are no longer vulnerable.

Let's queue directly on 'next' (unlike 'master', where we want to
merge only commits that had exposure in 'next' for a week or so,
there is no formal requirement for topics to enter 'next' before
spending any time in 'seen') and fast-track to 'master', as I've
seen it already reviewed adequately over there.

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 5/10/2023 3:18 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>>     This patch was reviewed on the Git security list, but the impact seemed
>>     limited to Git forges using merge-ort to create merge commits. The
>>     forges represented on the list have deployed versions of this patch and
>>     thus are no longer vulnerable.
> 
> Let's queue directly on 'next' (unlike 'master', where we want to
> merge only commits that had exposure in 'next' for a week or so,
> there is no formal requirement for topics to enter 'next' before
> spending any time in 'seen') and fast-track to 'master', as I've
> seen it already reviewed adequately over there.

Thanks for picking it up so quickly. I did not mean to imply that
we should merge to master without giving the public list time to
digest the patch. It is a rather simple one, though.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

On the Git mailing list, Felipe Contreras wrote (reply to this):

Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The 'git merge-tree' command handles creating root trees for merges
> without using the worktree. This is a critical operation in many Git
> hosts, as they typically store bare repositories.
> 
> This builtin does not load the default Git config, which can have
> several important ramifications.

For the record, I had already sent a better version of this patch almost 2
years ago [1], not just for `git merge-tree`, but other commands as well.

The obvious fix was completely ignored by the maintainer.

The reason why it should be git_xmerge_config and not git_default_config, is
that merge.conflictstyle would not be parsed if you call git_default_config.

> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index aa8040c2a6a..b8f8a8b5d9f 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -17,6 +17,7 @@
>  #include "merge-blobs.h"
>  #include "quote.h"
>  #include "tree.h"
> +#include "config.h"
>  
>  static int line_termination = '\n';
>  
> @@ -628,6 +629,8 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
>  	if (argc != expected_remaining_argc)
>  		usage_with_options(merge_tree_usage, mt_options);
>  
> +	git_config(git_default_config, NULL);

It should be git_xmerge_config.

> +
>  	/* Do the relevant type of merge */
>  	if (o.mode == MODE_REAL)
>  		return real_merge(&o, merge_base, argv[0], argv[1], prefix);

[1] https://lore.kernel.org/git/20210622002714.1720891-3-felipe.contreras@gmail.com/

-- 
Felipe Contreras

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

User Felipe Contreras <felipe.contreras@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> The fix is simple: load the default Git config in cmd_merge_tree().
> This may also fix other behaviors that are effected by reading default
> config. The only possible downside is a little extra computation time
> spent reading config. The config parsing is placed after basic argument
> parsing so it does not slow down usage errors.

Presumably merge-tree wants to serve a low-level machinery that
gives reliable reproducible result, we may want to keep the
configuration variables we read as narrow as practical.  The
default_config() callback may still be wider than desirable from
that point of view, but I guess that is the most reasonable choice?

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

On the Git mailing list, Taylor Blau wrote (reply to this):

On Wed, May 10, 2023 at 12:18:15PM -0700, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >     This patch was reviewed on the Git security list, but the impact seemed
> >     limited to Git forges using merge-ort to create merge commits. The
> >     forges represented on the list have deployed versions of this patch and
> >     thus are no longer vulnerable.
>
> Let's queue directly on 'next' (unlike 'master', where we want to
> merge only commits that had exposure in 'next' for a week or so,
> there is no formal requirement for topics to enter 'next' before
> spending any time in 'seen') and fast-track to 'master', as I've
> seen it already reviewed adequately over there.

Agreed. I also participated in the earlier rounds of review and the
resulting patch looks obviously correct to me. I would be happy to see
it merged.

I added Elijah to the CC list, since he is likely to be interested in
this topic and may have thoughts to share.

Thanks,
Taylor

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2023

On the Git mailing list, Felipe Contreras wrote (reply to this):

Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > The fix is simple: load the default Git config in cmd_merge_tree().
> > This may also fix other behaviors that are effected by reading default
> > config. The only possible downside is a little extra computation time
> > spent reading config. The config parsing is placed after basic argument
> > parsing so it does not slow down usage errors.
> 
> Presumably merge-tree wants to serve a low-level machinery that
> gives reliable reproducible result, we may want to keep the
> configuration variables we read as narrow as practical.  The
> default_config() callback may still be wider than desirable from
> that point of view, but I guess that is the most reasonable choice?

If you want to *intentionally* ignore merge.conflictStyle in `git merge-tree`
then the documentation has to be updated as this is clearly not true:

  The performed merge will use the same feature as the "real"
  linkgit:git-merge[1], including:

    * three way content merges of individual files
    * rename detection
    * proper directory/file conflict handling
    * recursive ancestor consolidation (i.e. when there is more than one
      merge base, creating a virtual merge base by merging the merge bases)
    * etc.

If you want to ignore merge.conflictStyle, then `git merge-tree` would *not* be
using the same features as the real `git merge`, in particular proper conflict
halding (the conflict markers will be different).

-- 
Felipe Contreras

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Elijah Newren wrote (reply to this):

On Wed, May 10, 2023 at 12:33 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <derrickstolee@github.com>
>
> The 'git merge-tree' command handles creating root trees for merges
> without using the worktree. This is a critical operation in many Git
> hosts, as they typically store bare repositories.
>
> This builtin does not load the default Git config, which can have
> several important ramifications.
>
> In particular, one config that is loaded by default is
> core.useReplaceRefs. This is typically disabled in Git hosts due to
> the ability to spoof commits in strange ways.
>
> Since this config is not loaded specifically during merge-tree, users
> were previously able to use refs/replace/ references to make pull
> requests that looked valid but introduced malicious content. The
> resulting merge commit would have the correct commit history, but the
> malicious content would exist in the root tree of the merge.

Ouch!  So sorry for creating this problem.

> The fix is simple: load the default Git config in cmd_merge_tree().
> This may also fix other behaviors that are effected by reading default
> config. The only possible downside is a little extra computation time
> spent reading config. The config parsing is placed after basic argument
> parsing so it does not slow down usage errors.
>
> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>     merge-tree: load default git config
>
>     This patch was reviewed on the Git security list, but the impact seemed
>     limited to Git forges using merge-ort to create merge commits. The
>     forges represented on the list have deployed versions of this patch and
>     thus are no longer vulnerable.
>
>     Thanks, -Stolee
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1530%2Fderrickstolee%2Fstolee%2Frefs-replace-upstream-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1530/derrickstolee/stolee/refs-replace-upstream-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1530
>
>  builtin/merge-tree.c  |  3 +++
>  t/t4300-merge-tree.sh | 18 ++++++++++++++++++
>  2 files changed, 21 insertions(+)
>
> diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> index aa8040c2a6a..b8f8a8b5d9f 100644
> --- a/builtin/merge-tree.c
> +++ b/builtin/merge-tree.c
> @@ -17,6 +17,7 @@
>  #include "merge-blobs.h"
>  #include "quote.h"
>  #include "tree.h"
> +#include "config.h"
>
>  static int line_termination = '\n';
>
> @@ -628,6 +629,8 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
>         if (argc != expected_remaining_argc)
>                 usage_with_options(merge_tree_usage, mt_options);
>
> +       git_config(git_default_config, NULL);
> +

Always nice when it's a simple fix.  :-)

I am curious though...

init_merge_options() in merge-recursive.c (which is also used by
merge-ort) calls merge_recursive_config().  merge_recursive_config()
does a bunch of config parsing, regardless of whatever config parsing
is done beforehand by the caller of init_merge_options().  This makes
me wonder if the config which handles replace refs should be included
in merge_recursive_config() as well.  Doing so would have the added
benefit of making sure all the builtins calling the merge logic behave
similarly.  And if we copy/move the replace-refs-handling config
logic, does that replace the fix in this patch, or just supplement it?

To be honest, I've mostly ignored the config side of things while
working on the merge machinery, so I didn't even know (or at least
remember) the above details until I went digging just now.  I don't
know if the way init_merge_options()/merge_recursive_config() is how
we should do things, or just vestiges of how it's evolved from 15
years ago.

>         /* Do the relevant type of merge */
>         if (o.mode == MODE_REAL)
>                 return real_merge(&o, merge_base, argv[0], argv[1], prefix);
> diff --git a/t/t4300-merge-tree.sh b/t/t4300-merge-tree.sh
> index c52c8a21fae..57c4f26e461 100755
> --- a/t/t4300-merge-tree.sh
> +++ b/t/t4300-merge-tree.sh
> @@ -334,4 +334,22 @@ test_expect_success 'turn tree to file' '
>         test_cmp expect actual
>  '
>
> +test_expect_success 'merge-tree respects core.useReplaceRefs=false' '
> +       test_commit merge-to &&
> +       test_commit valid base &&
> +       git reset --hard HEAD^ &&
> +       test_commit malicious base &&
> +
> +       test_when_finished "git replace -d $(git rev-parse valid^0)" &&
> +       git replace valid^0 malicious^0 &&
> +
> +       tree=$(git -c core.useReplaceRefs=true merge-tree --write-tree merge-to valid) &&
> +       merged=$(git cat-file -p $tree:base) &&
> +       test malicious = $merged &&
> +
> +       tree=$(git -c core.useReplaceRefs=false merge-tree --write-tree merge-to valid) &&
> +       merged=$(git cat-file -p $tree:base) &&
> +       test valid = $merged
> +'
> +
>  test_done

Thanks for adding the test case too, as always.

> base-commit: 5597cfdf47db94825213fefe78c4485e6a5702d8
> --
> gitgitgadget

Looks good.  I am curious for other's thoughts on whether it may make
sense to add parsing of core.useReplaceRefs within
merge_recursive_config().

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

User Elijah Newren <newren@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Elijah Newren wrote (reply to this):

On Wed, May 10, 2023 at 4:21 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> On Wed, May 10, 2023 at 12:18:15PM -0700, Junio C Hamano wrote:
> > "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> >
> > >     This patch was reviewed on the Git security list, but the impact seemed
> > >     limited to Git forges using merge-ort to create merge commits. The
> > >     forges represented on the list have deployed versions of this patch and
> > >     thus are no longer vulnerable.
> >
> > Let's queue directly on 'next' (unlike 'master', where we want to
> > merge only commits that had exposure in 'next' for a week or so,
> > there is no formal requirement for topics to enter 'next' before
> > spending any time in 'seen') and fast-track to 'master', as I've
> > seen it already reviewed adequately over there.
>
> Agreed. I also participated in the earlier rounds of review and the
> resulting patch looks obviously correct to me. I would be happy to see
> it merged.
>
> I added Elijah to the CC list, since he is likely to be interested in
> this topic and may have thoughts to share.

Thanks.  I took a look and left some comments (it looks like the merge
machinery already parses _most_ relevant merge-related config
unconditionally, each time we set up a merge), but I had more
questions than answers.  :-)

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Felipe Contreras wrote (reply to this):

Elijah Newren wrote:
> On Wed, May 10, 2023 at 12:33 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: Derrick Stolee <derrickstolee@github.com>
> >
> > The 'git merge-tree' command handles creating root trees for merges
> > without using the worktree. This is a critical operation in many Git
> > hosts, as they typically store bare repositories.
> >
> > This builtin does not load the default Git config, which can have
> > several important ramifications.
> >
> > In particular, one config that is loaded by default is
> > core.useReplaceRefs. This is typically disabled in Git hosts due to
> > the ability to spoof commits in strange ways.
> >
> > Since this config is not loaded specifically during merge-tree, users
> > were previously able to use refs/replace/ references to make pull
> > requests that looked valid but introduced malicious content. The
> > resulting merge commit would have the correct commit history, but the
> > malicious content would exist in the root tree of the merge.
> 
> Ouch!  So sorry for creating this problem.
> 
> > The fix is simple: load the default Git config in cmd_merge_tree().
> > This may also fix other behaviors that are effected by reading default
> > config. The only possible downside is a little extra computation time
> > spent reading config. The config parsing is placed after basic argument
> > parsing so it does not slow down usage errors.
> >
> > Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> > ---
> >     merge-tree: load default git config
> >
> >     This patch was reviewed on the Git security list, but the impact seemed
> >     limited to Git forges using merge-ort to create merge commits. The
> >     forges represented on the list have deployed versions of this patch and
> >     thus are no longer vulnerable.
> >
> >     Thanks, -Stolee
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1530%2Fderrickstolee%2Fstolee%2Frefs-replace-upstream-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1530/derrickstolee/stolee/refs-replace-upstream-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/1530
> >
> >  builtin/merge-tree.c  |  3 +++
> >  t/t4300-merge-tree.sh | 18 ++++++++++++++++++
> >  2 files changed, 21 insertions(+)
> >
> > diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
> > index aa8040c2a6a..b8f8a8b5d9f 100644
> > --- a/builtin/merge-tree.c
> > +++ b/builtin/merge-tree.c
> > @@ -17,6 +17,7 @@
> >  #include "merge-blobs.h"
> >  #include "quote.h"
> >  #include "tree.h"
> > +#include "config.h"
> >
> >  static int line_termination = '\n';
> >
> > @@ -628,6 +629,8 @@ int cmd_merge_tree(int argc, const char **argv, const char *prefix)
> >         if (argc != expected_remaining_argc)
> >                 usage_with_options(merge_tree_usage, mt_options);
> >
> > +       git_config(git_default_config, NULL);
> > +
> 
> Always nice when it's a simple fix.  :-)
> 
> I am curious though...
> 
> init_merge_options() in merge-recursive.c (which is also used by
> merge-ort) calls merge_recursive_config().  merge_recursive_config()
> does a bunch of config parsing, regardless of whatever config parsing
> is done beforehand by the caller of init_merge_options().  This makes
> me wonder if the config which handles replace refs should be included
> in merge_recursive_config() as well.  Doing so would have the added
> benefit of making sure all the builtins calling the merge logic behave
> similarly.  And if we copy/move the replace-refs-handling config
> logic, does that replace the fix in this patch, or just supplement it?
> 
> To be honest, I've mostly ignored the config side of things while
> working on the merge machinery, so I didn't even know (or at least
> remember) the above details until I went digging just now.  I don't
> know if the way init_merge_options()/merge_recursive_config() is how
> we should do things, or just vestiges of how it's evolved from 15
> years ago.

It's obvious this is not the way we should do things as configurations are
overriden when they shouldn't be.

Something like this [1] is obviously needed.

[1] https://lore.kernel.org/git/20210609192842.696646-5-felipe.contreras@gmail.com/

-- 
Felipe Contreras

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 5/11/2023 2:34 AM, Elijah Newren wrote:
> On Wed, May 10, 2023 at 12:33 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> The 'git merge-tree' command handles creating root trees for merges
>> without using the worktree. This is a critical operation in many Git
>> hosts, as they typically store bare repositories.
>>
>> This builtin does not load the default Git config, which can have
>> several important ramifications.
>>
>> In particular, one config that is loaded by default is
>> core.useReplaceRefs. This is typically disabled in Git hosts due to
>> the ability to spoof commits in strange ways.

>> +       git_config(git_default_config, NULL);
>> +
> 
> Always nice when it's a simple fix.  :-)
> 
> I am curious though...
> 
> init_merge_options() in merge-recursive.c (which is also used by
> merge-ort) calls merge_recursive_config().  merge_recursive_config()
> does a bunch of config parsing, regardless of whatever config parsing
> is done beforehand by the caller of init_merge_options().  This makes
> me wonder if the config which handles replace refs should be included
> in merge_recursive_config() as well.  Doing so would have the added
> benefit of making sure all the builtins calling the merge logic behave
> similarly.  And if we copy/move the replace-refs-handling config
> logic, does that replace the fix in this patch, or just supplement it?
> 
> To be honest, I've mostly ignored the config side of things while
> working on the merge machinery, so I didn't even know (or at least
> remember) the above details until I went digging just now.  I don't
> know if the way init_merge_options()/merge_recursive_config() is how
> we should do things, or just vestiges of how it's evolved from 15
> years ago.
...
> Looks good.  I am curious for other's thoughts on whether it may make
> sense to add parsing of core.useReplaceRefs within
> merge_recursive_config().

In terms of a "real" fix to this kind of problem, I'm thinking that
we actually need to be sure we've parsed things like core.useReplaceRefs
when loading the object database for the first time.

Here, I'm suggesting the simplest fix before we can go about a more
rigorous change to prevent this from happening again.

The custom ahead-behind builtin that we have in our fork once also had
this same problem, and the fix was exactly like this. The impact was
less severe (mostly, things slowed down because the commit-graph was
disabled, but also the numbers could be different from expected).

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 5/10/2023 4:30 PM, Felipe Contreras wrote:
> Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> The 'git merge-tree' command handles creating root trees for merges
>> without using the worktree. This is a critical operation in many Git
>> hosts, as they typically store bare repositories.
>>
>> This builtin does not load the default Git config, which can have
>> several important ramifications.
> 
> For the record, I had already sent a better version of this patch almost 2
> years ago [1], not just for `git merge-tree`, but other commands as well.
> 
> The obvious fix was completely ignored by the maintainer.
> 
> The reason why it should be git_xmerge_config and not git_default_config, is
> that merge.conflictstyle would not be parsed if you call git_default_config.

As mentioned by Elijah in a different thread, the merge machinery loads
the merge config as needed. I confirmed by creating this test, which I
may submit as an independent patch:

test_expect_success 'merge-tree respects merge.conflictstyle' '
	test_commit conflict-base &&
	for branch in left right
	do
		git checkout -b $branch conflict-base &&
		echo $branch >>conflict-base.t &&
		git add conflict-base.t &&
		git commit -m $branch || return 1
	done &&

	test_must_fail git merge-tree left right >out1 &&
	test_must_fail git -c merge.conflictstyle=diff3 merge-tree left right >out2 &&

	tree1=$(head -n 1 out1) &&
	tree2=$(head -n 1 out2) &&

	git cat-file -p $tree1:conflict-base.t >conflict1 &&
	git cat-file -p $tree2:conflict-base.t >conflict2 &&
	! test_cmp conflict1 conflict2 &&
	! grep "||||||" conflict1 &&
	grep "||||||" conflict2
'

Thus we do not need to use git_xmerge_config at this point in the
process.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Felipe Contreras wrote (reply to this):

Derrick Stolee wrote:
> On 5/10/2023 4:30 PM, Felipe Contreras wrote:
> > Derrick Stolee via GitGitGadget wrote:
> >> From: Derrick Stolee <derrickstolee@github.com>
> >>
> >> The 'git merge-tree' command handles creating root trees for merges
> >> without using the worktree. This is a critical operation in many Git
> >> hosts, as they typically store bare repositories.
> >>
> >> This builtin does not load the default Git config, which can have
> >> several important ramifications.
> > 
> > For the record, I had already sent a better version of this patch almost 2
> > years ago [1], not just for `git merge-tree`, but other commands as well.
> > 
> > The obvious fix was completely ignored by the maintainer.
> > 
> > The reason why it should be git_xmerge_config and not git_default_config, is
> > that merge.conflictstyle would not be parsed if you call git_default_config.
> 
> As mentioned by Elijah in a different thread, the merge machinery loads
> the merge config as needed. I confirmed by creating this test, which I
> may submit as an independent patch:
> 
> test_expect_success 'merge-tree respects merge.conflictstyle' '
> 	test_commit conflict-base &&
> 	for branch in left right
> 	do
> 		git checkout -b $branch conflict-base &&
> 		echo $branch >>conflict-base.t &&
> 		git add conflict-base.t &&
> 		git commit -m $branch || return 1
> 	done &&
> 
> 	test_must_fail git merge-tree left right >out1 &&
> 	test_must_fail git -c merge.conflictstyle=diff3 merge-tree left right >out2 &&
> 
> 	tree1=$(head -n 1 out1) &&
> 	tree2=$(head -n 1 out2) &&
> 
> 	git cat-file -p $tree1:conflict-base.t >conflict1 &&
> 	git cat-file -p $tree2:conflict-base.t >conflict2 &&
> 	! test_cmp conflict1 conflict2 &&
> 	! grep "||||||" conflict1 &&
> 	grep "||||||" conflict2
> '

This test is doing a real merge, not a trivial merge.

Try doing a trivial merge instead--which is what most of our testing framework
checks--and your test fails:
 
@@ -14,17 +14,12 @@ test_expect_success 'merge-tree respects merge.conflictstyle' '
                 git commit -m $branch || return 1
         done &&
 
-        test_must_fail git merge-tree left right >out1 &&
-        test_must_fail git -c merge.conflictstyle=diff3 merge-tree left right >out2 &&
+        test_expect_code 0 git merge-tree conflict-base left right >out1 &&
+        test_expect_code 0 git -c merge.conflictstyle=diff3 merge-tree conflict-base left right >out2 &&
 
-        tree1=$(head -n 1 out1) &&
-        tree2=$(head -n 1 out2) &&
-
-        git cat-file -p $tree1:conflict-base.t >conflict1 &&
-        git cat-file -p $tree2:conflict-base.t >conflict2 &&
-        ! test_cmp conflict1 conflict2 &&
-        ! grep "||||||" conflict1 &&
-        grep "||||||" conflict2
+        ! test_cmp out1 out2 &&
+        ! grep "||||||" out1 &&
+        grep "||||||" out2
 '

-- 
Felipe Contreras

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Felipe Contreras wrote (reply to this):

Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
> > The fix is simple: load the default Git config in cmd_merge_tree().
> > This may also fix other behaviors that are effected by reading default
> > config. The only possible downside is a little extra computation time
> > spent reading config. The config parsing is placed after basic argument
> > parsing so it does not slow down usage errors.
> 
> Presumably merge-tree wants to serve a low-level machinery that
> gives reliable reproducible result, we may want to keep the
> configuration variables we read as narrow as practical.  The
> default_config() callback may still be wider than desirable from
> that point of view, but I guess that is the most reasonable choice?

If you want `git merge-tree` to not call git_xmerge_config, then why did you
merge 1f0c3a29da (merge-tree: implement real merges, 2022-06-18) which
introduces a call to init_merge_options, which calls merge_recursive_config,
which calls git_xmerge_config?

And BTW, the way merge_recursive_config is implemented is completely different
to how the rest of the config infraestructure works.

-- 
Felipe Contreras

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Felipe Contreras wrote (reply to this):

When real merges were implemented in 1f0c3a29da (merge-tree: implement
real merges, 2022-06-18), init_merge_options was called after doing
get_merge_parent, which means core.useReplaceRefs was not parsed at the
moment the commit objects were parsed, and thus essentially this option
was ignored.

This configuration is typically disabled in git hosts due to the ability
to spoof commits in strange ways. Users are able to use refs/replace/
references to make pull requests that look valid but introduce malicious
content. The resulting merge has the correct commit history, but the
malicious content exists in the root tree of the merge.

To fix this let's simply load the configuration before any call to
get_merge_parent().

Cc: Elijah Newren <newren@gmail.com>
Tests-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
---
 builtin/merge-tree.c  |  4 ++--
 t/t4300-merge-tree.sh | 18 ++++++++++++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-tree.c b/builtin/merge-tree.c
index aa8040c2a6..b405cf448f 100644
--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -424,6 +424,8 @@ static int real_merge(struct merge_tree_options *o,
 	struct merge_result result = { 0 };
 	int show_messages = o->show_messages;
 
+	init_merge_options(&opt, the_repository);
+
 	parent1 = get_merge_parent(branch1);
 	if (!parent1)
 		help_unknown_ref(branch1, "merge-tree",
@@ -434,8 +436,6 @@ static int real_merge(struct merge_tree_options *o,
 		help_unknown_ref(branch2, "merge-tree",
 				 _("not something we can merge"));
 
-	init_merge_options(&opt, the_repository);
-
 	opt.show_rename_progress = 0;
 
 	opt.branch1 = branch1;
diff --git a/t/t4300-merge-tree.sh b/t/t4300-merge-tree.sh
index c52c8a21fa..57c4f26e46 100755
--- a/t/t4300-merge-tree.sh
+++ b/t/t4300-merge-tree.sh
@@ -334,4 +334,22 @@ test_expect_success 'turn tree to file' '
 	test_cmp expect actual
 '
 
+test_expect_success 'merge-tree respects core.useReplaceRefs=false' '
+	test_commit merge-to &&
+	test_commit valid base &&
+	git reset --hard HEAD^ &&
+	test_commit malicious base &&
+
+	test_when_finished "git replace -d $(git rev-parse valid^0)" &&
+	git replace valid^0 malicious^0 &&
+
+	tree=$(git -c core.useReplaceRefs=true merge-tree --write-tree merge-to valid) &&
+	merged=$(git cat-file -p $tree:base) &&
+	test malicious = $merged &&
+
+	tree=$(git -c core.useReplaceRefs=false merge-tree --write-tree merge-to valid) &&
+	merged=$(git cat-file -p $tree:base) &&
+	test valid = $merged
+'
+
 test_done
-- 
2.40.0+fc1

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

On the Git mailing list, Felipe Contreras wrote (reply to this):

Derrick Stolee wrote:
> On 5/10/2023 4:30 PM, Felipe Contreras wrote:
> > Derrick Stolee via GitGitGadget wrote:
> >> From: Derrick Stolee <derrickstolee@github.com>
> >>
> >> The 'git merge-tree' command handles creating root trees for merges
> >> without using the worktree. This is a critical operation in many Git
> >> hosts, as they typically store bare repositories.
> >>
> >> This builtin does not load the default Git config, which can have
> >> several important ramifications.
> > 
> > For the record, I had already sent a better version of this patch almost 2
> > years ago [1], not just for `git merge-tree`, but other commands as well.
> > 
> > The obvious fix was completely ignored by the maintainer.
> > 
> > The reason why it should be git_xmerge_config and not git_default_config, is
> > that merge.conflictstyle would not be parsed if you call git_default_config.
> 
> As mentioned by Elijah in a different thread, the merge machinery loads
> the merge config as needed. I confirmed by creating this test, which I
> may submit as an independent patch:

I wrote my patches before Elijah wrote the real merge implementation, and in
his function he does `init_merge_options()`, which eventually calls
`git_config(git_xmerge_config, NULL)`.

But if `git_config()` is already called, you shouldn't need to add yet another
`git_config()` call.

The problem is that he added `init_merge_options()` *after* the
`get_merge_parent()` calls, that's why the configuration is ignored.

If we move `init_merge_options()` to the right place, the problem is fixed:

--- a/builtin/merge-tree.c
+++ b/builtin/merge-tree.c
@@ -424,6 +424,8 @@ static int real_merge(struct merge_tree_options *o,
 	struct merge_result result = { 0 };
 	int show_messages = o->show_messages;
 
+	init_merge_options(&opt, the_repository);
+
 	parent1 = get_merge_parent(branch1);
 	if (!parent1)
 		help_unknown_ref(branch1, "merge-tree",
@@ -434,8 +436,6 @@ static int real_merge(struct merge_tree_options *o,
 		help_unknown_ref(branch2, "merge-tree",
 				 _("not something we can merge"));
 
-	init_merge_options(&opt, the_repository);
-
 	opt.show_rename_progress = 0;
 
 	opt.branch1 = branch1;

I ran your test case, and it passes.

I sent a patch for that here:

https://lore.kernel.org/git/20230511215608.1297686-1-felipe.contreras@gmail.com/

This is a more proper fix because a) it doesn't add any new line of code, b) it
doesn't add a new include, and c) it doesn't call `git_config()` twice.

Cheers.

-- 
Felipe Contreras

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

This branch is now known as ds/merge-tree-use-config.

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

This patch series was integrated into seen via git@fcc91b1.

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2023

This patch series was integrated into next via git@e0dab53.

@gitgitgadget
Copy link

gitgitgadget bot commented May 12, 2023

There was a status update in the "New Topics" section about the branch ds/merge-tree-use-config on the Git mailing list:

Allow git forges to disable replace-refs feature while running "git
merge-tree".

Will merge to 'master'.
source: <pull.1530.git.1683745654800.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented May 12, 2023

This patch series was integrated into seen via git@1a08cb6.

@gitgitgadget
Copy link

gitgitgadget bot commented May 15, 2023

This patch series was integrated into seen via git@80754c5.

@gitgitgadget
Copy link

gitgitgadget bot commented May 15, 2023

This patch series was integrated into next via git@80754c5.

@gitgitgadget
Copy link

gitgitgadget bot commented May 16, 2023

This patch series was integrated into master via git@80754c5.

@gitgitgadget gitgitgadget bot added the master label May 16, 2023
@gitgitgadget gitgitgadget bot closed this May 16, 2023
@gitgitgadget
Copy link

gitgitgadget bot commented May 16, 2023

Closed via 80754c5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant