Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge-tree: handle missing objects correctly #1651

Closed
wants to merge 6 commits into from

Conversation

dscho
Copy link
Member

@dscho dscho commented Feb 5, 2024

I recently looked into issues where git merge-tree calls returned bogus data (in one instance returning an empty tree for non-empty merge parents). By the time I had a look at the corresponding repository, the issue was no longer reproducible, but a closer look at the code combined with some manual experimenting turned up the fact that missing tree objects aren't handled as errors by git merge-tree.

While at it, I added a commit on top that tries to catch all remaining unchecked parse_tree() calls.

This patch series is based on 5f43cf5 (merge-tree: accept 3 trees as arguments, 2024-01-28) (the original tip commit of js/merge-tree-3-trees) because I introduced three unchecked parse_tree() calls in that topic.

Changes since v3:

  • Aligned the translated error messages with pre-existing ones (sorry, I forgot to make that change in v2!).
  • Added a new commit at the end to mark the one error message for translation which I had imitated, after making it consistent with the remaining "unable to read tree" error messages.

Changes since v2:

  • Fixed the new "missing tree object" test case in t4301 that succeeded for the wrong reason.
  • Adjusted the new "missing blob object" test case to avoid succeeding for the wrong reason.
  • Simplified the "missing blob object" test case.

Changes since v1:

  • Simplified the test case, avoiding a subshell and a pipe in the process.
  • Added a patch to remove a superfluous subtree->object.parsed guard around a parse_tree(subtree) call.

cc: Patrick Steinhardt ps@pks.im
cc: Eric Sunshine sunshine@sunshineco.com

@dscho dscho self-assigned this Feb 5, 2024
@dscho
Copy link
Member Author

dscho commented Feb 6, 2024

/submit

Copy link

gitgitgadget bot commented Feb 6, 2024

Submitted as pull.1651.git.1707212981.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1651/dscho/merge-tree-and-missing-objects-v1

To fetch this version to local tag pr-1651/dscho/merge-tree-and-missing-objects-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1651/dscho/merge-tree-and-missing-objects-v1

Copy link

gitgitgadget bot commented Feb 6, 2024

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> I recently looked into issues where git merge-tree calls returned bogus data
> (in one instance returning an empty tree for non-empty merge parents). By
> the time I had a look at the corresponding repository, the issue was no
> longer reproducible, but a closer look at the code combined with some manual
> experimenting turned up the fact that missing tree objects aren't handled as
> errors by git merge-tree.
>
> While at it, I added a commit on top that tries to catch all remaining
> unchecked parse_tree() calls.
>
> This patch series is based on js/merge-tree-3-trees because I introduced
> three unchecked parse_tree() calls in that topic branch.

Thanks.  All the added checks looked reasonable to me.

Will queue.

>
> Johannes Schindelin (4):
>   merge-tree: fail with a non-zero exit code on missing tree objects
>   merge-ort: do check `parse_tree()`'s return value
>   t4301: verify that merge-tree fails on missing blob objects
>   Always check `parse_tree*()`'s return value
>
>  builtin/checkout.c               | 19 ++++++++++++++++---
>  builtin/clone.c                  |  3 ++-
>  builtin/commit.c                 |  3 ++-
>  builtin/merge-tree.c             |  6 ++++++
>  builtin/read-tree.c              |  3 ++-
>  builtin/reset.c                  |  4 ++++
>  cache-tree.c                     |  4 ++--
>  merge-ort.c                      | 16 +++++++++++-----
>  merge-recursive.c                |  3 ++-
>  merge.c                          |  5 ++++-
>  reset.c                          |  5 +++++
>  sequencer.c                      |  4 ++++
>  t/t4301-merge-tree-write-tree.sh | 24 ++++++++++++++++++++++++
>  13 files changed, 84 insertions(+), 15 deletions(-)
>
>
> base-commit: 5f43cf5b2e4b68386d3774bce880b0f74d801635
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1651%2Fdscho%2Fmerge-tree-and-missing-objects-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1651/dscho/merge-tree-and-missing-objects-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1651

@@ -707,7 +707,8 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
init_checkout_metadata(&opts.meta, info->refname,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> @@ -707,7 +707,8 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
>  	init_checkout_metadata(&opts.meta, info->refname,
>  			       info->commit ? &info->commit->object.oid : null_oid(),
>  			       NULL);
> -	parse_tree(tree);
> +	if (parse_tree(tree) < 0)
> +		return 128;
>  	init_tree_desc(&tree_desc, tree->buffer, tree->size);

Another thing that caught my attention during today's integration
cycle was that this "parse the tree object in variable tree and then
call init_tree_desc on it" is a recurring pattern.  After the dust
settles, we might want to have a helper function perhaps like

    int init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
    {
	if (parse_tree(tree))
		return -1;
	init_tree_desc(desc, tree->buffer, tree->size);
    }

It was a bit irritating as another topic in flight changes the
function signature for init_tree_desc(), causing a textual conflict
that could be easily avoided.

Copy link

gitgitgadget bot commented Feb 7, 2024

This branch is now known as js/merge-tree-3-trees.

Copy link

gitgitgadget bot commented Feb 7, 2024

This patch series was integrated into seen via git@ffa365f.

@gitgitgadget gitgitgadget bot added the seen label Feb 7, 2024
Copy link

gitgitgadget bot commented Feb 7, 2024

There was a status update in the "Cooking" section about the branch js/merge-tree-3-trees on the Git mailing list:

"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.

Will cook in 'next'
source: <pull.1647.v2.git.1706474063109.gitgitgadget@gmail.com>
source: <pull.1651.git.1707212981.gitgitgadget@gmail.com>

@@ -1659,9 +1659,10 @@ static int collect_merge_info(struct merge_options *opt,
info.data = opt;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Feb 06, 2024 at 09:49:38AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> When `git merge-tree` encounters a missing tree object, it should error
> out and not continue quietly as if nothing had happened.
> 
> However, as of time of writing, `git merge-tree` _does_ continue, and
> then offers the empty tree as result.
> 
> Let's fix this.
> 
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  merge-ort.c                      |  7 ++++---
>  t/t4301-merge-tree-write-tree.sh | 10 ++++++++++
>  2 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/merge-ort.c b/merge-ort.c
> index 6491070d965..c37fc035f13 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -1659,9 +1659,10 @@ static int collect_merge_info(struct merge_options *opt,
>  	info.data = opt;
>  	info.show_all_errors = 1;
>  
> -	parse_tree(merge_base);
> -	parse_tree(side1);
> -	parse_tree(side2);
> +	if (parse_tree(merge_base) < 0 ||
> +	    parse_tree(side1) < 0 ||
> +	    parse_tree(side2) < 0)
> +		return -1;

I was wondering whether we also want to print an error in this case. But
`parse_tree()` calls `parse_tree_gently(tree, 0)`, where the second
parameter instructs the function to print an error message when the tree
is missing.

>  	init_tree_desc(t + 0, merge_base->buffer, merge_base->size);
>  	init_tree_desc(t + 1, side1->buffer, side1->size);
>  	init_tree_desc(t + 2, side2->buffer, side2->size);
> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> index 7d0fa74da74..4ea1b74445d 100755
> --- a/t/t4301-merge-tree-write-tree.sh
> +++ b/t/t4301-merge-tree-write-tree.sh
> @@ -950,5 +950,15 @@ test_expect_success '--merge-base with tree OIDs' '
>  	git merge-tree --merge-base=side1^^{tree} side1^{tree} side3^{tree} >with-trees &&
>  	test_cmp with-commits with-trees
>  '

Nit: missing newline.

> +test_expect_success 'error out on missing tree objects' '
> +	git init --bare missing-tree.git &&
> +	(
> +		git rev-list side3 &&
> +		git rev-parse side3^:
> +	) | git pack-objects missing-tree.git/objects/pack/side3-tree-is-missing &&
> +	side3=$(git rev-parse side3) &&
> +	test_must_fail git --git-dir=missing-tree.git merge-tree $side3^ $side3 >actual &&
> +	test_must_be_empty actual
> +'

Can't we avoid the subshell by using curly braces to pipe the outputs

Patrick

Copy link

gitgitgadget bot commented Feb 7, 2024

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

@@ -707,7 +707,8 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
init_checkout_metadata(&opts.meta, info->refname,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Feb 06, 2024 at 09:49:41AM +0000, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
[snip]
> diff --git a/cache-tree.c b/cache-tree.c
> index 641427ed410..c6508b64a5c 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -779,8 +779,8 @@ static void prime_cache_tree_rec(struct repository *r,
>  			struct cache_tree_sub *sub;
>  			struct tree *subtree = lookup_tree(r, &entry.oid);
>  
> -			if (!subtree->object.parsed)
> -				parse_tree(subtree);
> +			if (!subtree->object.parsed && parse_tree(subtree) < 0)
> +				exit(128);

It's somewhat weird that we have the `subtree->object.parsed` check in
the first place, because the first thing that `parse_tree_gently()` does
is to check that flag and immediately return when the tree is parsed
already. It's a preexisting issue though, so this is fine.

Patrick

Copy link

gitgitgadget bot commented Feb 7, 2024

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Feb 06, 2024 at 09:49:37AM +0000, Johannes Schindelin via GitGitGadget wrote:
> I recently looked into issues where git merge-tree calls returned bogus data
> (in one instance returning an empty tree for non-empty merge parents). By
> the time I had a look at the corresponding repository, the issue was no
> longer reproducible, but a closer look at the code combined with some manual
> experimenting turned up the fact that missing tree objects aren't handled as
> errors by git merge-tree.
> 
> While at it, I added a commit on top that tries to catch all remaining
> unchecked parse_tree() calls.
> 
> This patch series is based on js/merge-tree-3-trees because I introduced
> three unchecked parse_tree() calls in that topic branch.

These patches all look good to me, thanks! I have two nits for the first
patch, but I don't really think that those are worth a reroll.

Patrick

@dscho dscho force-pushed the merge-tree-and-missing-objects branch from 93abd70 to ffd38ad Compare February 7, 2024 12:05
@dscho
Copy link
Member Author

dscho commented Feb 7, 2024

/submit

Copy link

gitgitgadget bot commented Feb 7, 2024

Submitted as pull.1651.v2.git.1707324461.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1651/dscho/merge-tree-and-missing-objects-v2

To fetch this version to local tag pr-1651/dscho/merge-tree-and-missing-objects-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1651/dscho/merge-tree-and-missing-objects-v2

@@ -779,8 +779,8 @@ static void prime_cache_tree_rec(struct repository *r,
struct cache_tree_sub *sub;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The first thing the `parse_tree()` function does is to return early if
> the tree has already been parsed. Therefore we do not need to guard the
> `parse_tree()` call behind a check of that flag.

Makes sense, and it doubly makes sense to keep this separate from
the change done to the same location in [4/5].

Will queue.

@@ -1659,9 +1659,10 @@ static int collect_merge_info(struct merge_options *opt,
info.data = opt;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 7, 2024 at 11:48 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> When `git merge-tree` encounters a missing tree object, it should error
> out and not continue quietly as if nothing had happened.
>
> However, as of time of writing, `git merge-tree` _does_ continue, and
> then offers the empty tree as result.
>
> Let's fix this.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> @@ -951,4 +951,14 @@ test_expect_success '--merge-base with tree OIDs' '
> +test_expect_success 'error out on missing tree objects' '
> +       git init --bare missing-tree.git &&
> +       git rev-list side3 >list &&
> +       git rev-parse side3^: >list &&

Isn't the git-rev-parse invocation simply overwriting "list" rather
than appending to it? Did you mean:

    git rev-list side3 >list &&
    git rev-parse side3^: >>list &&

An alternative would be:

    {
        git rev-list side3 &&
        git rev-parse side3^:
    } >list &&

> +       git pack-objects missing-tree.git/objects/pack/side3-tree-is-missing <list &&
> +       side3=$(git rev-parse side3) &&
> +       test_must_fail git --git-dir=missing-tree.git merge-tree $side3^ $side3 >actual &&
> +       test_must_be_empty actual
> +'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Eric,

On Wed, 7 Feb 2024, Eric Sunshine wrote:

> On Wed, Feb 7, 2024 at 11:48 AM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > When `git merge-tree` encounters a missing tree object, it should error
> > out and not continue quietly as if nothing had happened.
> >
> > However, as of time of writing, `git merge-tree` _does_ continue, and
> > then offers the empty tree as result.
> >
> > Let's fix this.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> > diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> > @@ -951,4 +951,14 @@ test_expect_success '--merge-base with tree OIDs' '
> > +test_expect_success 'error out on missing tree objects' '
> > +       git init --bare missing-tree.git &&
> > +       git rev-list side3 >list &&
> > +       git rev-parse side3^: >list &&
>
> Isn't the git-rev-parse invocation simply overwriting "list" rather
> than appending to it? Did you mean:
>
>     git rev-list side3 >list &&
>     git rev-parse side3^: >>list &&

Yes. And the fact that this test case still passed was for the _wrong_
reason! I adjusted the test case by writing the correct contents to `list`
and by verifying that the expected error message is shown.

>
> An alternative would be:
>
>     {
>         git rev-list side3 &&
>         git rev-parse side3^:
>     } >list &&

I find that uglier and longer, so I'll stay with the first option ;-)

Thank you for your review!
Johannes

>
> > +       git pack-objects missing-tree.git/objects/pack/side3-tree-is-missing <list &&
> > +       side3=$(git rev-parse side3) &&
> > +       test_must_fail git --git-dir=missing-tree.git merge-tree $side3^ $side3 >actual &&
> > +       test_must_be_empty actual
> > +'
>

Copy link

gitgitgadget bot commented Feb 7, 2024

User Eric Sunshine <sunshine@sunshineco.com> has been added to the cc: list.

git pack-objects missing-tree.git/objects/pack/side3-tree-is-missing <list &&
side3=$(git rev-parse side3) &&
test_must_fail git --git-dir=missing-tree.git merge-tree $side3^ $side3 >actual &&
test_must_be_empty actual
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 7, 2024 at 11:48 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> We just fixed a problem where `merge-tree` would not fail on missing
> tree objects. Let's ensure that that problem does not occur with blob
> objects (and won't, in the future, either).
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> @@ -961,4 +961,18 @@ test_expect_success 'error out on missing tree objects' '
> +test_expect_success 'error out on missing blob objects' '
> +       seq1=$(test_seq 1 10 | git hash-object -w --stdin) &&
> +       seq2=$(test_seq 1 11 | git hash-object -w --stdin) &&
> +       seq3=$(test_seq 0 10 | git hash-object -w --stdin) &&

Is there significance to the ranges passed to test_seq()? Or, can the
same be achieved by using arbitrary content for each blob?

    blob1=$(echo "one" | git hash-object -w --stdin) &&
    blob2=$(echo "two" | git hash-object -w --stdin) &&
    blob3=$(echo "three" | git hash-object -w --stdin) &&

> +       tree1=$(printf "100644 blob %s\tsequence" $seq1 | git mktree) &&
> +       tree2=$(printf "100644 blob %s\tsequence" $seq2 | git mktree) &&
> +       tree3=$(printf "100644 blob %s\tsequence" $seq3 | git mktree) &&

I found the lack of terminating "\n" in the `printf` confusing,
especially since the variable names (seq1, seq2, etc.) and the use of
`printf` seem to imply, at first glance, that each git-mktree
invocation is receiving multiple lines as input which, it turns out,
is not the case. Adding the missing "\n" would help:

   tree1=$(printf "100644 blob %s\tsequence\n" $seq1 | git mktree) &&
   tree2=$(printf "100644 blob %s\tsequence\n" $seq2 | git mktree) &&
   tree3=$(printf "100644 blob %s\tsequence\n" $seq3 | git mktree) &&

Interpolating the $seqN variable directly into the string rather than
using %s would make it even clearer that only a single line is being
generated as input to git-mktree:

   tree1=$(printf "100644 blob $seq1\tsequence\n" | git mktree) &&
   tree2=$(printf "100644 blob $seq2\tsequence\n" | git mktree) &&
   tree3=$(printf "100644 blob $seq3\tsequence\n" | git mktree) &&

Alternatively `echo` could be used, though it's not necessarily any nicer:

    tree1=$(echo "100644 blob $seq1Qsequence" | q_to_tab | git mktree) &&
    tree2=$(echo "100644 blob $seq2Qsequence" | q_to_tab | git mktree) &&
    tree3=$(echo "100644 blob $seq3Qsequence" | q_to_tab | git mktree) &&

> +       git init --bare missing-blob.git &&
> +       test_write_lines $seq1 $seq3 $tree1 $tree2 $tree3 |
> +       git pack-objects missing-blob.git/objects/pack/side1-whatever-is-missing &&
> +       test_must_fail git --git-dir=missing-blob.git merge-tree --merge-base=$tree1 $tree2 $tree3 >actual &&
> +       test_must_be_empty actual
> +'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Eric Sunshine <sunshine@sunshineco.com> writes:

>> diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
>> @@ -961,4 +961,18 @@ test_expect_success 'error out on missing tree objects' '
>> +test_expect_success 'error out on missing blob objects' '
>> +       seq1=$(test_seq 1 10 | git hash-object -w --stdin) &&
>> +       seq2=$(test_seq 1 11 | git hash-object -w --stdin) &&
>> +       seq3=$(test_seq 0 10 | git hash-object -w --stdin) &&
>
> Is there significance to the ranges passed to test_seq()? Or, can the
> same be achieved by using arbitrary content for each blob?
>
>     blob1=$(echo "one" | git hash-object -w --stdin) &&
>     blob2=$(echo "two" | git hash-object -w --stdin) &&
>     blob3=$(echo "three" | git hash-object -w --stdin) &&
>
>> +       tree1=$(printf "100644 blob %s\tsequence" $seq1 | git mktree) &&
>> +       tree2=$(printf "100644 blob %s\tsequence" $seq2 | git mktree) &&
>> +       tree3=$(printf "100644 blob %s\tsequence" $seq3 | git mktree) &&
>
> I found the lack of terminating "\n" in the `printf` confusing,
> especially since the variable names (seq1, seq2, etc.) and the use of
> `printf` seem to imply, at first glance, that each git-mktree
> invocation is receiving multiple lines as input which, it turns out,
> is not the case. Adding the missing "\n" would help:
>
>    tree1=$(printf "100644 blob %s\tsequence\n" $seq1 | git mktree) &&
>    tree2=$(printf "100644 blob %s\tsequence\n" $seq2 | git mktree) &&
>    tree3=$(printf "100644 blob %s\tsequence\n" $seq3 | git mktree) &&
>
> Interpolating the $seqN variable directly into the string rather than
> using %s would make it even clearer that only a single line is being
> generated as input to git-mktree:
>
>    tree1=$(printf "100644 blob $seq1\tsequence\n" | git mktree) &&
>    tree2=$(printf "100644 blob $seq2\tsequence\n" | git mktree) &&
>    tree3=$(printf "100644 blob $seq3\tsequence\n" | git mktree) &&

All good points.  Thanks for excellent reviews (not just this one
but another one in the series).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 7, 2024 at 12:24 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> Interpolating the $seqN variable directly into the string rather than
> using %s would make it even clearer that only a single line is being
> generated as input to git-mktree:
>
>    tree1=$(printf "100644 blob $seq1\tsequence\n" | git mktree) &&
>    tree2=$(printf "100644 blob $seq2\tsequence\n" | git mktree) &&
>    tree3=$(printf "100644 blob $seq3\tsequence\n" | git mktree) &&
>
> Alternatively `echo` could be used, though it's not necessarily any nicer:
>
>     tree1=$(echo "100644 blob $seq1Qsequence" | q_to_tab | git mktree) &&
>     tree2=$(echo "100644 blob $seq2Qsequence" | q_to_tab | git mktree) &&
>     tree3=$(echo "100644 blob $seq3Qsequence" | q_to_tab | git mktree) &&

The `printf` example is probably cleaner, thus preferable. For
completeness, though, I should mention that the `echo` example is, of
course, not quite correct. For the interpolation to work as intended,
it would need ${...}:

    tree1=$(echo "100644 blob ${seq1}Qsequence" | q_to_tab | git mktree) &&
    tree2=$(echo "100644 blob ${seq2}Qsequence" | q_to_tab | git mktree) &&
    tree3=$(echo "100644 blob ${seq3}Qsequence" | q_to_tab | git mktree) &&

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Eric,

On Wed, 7 Feb 2024, Eric Sunshine wrote:

> On Wed, Feb 7, 2024 at 11:48 AM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > We just fixed a problem where `merge-tree` would not fail on missing
> > tree objects. Let's ensure that that problem does not occur with blob
> > objects (and won't, in the future, either).
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> > diff --git a/t/t4301-merge-tree-write-tree.sh b/t/t4301-merge-tree-write-tree.sh
> > @@ -961,4 +961,18 @@ test_expect_success 'error out on missing tree objects' '
> > +test_expect_success 'error out on missing blob objects' '
> > +       seq1=$(test_seq 1 10 | git hash-object -w --stdin) &&
> > +       seq2=$(test_seq 1 11 | git hash-object -w --stdin) &&
> > +       seq3=$(test_seq 0 10 | git hash-object -w --stdin) &&
>
> Is there significance to the ranges passed to test_seq()?

Originally, there was, since I wanted the merge to succeed without
conflicts.

However, I simplified the test case similar to what you suggested and
now specifically verify that the expected error message is shown (as
opposed to, say, an error message indicating a merge conflict).

> Or, can the same be achieved by using arbitrary content for each blob?
>
>     blob1=$(echo "one" | git hash-object -w --stdin) &&
>     blob2=$(echo "two" | git hash-object -w --stdin) &&
>     blob3=$(echo "three" | git hash-object -w --stdin) &&
>
> > +       tree1=$(printf "100644 blob %s\tsequence" $seq1 | git mktree) &&
> > +       tree2=$(printf "100644 blob %s\tsequence" $seq2 | git mktree) &&
> > +       tree3=$(printf "100644 blob %s\tsequence" $seq3 | git mktree) &&
>
> I found the lack of terminating "\n" in the `printf` confusing,
> especially since the variable names (seq1, seq2, etc.) and the use of
> `printf` seem to imply, at first glance, that each git-mktree
> invocation is receiving multiple lines as input which, it turns out,
> is not the case. Adding the missing "\n" would help:
>
>    tree1=$(printf "100644 blob %s\tsequence\n" $seq1 | git mktree) &&
>    tree2=$(printf "100644 blob %s\tsequence\n" $seq2 | git mktree) &&
>    tree3=$(printf "100644 blob %s\tsequence\n" $seq3 | git mktree) &&

I added the `\n` and now avoid the `%s`.

Thank you for your review!
Johannes

>
> Interpolating the $seqN variable directly into the string rather than
> using %s would make it even clearer that only a single line is being
> generated as input to git-mktree:
>
>    tree1=$(printf "100644 blob $seq1\tsequence\n" | git mktree) &&
>    tree2=$(printf "100644 blob $seq2\tsequence\n" | git mktree) &&
>    tree3=$(printf "100644 blob $seq3\tsequence\n" | git mktree) &&
>
> Alternatively `echo` could be used, though it's not necessarily any nicer:
>
>     tree1=$(echo "100644 blob $seq1Qsequence" | q_to_tab | git mktree) &&
>     tree2=$(echo "100644 blob $seq2Qsequence" | q_to_tab | git mktree) &&
>     tree3=$(echo "100644 blob $seq3Qsequence" | q_to_tab | git mktree) &&
>
> > +       git init --bare missing-blob.git &&
> > +       test_write_lines $seq1 $seq3 $tree1 $tree2 $tree3 |
> > +       git pack-objects missing-blob.git/objects/pack/side1-whatever-is-missing &&
> > +       test_must_fail git --git-dir=missing-blob.git merge-tree --merge-base=$tree1 $tree2 $tree3 >actual &&
> > +       test_must_be_empty actual
> > +'
>

@@ -707,7 +707,8 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
init_checkout_metadata(&opts.meta, info->refname,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 7, 2024 at 11:48 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> Always check `parse_tree*()`'s return value

If you happen to reroll for some reason, perhaps: s/Always/always/

> Otherwise we may easily run into serious crashes: For example, if we run
> `init_tree_desc()` directly after a failed `parse_tree()`, we are
> accessing uninitialized data or trying to dereference `NULL`.
>
> Note that the `parse_tree()` function already takes care of showing an
> error message. The `parse_tree_indirectly()` and
> `repo_get_commit_tree()` functions do not, therefore those latter call
> sites need to show a useful error message while the former do not.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Eric,

On Wed, 7 Feb 2024, Eric Sunshine wrote:

> On Wed, Feb 7, 2024 at 11:48 AM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > Always check `parse_tree*()`'s return value
>
> If you happen to reroll for some reason, perhaps: s/Always/always/

Unless I am missing something we only ask the part of a oneline after an
initial "<something>:" to be downcased.

At least all the "Merge branch [...]" commits are still capitalized and
nobody complains ;-)

Ciao,
Johannes

>
> > Otherwise we may easily run into serious crashes: For example, if we run
> > `init_tree_desc()` directly after a failed `parse_tree()`, we are
> > accessing uninitialized data or trying to dereference `NULL`.
> >
> > Note that the `parse_tree()` function already takes care of showing an
> > error message. The `parse_tree_indirectly()` and
> > `repo_get_commit_tree()` functions do not, therefore those latter call
> > sites need to show a useful error message while the former do not.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> If you happen to reroll for some reason, perhaps: s/Always/always/
>
> Unless I am missing something we only ask the part of a oneline after an
> initial "<something>:" to be downcased.
>
> At least all the "Merge branch [...]" commits are still capitalized and
> nobody complains ;-)

Of course.  Merge commits and single-parent commits are very
different.

I suspect that the lack of "<area>:" was what triggered the comment,
but this being more about the callers all over the place in the
code, it may be hard to say what "area" this belongs to.  If I were
pressed to choose:

    Subject: parse_tree*(): make callers always check return value

would probably be what I would choose, but I usually opt for going
the lazy way of not labelling the area at all ;-).

Copy link

gitgitgadget bot commented Feb 7, 2024

This patch series was integrated into seen via git@29194ba.

Copy link

gitgitgadget bot commented Feb 7, 2024

This patch series was integrated into seen via git@0f2e55b.

Copy link

gitgitgadget bot commented Feb 8, 2024

This patch series was integrated into seen via git@0e7515c.

Copy link

gitgitgadget bot commented Feb 8, 2024

This patch series was integrated into seen via git@601c324.

Copy link

gitgitgadget bot commented Feb 8, 2024

This patch series was integrated into seen via git@7d51954.

Copy link

gitgitgadget bot commented Feb 9, 2024

This patch series was integrated into seen via git@d7875c0.

Copy link

gitgitgadget bot commented Feb 9, 2024

There was a status update in the "Cooking" section about the branch js/merge-tree-3-trees on the Git mailing list:

"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.

Expecting a reroll.
cf. <CAPig+cSs8MFkLasTULh7tybrFm7SwaT+JeR7HnXjh+-agCHYMw@mail.gmail.com>
cf. <CAPig+cSJz3U+vT==NhX5hcrTjsCggnAzhzQOvZcSXbcEGuYaKQ@mail.gmail.com>
source: <pull.1647.v2.git.1706474063109.gitgitgadget@gmail.com>
source: <pull.1651.v2.git.1707324461.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Feb 10, 2024

This patch series was integrated into seen via git@3ee2012.

Copy link

gitgitgadget bot commented Feb 12, 2024

This patch series was integrated into seen via git@b22149f.

Copy link

gitgitgadget bot commented Feb 23, 2024

There was a status update in the "Cooking" section about the branch js/merge-tree-3-trees on the Git mailing list:

"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.

Comments?
source: <pull.1647.git.1706277694231.gitgitgadget@gmail.com>
source: <pull.1651.v3.git.1708612605.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Feb 23, 2024

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> I recently looked into issues where git merge-tree calls returned bogus data
> (in one instance returning an empty tree for non-empty merge parents). By
> the time I had a look at the corresponding repository, the issue was no
> longer reproducible, but a closer look at the code combined with some manual
> experimenting turned up the fact that missing tree objects aren't handled as
> errors by git merge-tree.

Looking good.  Will replace.  Thanks.

Copy link

gitgitgadget bot commented Feb 23, 2024

This patch series was integrated into seen via git@80f2f2a.

Copy link

gitgitgadget bot commented Feb 24, 2024

This patch series was integrated into seen via git@fffe9e7.

Copy link

gitgitgadget bot commented Feb 27, 2024

This patch series was integrated into seen via git@aa77cae.

Copy link

gitgitgadget bot commented Feb 27, 2024

This patch series was integrated into seen via git@2bf8cc8.

Copy link

gitgitgadget bot commented Feb 28, 2024

This patch series was integrated into seen via git@88d3ea2.

Copy link

gitgitgadget bot commented Feb 28, 2024

There was a status update in the "Cooking" section about the branch js/merge-tree-3-trees on the Git mailing list:

"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.

Will merge to 'next'?
source: <pull.1647.git.1706277694231.gitgitgadget@gmail.com>
source: <pull.1651.v4.git.1708677266.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Feb 28, 2024

This patch series was integrated into seen via git@d423e51.

Copy link

gitgitgadget bot commented Feb 29, 2024

This patch series was integrated into seen via git@c507fb3.

Copy link

gitgitgadget bot commented Feb 29, 2024

This patch series was integrated into next via git@d614f80.

@gitgitgadget gitgitgadget bot added the next label Feb 29, 2024
Copy link

gitgitgadget bot commented Feb 29, 2024

This patch series was integrated into seen via git@d3caece.

Copy link

gitgitgadget bot commented Mar 1, 2024

This patch series was integrated into seen via git@59712e4.

Copy link

gitgitgadget bot commented Mar 2, 2024

This patch series was integrated into seen via git@152fa33.

Copy link

gitgitgadget bot commented Mar 2, 2024

This patch series was integrated into next via git@a75dc95.

Copy link

gitgitgadget bot commented Mar 4, 2024

This patch series was integrated into seen via git@4bea430.

Copy link

gitgitgadget bot commented Mar 4, 2024

There was a status update in the "Cooking" section about the branch js/merge-tree-3-trees on the Git mailing list:

Originally merged to 'next' on 2024-02-28

"git merge-tree" has learned that the three trees involved in the
3-way merge only need to be trees, not necessarily commits.

Will merge to 'master'.
source: <pull.1647.git.1706277694231.gitgitgadget@gmail.com>
source: <pull.1651.v4.git.1708677266.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Mar 5, 2024

This patch series was integrated into seen via git@47b75e8.

Copy link

gitgitgadget bot commented Mar 6, 2024

This patch series was integrated into seen via git@873db49.

Copy link

gitgitgadget bot commented Mar 7, 2024

This patch series was integrated into seen via git@e857d0d.

Copy link

gitgitgadget bot commented Mar 7, 2024

This patch series was integrated into seen via git@3b182f4.

Copy link

gitgitgadget bot commented Mar 8, 2024

This patch series was integrated into seen via git@ae46d5f.

Copy link

gitgitgadget bot commented Mar 8, 2024

This patch series was integrated into master via git@ae46d5f.

Copy link

gitgitgadget bot commented Mar 8, 2024

This patch series was integrated into next via git@ae46d5f.

@gitgitgadget gitgitgadget bot added the master label Mar 8, 2024
@gitgitgadget gitgitgadget bot closed this Mar 8, 2024
Copy link

gitgitgadget bot commented Mar 8, 2024

Closed via ae46d5f.

@dscho dscho deleted the merge-tree-and-missing-objects branch March 8, 2024 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant