Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clone: ignore invalid local refs in remote #1214

Closed
wants to merge 1 commit into from

Conversation

derrickstolee
Copy link

@derrickstolee derrickstolee commented Apr 13, 2022

Update in v2:

  • Subject line is improved to better match the message and fix.
  • I did not explore Ævar's idea to optimistically continue the clone as much as possible. That seems like a bigger change that could be investigated independently.

cc: gitster@pobox.com
cc: me@ttaylorr.com
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com

@derrickstolee derrickstolee self-assigned this Apr 13, 2022
@derrickstolee
Copy link
Author

Linking to the original report: git-for-windows#3781

@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 18, 2022

Submitted as pull.1214.git.1650301959803.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1214/derrickstolee/refs-bug-v1

To fetch this version to local tag pr-1214/derrickstolee/refs-bug-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1214/derrickstolee/refs-bug-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 18, 2022

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):


On Mon, Apr 18 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> When cloning directly from a local repository, we load a list of refs
> based on scanning the $GIT_DIR/refs/ directory of the "server"
> repository. If files exist in that directory that do not parse as
> hexadecimal hashes, then the ref array used by write_remote_refs()
> ends up with some entries with null OIDs. This causes us to hit a BUG()
> statement in ref_transaction_create():
>
>   BUG: create called without valid new_oid
>
> This BUG() call used to be a die() until 033abf97f (Replace all
> die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
> was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
> valid, 2015-02-17).
>
> The original report for this bug [1] mentioned that this problem did not
> exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
> (refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
> GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
> able to successfully compile and test the Git codebase.
>
> [1] https://github.com/git-for-windows/git/issues/3781
>
> There are two approaches to consider here. One would be to remove this
> BUG() statement in favor of returning with an error. There are only two
> callers to ref_transaction_create(), so this would have a limited
> impact.
>
> The other approach would be to add special casing in 'git clone' to
> avoid this faulty input to the method.
>
> While I originally started with changing 'git clone', I decided that
> modifying ref_transaction_create() was a more complete solution. This
> prevents failing with a BUG() statement when we already have a good way
> to report an error (including a reason for that error) within the
> method. Both callers properly check the return value and die() with the
> error message, so this is an appropriate direction.
>
> The added test helps check against a regression, but does check that our
> intended error message is handled correctly.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>     clone: ignore invalid local refs in remote
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1214%2Fderrickstolee%2Frefs-bug-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1214/derrickstolee/refs-bug-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1214
>
>  refs.c                 | 6 ++++--
>  t/t5605-clone-local.sh | 9 +++++++++
>  2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index 1a964505f92..f300f83e4d4 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1111,8 +1111,10 @@ int ref_transaction_create(struct ref_transaction *transaction,
>  			   unsigned int flags, const char *msg,
>  			   struct strbuf *err)
>  {
> -	if (!new_oid || is_null_oid(new_oid))
> -		BUG("create called without valid new_oid");
> +	if (!new_oid || is_null_oid(new_oid)) {
> +		strbuf_addf(err, "'%s' has a null OID", refname);
> +		return 1;
> +	}
>  	return ref_transaction_update(transaction, refname, new_oid,
>  				      null_oid(), flags, msg, err);
>  }
> diff --git a/t/t5605-clone-local.sh b/t/t5605-clone-local.sh
> index 7d63365f93a..21ab6192839 100755
> --- a/t/t5605-clone-local.sh
> +++ b/t/t5605-clone-local.sh
> @@ -141,4 +141,13 @@ test_expect_success 'cloning locally respects "-u" for fetching refs' '
>  	test_must_fail git clone --bare -u false a should_not_work.git
>  '
>  
> +test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
> +	git init corrupt &&
> +	test_commit -C corrupt one &&
> +	echo a >corrupt/.git/refs/heads/topic &&
> +
> +	test_must_fail git clone corrupt working 2>err &&
> +	grep "has a null OID" err
> +'
> +
>  test_done
>
> base-commit: 11cfe552610386954886543f5de87dcc49ad5735

If I change this test to clone from "$PWD/corrupt" instead we get as far
as dying in "not our ref" in upload-pack", after the "client" said "want
0000....".

I don't see anything wrong with this narrow fix, but I wonder how close
we are to just doing something useful here for the user. I.e. if we
can't parse_object() what we're about to "copy" to e.g. error() on that,
but carry on cloning the rest if possible.

We should probably always return a non-zero at the end, suggesting the
user omit the bad ref with a refspec. But if we did all that we could
usefully recover partially corrupt-repos in this state...

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 18, 2022

User Ævar Arnfjörð Bjarmason <avarab@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 20, 2022

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
> Subject: Re: [PATCH] clone: ignore invalid local refs in remote

After seeing the title, I expected that cloning from such a
repository with cruft in .git/refs/ directory would issue a warning
and succeed without these non-ref files.

But that is not what is happening here?

> +test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
> +	git init corrupt &&
> +	test_commit -C corrupt one &&
> +	echo a >corrupt/.git/refs/heads/topic &&
> +
> +	test_must_fail git clone corrupt working 2>err &&
> +	grep "has a null OID" err
> +'
> +

We keep expecting that clone _will_ fail.

So the net change is that we still do not tolerate a corrupt
repository and do not let corruption to propagate through cloning,
but we diagnose this breakage as an error by calling die(), which 
is appropriate for dealing with runtime data error, instead of
hitting a BUG(), which is reserved for program errors.

I agree with the fixed behaviour and implementation.  It just is
that "ignore" on the title seems misleading.  Other than that,
thanks for a good finding and a clean fix.

> -	if (!new_oid || is_null_oid(new_oid))
> -		BUG("create called without valid new_oid");
> +	if (!new_oid || is_null_oid(new_oid)) {
> +		strbuf_addf(err, "'%s' has a null OID", refname);
> +		return 1;
> +	}

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 21, 2022

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 4/20/2022 4:53 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <derrickstolee@github.com>
>> Subject: Re: [PATCH] clone: ignore invalid local refs in remote
> 
> After seeing the title, I expected that cloning from such a
> repository with cruft in .git/refs/ directory would issue a warning
> and succeed without these non-ref files.
> 
> But that is not what is happening here?

Sorry, this title of the commit message is stale from a version
where I started making the clones succeed (but without these
bad refs). I changed my mind to only switch BUG() to die() to
avoid giving the impression that we have a "matching" repo after
the clone.
 
>> +test_expect_success 'local clone from repo with corrupt refs fails gracefully' '
>> +	git init corrupt &&
>> +	test_commit -C corrupt one &&
>> +	echo a >corrupt/.git/refs/heads/topic &&
>> +
>> +	test_must_fail git clone corrupt working 2>err &&
>> +	grep "has a null OID" err
>> +'
>> +
> 
> We keep expecting that clone _will_ fail.
> 
> So the net change is that we still do not tolerate a corrupt
> repository and do not let corruption to propagate through cloning,
> but we diagnose this breakage as an error by calling die(), which 
> is appropriate for dealing with runtime data error, instead of
> hitting a BUG(), which is reserved for program errors.
> 
> I agree with the fixed behaviour and implementation.  It just is
> that "ignore" on the title seems misleading.  Other than that,
> thanks for a good finding and a clean fix.

Perhaps the commit could instead start with

	clone: die() instead of BUG() on bad refs

?

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 21, 2022

On the Git mailing list, Junio C Hamano wrote (reply to this):

Derrick Stolee <derrickstolee@github.com> writes:

> Sorry, this title of the commit message is stale from a version
> where I started making the clones succeed (but without these
> bad refs). I changed my mind to only switch BUG() to die() to
> avoid giving the impression that we have a "matching" repo after
> the clone.

Ah, that figures ;-)

> Perhaps the commit could instead start with
>
> 	clone: die() instead of BUG() on bad refs
>
> ?

Yeah, it would be the title I would have been happy with.



When cloning directly from a local repository, we load a list of refs
based on scanning the $GIT_DIR/refs/ directory of the "server"
repository. If files exist in that directory that do not parse as
hexadecimal hashes, then the ref array used by write_remote_refs()
ends up with some entries with null OIDs. This causes us to hit a BUG()
statement in ref_transaction_create():

  BUG: create called without valid new_oid

This BUG() call used to be a die() until 033abf9 (Replace all
die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
was added by f04c5b5 (ref_transaction_create(): check that new_sha1 is
valid, 2015-02-17).

The original report for this bug [1] mentioned that this problem did not
exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12f
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
able to successfully compile and test the Git codebase.

[1] git-for-windows#3781

There are two approaches to consider here. One would be to remove this
BUG() statement in favor of returning with an error. There are only two
callers to ref_transaction_create(), so this would have a limited
impact.

The other approach would be to add special casing in 'git clone' to
avoid this faulty input to the method.

While I originally started with changing 'git clone', I decided that
modifying ref_transaction_create() was a more complete solution. This
prevents failing with a BUG() statement when we already have a good way
to report an error (including a reason for that error) within the
method. Both callers properly check the return value and die() with the
error message, so this is an appropriate direction.

The added test helps check against a regression, but does check that our
intended error message is handled correctly.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 25, 2022

Submitted as pull.1214.v2.git.1650894450441.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1214/derrickstolee/refs-bug-v2

To fetch this version to local tag pr-1214/derrickstolee/refs-bug-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1214/derrickstolee/refs-bug-v2

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 25, 2022

This branch is now known as ds/do-not-call-bug-on-bad-refs.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 25, 2022

This patch series was integrated into seen via git@5cb6ecf.

@gitgitgadget gitgitgadget bot added the seen label Apr 25, 2022
@gitgitgadget
Copy link

gitgitgadget bot commented Apr 25, 2022

This patch series was integrated into seen via git@9b6309a.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 26, 2022

This patch series was integrated into seen via git@d4a6080.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 26, 2022

There was a status update in the "New Topics" section about the branch ds/do-not-call-bug-on-bad-refs on the Git mailing list:

Code clean-up.

Will merge to 'next'.
source: <pull.1214.v2.git.1650894450441.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 26, 2022

This patch series was integrated into seen via git@1c7bbea.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 26, 2022

This patch series was integrated into seen via git@0f5143d.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 27, 2022

This patch series was integrated into seen via git@945d404.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 27, 2022

This patch series was integrated into seen via git@d415a42.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 28, 2022

This patch series was integrated into seen via git@65ffd11.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 28, 2022

This patch series was integrated into seen via git@c848471.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 28, 2022

There was a status update in the "Cooking" section about the branch ds/do-not-call-bug-on-bad-refs on the Git mailing list:

Code clean-up.

Will merge to 'next'.
source: <pull.1214.v2.git.1650894450441.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 29, 2022

This patch series was integrated into seen via git@49bf962.

@gitgitgadget
Copy link

gitgitgadget bot commented May 2, 2022

This patch series was integrated into seen via git@18f3145.

@gitgitgadget
Copy link

gitgitgadget bot commented May 2, 2022

There was a status update in the "Cooking" section about the branch ds/do-not-call-bug-on-bad-refs on the Git mailing list:

Code clean-up.

Will merge to 'next'.
source: <pull.1214.v2.git.1650894450441.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented May 3, 2022

This patch series was integrated into seen via git@64359f5.

@gitgitgadget
Copy link

gitgitgadget bot commented May 4, 2022

This patch series was integrated into seen via git@4f429ec.

@gitgitgadget
Copy link

gitgitgadget bot commented May 5, 2022

There was a status update in the "Cooking" section about the branch ds/do-not-call-bug-on-bad-refs on the Git mailing list:

Code clean-up.

Will merge to 'next'.
source: <pull.1214.v2.git.1650894450441.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented May 6, 2022

This patch series was integrated into seen via git@892b473.

@gitgitgadget
Copy link

gitgitgadget bot commented May 10, 2022

This patch series was integrated into seen via git@8cdf52e.

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2022

There was a status update in the "Cooking" section about the branch ds/do-not-call-bug-on-bad-refs on the Git mailing list:

Code clean-up.

Will merge to 'next'.
source: <pull.1214.v2.git.1650894450441.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2022

This patch series was integrated into seen via git@4df5247.

@gitgitgadget
Copy link

gitgitgadget bot commented May 11, 2022

This patch series was integrated into seen via git@2feb3c2.

@gitgitgadget
Copy link

gitgitgadget bot commented May 12, 2022

This patch series was integrated into seen via git@3a47162.

@gitgitgadget
Copy link

gitgitgadget bot commented May 12, 2022

This patch series was integrated into next via git@af4d003.

@gitgitgadget gitgitgadget bot added the next label May 12, 2022
@gitgitgadget
Copy link

gitgitgadget bot commented May 13, 2022

There was a status update in the "Cooking" section about the branch ds/do-not-call-bug-on-bad-refs on the Git mailing list:

Code clean-up.

Will merge to 'master'.
source: <pull.1214.v2.git.1650894450441.gitgitgadget@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented May 13, 2022

This patch series was integrated into seen via git@93031d7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant