Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make git reflog expire --stale-fix a lot more useful #873

Closed

Conversation

dscho
Copy link
Member

@dscho dscho commented Feb 10, 2021

Yesterday, I tried to run a quick test to find out whether master's version of git repack prevents .bitmap files from being deleted by still having them mmap()ed. Since I do not have a build of master lying around just like that, I checked it out, built the thing, and then ran

./bin-wrappers/git -c alias.c='!(cd /path/to/directory && ./test-whether-bitmaps-are-mmaped-during-repack.sh) c

Do NOT try this at home! The problem with this invocation is that the alias will still have GIT_DIR set, therefore the git init in that script will not create a new Git directory, and the git repack -ad in that script will remove all kinds of precious objects from the Git checkout. Even though I interrupted the run as quickly as I realized that things were going wrong, my repository was corrupted in a major way, and it took me many hours to get back to a healthy state.

It made matters worse that git reflog expire --stale-fix was less helpful than it could have been, and this patch is the result of my directed emotional energy.

cc: Jeff King peff@peff.net

Whenever a user runs `git reflog expire --stale-fix`, the most likely
reason is that their repository is at least _somewhat_ corrupt. Which
means that it is more than just possible that some objects are missing.

If that is the case, that can currently let the command abort through
the phase where it tries to mark all reachable objects.

Instead of adding insult to injury, let's be gentle and continue as best
as we can in such a scenario, simply by ignoring the missing objects and
moving on.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho dscho force-pushed the reflog-expire-with-missing-objects branch from e598d72 to c0d5f65 Compare February 10, 2021 15:38
@dscho
Copy link
Member Author

dscho commented Feb 10, 2021

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 10, 2021

Submitted as pull.873.git.1612973499110.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-873/dscho/reflog-expire-with-missing-objects-v1

To fetch this version to local tag pr-873/dscho/reflog-expire-with-missing-objects-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-873/dscho/reflog-expire-with-missing-objects-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 10, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Whenever a user runs `git reflog expire --stale-fix`, the most likely
> reason is that their repository is at least _somewhat_ corrupt. Which
> means that it is more than just possible that some objects are missing.
>
> If that is the case, that can currently let the command abort through
> the phase where it tries to mark all reachable objects.
>
> Instead of adding insult to injury, let's be gentle and continue as best
> as we can in such a scenario, simply by ignoring the missing objects and
> moving on.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---

I am of two minds.

I appreciate an effort of making it looser and less likely to get in
trouble when running in a corrupt repository, but --stale-fix is a
bit special and should probably be removed.

The only reason the option was added was because we needed to have
an easy way to recover from a specific kind of reflog corruption
that would have resulted by a (then known) bug in "git reflog" in
the released version of Git that came immediately before the version
of Git that added the "fix" option, while the root cause of the
corruption got fixed.  

Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was
written, it was very useful to have a way to recover from the
corruption likely to have happened with the version of Git that came
before it.  But it no longer is relevant after this many years.
There may be other ways to break the reflog entries, but --stale-fix
was never designed to deal with anything but a specific way the
reflog gets corrupted (namely, by the old version of Git that
corrupted reflog in a specific way), so keeping it would not be very
useful.


@gitgitgadget
Copy link

gitgitgadget bot commented Feb 11, 2021

On the Git mailing list, Jeff King wrote (reply to this):

On Wed, Feb 10, 2021 at 12:30:27PM -0800, Junio C Hamano wrote:

> I appreciate an effort of making it looser and less likely to get in
> trouble when running in a corrupt repository, but --stale-fix is a
> bit special and should probably be removed.
> 
> The only reason the option was added was because we needed to have
> an easy way to recover from a specific kind of reflog corruption
> that would have resulted by a (then known) bug in "git reflog" in
> the released version of Git that came immediately before the version
> of Git that added the "fix" option, while the root cause of the
> corruption got fixed.
> 
> Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was
> written, it was very useful to have a way to recover from the
> corruption likely to have happened with the version of Git that came
> before it.  But it no longer is relevant after this many years.
> There may be other ways to break the reflog entries, but --stale-fix
> was never designed to deal with anything but a specific way the
> reflog gets corrupted (namely, by the old version of Git that
> corrupted reflog in a specific way), so keeping it would not be very
> useful.

FWIW, I have used --stale-fix for cases where a repo's reflog was
"corrupted" by its alternate pruning objects.

E.g., if you do something like this:

  git clone -s orig.git new.git

you're now at risk of "git gc" in orig.git corrupting new.git, because
its reachability check doesn't know anything about those refs. You can
mitigate that by keeping a copy of new.git's refs in orig.git. Something
like:

  git -C orig.git fetch ../new.git refs/*:refs/preserve/*
  git -C orig.git gc

But that doesn't know about reflogs at all! It will still corrupt them
(but only sometimes; depending how often you do that fetch, you might
catch the same values in orig.git's reflog).

Possibly the correct answer here is "turn off reflogs in new.git", but
they are sometimes useful, and things _mostly_ work (for history that
doesn't rewind, or where the rewound bits are specific to new.git). So
it's useful to be able to run something like "reflog expire --stale-fix"
to clear out the occasional problem.

(A careful reader will note that objects mentioned only in the index
have a similar problem; but again, those tend to be local to new.git,
and don't exist at all in a server context).

-Peff

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 11, 2021

User Jeff King <peff@peff.net> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 11, 2021

This branch is now known as js/reflog-expire-stale-fix.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 11, 2021

This patch series was integrated into seen via git@8d144f4.

@gitgitgadget gitgitgadget bot added the seen label Feb 11, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 12, 2021

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi,

On Thu, 11 Feb 2021, Jeff King wrote:

> On Wed, Feb 10, 2021 at 12:30:27PM -0800, Junio C Hamano wrote:
>
> > I appreciate an effort of making it looser and less likely to get in
> > trouble when running in a corrupt repository, but --stale-fix is a
> > bit special and should probably be removed.
> >
> > The only reason the option was added was because we needed to have
> > an easy way to recover from a specific kind of reflog corruption
> > that would have resulted by a (then known) bug in "git reflog" in
> > the released version of Git that came immediately before the version
> > of Git that added the "fix" option, while the root cause of the
> > corruption got fixed.
> >
> > Back when 1389d9dd (reflog expire --fix-stale, 2007-01-06) was
> > written, it was very useful to have a way to recover from the
> > corruption likely to have happened with the version of Git that came
> > before it.  But it no longer is relevant after this many years.
> > There may be other ways to break the reflog entries, but --stale-fix
> > was never designed to deal with anything but a specific way the
> > reflog gets corrupted (namely, by the old version of Git that
> > corrupted reflog in a specific way), so keeping it would not be very
> > useful.

Thank you for the original historical context.

> FWIW, I have used --stale-fix for cases where a repo's reflog was
> "corrupted" by its alternate pruning objects.
>
> E.g., if you do something like this:
>
>   git clone -s orig.git new.git
>
> you're now at risk of "git gc" in orig.git corrupting new.git, because
> its reachability check doesn't know anything about those refs. You can
> mitigate that by keeping a copy of new.git's refs in orig.git. Something
> like:
>
>   git -C orig.git fetch ../new.git refs/*:refs/preserve/*
>   git -C orig.git gc
>
> But that doesn't know about reflogs at all! It will still corrupt them
> (but only sometimes; depending how often you do that fetch, you might
> catch the same values in orig.git's reflog).
>
> Possibly the correct answer here is "turn off reflogs in new.git", but
> they are sometimes useful, and things _mostly_ work (for history that
> doesn't rewind, or where the rewound bits are specific to new.git). So
> it's useful to be able to run something like "reflog expire --stale-fix"
> to clear out the occasional problem.
>
> (A careful reader will note that objects mentioned only in the index
> have a similar problem; but again, those tend to be local to new.git,
> and don't exist at all in a server context).

I want to add the experience with that half year during which `git gc`
with worktrees was broken, as it would happily ignore the reflogs of the
worktree-specific `HEAD`s, all except the one from the worktree from which
`git gc` was run. That was some fun time, and `--stale-fix` was a life
saver.

With this _additional_ historical context, I would deem it wise to keep
`--stale-fix` around.

Ciao,
Dscho

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 12, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> Possibly the correct answer here is "turn off reflogs in new.git", but
>> they are sometimes useful, and things _mostly_ work (for history that
>> doesn't rewind, or where the rewound bits are specific to new.git). So
>> it's useful to be able to run something like "reflog expire --stale-fix"
>> to clear out the occasional problem.
>>
>> (A careful reader will note that objects mentioned only in the index
>> have a similar problem; but again, those tend to be local to new.git,
>> and don't exist at all in a server context).
>
> I want to add the experience with that half year during which `git gc`
> with worktrees was broken, as it would happily ignore the reflogs of the
> worktree-specific `HEAD`s, all except the one from the worktree from which
> `git gc` was run. That was some fun time, and `--stale-fix` was a life
> saver.

The option was invented for a specific case, but if its "fix"
applies for breakages caused by more recent bugs and user induced
actions, I would agree that that it gives us a good reason to keep
it around.

I have to wonder if the explanation in the documentation for the
option needs to be made less specific to the originally motivating
case, though.

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 12, 2021

This patch series was integrated into seen via git@857a021.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 12, 2021

This patch series was integrated into next via git@899034e.

@gitgitgadget gitgitgadget bot added the next label Feb 12, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2021

This patch series was integrated into seen via git@e68f62b.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2021

This patch series was integrated into next via git@e68f62b.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2021

This patch series was integrated into master via git@e68f62b.

@gitgitgadget gitgitgadget bot added the master label Feb 19, 2021
@gitgitgadget gitgitgadget bot closed this Feb 19, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 19, 2021

Closed via e68f62b.

@dscho dscho deleted the reflog-expire-with-missing-objects branch February 19, 2021 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant