Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard reset safe mode #1137

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

slamko
Copy link

@slamko slamko commented Jan 30, 2022

Preventing unexpected code-deletion with hard reset 'safe mode'.

Considering the recomendations for patch v1 and in order to preserve the current robustness of hard reset, I have made the following modifications to the original version (which has completely disallowed hard reset on unborn branch with staged files):

Changes since v1:

  • Described security measures aren't enabled by default. Safe mode can be turned on with 'reset.safe' config variable.
  • Replaced the resulting error with Yes/No choice so hard reset on unborn branch isn`t impossible now.
  • Detection of staged changes that are going to be deleted by the reset isn't limited to unborn-branch state now. Git will warn you and ask for a confirmation if there are commits on the branch also.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 30, 2022

Welcome to GitGitGadget

Hi @Viaceslavus, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that your Pull Request has a good description, as it will be used as cover letter.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 30, 2022

There are issues in commit 556ad82:
forbid 'reset --hard' before the initial commit
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.

@slamko slamko force-pushed the hard-reset-safety-on-empty-repo branch from 556ad82 to 8ede26b Compare January 30, 2022 19:14
@gitgitgadget
Copy link

gitgitgadget bot commented Jan 30, 2022

There are issues in commit 8ede26b:
forbid a hard reset before the initial commit
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.

@slamko slamko force-pushed the hard-reset-safety-on-empty-repo branch from 8ede26b to 33e52c7 Compare January 30, 2022 19:25
@dscho
Copy link
Member

dscho commented Jan 31, 2022

/allow

@gitgitgadget
Copy link

gitgitgadget bot commented Jan 31, 2022

User Viaceslavus is now allowed to use GitGitGadget.

WARNING: Viaceslavus has no public email address set on GitHub

@dscho
Copy link
Member

dscho commented Jan 31, 2022

Lines in the body of the commit messages should be wrapped between 60 and 76 characters.

What this means is that the lines are too long:

Performing 'git reset --hard' on empty repo with staged files may havethe only one possible result - deleting all staged files.
Such behaviour may be unexpected or even dangerous.
So now, when running 'git reset --hard', git will check for the existence of commits in the repo;
in case of absence of such, and also if there are files staged, git will return an error.

Also, there are a couple of space before tab character problems, which you can fix by calling git rebase --whitespace=fix HEAD^ and then force-pushing.

@dscho
Copy link
Member

dscho commented Jan 31, 2022

The CI build is failing, and I have a hunch that this points out a real problem. For example, t0021 is failing thusly:

+ git reset --hard empty-branch
fatal: Hard reset isn`t allowed before the first commit.
error: last command exited with $?=128

This, along with the other test failures, suggests that the change provided by this PR cannot be applied as-is. You may want to introduce a new config setting to enable the new behavior, imitating e.g. 4c3abd0

To make this new config setting discoverable, you will want to add a new advice, imitating e.g. 649bf3a

The reason to be so careful is that we cannot simply break existing workflows of millions of users, just like that. And it is virtually guaranteed that existing workflows would be broken, by the mere fact that our very own test suite uses the existing paradigm that would be broken by this PR in its current form.

@slamko slamko force-pushed the hard-reset-safety-on-empty-repo branch 3 times, most recently from e1958fd to 13cc01b Compare February 2, 2022 10:54
@slamko
Copy link
Author

slamko commented Feb 2, 2022

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 2, 2022

Submitted as pull.1137.git.1643802721612.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1137/Viaceslavus/hard-reset-safety-on-empty-repo-v1

To fetch this version to local tag pr-1137/Viaceslavus/hard-reset-safety-on-empty-repo-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1137/Viaceslavus/hard-reset-safety-on-empty-repo-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 2, 2022

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Viaceslavus via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Viacelaus <vaceslavkozin619@gmail.com>
>
> Performing 'git reset --hard' on empty repo with staged files
> may have the only one possible result - deleting all staged files.

Sure.  It has the only one possible result, which is a sign that the
command is well designed to give a robust and predictable end user
experience.

I know you wanted to say "there is only one possible result, and
that result cannot be anything but bad. You Git folks are stupid to
design a command that only can have a bad result, so I'll fix that
stupidity for you".

But the thing is, not everybody agrees with your "deleting all files
that added to the index when asked to 'reset --hard' is bad".  It is
the most obvious way to go back to the "pristine" state, and after
all, that is what "reset --hard" is about.

Many readers on the list are non-native speakers.  You must be
careful with your rhetorics, because they often will not be taken in
the way you meant them to be taken by them.  When you can say "doing
X does Y" and convey the core of what you want to say, do so,
instead of saying "doing X has only one possible result, which is
Y". You may lose the "you Git folks are stupid" part of the message,
but you're better off not to sound rude anyway ;-)

> Such behaviour may be unexpected or even dangerous. With this
> commit, when running 'git reset --hard', git will check for the
> existence of commits in the repo; in case of absence of such, and
> also if there are any files staged, git will die with an error.

This directly contradicts with, and likely will regress the fix made
by, what 166ec2e9 (reset: allow reset on unborn branch, 2013-01-14)
wanted to do.  I do not think we want this change in its current
form.

When starting a new project on a hosting provider like GitHub these
days, you can have them create the initial commit that records the
copy of the license file, and the first thing you do on your local
machine after leaving the browser to create the repository over
there is to clone from it.  After that, you'd populate the working
tree with the rest of the project files, and record the result.  If
you say "reset --hard" before committing, you'll equally lose all
the newly added files, but because the history is not empty, the
approach taken by this patch would not work to protect you, I
suspect.  It almost always is a mistake to special case an empty
repository or an empty history.

Having said all that, I am sympathetic to the cause to make it
harder to discard a lot of work by mistake.  It is just that
disabling "reset --hard" only when it is trying to go back to an
empty tree is not an effective way to do so.  It is even less so
when you do not give any escape hatch in case the user knew what
they were doing and really meant to go back to the pristine state.

    Side note.  Yes, "git diff --cached | git apply -R --index" or
    "git rm --cached -r ." as a workaround, but when the user wanted
    to do "reset --hard", we should have a way to let them do so.

Off the top of my head, here are a couple of possible ways to
improve the design of this change (note: I am not saying that I'll
unconditionally take such a patch that implements any of these):

 * Detect if we are being interactive, and offer Yes/No choice to
   give an interactive user a chance to abort when we detect a
   "risky" situation.  Don't do anything if we are not interactive,
   and don't make it impossible to do things that we may (mis)detect
   as risky.

 * Instead of "we are going back to the state without any commit
   yet", use a better heuristics, such as "we'd lose a newly added
   path (i.e. the path exists in the index and in the working tree
   but does not exist in HEAD)" as a sign to flag the situation as
   possibly risky.  Or limit that further to protect only when we'd
   lose more than N-percent of the paths in the index that way.

But both are hard problems.

Many existing scripts do rely on "reset --hard" to be a robust and
predictable way to go back to the pristine state, and they will be
very upset if we misdetect and prompt the user who is not sitting in
front of the keyboard.

@slamko slamko force-pushed the hard-reset-safety-on-empty-repo branch 5 times, most recently from 83cbd53 to 450078e Compare February 11, 2022 17:41
The power of hard reset may frequently be inconvinient for a common user. So
this is an implementation of safe mode for hard reset. It can be switched on
by setting 'reset.safe' config variable to true. When running 'reset --hard'
with 'reset.safe' enabled git will check if there are any staged changes
that may be discarded by this reset. If there is a chance of deleting the
changes, git will ask the user for a confirmation with Yes/No choice.

Signed-off-by: Viacelaus <vaceslavkozin619@gmail.com>
@slamko slamko force-pushed the hard-reset-safety-on-empty-repo branch from 450078e to e6eec1b Compare February 11, 2022 20:59
@slamko slamko changed the title Forbid a hard reset on empty repo with staged files. Hard reset safe mode Feb 11, 2022
@slamko
Copy link
Author

slamko commented Feb 11, 2022

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 11, 2022

Submitted as pull.1137.v2.git.1644618404948.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1137/Viaceslavus/hard-reset-safety-on-empty-repo-v2

To fetch this version to local tag pr-1137/Viaceslavus/hard-reset-safety-on-empty-repo-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1137/Viaceslavus/hard-reset-safety-on-empty-repo-v2

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 11, 2022

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Viaceslavus via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Viacelaus <vaceslavkozin619@gmail.com>
>
> The power of hard reset may frequently be inconvinient for a common user. So
> this is an implementation of safe mode for hard reset. It can be switched on
> by setting 'reset.safe' config variable to true. When running 'reset --hard'
> with 'reset.safe' enabled git will check if there are any staged changes
> that may be discarded by this reset. If there is a chance of deleting the
> changes, git will ask the user for a confirmation with Yes/No choice.

There needs an explanation on how this avoids breaking scripts that
trust that "git reset --hard HEAD" reliably matches the index and
the working tree files to what is recorded in HEAD without getting
stuck waiting for any user input.  "They can turn off reset.safe" is
not an acceptable answer.

> +static int check_commit_exists(const char *refname, const struct object_id *oid, int f, void *d)
> +{
> +	return is_branch(refname);
> +}

The returned value from a for_each_ref() callback is used to tell
the caller "stop here, no need to further iterate and call me with
other refs".  I think this wants to say "if I ever get called even
once, tell the caller to stop, so that it can tell its caller that
it was stopped".

> +static void accept_discarding_changes(void) {
> +	int answer = getc(stdin);
> +	printf(_("Some staged changes may be discarded by this reset. Continue? [Y/n]"));
> +
> +	if (answer != 'y' && answer != 'Y') {
> +		printf(_("aborted\n"));
> +		exit(1);
> +	}
> +}

I'd think at least we should use git_prompt(), instead of
hand-rolled prompt routine like this one that assumes that an
end-user is sitting in front of the terminal waiting to be prompted.

If updating "git reset" like this patch does were a good idea to
begin with, that is.

> +static void detect_risky_reset(int commits_exist) {
> +	int cache = read_cache();
> +	if(!commits_exist) {
> +		if(cache == 1) {
> +			accept_discarding_changes();
> +		}
> +	}
> +	else {
> +		if(has_uncommitted_changes(the_repository, 1)) {
> +			accept_discarding_changes();
> +		}
> +	}
> +}

Style (too many to list---see Documentation/CodingGuidelines).

> +	if (reset_type == HARD) {
> +		int safe = 0;
> +		git_config_get_bool("reset.safe", &safe);
> +		if (safe) {
> +			int commits_exist = for_each_fullref_in("refs/heads", check_commit_exists, NULL);

The callback is called for each and every ref inside "refs/heads/",
so by definition, shouldn't any of them pass "is_branch(refname)"?

In any case, why does this have to be done by the caller?  If the
helper claims to be capable of detecting a "risky reset" (if such a
thing exists, that is), and if the helper behaves differently when
there is any commit on any branch or not as its implementation
detail, shouldn't it figure out if there is a commit _inside_ the
helper itself, not forcing the caller to compute it for it?

> +			detect_risky_reset(commits_exist);
> +		}
> +	}
> +
>  	if (reset_type == NONE)
>  		reset_type = MIXED; /* by default */
>  

> diff --git a/t/t7104-reset-hard.sh b/t/t7104-reset-hard.sh
> index cf9697eba9a..c962c706bed 100755
> --- a/t/t7104-reset-hard.sh
> +++ b/t/t7104-reset-hard.sh
> @@ -44,4 +44,31 @@ test_expect_success 'reset --hard did not corrupt index or cache-tree' '
>  
>  '
>  
> +test_expect_success 'reset --hard in safe mode on unborn branch with staged files results in a warning' '
> +	git config reset.safe on &&

Use either "test_when_finished" or "test_config", which is a good
way to isolate each test from side effects of running the test that
comes before it.

@dscho
Copy link
Member

dscho commented Mar 22, 2022

/allow

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2022

User slamko is now allowed to use GitGitGadget.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants