stash: reuse cached index entries in --patch temporary index#2306
Open
adamchainz wants to merge 1 commit into
Open
stash: reuse cached index entries in --patch temporary index#2306adamchainz wants to merge 1 commit into
adamchainz wants to merge 1 commit into
Conversation
5163197 to
d8438ee
Compare
`git stash -p` prepares the interactive selection by creating a temporary index at HEAD, switching `GIT_INDEX_FILE` to it, and then running the `add -p` machinery. That temporary index was created by running `git read-tree HEAD`. The resulting index had no useful cached stat data or fsmonitor-valid bits from the real index. When `run_add_p()` refreshed that temporary index before showing the first prompt, it could end up lstat(2)-ing every tracked file, even in a repository where `git diff` and `git restore -p` can use fsmonitor to avoid that work. Create the temporary index in-process instead. Use `unpack_trees()` to reset the real index contents to HEAD while writing the result to the temporary index path. For paths whose index entries already match HEAD, `oneway_merge()` reuses the existing cache entries, preserving their cached stat data and `CE_FSMONITOR_VALID` state. This makes the refresh performed by `run_add_p()` behave like the one used by `git restore -p`: unchanged paths can be skipped via fsmonitor instead of being scanned again. In a 206k file repository with `core.fsmonitor` enabled and a one-line change in one file, time to first prompt dropped from 34.774 seconds to 0.659 seconds. Signed-off-by: Adam Johnson <me@adamj.eu>
d8438ee to
b228160
Compare
Contributor
Author
|
/preview |
|
Preview email sent as pull.2306.git.git.1779194003106.gitgitgadget@gmail.com |
Contributor
Author
|
/submit |
|
Submitted as pull.2306.git.git.1779194605735.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
This patch series was integrated into seen via 15bd84f. |
|
Junio C Hamano wrote on the Git mailing list (how to reply to this email): "Adam Johnson via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Adam Johnson <me@adamj.eu>
>
> `git stash -p` prepares the interactive selection by creating a
> temporary index at HEAD, switching `GIT_INDEX_FILE` to it, and then
> running the `add -p` machinery.
>
> That temporary index was created by running `git read-tree HEAD`. The
> resulting index had no useful cached stat data or fsmonitor-valid bits
> from the real index. When `run_add_p()` refreshed that temporary index
> before showing the first prompt, it could end up lstat(2)-ing every
> tracked file, even in a repository where `git diff` and `git restore -p`
> can use fsmonitor to avoid that work.
>
> Create the temporary index in-process instead. Use `unpack_trees()` to
> reset the real index contents to HEAD while writing the result to the
> temporary index path. For paths whose index entries already match HEAD,
> `oneway_merge()` reuses the existing cache entries, preserving their
> cached stat data and `CE_FSMONITOR_VALID` state.
Clever. As the fsmonitor_valid bit is in-core only, updating the
index in-process would be an obvious and probably the only sensible
way to preserve it.
I however have to wonder if simply replacing the external process
invocation with "git read-tree -m HEAD" (i.e., oneway merge) gives
a similar speed-up.
> This makes the refresh performed by `run_add_p()` behave like the one
> used by `git restore -p`: unchanged paths can be skipped via fsmonitor
> instead of being scanned again.
>
> In a 206k file repository with `core.fsmonitor` enabled and a one-line
> change in one file, time to first prompt dropped from 34.774 seconds to
> 0.659 seconds.
Interesting.
> diff --git a/t/t3904-stash-patch.sh b/t/t3904-stash-patch.sh
> index 90a4ff2c10..4b3241c8cd 100755
> --- a/t/t3904-stash-patch.sh
> +++ b/t/t3904-stash-patch.sh
> @@ -84,6 +84,24 @@ test_expect_success 'none of this moved HEAD' '
> verify_saved_head
> '
>
> +test_expect_success 'stash -p with unmodified tracked files present' '
> + git reset --hard &&
> + echo line1 >alpha &&
> + echo line1 >beta &&
> + git add alpha beta &&
> + git commit -m "add alpha and beta" &&
> + echo line2 >>alpha &&
> + echo y | git stash -p &&
> + echo line1 >expect &&
> + test_cmp expect alpha &&
> + test_cmp expect beta &&
> + git stash pop &&
> + printf "line1\nline2\n" >expect &&
> + test_cmp expect alpha &&
> + echo line1 >expect &&
> + test_cmp expect beta
> +'
What I read from the proposed log message is that the change is
purely about performance and should not change any behaviour. Why
do we need a new test in t/t3904? I would not have surprised if we
saw a new test in t/perf/, though.
Thanks. |
|
Junio C Hamano wrote on the Git mailing list (how to reply to this email): "Adam Johnson via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 2 files changed, 83 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/stash.c b/builtin/stash.c
> index 32dbc97b47..48189cb9f7 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -372,6 +372,57 @@ static int reset_tree(struct object_id *i_tree, int update, int reset)
> return 0;
> }
>
> +static int create_index_from_tree(const struct object_id *tree_id,
> + const char *index_path)
> +{
> + int nr_trees = 1;
> + int ret = 0;
> + struct unpack_trees_options opts;
> + struct tree_desc t[MAX_UNPACK_TREES];
> + struct tree *tree;
> + struct index_state dst_istate = INDEX_STATE_INIT(the_repository);
> + struct lock_file lock_file = LOCK_INIT;
> +
> + repo_read_index_preload(the_repository, NULL, 0);
> + if (refresh_index(the_repository->index, REFRESH_QUIET, NULL, NULL, NULL))
> + return -1;
Is this "non-zero return from refresh_index() leads to a failure"
intended? The old "git read-tree HEAD" wouldn't have cared if the
original index were unmerged, for example, but with this update, we
will see an immediate failure. There are other conditions that
refresh_index() flips its local variable has_errors on, which leads
to its non-zero return.
Since "git stash -p" is almost always invoked when the user has
unstaged modifications, I am not sure allowing refresh_index() to
notice and barf is what we want here. |
|
This patch series was integrated into seen via f9c94c1. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CC: Thomas Gummerer t.gummerer@gmail.com, Elijah Newren newren@gmail.com, Phillip Wood phillip.wood@dunelm.org.uk, Victoria Dye vdye@github.com