From c0c9d35e27e777d117362f6e96f433eb2b55bf76 Mon Sep 17 00:00:00 2001 From: Taylor Blau Date: Fri, 3 Jun 2022 17:55:36 -0400 Subject: [PATCH] builtin/show-ref.c: avoid over-iterating with --heads, --tags MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When `show-ref` is combined with the `--heads` or `--tags` options, it can avoid iterating parts of a repository's references that it doesn't care about. But it doesn't take advantage of this potential optimization. When this command was introduced back in 358ddb62cf (Add "git show-ref" builtin command, 2006-09-15), `for_each_ref_in()` did exist. But since most repositories don't have many (any?) references that aren't branches or tags already, this makes little difference in practice. Though for repositories with a large imbalance of branches and tags (or, more likely in the case of server operators, many hidden references), this can make quite a difference. Take, for example, a repository with 500,000 "hidden" references (all of the form "refs/__hidden__/N"), and a single branch: git commit --allow-empty -m "base" && seq 1 500000 | sed 's,\(.*\),create refs/__hidden__/\1 HEAD,' | git update-ref --stdin && git pack-refs --all Outputting the existence of that single branch currently takes on the order of ~50ms on my machine. The vast majority of this time is wasted iterating through references that we know we're going to discard. Instead, teach `show-ref` that it can iterate just "refs/heads" and/or "refs/tags" when given `--heads` and/or `--tags`, respectively. A few small interesting things to note: - When given either option, we can avoid the general-purpose for_each_ref() call altogether, since we know that it won't give us any references that we wouldn't filter out already. - We can make two separate calls to `for_each_fullref_in()` (and avoid, say, the more specialized `for_each_fullref_in_prefixes()`, since we know that the set of references enumerated by each is disjoint, so we'll never see the same reference appear in both calls. - We have to use the "fullref" variant (instead of just `for_each_branch_ref()` and `for_each_tag_ref()`), since we expect fully-qualified reference names to appear in `show-ref`'s output. When either of `heads_only` or `tags_only` is set, we can eliminate the strcmp() calls in `builtin/show-ref.c::show_ref()` altogether, since we know that `show_ref()` will never see a non-branch or tag reference. Unfortunately, we can't use `for_each_fullref_in_prefixes()` to enhance `show-ref`'s pattern matching, since `show-ref` patterns match on the _suffix_ (e.g., the pattern "foo" shows "refs/heads/foo", "refs/tags/foo", and etc, not "foo/*"). Nonetheless, in our synthetic example above, this provides a significant speed-up ("git" is roughly v2.36, "git.compile" is this patch): $ hyperfine -N 'git show-ref --heads' 'git.compile show-ref --heads' Benchmark 1: git show-ref --heads Time (mean ± σ): 49.9 ms ± 6.2 ms [User: 45.6 ms, System: 4.1 ms] Range (min … max): 46.1 ms … 73.6 ms 43 runs Benchmark 2: git.compile show-ref --heads Time (mean ± σ): 2.8 ms ± 0.4 ms [User: 1.4 ms, System: 1.2 ms] Range (min … max): 1.3 ms … 5.6 ms 957 runs Summary 'git.compile show-ref --heads' ran 18.03 ± 3.38 times faster than 'git show-ref --heads' Signed-off-by: Taylor Blau Signed-off-by: Junio C Hamano --- builtin/show-ref.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/builtin/show-ref.c b/builtin/show-ref.c index 7f8a5332f838ac..5fa207a044e069 100644 --- a/builtin/show-ref.c +++ b/builtin/show-ref.c @@ -52,14 +52,6 @@ static int show_ref(const char *refname, const struct object_id *oid, if (show_head && !strcmp(refname, "HEAD")) goto match; - if (tags_only || heads_only) { - int match; - - match = heads_only && starts_with(refname, "refs/heads/"); - match |= tags_only && starts_with(refname, "refs/tags/"); - if (!match) - return 0; - } if (pattern) { int reflen = strlen(refname); const char **p = pattern, *m; @@ -216,7 +208,14 @@ int cmd_show_ref(int argc, const char **argv, const char *prefix) if (show_head) head_ref(show_ref, NULL); - for_each_ref(show_ref, NULL); + if (heads_only || tags_only) { + if (heads_only) + for_each_fullref_in("refs/heads/", show_ref, NULL); + if (tags_only) + for_each_fullref_in("refs/tags/", show_ref, NULL); + } else { + for_each_ref(show_ref, NULL); + } if (!found_match) { if (verify && !quiet) die("No match");