pr-1609/vdye/vdye/for-each-ref-optimizations-v2
pr-1609/vdye/vdye/for-each-ref-optimizations-v2: for-each-ref optimizations & usability improvements
tagged this
14 Nov 19:53
This series is a bit of an informal follow-up to [1], adding some more substantial optimizations and usability fixes around ref filtering/formatting. Some of the changes here affect user-facing behavior, some are internal-only, but they're all interdependent enough to warrant putting them together in one series. [1] https://lore.kernel.org/git/pull.1594.v2.git.1696888736.gitgitgadget@gmail.com/ Patch 1 changes the behavior of the '--no-sort' option in 'for-each-ref', 'tag', and 'branch'. Currently, it just removes previous sort keys and, if no further keys are specified, falls back on ascending refname sort (which, IMO, makes the name '--no-sort' somewhat misleading). Now, '--no-sort' completely disables sorting (unless subsequent '--sort' options are provided). Patches 2-7 incrementally refactor various parts of the ref filtering/formatting workflows in order to create a 'filter_and_format_refs()' function. If certain conditions are met (sorting disabled, no reachability filtering or ahead-behind formatting), ref filtering & formatting is done within a single 'for_each_fullref_in' callback. Especially in large repositories, this makes a huge difference in memory usage & runtime for certain usages of 'for-each-ref', since it's no longer writing everything to a 'struct ref_array' then repeatedly whittling down/updating its contents. Patch 8 updates the 'for-each-ref' documentation, making the '--format' description a bit less jumbled and more clearly explaining the '*' prefix (to be updated in the next patch) Patch 9 changes the dereferencing done by the '*' format prefix from a single dereference to a recursive peel. See [1] + replies for the discussion that led to this approach (as opposed to a command line option or new format specifier). [1] https://lore.kernel.org/git/ZUoWWo7IEKsiSx-C@tanuki/ Finally, patch 10 adds performance tests for 'for-each-ref', showing the effects of optimizations made throughout the series. Here are some sample results from my Ubuntu VM (test names shortened for space): Test HEAD ---------------------------------------------------------------------------- 6300.2: (loose) 4.68(0.98+3.64) 6300.3: (loose, no sort) 4.65(0.91+3.67) 6300.4: (loose, --count=1) 4.50(0.84+3.60) 6300.5: (loose, --count=1, no sort) 4.24(0.46+3.71) 6300.6: (loose, tags) 2.41(0.45+1.93) 6300.7: (loose, tags, no sort) 2.33(0.48+1.83) 6300.8: (loose, tags, dereferenced) 3.65(1.66+1.95) 6300.9: (loose, tags, dereferenced, no sort) 3.48(1.59+1.87) 6300.10: for-each-ref + cat-file (loose, tags) 4.48(2.27+2.22) 6300.12: (packed) 0.90(0.68+0.18) 6300.13: (packed, no sort) 0.71(0.55+0.06) 6300.14: (packed, --count=1) 0.77(0.52+0.16) 6300.15: (packed, --count=1, no sort) 0.03(0.01+0.02) 6300.16: (packed, tags) 0.45(0.33+0.10) 6300.17: (packed, tags, no sort) 0.39(0.33+0.03) 6300.18: (packed, tags, dereferenced) 1.83(1.67+0.10) 6300.19: (packed, tags, dereferenced, no sort) 1.42(1.28+0.08) 6300.20: for-each-ref + cat-file (packed, tags) 2.36(2.11+0.29) * Victoria Changes since V1 ================ * Restructured commit message of patch 1 for better readability * Re-added 'ref_sorting_release(sorting)' to 'ls-remote' * Dropped patch 2 so we don't commit to behavior we don't want in 'for-each-ref --omit-empty --count' * Split patch 6 into one that renames 'ref_filter_handler()' to 'filter_one()' and another that creates helper functions from existing code * Added/updated code comments in patch 7, changed ref iteration "break" return value from -1 to 1 * Added a patch to reword 'for-each-ref' documentation in anticipation of updating the description of what '*' does in the format * Removed command-line option '--full-deref' for peeling tags in '*' format fields in favor of simply cutting over from the current single dereference to recursive dereference in all cases. Updated tests to match new behavior. * Added the '--count=1' tests back to p6300 (I must have unintentionally removed them before submitting V1) Victoria Dye (10): ref-filter.c: really don't sort when using --no-sort ref-filter.h: add max_count and omit_empty to ref_format ref-filter.h: move contains caches into filter ref-filter.h: add functions for filter/format & format-only ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()' ref-filter.c: refactor to create common helper functions ref-filter.c: filter & format refs in the same callback for-each-ref: clean up documentation of --format ref-filter.c: use peeled tag for '*' format fields t/perf: add perf tests for for-each-ref Documentation/git-for-each-ref.txt | 23 +-- builtin/branch.c | 42 +++-- builtin/for-each-ref.c | 39 +---- builtin/ls-remote.c | 11 +- builtin/tag.c | 32 +--- ref-filter.c | 272 ++++++++++++++++++++--------- ref-filter.h | 25 +++ t/perf/p6300-for-each-ref.sh | 87 +++++++++ t/t3200-branch.sh | 68 +++++++- t/t6300-for-each-ref.sh | 43 +++++ t/t6302-for-each-ref-filter.sh | 4 +- t/t7004-tag.sh | 45 +++++ 12 files changed, 517 insertions(+), 174 deletions(-) create mode 100755 t/perf/p6300-for-each-ref.sh base-commit: bc5204569f7db44d22477485afd52ea410d83743 Submitted-As: https://lore.kernel.org/git/pull.1609.v2.git.1699991638.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.1609.git.1699320361.gitgitgadget@gmail.com
Assets 2
-
2023-11-14T19:53:59Z -
2023-11-14T19:53:59Z -