[pull] master from git:master by pull[bot] · Pull Request #138 · makesoftwaresafe/git

pull · 2023-07-25T22:53:25Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

When cross-compiling with the mingw toolchain on a system with a case sensitive filesystem, the mixed case (which is technically correct as per the contents of MS Visual C++) doesn't work (the corresponding mingw headers are all lowercase for some reason). Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In 37fec86 (packfile: abstract away hash constant values, 2018-05-02), `nth_packed_object_id()` started using the variable `the_hash_algo->rawsz` instead of a fixed constant when trying to compute an offset into the ".idx" file for some object position. This can lead to surprising truncation when looking for an object towards the end of a large enough pack, like the following: (gdb) p hashsz $1 = 20 (gdb) p n $2 = 215043814 (gdb) p hashsz * n $3 = 5908984 , which is a debugger session broken on a known-bad call to the `nth_packed_object_id()` function. This behavior predates 37fec86, and is original to the v2 index format, via: 74e34e1 (sha1_file.c: learn about index version 2, 2007-04-09). This is due to §6.4.4.1 of the C99 standard, which states that an untyped integer constant will take the first type in which the value can be accurately represented, among `int`, `long int`, and `long long int`. Since 20 can be represented as an `int`, and `n` is a 32-bit unsigned integer, the resulting computation is defined by §6.3.1.8, and the (signed) integer value representing `n` is converted to an unsigned type, meaning that `20 * n` (for `n` having type `uint32_t`) is equivalent to a multiplication between two unsigned 32-bit integers. When multiplying a sufficiently large `n`, the resulting value can exceed 2^32-1, wrapping around and producing an invalid result. Let's follow the example in f86f769 (compute pack .idx byte offsets using size_t, 2020-11-13) and replace this computation with `st_mult()`, which will ensure that the computation is done using 64-bits. While here, guard the corresponding computation for packs with v1 indexes, too. Though the likelihood of seeing a bug there is much smaller, since (a) v1 indexes are generated far less frequently than v2 indexes, and (b) they all correspond to packs no larger than 2 GiB, so having enough objects to trigger this overflow is unlikely if not impossible. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When the user is in the middle of making a commit, they are not yet at the point where they are ready to think about integrating their local branch with the corresponding remote branch or force-pushing over the remote branch. Don't include advice on how to deal with divergent branches in the commit template, to avoid giving the impression that the divergence needs to be dealt with immediately. Similar advice will be printed when it is most relevant, that is, if the user does try to push without first reconciling the two branches. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a narrow but common case, the user is the only author of a branch and doesn't mind overwriting the corresponding branch on the remote. This workflow is especially common on GitHub, GitLab, and Gerrit, which keep a permanent record of every version of a branch that is pushed while a pull request is open for that branch. On those platforms, force-pushing is encouraged and is analogous to emailing a new version of a patchset. When giving advice about divergent branches, tell the user about `git pull`, but don't unconditionally instruct the user to do it. A less prescriptive message will help prevent users from thinking that they are required to create an integrated history instead of simply replacing the previous history. Likewise, don't imply that `git pull` is only for merging. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a narrow but common case, the user is the only author of a branch and doesn't mind overwriting the corresponding branch on the remote. This workflow is especially common on GitHub, GitLab, and Gerrit, which keep a permanent record of every version of a branch that is pushed while a pull request is open for that branch. On those platforms, force-pushing is encouraged and is analogous to emailing a new version of a patchset. When giving advice about divergent branches, tell the user about `git pull`, but don't unconditionally instruct the user to do it. A less prescriptive message will help prevent users from thinking that they are required to create an integrated history instead of simply replacing the previous history. Also, don't put `git pull` in an awkward parenthetical, because `git pull` can always be used to reconcile branches and is the normal way to do so. Due to the difficulty of knowing which command for force-pushing is best suited to the user's situation, no specific advice is given about force-pushing. Instead, the user is directed to the Git documentation to read about possible ways forward that do not involve integration. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

These two messages were introduced in 8ba221e (bundle: output hash information in 'verify', 2022-03-22) and 105c6f1 (bundle: parse filter capability, 2022-03-09) but never for translation. Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The reflog-expire command has been unimplemented since it was added in 80f2a60 (t/helper: add test-ref-store to test ref-store functions, 2017-03-26). This causes -Wunused-parameter to complain, since the function just calls die() without looking at its arguments. We could mark these as UNUSED to silence the warning. But let's just drop the function. It has no callers in the test suite and is not doing anything useful, beyond perhaps reminding us that it's something we _could_ be testing. But since the bulk of the work in adding such tests would be the shell bits that actually examine the reflog state before and after expiration, this is not even a useful step in that direction. Somebody who wants to do that work later can easily add this function back. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

This function gets a repository parameter because it's a callback for do_for_each_repo_ref_iterator(). But it's just a wrapper that passes along each call to a regular each_ref_fn callback, and the latter doesn't accept a repository argument. Probably in the long run all of the each_ref_fn callbacks should get a repository parameter, too. But changing that now would require updates all over the code base. Until that happens, let's annotate this wrapper callback to quiet the compiler's -Wunused-parameter warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

These functions are all used as callbacks for curl, so they have to conform to a particular interface. But they don't need all of their parameters: - fwrite_null() throws away the input, so it doesn't look at most parameters - fwrite_wwwauth() in theory could take the auth struct in its void pointer, but instead we just access it as the global http_auth (matching the rest of the code in this file) - curl_trace() always writes via the trace mechanism, so it doesn't need its void pointer to know where to send things. Likewise, it ignores the CURL parameter, since nothing we trace requires querying the handle. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The xml_start_tag() function is passed the expat library's XML_SetElementHandler() function, so it has to conform to the expected interface. But we don't actually care about the attributes list. Mark it so that -Wunused-parameter does not complain. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When parsing the input, we have a "keep_cr" parameter to tell us how to handle line endings. But this doesn't apply to stgit or hg patches (which are not mailbox formats where we have to worry about that), so we ignore the parameter entirely in those functions. Let's mark these as unused so that -Wunused-parameter does not complain about them. Note that we could just drop these parameters entirely. They are necessary to conform to the mail_conv_fn interface used by split_mail_conv(), but these two callbacks are the only ones used with that function. The other formats (which _do_ care about keep_cr) use split_mail_mbox(). But it's conceivable that we'd eventually add another format that does care about this option, so let's leave it as part of the generic interface. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Callbacks to for_each_altodb() get a void data pointer, but we don't need it here. Mark it as unused to silence -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The setup_revision_opt struct has a "tweak" function pointer, which can be used to adjust parameters after setup_revisions() parses arguments, but before it finalizes setup. In addition to the rev_info struct, the callback receives a pointer to the setup_revision_opt, as well. But none of the existing callbacks looks at the extra "opt" parameter, leading to -Wunused-parameter warnings. We could mark it as UNUSED, but instead let's remove it entirely. It's conceivable that it could be useful for a callback to have access to the "opt" struct. But in the 13 years that this mechanism has existed, nobody has used it. So let's just drop it in the name of simplifying. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

There are a few callback functions which are used with the fsck code, but it's natural that not all callbacks need all parameters. For reporting, even something as obvious as "the oid of the object which had a problem" is not always used, as some callers are only checking a single object in the first place. And for both reporting and walking, things like void data pointers and the fsck_options aren't always necessary. But since each such parameter is used by _some_ callback, we have to keep them in the interface. Mark the unused ones in specific callbacks to avoid triggering -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Our threeway_callback() does not bother to look at its "n" parameter. It is static in this file and used only by trivial_merge_trees(), which always passes 3 trees (hence the name "threeway"). It also does not look at "dirmask". This is OK, as it handles directories specifically by looking at the mode bits. Other traverse_info callbacks need these, so we can't get drop them from the interface. But let's annotate these ones to avoid complaints from -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

We don't look at the "flags" parameter, which is natural for something that is just printing the contents of the replace refs. But let's mark it to appease -Wunused-parameter. This probably should have been part of 63e14ee (refs: mark unused each_ref_fn parameters, 2022-08-19), but I missed it as this one is a repo_each_ref_fn, which takes an extra repository argument. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

We don't look at the "commit" parameter to our callback, as our "mergetag_data" pointer contains the original name "ref", which we use instead. But we can't get rid of it, since other for_each_mergetag callbacks do use it. Let's mark the parameter to avoid -Wunused-parameter warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

We don't need to use the "data" parameter in this instance. Let's mark it to avoid -Wunused-parameter warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

We iterate over the set of input tag names using callbacks. But not all operations need the same inputs, so some parameters go unused (but of course not the same ones for each operation). Mark the unused ones to avoid -Wunused-parameter warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Many callback interfaces have an extra void data parameter, but we don't always need it (especially for dumping functions like the ones in test helpers). Mark them as unused to avoid -Wunused-parameter warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Prevent an overflow when locating a pack's CRC offset when the number of packed items is greater than 2^32-1/hashsz by guarding the computation with an `st_mult()`. Note that to avoid truncating the result, the `crc_offset` member must itself become a `size_t`. The only usage of this variable (besides the assignment in `load_idx()`) is in `read_v2_anomalous_offsets()` in the index-pack code. There we use the `crc_offset` as a pointer offset, so we are already equipped to handle the type change. Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as the previous commits, ensure that we use `st_add()` or `st_mult()` when computing values that may overflow the 32-bit unsigned limit. Note that in each of these instances, we prevent 32-bit overflow already since we have explicit casts to `size_t`. So this code is OK as-is, but let's clarify it by using the `st_xyz()` helpers to make it obvious that we are performing the relevant computations using 64 bits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The `midx_fanout` struct is used to keep track of a set of OIDs corresponding to each layer of the MIDX's fanout table. It stores an array of entries, along with the number of entries in the table, and the allocated size of the array. Both `nr` and `alloc` are stored as 32-bit unsigned integers. In practice, this should never cause any problems, since most packs have far fewer than 2^32-1 objects. But storing these as `size_t`'s is more appropriate, and prevents us from accidentally overflowing some result when multiplying or adding to either of these values. Update these struct members to be `size_t`'s as appropriate. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, avoid overflow when looking up an object's OID in a MIDX when its position is greater than `2^32-1/m->hash_len`. As usual, it is perfectly OK for a MIDX to have as many as 2^32-1 objects (since we use 32-bit fields to count the number of objects at each fanout layer). But if we have more than `2^32-1/m->hash_len` number of objects, we will incorrectly perform the computation using 32-bit integers, overflowing the result. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous patches, avoid an overflow when looking up object offsets in the MIDX's large offset table by guarding the computation via `st_mult()`. This instance is also OK as-is, since the left operand is the result of `sizeof(...)`, which is already a `size_t`. But use `st_mult()` instead here to make it explicit that this computation is to be performed using 64-bit unsigned integers. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In the `write_midx_context` structure, we use two `uint32_t`'s to track the length and allocated size of the packs, and one `uint32_t` to track the number of objects in the MIDX. In practice, having these be 32-bit unsigned values shouldn't cause any problems since we are unlikely to have that many objects or packs in any real-world repository. But these values should be `size_t`'s, so change their type to reflect that. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When writing a MIDX, we use the chunk-format API to write out each individual chunk of the MIDX. Each chunk of the MIDX is tracked via a call to `add_chunk()`, along with the expected size of that chunk. Guard against overflow when dealing with a MIDX with a large number of entries (and consequently, large chunks within the MIDX file itself) to avoid corrupting the contents of the MIDX itself. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as in previous commits, avoid an integer overflow when computing the expected size of a MIDX. (Note that this is also OK as-is, since `p->pack_size` is an `off_t`, so this computation should already be done as 64-bit integers. But again, let's use `st_mult()` to make this fact clear). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When a bitmap is used to answer some reachability query, it creates a pseudo-bitmap called the "extended index" on top of any existing bitmaps to store objects that are relevant to the query, but not mentioned in the bitmap. When looking up the ith object in the extended index in a bitmap, it is common to write something like: bitmap_get(result, i + bitmap_num_objects(bitmap_git)) , indicating that we want the ith object following all other objects mentioned in the bitmap_git. Since the type of `i` and the return type of `bitmap_num_objects()` are both `uint32_t`s, But if there are either a large number of objects in the bitmap, or a large number of objects in the extended index (or both), this addition can overflow when the sum is greater than 2^32-1. Having that large of a bitmap position is entirely acceptable, but we need to ensure that the computed bitmap position for that object is performed using 64-bits and doesn't overflow. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When writing a commit-graph, we use the chunk-format API to write out each individual chunk of the commit-graph. Each chunk of the commit-graph is tracked via a call to `add_chunk()`, along with the expected size of that chunk. Similar to an earlier commit which handled the identical issue in the MIDX machinery, guard against overflow when dealing with a commit-graph with a large number of entries to avoid corrupting the contents of the commit-graph itself. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The commit-graph uses a fanout table with 4-byte entries to store the number of commits at each shard of the commit-graph. So it is OK to have a commit graph with as many as 2^32-1 stored commits. But we risk overflowing any computation which may exceed the 32-bit (unsigned) maximum when those computations are (incorrectly) performed using 32-bit operands. There are a couple of spots in `add_graph_to_chain()` where we could potentially overflow the result: - First, when comparing the list of existing entries in the commit-graph chain. It is unlikely that this should ever overflow, since it would require having roughly 2^32-1/g->hash_len commit-graphs in the chain. But let's guard that computation with a `st_mult()` just to be safe. - Second, when computing the number of commits in the graph added to the front of the chain. This value is also a 32-bit unsigned, but we should make sure that it does not grow beyond the maximum value. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, ensure that we don't overflow when trying to compute an offset into the `chunk_oid_lookup` table when the `lex_index` of the item we're trying to look up exceeds `2^32-1/g->hash_len`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, ensure that we don't overflow in a few spots within `fill_commit_graph_info()`: - First, when computing an offset into the commit data chunk, which can occur when the `lex_index` of the item we're looking up exceeds 2^32-1/GRAPH_DATA_WIDTH. - A similar issue when computing the generation date offset for commits with `lex_index` greater than 2^32-1/4. Note that in practice this will never overflow, since the left-hand operand is from calling `sizeof(...)` and is thus already a `size_t`. But wrap that in an `st_mult()` to make it clear that we intend to perform this computation using 64-bit operands. - Finally, a nearly identical issue as above when computing an offset into the `generation_data_overflow` chunk. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, ensure that we don't overflow when the lex_index of the commit we are trying to fill out exceeds 2^32-1/(g->hash_len+16). The other hunk touched in this patch is not susceptible to overflow, since an explicit cast is made to a 64-bit unsigned value. For clarity and consistency with the rest of the commits in this series, avoid a tricky to reason about cast, and use `st_mult()` directly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, ensure that we don't overflow when computing an offset into the commit_data chunk when the (relative) graph position exceeds 2^32-1/GRAPH_DATA_WIDTH. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, ensure that we don't overflow when choosing how to split and merge different layers of the commit-graph. In particular, avoid a potential overflow between `size_mult` and `num_commits`, as well as a potential overflow between the number of commits currently in the merged graph, and the number of commits in the graph about to be merged. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When merging two commit graphs, ensure that we don't attempt to merge two graphs which, when combined, have more total commits than the 32-bit unsigned maximum. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, ensure that we don't overflow when trying to read an existing OID while writing a new commit-graph. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a similar spirit as previous commits, ensure that we don't overflow when trying to read an OID out of an existing commit-graph during verification. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

I noticed this test was producing output like ``` t4002-diff-basic.sh: test_expect_successdiff can read from stdin: not found ``` which is rather odd. Investigation shows an error of shell syntax: foo'abc' is the same as fooabc to the shell. Perhaps obviously, this is not a valid command for the test. I am surprised this doesn't count as an error in the test, but that accounts for it going unnoticed. Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>

In a test introduced by 26c9c03 (ref-filter: add new "signature" atom, 2023-06-04) the file named "file" is added by a setup step that requires GPG and modified by a second setup step that requires GPGSSH. Systems lacking the first prerequisite skip the initial setup step and then "git commit -a" in the second one doesn't find the modified file. Add it explicitly. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Use the now common skip_prefix() cascade instead of a case statement to parse the strftime(3) format in strbuf_addftime(). skip_prefix() parses the "fmt" pointer and advances it appropriately, making additional pointer arithmetic unnecessary. The resulting code is more compact and consistent with most other strbuf_expand_step() loops. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Tags are dereferenced until reaching a different object type to handle nested tags, e.g. on checkout. In contrast, "git tag --points-at=..." fails to list such nested tags because only one level of indirection is obtained in filter_refs(). Implement the recursive dereferencing for the "--points-at" option when filtering refs to unify the behaviour. Signed-off-by: Jan Klötzke <jan@kloetzke.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When we peel tags to check if they match a --points-at oid, we recursively parse the tagged object to see if it is also a tag. But since the tag itself tells us the type of the object it points to (and even gives us the appropriate object struct via its "tagged" member), we can use that directly. We do still have to make sure to call parse_tag() before looking at each tag. This is redundant for the outermost tag (since we did call parse_object() to find its type), but that's OK; parse_tag() is smart enough to make this a noop when the tag has already been parsed. In my clone of linux.git, with 782 tags (and only 3 non-tags), this yields a significant speedup (bringing us back where we were before the commit before this one started recursively dereferencing tags): Benchmark 1: ./git.old for-each-ref --points-at=HEAD --format="%(refname)" Time (mean ± σ): 20.3 ms ± 0.5 ms [User: 11.1 ms, System: 9.1 ms] Range (min … max): 19.6 ms … 21.5 ms 141 runs Benchmark 2: ./git.new for-each-ref --points-at=HEAD --format="%(refname)" Time (mean ± σ): 11.4 ms ± 0.2 ms [User: 6.3 ms, System: 5.0 ms] Range (min … max): 11.0 ms … 12.2 ms 250 runs Summary './git.new for-each-ref --points-at=HEAD --format="%(refname)"' ran 1.79 ± 0.05 times faster than './git.old for-each-ref --points-at=HEAD --format="%(refname)"' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When handling --points-at, we have to try to peel each ref to see if it's a tag that points at a requested oid. We start this process by calling parse_object() on the oid pointed to by each ref. The cost of parsing each object adds up, especially in an output that doesn't otherwise need to open the objects at all. Ideally we'd use peel_iterated_oid() here, which uses the cached information in the packed-refs file. But we can't, because our --points-at must match not only the fully peeled value, but any interim values (so if tag A points to tag B which points to commit C, we should match --points-at=B, but peel_iterated_oid() will only tell us about C). So the best we can do (absent changes to the packed-refs peel traits) is to avoid parsing non-tags. The obvious way to do that is to call oid_object_info() to check the type before parsing. But there are a few gotchas there, like checking if the object has already been parsed. Instead we can just tell parse_object() that we are OK skipping the hash check, which lets it turn on several optimizations. Commits can be loaded via the commit graph (so it's both fast and we have the benefit of the parsed data if we need it later at the output stage). Blobs are not loaded at all. Trees are still loaded, but it's rather rare to have a ref point directly to a tree (and since this is just an optimization, kicking in 99% of the time is OK). Even though we're paying for an extra lookup, the cost to avoid parsing the non-tags is a net benefit. In my git.git repository with 941 tags and 1440 other refs pointing to commits, this significantly cuts the runtime: Benchmark 1: ./git.old for-each-ref --points-at=HEAD Time (mean ± σ): 26.8 ms ± 0.5 ms [User: 24.5 ms, System: 2.2 ms] Range (min … max): 25.9 ms … 29.2 ms 107 runs Benchmark 2: ./git.new for-each-ref --points-at=HEAD Time (mean ± σ): 9.1 ms ± 0.3 ms [User: 6.8 ms, System: 2.2 ms] Range (min … max): 8.6 ms … 10.2 ms 308 runs Summary './git.new for-each-ref --points-at=HEAD' ran 2.96 ± 0.10 times faster than './git.old for-each-ref --points-at=HEAD' In a repository that is mostly annotated tags, we'd expect less improvement (we might still skip a few object loads, but that's balanced by the extra lookups). In my clone of linux.git, which has 782 tags and 3 branches, the run-time is about the same (it's actually ~1% faster on average after this patch, but that's within the run-to-run noise). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

We return the oid that matched, but the sole caller only cares whether we matched anything at all. This is mostly academic, since there's only one caller, but the lifetime of the returned pointer is not immediately clear. Sometimes it points to an oid in a tag struct, which should live forever. And sometimes to the oid passed in, which only lives as long as the each_ref_fn callback we're called from. Simplify this to a boolean return which is more direct and obvious. As a bonus, this lets us avoid the weird pattern of overwriting our "oid" parameter in the loop (since we now only refer to the tagged oid one time, and can just inline the call to get it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Help newbies by suggesting that there are cases where force-pushing is a valid and sensible thing to update a branch at a remote repository, rather than reconciling with merge/rebase. * ah/advise-force-pushing: push: don't imply that integration is always required before pushing remote: don't imply that integration is always required before pushing wt-status: don't show divergence advice when committing

Various offset computation in the code that accesses the packfiles and other data in the object layer has been hardened against arithmetic overflow, especially on 32-bit systems. * tb/object-access-overflow-protection: commit-graph.c: prevent overflow in `verify_commit_graph()` commit-graph.c: prevent overflow in `write_commit_graph()` commit-graph.c: prevent overflow in `merge_commit_graph()` commit-graph.c: prevent overflow in `split_graph_merge_strategy()` commit-graph.c: prevent overflow in `load_tree_for_commit()` commit-graph.c: prevent overflow in `fill_commit_in_graph()` commit-graph.c: prevent overflow in `fill_commit_graph_info()` commit-graph.c: prevent overflow in `load_oid_from_graph()` commit-graph.c: prevent overflow in add_graph_to_chain() commit-graph.c: prevent overflow in `write_commit_graph_file()` pack-bitmap.c: ensure that eindex lookups don't overflow midx.c: prevent overflow in `fill_included_packs_batch()` midx.c: prevent overflow in `write_midx_internal()` midx.c: store `nr`, `alloc` variables as `size_t`'s midx.c: prevent overflow in `nth_midxed_offset()` midx.c: prevent overflow in `nth_midxed_object_oid()` midx.c: use `size_t`'s for fanout nr and alloc packfile.c: use checked arithmetic in `nth_packed_object_offset()` packfile.c: prevent overflow in `load_idx()` packfile.c: prevent overflow in `nth_packed_object_id()`

Test fix. * dk/t4002-syntaxo-fix: t4002: fix "diff can read from stdin" syntax

Names of MinGW header files are spelled in mixed case in some source files, but the build host can be using case sensitive filesystem with header files with their name spelled in all lowercase. * mh/mingw-case-sensitive-build: mingw: use lowercase includes for some Windows headers

Update message mark-up for i18n in "git bundle". * dk/bundle-i18n-more: i18n: mark more bundle.c strings for translation

Mark-up unused parameters in the code so that we can eventually enable -Wunused-parameter by default. * jk/unused-parameter: t/helper: mark unused callback void data parameters tag: mark unused parameters in each_tag_name_fn callbacks rev-parse: mark unused parameter in for_each_abbrev callback replace: mark unused parameter in each_mergetag_fn callback replace: mark unused parameter in ref callback merge-tree: mark unused parameter in traverse callback fsck: mark unused parameters in various fsck callbacks revisions: drop unused "opt" parameter in "tweak" callbacks count-objects: mark unused parameter in alternates callback am: mark unused keep_cr parameters http-push: mark unused parameter in xml callback http: mark unused parameters in curl callbacks do_for_each_ref_helper(): mark unused repository parameter test-ref-store: drop unimplemented reflog-expire command

Test fix. * rs/ref-filter-signature-fix: t6300: fix setup with GPGSSH but without GPG

Code clean-up. * rs/strbuf-addftime-simplify: strbuf: use skip_prefix() in strbuf_addftime()

"git tag --list --points-at X" showed tags that directly refers to object X, but did not list a tag that points at such a tag, which has been corrected. * jk/nested-points-at: ref-filter: simplify return type of match_points_at ref-filter: avoid parsing non-tags in match_points_at() ref-filter: avoid parsing tagged objects in match_points_at() ref-filter: handle nested tags in --points-at option

Signed-off-by: Junio C Hamano <gitster@pobox.com>

glandium and others added 30 commits June 12, 2023 12:31

count-objects: mark unused parameter in alternates callback

506d35f

Callbacks to for_each_altodb() get a void data pointer, but we don't need it here. Mark it as unused to silence -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

rev-parse: mark unused parameter in for_each_abbrev callback

1e6459e

We don't need to use the "data" parameter in this instance. Let's mark it to avoid -Wunused-parameter warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>

ttaylorr and others added 26 commits July 14, 2023 09:32

Merge branch 'dk/t4002-syntaxo-fix'

d4ce185

Test fix. * dk/t4002-syntaxo-fix: t4002: fix "diff can read from stdin" syntax

Merge branch 'dk/bundle-i18n-more'

dd224ce

Update message mark-up for i18n in "git bundle". * dk/bundle-i18n-more: i18n: mark more bundle.c strings for translation

Merge branch 'rs/ref-filter-signature-fix'

261ff51

Test fix. * rs/ref-filter-signature-fix: t6300: fix setup with GPGSSH but without GPG

Merge branch 'rs/strbuf-addftime-simplify'

02f50d0

Code clean-up. * rs/strbuf-addftime-simplify: strbuf: use skip_prefix() in strbuf_addftime()

The fourteenth batch

a80be15

Signed-off-by: Junio C Hamano <gitster@pobox.com>

pull bot added the ⤵️ pull label Jul 25, 2023

pull bot merged commit a80be15 into makesoftwaresafe:master Jul 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from git:master#138

[pull] master from git:master#138
pull[bot] merged 56 commits intomakesoftwaresafe:masterfrom
git:master

pull bot commented Jul 25, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

pull bot commented Jul 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

pull bot commented Jul 25, 2023 •

edited

Loading