Fix the fedora-breaking-changes CI job failures#900
Merged
dscho merged 40 commits intovfs-2.53.0from Apr 28, 2026
Merged
Conversation
The recent glibc 2.43 release had the following change listed in its
NEWS file:
For ISO C23, the functions bsearch, memchr, strchr, strpbrk, strrchr,
strstr, wcschr, wcspbrk, wcsrchr, wcsstr and wmemchr that return
pointers into their input arrays now have definitions as macros that
return a pointer to a const-qualified type when the input argument is
a pointer to a const-qualified type.
When compiling with GCC 15, which defaults to -std=gnu23, this causes
many warnings like this:
merge-ort.c: In function ‘apply_directory_rename_modifications’:
merge-ort.c:2734:36: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
2734 | char *last_slash = strrchr(cur_path, '/');
| ^~~~~~~
This patch fixes the more obvious ones by making them const when we do
not write to the returned pointer.
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We assign this variable unconditionally, so we do not need to initialize it to NULL where it is defined. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "lstrip" and "rstrip" options to the %(refname) placeholder both accept a negative length, which asks us to keep that many path components (rather than stripping that many). The code to count components and convert the negative value to a positive was copied from lstrip to rstrip in 1a34728 (ref-filter: add an 'rstrip=<N>' option to atoms which deal with refnames, 2017-01-10). Let's factor it out into a separate function. This reduces duplication and also makes the lstrip/rstrip functions much easier to follow, since the bulk of their code is now the actual stripping. Note that the computed "remaining" value is currently stored as a "long", so in theory that's what our function should return. But this is purely historical. When the variable was added in 0571979 (tag: do not show ambiguous tag names as "tags/foo", 2016-01-25), we parsed the value from strtol(), and thus used a long. But these days we take "len" as an int, and also use an int to count up components. So let's just consistently use int here. This value could only overflow in a pathological case (e.g., 4GB worth of "a/a/...") and even then will not result in out-of-bounds memory access (we keep stripping until we run out of string to parse). The minimal Myers diff here is a little hard to read; with --patience the code movement is shown much more clearly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're walking forward in the string, skipping path components from left-to-right. So when we've stripped as much as we want, the pointer we have is a complete NUL-terminated string and we can just return it (after duplicating it, of course). So there is no need for a temporary allocated string. But we do make an extra temporary copy due to f0062d3 (ref-filter: free item->value and item->value->s, 2018-10-18). This is probably from cargo-culting the technique used in rstrip_ref_components(), which _does_ need a separate string (since it is stripping from the end and ties off the temporary string with a NUL). Let's drop the extra allocation. This is slightly more efficient, but more importantly makes the code much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're stripping path components from the end of a string, which we do by
assigning a NUL as we parse each component, shortening the string. This
requires an extra temporary buffer to avoid munging our input string.
But the way that we allocate the buffer is unusual. We have an extra
"to_free" variable. Usually this is used when the access variable is
conceptually const, like:
const char *foo;
char *to_free = NULL;
if (...)
foo = to_free = xstrdup(...);
else
foo = some_const_string;
...
free(to_free);
But that's not what's happening here. Our "start" variable always points
to the allocated buffer, and to_free is redundant. Worse, it is marked
as const itself, requiring a cast when we free it.
Let's drop to_free entirely, and mark "start" as non-const, making the
memory handling more clear. As a bonus, this also silences a warning
from glibc-2.43 that our call to strrchr() implicitly strips away the
const-ness of "start".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To strip path components from our refname string, we repeatedly call
strrchr() to find the trailing slash, shortening the string each time by
assigning NUL over it. This has two downsides:
1. Calling strrchr() in a loop is quadratic, since each call has to
call strlen() under the hood to find the end of the string (even
though we know exactly where it is from the last loop iteration).
2. We need a temporary buffer, since we're munging the string with NUL
as we shorten it (which we must do, because strrchr() has no other
way of knowing what we consider the end of the string).
Using memrchr() would let us fix both of these, but it isn't portable.
So instead, let's just open-code the string traversal from back to
front as we loop.
I doubt that the quadratic nature is a serious concern. You can see it
in practice with something like:
git init
git commit --allow-empty -m foo
echo "$(git rev-parse HEAD) refs/heads$(perl -e 'print "/a" x 500_000')" >.git/packed-refs
time git for-each-ref --format='%(refname:rstrip=-1)'
That takes ~5.5s to run on my machine before this patch, and ~11ms
after. But I don't think there's a reasonable way for somebody to infect
you with such a garbage ref, as the wire protocol is limited to 64k
pkt-lines. The difference is measurable for me for a 32k-component ref
(about 19ms vs 7ms), so perhaps you could create some chaos by pushing a
lot of them. But we also run into filesystem limits (if the loose
backend is in use), and in practice it seems like there are probably
simpler and more effective ways to waste CPU.
Likewise the extra allocation probably isn't really measurable. In fact,
since our goal is to return an allocated string, we end up having to
make the same allocation anyway (though it is sized to the result,
rather than the input). My main goal was simplicity in avoiding the need
to handle cleaning it up in the early return path.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a strip option to the %(refname) placeholder is asked to leave N path components, we first count up the path components to know how many to remove. That happens with a loop like this: /* Find total no of '/' separated path-components */ for (i = 0; p[i]; p[i] == '/' ? i++ : *p++) ; which is a little hard to understand for two reasons. First, the dereference in "*p++" is seemingly useless, since nobody looks at the result. And static analyzers like Coverity will complain about that. But removing the "*" will cause gcc to complain with -Wint-conversion, since the two sides of the ternary do not match (one is a pointer and the other an int). Second, it is not clear what the meaning of "p" is at each iteration of the loop, as its position with respect to our walk over the string depends on how many slashes we've seen. The answer is that by itself, it doesn't really mean anything: "p + i" represents the current state of our walk, with "i" counting up slashes, and "p" by itself essentially meaningless. None of this behaves incorrectly, but ultimately the loop is just counting the slashes in the refname. We can do that much more simply with a for-loop iterating over the string and a separate slash counter. We can also drop the comment, which is somewhat misleading. We are counting slashes, not components (and a comment later in the function makes it clear that we must add one to compensate). In the new code it is obvious that we are counting slashes here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building with glibc-2.43 there is the following warning:
bloom.c: In function ‘get_or_compute_bloom_filter’:
bloom.c:515:52: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
515 | char *last_slash = strrchr(path, '/');
| ^~~~~~~
In this case, we always write through "path" through the "last_slash"
pointer. Therefore, the const qualifier on "path" is misleading and we
can just remove it.
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When building with glibc-2.43 there is the following warning:
dir.c:3526:15: warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
3526 | slash = strrchr(name, '/');
| ^
In this case we use a non-const pointer to get the last slash of the
unwritable file name, and then use it again to write in the strdup'd
file name.
We can avoid this warning and make the code a bit more clear by using a
separate variable to access the original argument and its strdup'd
copy.
Signed-off-by: Collin Funk <collin.funk1@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
C23 versions of libc (like recent glibc) may provide generic versions of strchr() that match constness between the input and return value. The idea being that the compiler can detect when it implicitly converts a const pointer into a non-const one (which then emits a warning). There are a few cases here where the result pointer does not need to be non-const at all, and we should mark it as such. That silences the warning (and avoids any potential problems with trying to write via those pointers). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 2b9665c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The "path" field of a "struct repo" (a custom http-push struct, not to be confused with "struct repository) is a pointer into a const argv string, and is never written to. The compiler does not traditionally complain about assigning from a const pointer because it happens via strchr(). But with some C23 libc versions (notably recent glibc), it has started to do so. Let's mark the field as const to silence the warnings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 2fb6a18) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We implicitly drop the const from our "key" variable when we do: char *p = strchr(key, ' '); which causes compilation with some C23 versions of libc (notably recent glibc) to complain. We need "p" to remain writable, since we assign NUL over the space we found. We can solve this by also making "key" writable. This works because it comes from a strbuf, which is itself a writable string. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit eedc7ec) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When we do: char *cp = strchr(argv[i], '='); it implicitly removes the constness from argv[i]. We need "cp" to remain writable (since we overwrite it with a NUL). In theory we should be able to drop the const from argv[i], because it is a sub-pointer into our duplicated pager_env variable. But we get it from split_cmdline(), which uses the traditional "const char **" type for argv. This is overly limiting, but changing it would be awkward for all the other callers of split_cmdline(). Let's do an explicit cast with a note about why it is OK. This is enough to silence compiler warnings about the implicit const problems. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 031d29d) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We do this: char *equals = strchr(*e, '='); which implicitly removes the constness from "*e" and cause the compiler to complain. We never write to "equals", but later assign it to a string_list util field, which is defined as non-const "void *". We have to cast somewhere, but doing so at the assignment to util is the least-bad place, since that is the source of the confusion. Sadly we are still open to accidentally writing to the string via the util pointer, but that is the cost of using void pointers, which lose all type information. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 21c57ef) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The find_last_dir_sep() function is implemented as an inline function which takes in a "const char *" and returns a "char *" via strrchr(). That means that just like strrchr(), it quietly removes the const from our pointer, which could lead to accidentally writing to the resulting string. But C23 versions of libc (including recent glibc) annotate strrchr() such that the compiler can detect when const is implicitly lost, and it now complains about the call in this inline function. We can't just switch the return type of the function to "const char *", though. Some callers really do want a non-const string to be returned (and are OK because they are feeding a non-const string into the function). The most general solution is for us to annotate find_last_dir_sep() in the same way that is done for strrchr(). But doing so relies on using C23 generics, which we do not otherwise require. Since this inline function is wrapping a single call to strrchr(), we can take a shortcut. If we implement it as a macro, then the original type information is still available to strrchr(), and it does the check for us. Note that this is just one implementation of find_last_dir_sep(). There is an alternate implementation in compat/win32/path-utils.h. It doesn't suffer from the same warning, as it does not use strrchr() and just casts away const explicitly. That's not ideal, and eventually we may want to conditionally teach it the same C23 generic trick that strrchr() uses. But it has been that way forever, and our goal here is just quieting new warnings, not improving const-checking. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit bc4fd55) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The goal of this commit was to fix a const warning when compiling
with new versions of glibc, but ended up untangling a much deeper
problem.
The find_pseudo_merge() function does a bsearch() on the "commits"
pointer of a pseudo_merge_map. This pointer ultimately comes from memory
mapped from the on-disk bitmap file, and is thus not writable.
The "commits" array is correctly marked const, but the result from
bsearch() is returned directly as a non-const pseudo_merge_commit
struct. Since new versions of glibc annotate bsearch() in a way that
detects the implicit loss of const, the compiler now warns.
My first instinct was that we should be returning a const struct. That
requires apply_pseudo_merges_for_commit() to mark its local pointer as
const. But that doesn't work! If the offset field has the high-bit set,
we look it up in the extended table via nth_pseudo_merge_ext(). And that
function then feeds our const struct to read_pseudo_merge_commit_at(),
which writes into it by byte-swapping from the on-disk mmap.
But I think this points to a larger problem with find_pseudo_merge(). It
is not just that the return value is missing const, but it is missing
that byte-swapping! And we know that byte-swapping is needed here,
because the comparator we use for bsearch() also calls our
read_pseudo_merge_commit_at() helper.
So I think the interface is all wrong here. We should not be returning a
pointer to a struct which was cast from on-disk data. We should be
filling in a caller-provided struct using the bytes we found,
byte-swapping the values.
That of course raises the dual question: how did this ever work, and
does it work now? The answer to the first part is: this code does not
seem to be triggered in the test suite at all. If we insert a BUG("foo")
call into apply_pseudo_merges_for_commit(), it never triggers.
So I think there is something wrong or missing from the test setup, and
this bears further investigation. Sadly the answer to the second part
("does it work now") is still "no idea". I _think_ this takes us in a
positive direction, but my goal here is mainly to quiet the compiler
warning. Further bug-hunting on this experimental feature can be done
separately.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
(cherry picked from commit 25e5ceb)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The skip_prefix() function takes in a "const char *" string, and returns
via a "const char **" out-parameter that points somewhere in that
string. This is fine if you are operating on a const string, like:
const char *in = ...;
const char *out;
if (skip_prefix(in, "foo", &out))
...look at out...
It is also OK if "in" is not const but "out" is, as we add an implicit
const when we pass "in" to the function. But there's another case where
this is limiting. If we want both fields to be non-const, like:
char *in = ...;
char *out;
if (skip_prefix(in, "foo", &out))
*out = '\0';
it doesn't work. The compiler will complain about the type mismatch in
passing "&out" to a parameter which expects "const char **". So to make
this work, we have to do an explicit cast.
But such a cast is ugly, and also means that we run afoul of making this
mistake:
const char *in = ...;
char *out;
if (skip_prefix(in, "foo", (const char **)&out))
*out = '\0';
which causes us to write to the memory pointed by "in", which was const.
We can imagine these four cases as:
(1) const in, const out
(2) non-const in, const out
(3) non-const in, non-const out
(4) const in, non-const out
Cases (1) and (2) work now. We would like case (3) to work but it
doesn't. But we would like to catch case (4) as a compile error.
So ideally the rule is "the out-parameter must be at least as const as
the in-parameter". We can do this with some macro trickery. We wrap
skip_prefix() in a macro so that it has access to the real types of
in/out. And then we pass those parameters through another macro which:
1. Fails if the "at least as const" rule is not filled.
2. Casts to match the signature of the real skip_prefix().
There are a lot of ways to implement the "fails" part. You can use
__builtin_types_compatible_p() to check, and then either our
BUILD_ASSERT macros or _Static_assert to fail. But that requires some
conditional compilation based on compiler feature. That's probably OK
(the fallback would be to just cast without catching case 4). But we can
do better.
The macro I have here uses a ternary with a dead branch that tries to
assign "in" to "out", which should work everywhere and lets the compiler
catch the problem in the usual way. With an input like this:
int foo(const char *x, const char **y);
#define foo(in,out) foo((in), CONST_OUTPARAM((in), (out)))
void ok_const(const char *x, const char **y)
{
foo(x, y);
}
void ok_nonconst(char *x, char **y)
{
foo(x, y);
}
void ok_add_const(char *x, const char **y)
{
foo(x, y);
}
void bad_drop_const(const char *x, char **y)
{
foo(x, y);
}
gcc reports:
foo.c: In function ‘bad_drop_const’:
foo.c:2:35: error: assignment discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers]
2 | ((const char **)(0 ? ((*(out) = (in)),(out)) : (out)))
| ^
foo.c:4:31: note: in expansion of macro ‘CONST_OUTPARAM’
4 | #define foo(in,out) foo((in), CONST_OUTPARAM((in), (out)))
| ^~~~~~~~~~~~~~
foo.c:23:9: note: in expansion of macro ‘foo’
23 | foo(x, y);
| ^~~
It's a bit verbose, but I think makes it reasonably clear what's going
on. Using BUILD_ASSERT_OR_ZERO() ends up much worse. Using
_Static_assert you can be a bit more informative, but that's not
something we use at all yet in our code-base (it's an old gnu-ism later
standardized in C11).
Our generic macro only works for "const char **", which is something we
could improve by using typeof(in). But that introduces more portability
questions, and also some weird corner cases (e.g., around implicit void
conversion).
This patch just introduces the concept. We'll make use of it in future
patches.
Note that we rename skip_prefix() to skip_prefix_impl() here, to avoid
expanding the macro when defining the function. That's not strictly
necessary since we could just define the macro after defining the inline
function. But that would not be the case for a non-inline function (and
we will apply this technique to them later, and should be consistent).
It also gives us more freedom about where to define the macro. I did so
right above the definition here, which I think keeps the relevant bits
together.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
(cherry picked from commit cefb8b7)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The "line" member of a packet_reader struct is marked as const. This kind of makes sense, because it's not its own allocated buffer that should be freed, and we often use const to indicate that. But it is always writable, because it points into the non-const "buffer" member. And we rely on this writability in places like send-pack and receive-pack, where we parse incoming packet contents by writing NULs over delimiters. This has traditionally worked because we implicitly cast away the constness with strchr() like: const char *head; char *p; head = reader->line; p = strchr(head, ' '); Since C23 libc provides a generic strchr() to detect this implicit const removal, this now generate a compiler warning on some platforms (like recent glibc). We can fix it by marking "line" as non-const, as well as a few intermediate variables (like "head" in the above example). Note that by itself, switching to a non-const variable would cause problems with this line in send-pack.c: if (!skip_prefix(reader->line, "unpack ", &reader->line)) But due to our skip_prefix() magic introduced in the previous commit, this compiles fine (both the in and out-parameters are non-const, so we know it is safe). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit d3cd819) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
This is another case where we implicitly drop the "const" from a pointer by feeding it to strstr() and assigning the result to a non-const pointer. This is OK in practice, since the const pointer originally comes from a writable source (a strbuf), but C23 libc implementations have started to complain about it. We do write to the output pointer, so it needs to remain non-const. We can just switch the input pointer to also be non-const in this case. By itself that would run into problems with calls to skip_prefix(), but since that function has now been taught to match in/out constness automatically, it just works without us doing anything further. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit c395126) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In redact_sensitive_header(), a C23 implementation of libc will complain that strstr() assigns the result from "const char *cookie" to "char *semicolon". Ultimately the memory is writable. We're fed a strbuf, generate a const pointer "sensitive_header" within it using skip_iprefix(), and then assign the result to "cookie". So we can solve this by dropping the const from "cookie" and "sensitive_header". However, this runs afoul of skip_iprefix(), which wants a "const char **" for its out-parameter. We can solve that by teaching skip_iprefix() the same "make sure out is at least as const as in" magic that we recently taught to skip_prefix(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 8a0566b) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
In show_one_reflog_ent(), we're fed a writable strbuf buffer, which we parse into the various reflog components. We write a NUL over email_end to tie off one of the fields, and thus email_end must be non-const. But with a C23 implementation of libc, strchr() will now complain when assigning the result to a non-const pointer from a const one. So we can fix this by making the source pointer non-const. But there's a catch. We derive that source pointer by parsing the line with parse_oid_hex_algop(), which requires a const pointer for its out-parameter. We can work around that by teaching it to use our CONST_OUTPARAM() trick, just like skip_prefix(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit f1b8a4d) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
There's a typo in the comment, making it hard to understand. And the macro itself is indented with spaces rather than tab. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 58589c2) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The "lstrip" and "rstrip" options to the %(refname) placeholder both accept a negative length, which asks us to keep that many path components (rather than stripping that many). The code to count components and convert the negative value to a positive was copied from lstrip to rstrip in 1a34728 (ref-filter: add an 'rstrip=<N>' option to atoms which deal with refnames, 2017-01-10). Let's factor it out into a separate function. This reduces duplication and also makes the lstrip/rstrip functions much easier to follow, since the bulk of their code is now the actual stripping. Note that the computed "remaining" value is currently stored as a "long", so in theory that's what our function should return. But this is purely historical. When the variable was added in 0571979 (tag: do not show ambiguous tag names as "tags/foo", 2016-01-25), we parsed the value from strtol(), and thus used a long. But these days we take "len" as an int, and also use an int to count up components. So let's just consistently use int here. This value could only overflow in a pathological case (e.g., 4GB worth of "a/a/...") and even then will not result in out-of-bounds memory access (we keep stripping until we run out of string to parse). The minimal Myers diff here is a little hard to read; with --patience the code movement is shown much more clearly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 5ec4c22) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We're walking forward in the string, skipping path components from left-to-right. So when we've stripped as much as we want, the pointer we have is a complete NUL-terminated string and we can just return it (after duplicating it, of course). So there is no need for a temporary allocated string. But we do make an extra temporary copy due to f0062d3 (ref-filter: free item->value and item->value->s, 2018-10-18). This is probably from cargo-culting the technique used in rstrip_ref_components(), which _does_ need a separate string (since it is stripping from the end and ties off the temporary string with a NUL). Let's drop the extra allocation. This is slightly more efficient, but more importantly makes the code much simpler. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 87cb6dc) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We're stripping path components from the end of a string, which we do by
assigning a NUL as we parse each component, shortening the string. This
requires an extra temporary buffer to avoid munging our input string.
But the way that we allocate the buffer is unusual. We have an extra
"to_free" variable. Usually this is used when the access variable is
conceptually const, like:
const char *foo;
char *to_free = NULL;
if (...)
foo = to_free = xstrdup(...);
else
foo = some_const_string;
...
free(to_free);
But that's not what's happening here. Our "start" variable always points
to the allocated buffer, and to_free is redundant. Worse, it is marked
as const itself, requiring a cast when we free it.
Let's drop to_free entirely, and mark "start" as non-const, making the
memory handling more clear. As a bonus, this also silences a warning
from glibc-2.43 that our call to strrchr() implicitly strips away the
const-ness of "start".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
(cherry picked from commit 2ec30e7)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
To strip path components from our refname string, we repeatedly call
strrchr() to find the trailing slash, shortening the string each time by
assigning NUL over it. This has two downsides:
1. Calling strrchr() in a loop is quadratic, since each call has to
call strlen() under the hood to find the end of the string (even
though we know exactly where it is from the last loop iteration).
2. We need a temporary buffer, since we're munging the string with NUL
as we shorten it (which we must do, because strrchr() has no other
way of knowing what we consider the end of the string).
Using memrchr() would let us fix both of these, but it isn't portable.
So instead, let's just open-code the string traversal from back to
front as we loop.
I doubt that the quadratic nature is a serious concern. You can see it
in practice with something like:
git init
git commit --allow-empty -m foo
echo "$(git rev-parse HEAD) refs/heads$(perl -e 'print "/a" x 500_000')" >.git/packed-refs
time git for-each-ref --format='%(refname:rstrip=-1)'
That takes ~5.5s to run on my machine before this patch, and ~11ms
after. But I don't think there's a reasonable way for somebody to infect
you with such a garbage ref, as the wire protocol is limited to 64k
pkt-lines. The difference is measurable for me for a 32k-component ref
(about 19ms vs 7ms), so perhaps you could create some chaos by pushing a
lot of them. But we also run into filesystem limits (if the loose
backend is in use), and in practice it seems like there are probably
simpler and more effective ways to waste CPU.
Likewise the extra allocation probably isn't really measurable. In fact,
since our goal is to return an allocated string, we end up having to
make the same allocation anyway (though it is sized to the result,
rather than the input). My main goal was simplicity in avoiding the need
to handle cleaning it up in the early return path.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
(cherry picked from commit fe732a8)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Accommodate for C23 enforcing `const`-ness of `strchr()`'s return value. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When a strip option to the %(refname) placeholder is asked to leave N path components, we first count up the path components to know how many to remove. That happens with a loop like this: /* Find total no of '/' separated path-components */ for (i = 0; p[i]; p[i] == '/' ? i++ : *p++) ; which is a little hard to understand for two reasons. First, the dereference in "*p++" is seemingly useless, since nobody looks at the result. And static analyzers like Coverity will complain about that. But removing the "*" will cause gcc to complain with -Wint-conversion, since the two sides of the ternary do not match (one is a pointer and the other an int). Second, it is not clear what the meaning of "p" is at each iteration of the loop, as its position with respect to our walk over the string depends on how many slashes we've seen. The answer is that by itself, it doesn't really mean anything: "p + i" represents the current state of our walk, with "i" counting up slashes, and "p" by itself essentially meaningless. None of this behaves incorrectly, but ultimately the loop is just counting the slashes in the refname. We can do that much more simply with a for-loop iterating over the string and a separate slash counter. We can also drop the comment, which is somewhat misleading. We are counting slashes, not components (and a comment later in the function makes it clear that we must add one to compensate). In the new code it is obvious that we are counting slashes here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 8b0061b) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
There are two very subtle bits to the way we parse ".." (and "...")
range operators:
1. In handle_dotdot_1(), we assume that the incoming arguments "dotdot"
and "arg" are part of the same string, with the first digit of the
range-operator blanked to a NUL. Then when we want the full name
(e.g., to report an error), we replace the NUL with a dot to restore
the original string.
2. In handle_dotdot(), we take in a const string, but then we modify it
by overwriting the range operator with a NUL. This has worked OK in
practice since we tend to pass in buffers that are actually
writeable (including argv), but segfaults with something like:
handle_revision_arg("..HEAD", &revs, 0, 0);
On top of that, building with recent versions of glibc causes the
compiler to complain, because it notices when we use strchr() or
strstr() to launder away constness (basically detecting the
possibility of the segfault above via the type system).
Instead of munging the buffer, let's instead make a temporary copy of
the left-hand side of the range operator. That avoids any const
violations, and lets us pass around the parsed elements independently:
the left-hand side, the right-hand side, the number of dots (via the
"symmetric" flag), and the original full string for error messages.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
(cherry picked from commit 4d5fb93)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The previous commit simplified the way that revision.c parses ".." and "..." range operators. But there's roughly similar code in rev-parse. This is less likely to trigger a segfault, as there is no library function which we'd pass a string literal to, but it still causes the compiler to complain about laundering away constness via strstr(). Let's give it the same treatment, copying the left-hand side of the range operator into its own string. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 268a9ca) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We take in a "const char *", but may write a NUL into it when parsing
parent marks like "foo^-", since we want to isolate "foo" as a string
for further parsing. This is usually OK, as our "const" strings are
often actually argv strings which are technically writeable, but we'd
segfault with a string literal like:
handle_revision_arg("HEAD^-", &revs, 0, 0);
Similar to how we handled dotdot in a previous commit, we can avoid this
by making a temporary copy of the left-hand side of the string. The cost
should negligible compared to the rest of the parsing (like actually
parsing commits to create their parent linked-lists).
There is one slightly tricky thing, though. We parse some of the marks
progressively, so that if we see "foo^!" for example, we'll strip that
down to "foo" not just for calling add_parents_only(), but also for the
rest of the function. That makes sense since we eventually want to pass
"foo" to get_oid_with_context(). But it also means that we'll keep
looking for other marks. In particular, "foo^-^!" is valid, though oddly
"foo^!^-" would ignore the "^-". I'm not sure if this is a weird
historical artifact of the implementation, or if there are important
corner cases.
So I've left the behavior unchanged. Each mark we find allocates a
string with the mark stripped, which means we could allocate multiple
times (and carry a free-able pointer for each to the end). But in
practice we won't, because of the three marks, "^@" jumps immediately to
the end without further parsing, and "^-^!" is nonsense that nobody
would pass. So you'd get one allocation in general, and never more than
two.
Another obvious option would be to just copy "arg" up front and be OK
with munging it. But that means we pay the cost even when we find no
marks. We could make a single copy upon finding a mark and then munge,
but that adds extra code to each site (checking whether somebody else
allocated, and if not, adjusting our "mark" pointer to be relative to
the copied string).
I aimed for something that was clear and obvious, if a bit verbose.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
(cherry picked from commit 22b985e)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
The previous commit cleared up some const confusion in handling parent marks in revision.c, but we have roughly the same code duplicated in rev-parse. This one is much easier to fix, because the handling of the shortened string is all done in one place, after detecting any marks (but without shortening the string between marks). As a side note, I suspect this means that it behaves differently than the revision.c parser for weird stuff like "foo^!^@^-", but that is outside the scope of this patch. While we are here, let's also rename the variable "dotdot", which is totally misleading (and which we already fixed in revision.c long ago via f632ded (handle_revision_arg: stop using "dotdot" as a generic pointer, 2017-05-19)). Doing that here makes the diff a little messier, but it also lets the compiler help us make sure we did not miss any stray mentions of the variable while we are changing its semantics. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit 213b213) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When git-config matches a url, we copy the variable section name and store it in the "section" member of a urlmatch_config struct. That member is const, since the url-matcher will not touch it (and other callers really will have a const string). But that means that we have only a const pointer to our allocated string. We have to cast away the constness when we free it, and likewise when we assign NUL to tie off the "." separating the subsection and key. This latter happens implicitly via a strchr() call, but recent versions of glibc have added annotations that let the compiler detect that and complain. Let's keep our own "section" pointer for the non-const string, and then just point config.section at it. That avoids all of the casting, both explicit and implicit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> (cherry picked from commit d385845) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We need to accommodate for C23 being very strict about the `const`-ness of the return value of the `strchr()` function. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Small code clean-up around the constness area. * cf/constness-fixes: dir: avoid -Wdiscarded-qualifiers in remove_path() bloom: remove a misleading const qualifier This backports 651847f (Merge branch 'cf/constness-fixes', 2026-03-23). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
ISO C23 redefines strchr and friends that tradiotionally took a const pointer and returned a non-const pointer derived from it to preserve constness (i.e., if you ask for a substring in a const string, you get a const pointer to the substring). Update code paths that used non-const pointer to receive their results that did not have to be non-const to adjust. * cf/c23-const-preserving-strchr-updates-0: gpg-interface: remove an unnecessary NULL initialization global: constify some pointers that are not written to This backports 7855eff (Merge branch 'cf/c23-const-preserving-strchr-updates-0', 2026-02-13). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Further work to adjust the codebase for C23 that changes functions like strchr() that discarded constness when they return a pointer into a const string to preserve constness. * jk/c23-const-preserving-fixes-more: git-compat-util: fix CONST_OUTPARAM typo and indentation refs/files-backend: drop const to fix strchr() warning http: drop const to fix strstr() warning range-diff: drop const to fix strstr() warnings pkt-line: make packet_reader.line non-const skip_prefix(): check const match between in and out params pseudo-merge: fix disk reads from find_pseudo_merge() find_last_dir_sep(): convert inline function to macro run-command: explicitly cast away constness when assigning to void pager: explicitly cast away strchr() constness transport-helper: drop const to fix strchr() warnings http: add const to fix strchr() warnings convert: add const to fix strchr() warnings This backports 3eabc35 (Merge branch 'jk/c23-const-preserving-fixes-more', 2026-04-09). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
UI improvements for "git history reword". * ps/history-ergonomics-updates: Documentation/git-history: document default for "--update-refs=" builtin/history: rename "--ref-action=" to "--update-refs=" builtin/history: replace "--ref-action=print" with "--dry-run" builtin/history: check for merges before asking for user input builtin/history: perform revwalk checks before asking for user input This backports ce4530a (Merge branch 'jk/ref-filter-lrstrip-optim', 2026-02-27). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Code clean-up. * jk/ref-filter-lrstrip-optim: ref-filter: clarify lstrip/rstrip component counting ref-filter: avoid strrchr() in rstrip_ref_components() ref-filter: simplify rstrip_ref_components() memory handling ref-filter: simplify lstrip_ref_components() memory handling ref-filter: factor out refname component counting This backports fbd0428 (Merge branch 'jk/c23-const-preserving-fixes', 2026-04-06). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Adjust the codebase for C23 that changes functions like strchr() that discarded constness when they return a pointer into a const string to preserve constness. * jk/c23-const-preserving-fixes: config: store allocated string in non-const pointer rev-parse: avoid writing to const string for parent marks revision: avoid writing to const string for parent marks rev-parse: simplify dotdot parsing revision: make handle_dotdot() interface less confusing This backports fbd0428 (Merge branch 'jk/c23-const-preserving-fixes', 2026-04-06). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
mjcheetham
approved these changes
Apr 28, 2026
Merged
dscho
added a commit
that referenced
this pull request
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I noticed today that the
fedora-breaking-changesjob started failing:This is new, I had not seen this before. The reason that this did not fail earlier is that this CI job used to use Fedora 43, where it succeeded, whereas now it uses Fedora 44, where it fails.
Lucky for us, upstream Git already has fixes for this issue & related ones, in the
cf/constness-fixesbranch, thecf/c23-const-preserving-strchr-updates-0branch, thejk/c23-const-preserving-fixes-morebranch, thejk/c23-const-preserving-fixesbranch, thejk/ref-filter-lrstrip-optimbranch and theps/history-ergonomics-updatesbranch, which I hereby backport to thevfs-2.53.0branch.