Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ^=1 to toggle between 0 and 1 #1620

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

AtariDreams
Copy link

@AtariDreams AtariDreams commented Dec 12, 2023

If it is known that an int is either 1 or 0, doing an exclusive or to switch instead of a modulus makes more sense and is more efficient.

Signed-off-by: Seija Kijin doremylover123@gmail.com
cc: Dragan Simic dsimic@manjaro.org
cc: Jeff King peff@peff.net
cc: René Scharfe l.s.r@web.de
cc: Phillip Wood phillip.wood123@gmail.com

If it is known that an int is either 1 or 0,
doing an exclusive or to switch instead of a
modulus makes more sense and is more efficient.

Signed-off-by: Seija Kijin doremylover123@gmail.com
Copy link

Submitted as pull.1620.git.git.1702401468082.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1620/AtariDreams/buffer-v1

To fetch this version to local tag pr-git-1620/AtariDreams/buffer-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1620/AtariDreams/buffer-v1

Copy link

On the Git mailing list, Dragan Simic wrote (reply to this):

On 2023-12-12 18:17, AtariDreams via GitGitGadget wrote:
> From: Seija Kijin <doremylover123@gmail.com>
> > If it is known that an int is either 1 or 0,
> doing an exclusive or to switch instead of a
> modulus makes more sense and is more efficient.

Quite frankly, this doesn't seem like an improvement to me.  It makes the code much less readable, more error-prone, and may even end up producing code that isn't portable.

Regarding the efficiency, such optimizations may be perfectly fine as a trade-off in some critical paths, but these cases don't seem like that.

> Signed-off-by: Seija Kijin doremylover123@gmail.com
> ---
>     Use ^=1 to toggle between 0 and 1
> >     If it is known that an int is either 1 or 0, doing an exclusive or > to
>     switch instead of a modulus makes more sense and is more efficient.
> >     Signed-off-by: Seija Kijin doremylover123@gmail.com
> > Published-As:
> https://github.com/gitgitgadget/git/releases/tag/pr-git-1620%2FAtariDreams%2Fbuffer-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git
> pr-git-1620/AtariDreams/buffer-v1
> Pull-Request: https://github.com/git/git/pull/1620
> >  builtin/fast-export.c      | 4 ++--
>  diff.c                     | 2 +-
>  ident.c                    | 2 +-
>  t/helper/test-path-utils.c | 2 +-
>  4 files changed, 5 insertions(+), 5 deletions(-)
> > diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 70aff515acb..f9f2c9dd850 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -593,8 +593,8 @@ static void anonymize_ident_line(const char **beg,
> const char **end)
>  	struct ident_split split;
>  	const char *end_of_header;
> > -	out = &buffers[which_buffer++];
> -	which_buffer %= ARRAY_SIZE(buffers);
> +	out = &buffers[which_buffer];
> +	which_buffer ^= 1;
>  	strbuf_reset(out);
> >  	/* skip "committer", "author", "tagger", etc */
> diff --git a/diff.c b/diff.c
> index 2c602df10a3..91842b54753 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1191,7 +1191,7 @@ static void mark_color_as_moved(struct > diff_options *o,
>  							    &pmb_nr);
> >  			if (contiguous && pmb_nr && moved_symbol == l->s)
> -				flipped_block = (flipped_block + 1) % 2;
> +				flipped_block ^= 1;
>  			else
>  				flipped_block = 0;
> > diff --git a/ident.c b/ident.c
> index cc7afdbf819..188826eed63 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -459,7 +459,7 @@ const char *fmt_ident(const char *name, const char > *email,
>  	int want_name = !(flag & IDENT_NO_NAME);
> >  	struct strbuf *ident = &ident_pool[index];
> -	index = (index + 1) % ARRAY_SIZE(ident_pool);
> +	index ^= 1;
> >  	if (!email) {
>  		if (whose_ident == WANT_AUTHOR_IDENT && git_author_email.len)
> diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c
> index 70396fa3845..241136148a5 100644
> --- a/t/helper/test-path-utils.c
> +++ b/t/helper/test-path-utils.c
> @@ -185,7 +185,7 @@ static int check_dotfile(const char *x, const char > **argv,
>  	int res = 0, expect = 1;
>  	for (; *argv; argv++) {
>  		if (!strcmp("--not", *argv))
> -			expect = !expect;
> +			expect ^= 1;
>  		else if (expect != (is_hfs(*argv) || is_ntfs(*argv)))
>  			res = error("'%s' is %s.git%s", *argv,
>  				    expect ? "not " : "", x);
> > base-commit: 1a87c842ece327d03d08096395969aca5e0a6996

Copy link

User Dragan Simic <dsimic@manjaro.org> has been added to the cc: list.

Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, Dec 12, 2023 at 05:17:47PM +0000, AtariDreams via GitGitGadget wrote:

> diff --git a/builtin/fast-export.c b/builtin/fast-export.c
> index 70aff515acb..f9f2c9dd850 100644
> --- a/builtin/fast-export.c
> +++ b/builtin/fast-export.c
> @@ -593,8 +593,8 @@ static void anonymize_ident_line(const char **beg, const char **end)
>  	struct ident_split split;
>  	const char *end_of_header;
>  
> -	out = &buffers[which_buffer++];
> -	which_buffer %= ARRAY_SIZE(buffers);
> +	out = &buffers[which_buffer];
> +	which_buffer ^= 1;

In the current code, if the size of "buffers" is increased then
everything would just work. But your proposed code (rather subtly) makes
the assumption that ARRAY_SIZE(buffers) is 2.

So even leaving aside questions of readability, I think the existing
code is much more maintainable.

> diff --git a/diff.c b/diff.c
> index 2c602df10a3..91842b54753 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -1191,7 +1191,7 @@ static void mark_color_as_moved(struct diff_options *o,
>  							    &pmb_nr);
>  
>  			if (contiguous && pmb_nr && moved_symbol == l->s)
> -				flipped_block = (flipped_block + 1) % 2;
> +				flipped_block ^= 1;
>  			else
>  				flipped_block = 0;

This one I do not see any problem with changing, though I think it is a
matter of opinion on which is more readable (I actually tend to think of
"x = 0 - x" as idiomatic for flipping).

> diff --git a/ident.c b/ident.c
> index cc7afdbf819..188826eed63 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -459,7 +459,7 @@ const char *fmt_ident(const char *name, const char *email,
>  	int want_name = !(flag & IDENT_NO_NAME);
>  
>  	struct strbuf *ident = &ident_pool[index];
> -	index = (index + 1) % ARRAY_SIZE(ident_pool);
> +	index ^= 1;
>  
>  	if (!email) {
>  		if (whose_ident == WANT_AUTHOR_IDENT && git_author_email.len)

This has the same problem as the first case.

> diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c
> index 70396fa3845..241136148a5 100644
> --- a/t/helper/test-path-utils.c
> +++ b/t/helper/test-path-utils.c
> @@ -185,7 +185,7 @@ static int check_dotfile(const char *x, const char **argv,
>  	int res = 0, expect = 1;
>  	for (; *argv; argv++) {
>  		if (!strcmp("--not", *argv))
> -			expect = !expect;
> +			expect ^= 1;

This one is not wrong, but IMHO it is more clear to express negation of
a boolean using "!" (i.e., what the code is already doing).


So of the four hunks, only the second one seems like a possible
improvement, and even there I am not sure the readability is better.

-Peff

Copy link

User Jeff King <peff@peff.net> has been added to the cc: list.

Copy link

On the Git mailing list, René Scharfe wrote (reply to this):

Am 12.12.23 um 21:09 schrieb Jeff King:
> On Tue, Dec 12, 2023 at 05:17:47PM +0000, AtariDreams via GitGitGadget wrote:
>
>> diff --git a/diff.c b/diff.c
>> index 2c602df10a3..91842b54753 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -1191,7 +1191,7 @@ static void mark_color_as_moved(struct diff_options *o,
>>  							    &pmb_nr);
>>
>>  			if (contiguous && pmb_nr && moved_symbol == l->s)
>> -				flipped_block = (flipped_block + 1) % 2;
>> +				flipped_block ^= 1;
>>  			else
>>  				flipped_block = 0;
>
> This one I do not see any problem with changing, though I think it is a
> matter of opinion on which is more readable (I actually tend to think of
> "x = 0 - x" as idiomatic for flipping).

Did you mean "x = 1 - x"?

    x 0 - x 1 - x
   -- ----- -----
   -1    +1    +2
    0     0    +1
   +1    -1     0

I don't particular like this; it repeats x and seems error-prone. ;-)

I agree with your assessment of the other three cases in the patch.

Can we salvage something from this bikeshedding exercise?  I wonder if
it's time to use the C99 type _Bool in our code.  It would allow
documenting that only two possible values exist in cases like the one
above.  That would be even more useful for function returns, I assume.

René

Copy link

User René Scharfe <l.s.r@web.de> has been added to the cc: list.

Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, Dec 12, 2023 at 11:30:03PM +0100, René Scharfe wrote:

> Am 12.12.23 um 21:09 schrieb Jeff King:
> > On Tue, Dec 12, 2023 at 05:17:47PM +0000, AtariDreams via GitGitGadget wrote:
> >
> >> diff --git a/diff.c b/diff.c
> >> index 2c602df10a3..91842b54753 100644
> >> --- a/diff.c
> >> +++ b/diff.c
> >> @@ -1191,7 +1191,7 @@ static void mark_color_as_moved(struct diff_options *o,
> >>  							    &pmb_nr);
> >>
> >>  			if (contiguous && pmb_nr && moved_symbol == l->s)
> >> -				flipped_block = (flipped_block + 1) % 2;
> >> +				flipped_block ^= 1;
> >>  			else
> >>  				flipped_block = 0;
> >
> > This one I do not see any problem with changing, though I think it is a
> > matter of opinion on which is more readable (I actually tend to think of
> > "x = 0 - x" as idiomatic for flipping).
> 
> Did you mean "x = 1 - x"?

Oops, yes, of course. I'm not sure how I managed to fumble that.

> I don't particular like this; it repeats x and seems error-prone. ;-)

Yes. :)

Without digging into the code, I had just assumed that flipped_block was
used as an array index. But it really is a boolean, so I actually think
"flipped_block = !flipped_block" would probably be the most clear (but
IMHO not really worth the churn).

> Can we salvage something from this bikeshedding exercise?  I wonder if
> it's time to use the C99 type _Bool in our code.  It would allow
> documenting that only two possible values exist in cases like the one
> above.  That would be even more useful for function returns, I assume.

Hmm, possibly. I guess that might have helped my confusion, and I do
think returning bool for function returns would help make their meaning
more clear (it would help distinguish them from the usual "0 for
success" return values).

I don't even know that we'd need much of a weather-balloon patch. I
think it would be valid to do:

  #ifndef bool
  #define bool int

to handle pre-C99 compilers (if there even are any these days). Of
course we probably need some conditional magic to try to "#include
<stdbool.h>" for the actual C99. I guess we could assume C99 by default
and then add NO_STDBOOL as an escape hatch if anybody complains.

-Peff

Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <peff@peff.net> writes:

> Without digging into the code, I had just assumed that flipped_block was
> used as an array index. But it really is a boolean, so I actually think
> "flipped_block = !flipped_block" would probably be the most clear (but
> IMHO not really worth the churn).

;-)

> I don't even know that we'd need much of a weather-balloon patch. I
> think it would be valid to do:
>
>   #ifndef bool
>   #define bool int
>
> to handle pre-C99 compilers (if there even are any these days). Of
> course we probably need some conditional magic to try to "#include
> <stdbool.h>" for the actual C99. I guess we could assume C99 by default
> and then add NO_STDBOOL as an escape hatch if anybody complains.

Sounds good.

Copy link

On the Git mailing list, René Scharfe wrote (reply to this):

Am 13.12.23 um 09:01 schrieb Jeff King:
> On Tue, Dec 12, 2023 at 11:30:03PM +0100, René Scharfe wrote:
>
>> I wonder if
>> it's time to use the C99 type _Bool in our code.  It would allow
>> documenting that only two possible values exist in cases like the one
>> above.  That would be even more useful for function returns, I assume.

> I don't even know that we'd need much of a weather-balloon patch. I
> think it would be valid to do:
>
>   #ifndef bool
>   #define bool int
>
> to handle pre-C99 compilers (if there even are any these days). Of
> course we probably need some conditional magic to try to "#include
> <stdbool.h>" for the actual C99. I guess we could assume C99 by default
> and then add NO_STDBOOL as an escape hatch if anybody complains.

The semantics are slightly different in edge cases, so that fallback
would not be fully watertight.  E.g. consider:

   bool b(bool cond) {return cond == true;}
   bool b2(void) {return b(2);}

b() returns false if you give it false and true for anything else. b2()
returns true.

With int as the fallback this becomes:

   int b(int cond) {return cond == 1;}
   int b2(void) {return b(2);}

Now only 1 is recognized as true, b2() returns 0 (false).

A coding rule to not compare bools could mitigate that.  Or a rule to
only use the values true and false in bool context and to only use
logical operators on them.

René

Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Thu, Dec 14, 2023 at 02:08:31PM +0100, René Scharfe wrote:

> > I don't even know that we'd need much of a weather-balloon patch. I
> > think it would be valid to do:
> >
> >   #ifndef bool
> >   #define bool int
> >
> > to handle pre-C99 compilers (if there even are any these days). Of
> > course we probably need some conditional magic to try to "#include
> > <stdbool.h>" for the actual C99. I guess we could assume C99 by default
> > and then add NO_STDBOOL as an escape hatch if anybody complains.
> 
> The semantics are slightly different in edge cases, so that fallback
> would not be fully watertight.  E.g. consider:
> 
>    bool b(bool cond) {return cond == true;}
>    bool b2(void) {return b(2);}

Yeah. b2() is wrong for passing "2" to a bool. I assumed that the
compiler would warn of that (at least for people on modern C99
compilers, not the fallback code), but it doesn't seem to. It's been a
long time since I've worked on a code base that made us of "bool", but I
guess that idea is that silently coercing a non-zero int to a bool is
reasonable in many cases (e.g., "bool found_foo = count_foos()").

I guess one could argue that b() is also sub-optimal, as it should just
say "return cond" or "return !cond" rather than explicitly comparing to
true/false. But I won't be surprised if it happens from time to time.

> A coding rule to not compare bools could mitigate that.  Or a rule to
> only use the values true and false in bool context and to only use
> logical operators on them.

That seems more complex than we want if our goal is just supporting
legacy systems that may or may not even exist. Given your example, I'd
be more inclined to just do a weather-balloon adding <stdbool.h> to
git-compat-util.h, and using "bool" in a single spot in the code. If
nobody screams after a few releases, we can consider it OK. If they do,
it's a trivial patch to convert back.

-Peff

Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

On 14/12/2023 22:05, Jeff King wrote:
> On Thu, Dec 14, 2023 at 02:08:31PM +0100, René Scharfe wrote:
> >>> I don't even know that we'd need much of a weather-balloon patch. I
>>> think it would be valid to do:
>>>
>>>    #ifndef bool
>>>    #define bool int
>>>
>>> to handle pre-C99 compilers (if there even are any these days). Of
>>> course we probably need some conditional magic to try to "#include
>>> <stdbool.h>" for the actual C99. I guess we could assume C99 by default
>>> and then add NO_STDBOOL as an escape hatch if anybody complains.
>>
>> The semantics are slightly different in edge cases, so that fallback
>> would not be fully watertight.  E.g. consider:
>>
>>     bool b(bool cond) {return cond == true;}
>>     bool b2(void) {return b(2);}

Thanks for bring this up René, I had similar concerns when I saw the suggestion of using "int" as a fallback.

> Yeah. b2() is wrong for passing "2" to a bool.

I think it depends what you mean by "wrong" §6.3.1.2 of standard is quite clear that when any non-zero scalar value is converted to _Bool the result is "1"

> I assumed that the
> compiler would warn of that (at least for people on modern C99
> compilers, not the fallback code), but it doesn't seem to. It's been a
> long time since I've worked on a code base that made us of "bool", but I
> guess that idea is that silently coercing a non-zero int to a bool is
> reasonable in many cases (e.g., "bool found_foo = count_foos()").

I guess it is also consistent with the way "if" and "while" consider a non-zero scalar value to be "true".

> I guess one could argue that b() is also sub-optimal, as it should just
> say "return cond" or "return !cond" rather than explicitly comparing to
> true/false. But I won't be surprised if it happens from time to time.

Even if it unlikely that we would directly compare a boolean variable to "true" or "false" it is certainly conceivable that we'd compare two boolean variables directly. For the integer fallback to be safe we'd need to write

	if (!cond_a == !cond_b)

rather than

	if (cond_a == cond_b)

>> A coding rule to not compare bools could mitigate that.  Or a rule to
>> only use the values true and false in bool context and to only use
>> logical operators on them.
> > That seems more complex than we want if our goal is just supporting
> legacy systems that may or may not even exist. Given your example, I'd
> be more inclined to just do a weather-balloon adding <stdbool.h> to
> git-compat-util.h, and using "bool" in a single spot in the code. If
> nobody screams after a few releases, we can consider it OK. If they do,
> it's a trivial patch to convert back.

A weather-balloon seems like the safest route forward. We have been requiring C99 for two years now [1], hopefully there aren't any compilers out that claim to support C99 but don't provide "<stdbool.h>" (I did check online and the compiler on NonStop does support _Bool).

Best Wishes

Phillip

[1] 7bc341e21b (git-compat-util: add a test balloon for C99 support, 2021-12-01)

Copy link

User Phillip Wood <phillip.wood123@gmail.com> has been added to the cc: list.

Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <phillip.wood123@gmail.com> writes:

> Even if it unlikely that we would directly compare a boolean variable
> to "true" or "false" it is certainly conceivable that we'd compare two
> boolean variables directly. For the integer fallback to be safe we'd
> need to write
>
> 	if (!cond_a == !cond_b)
>
> rather than
>
> 	if (cond_a == cond_b)

Eek, it defeats the benefit of using true Boolean type if we had to
train ourselves to write the former, doesn't it?

> A weather-balloon seems like the safest route forward. We have been
> requiring C99 for two years now [1], hopefully there aren't any
> compilers out that claim to support C99 but don't provide
> "<stdbool.h>" (I did check online and the compiler on NonStop does
> support _Bool).
>
> Best Wishes
>
> Phillip
>
> [1] 7bc341e21b (git-compat-util: add a test balloon for C99 support,
> 2021-12-01)

Nice to be reminded of this one.

The cited commit does not start to use any specific feature from
C99, other than that we now require that the compiler claims C99
conformance by __STDC_VERSION__ set appropriately.  The commit log
message says C99 "provides a variety of useful features, including
..., many of which we already use.", which implies that our wish was
to officially allow any and all features in C99 to be used in our
codebase after a successful flight of this test balloon.

Now, I think we saw a successful flight of this test balloon by now.
Is allowing all the C99 the next step we really want to take?

I still personally have an aversion against decl-after-statement and
//-comments, not due to portability reasons at all, but because I
find that the code is easier to read without it. But in principle,
it is powerful to be able to say "OK, as long as the feature is in
C99 you can use it", instead of having to decide on individual
features, and I am not fundamentally against going that route if it
is where people want to go.

Thanks.

Copy link

On the Git mailing list, René Scharfe wrote (reply to this):

Am 15.12.23 um 18:09 schrieb Junio C Hamano:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> Even if it unlikely that we would directly compare a boolean variable
>> to "true" or "false" it is certainly conceivable that we'd compare two
>> boolean variables directly. For the integer fallback to be safe we'd
>> need to write
>>
>> 	if (!cond_a == !cond_b)
>>
>> rather than
>>
>> 	if (cond_a == cond_b)
>
> Eek, it defeats the benefit of using true Boolean type if we had to
> train ourselves to write the former, doesn't it?

Indeed.

>> [1] 7bc341e21b (git-compat-util: add a test balloon for C99 support,
>> 2021-12-01)
>
> Nice to be reminded of this one.
>
> The cited commit does not start to use any specific feature from
> C99, other than that we now require that the compiler claims C99
> conformance by __STDC_VERSION__ set appropriately.  The commit log
> message says C99 "provides a variety of useful features, including
> ..., many of which we already use.", which implies that our wish was
> to officially allow any and all features in C99 to be used in our
> codebase after a successful flight of this test balloon.
>
> Now, I think we saw a successful flight of this test balloon by now.
> Is allowing all the C99 the next step we really want to take?
>
> I still personally have an aversion against decl-after-statement and
> //-comments, not due to portability reasons at all, but because I
> find that the code is easier to read without it. But in principle,
> it is powerful to be able to say "OK, as long as the feature is in
> C99 you can use it", instead of having to decide on individual
> features, and I am not fundamentally against going that route if it
> is where people want to go.

C99 added a lot of features, but we already use several of them.
Support for individual features may vary, though -- who knows?

E.g. http://www.compilers.de/vbcc.html claims to support "most of
ISO/IEC 9899:1999 (C99)", yet _Bool is not mentioned in its docs (but
__STDC_VERSION__ 199901L is).  It's not a particularly interesting
compiler for us, but still a real-world example of selective C99
support.

The table at the bottom of https://en.cppreference.com/w/c/99 would be
useful if it was filled out for more compilers.  And it also doesn't
mention _Bool and stdbool.h.

TenDRA and the M/o/Vfuscator are the only compilers without stdbool.h
support on https://godbolt.org/ as far as I can see, but that website
doesn't have a lot of commercial compilers (understandably).

So I guess in practice we still need to check each new feature, even
though in theory we should be fine after the two-year test.

René

Copy link

On the Git mailing list, René Scharfe wrote (reply to this):

Use the data type bool and its values true and false to document the
binary return value of skip_prefix() and friends more explicitly.

This first use of stdbool.h, introduced with C99, is meant to check
whether there are platforms that claim support for C99, as tested by
7bc341e21b (git-compat-util: add a test balloon for C99 support,
2021-12-01), but still lack that header for some reason.

A fallback based on a wider type, e.g. int, would have to deal with
comparisons somehow to emulate that any non-zero value is true:

   bool b1 = 1;
   bool b2 = 2;
   if (b1 == b2) puts("This is true.");

   int i1 = 1;
   int i2 = 2;
   if (i1 == i2) puts("Not printed.");
   #define BOOLEQ(a, b) (!(a) == !(b))
   if (BOOLEQ(i1, i2)) puts("This is true.");

So we'd be better off using bool everywhere without a fallback, if
possible.  That's why this patch doesn't include any.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 git-compat-util.h | 42 ++++++++++++++++++++++--------------------
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 3e7a59b5ff..603c97e3b3 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -225,6 +225,7 @@ struct strbuf;
 #include <stddef.h>
 #include <stdlib.h>
 #include <stdarg.h>
+#include <stdbool.h>
 #include <string.h>
 #ifdef HAVE_STRINGS_H
 #include <strings.h> /* for strcasecmp() */
@@ -684,11 +685,11 @@ report_fn get_warn_routine(void);
 void set_die_is_recursing_routine(int (*routine)(void));

 /*
- * If the string "str" begins with the string found in "prefix", return 1.
+ * If the string "str" begins with the string found in "prefix", return true.
  * The "out" parameter is set to "str + strlen(prefix)" (i.e., to the point in
  * the string right after the prefix).
  *
- * Otherwise, return 0 and leave "out" untouched.
+ * Otherwise, return false and leave "out" untouched.
  *
  * Examples:
  *
@@ -699,57 +700,58 @@ void set_die_is_recursing_routine(int (*routine)(void));
  *   [skip prefix if present, otherwise use whole string]
  *   skip_prefix(name, "refs/heads/", &name);
  */
-static inline int skip_prefix(const char *str, const char *prefix,
-			      const char **out)
+static inline bool skip_prefix(const char *str, const char *prefix,
+			       const char **out)
 {
 	do {
 		if (!*prefix) {
 			*out = str;
-			return 1;
+			return true;
 		}
 	} while (*str++ == *prefix++);
-	return 0;
+	return false;
 }

 /*
  * Like skip_prefix, but promises never to read past "len" bytes of the input
  * buffer, and returns the remaining number of bytes in "out" via "outlen".
  */
-static inline int skip_prefix_mem(const char *buf, size_t len,
-				  const char *prefix,
-				  const char **out, size_t *outlen)
+static inline bool skip_prefix_mem(const char *buf, size_t len,
+				   const char *prefix,
+				   const char **out, size_t *outlen)
 {
 	size_t prefix_len = strlen(prefix);
 	if (prefix_len <= len && !memcmp(buf, prefix, prefix_len)) {
 		*out = buf + prefix_len;
 		*outlen = len - prefix_len;
-		return 1;
+		return true;
 	}
-	return 0;
+	return false;
 }

 /*
- * If buf ends with suffix, return 1 and subtract the length of the suffix
- * from *len. Otherwise, return 0 and leave *len untouched.
+ * If buf ends with suffix, return true and subtract the length of the suffix
+ * from *len. Otherwise, return false and leave *len untouched.
  */
-static inline int strip_suffix_mem(const char *buf, size_t *len,
-				   const char *suffix)
+static inline bool strip_suffix_mem(const char *buf, size_t *len,
+				    const char *suffix)
 {
 	size_t suflen = strlen(suffix);
 	if (*len < suflen || memcmp(buf + (*len - suflen), suffix, suflen))
-		return 0;
+		return false;
 	*len -= suflen;
-	return 1;
+	return true;
 }

 /*
- * If str ends with suffix, return 1 and set *len to the size of the string
- * without the suffix. Otherwise, return 0 and set *len to the size of the
+ * If str ends with suffix, return true and set *len to the size of the string
+ * without the suffix. Otherwise, return false and set *len to the size of the
  * string.
  *
  * Note that we do _not_ NUL-terminate str to the new length.
  */
-static inline int strip_suffix(const char *str, const char *suffix, size_t *len)
+static inline bool strip_suffix(const char *str, const char *suffix,
+				size_t *len)
 {
 	*len = strlen(str);
 	return strip_suffix_mem(str, len, suffix);
--
2.43.0

Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi Junio

On 15/12/2023 17:09, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> >> Even if it unlikely that we would directly compare a boolean variable
>> to "true" or "false" it is certainly conceivable that we'd compare two
>> boolean variables directly. For the integer fallback to be safe we'd
>> need to write
>>
>> 	if (!cond_a == !cond_b)
>>
>> rather than
>>
>> 	if (cond_a == cond_b)
> > Eek, it defeats the benefit of using true Boolean type if we had to
> train ourselves to write the former, doesn't it?

Yes, it's horrible - if for some reason it turns out that we cannot use "#include <stdbool.h>" everywhere I think we should drop it rather than providing a subtly incompatible fallback

>> A weather-balloon seems like the safest route forward. We have been
>> requiring C99 for two years now [1], hopefully there aren't any
>> compilers out that claim to support C99 but don't provide
>> "<stdbool.h>" (I did check online and the compiler on NonStop does
>> support _Bool).
>>
>> Best Wishes
>>
>> Phillip
>>
>> [1] 7bc341e21b (git-compat-util: add a test balloon for C99 support,
>> 2021-12-01)
> > Nice to be reminded of this one.
> > The cited commit does not start to use any specific feature from
> C99, other than that we now require that the compiler claims C99
> conformance by __STDC_VERSION__ set appropriately.  The commit log
> message says C99 "provides a variety of useful features, including
> ..., many of which we already use.", which implies that our wish was
> to officially allow any and all features in C99 to be used in our
> codebase after a successful flight of this test balloon.
> > Now, I think we saw a successful flight of this test balloon by now.
> Is allowing all the C99 the next step we really want to take?
>
> I still personally have an aversion against decl-after-statement and
> //-comments, not due to portability reasons at all, but because I
> find that the code is easier to read without it. But in principle,
> it is powerful to be able to say "OK, as long as the feature is in
> C99 you can use it", instead of having to decide on individual
> features, and I am not fundamentally against going that route if it
> is where people want to go.

I'm not sure we necessarily want to say "use anything that is in C99" for several reasons.

 - Some features such as C99's variable length arrays are known to be
   problematic.

 - As you say above there maybe features that we think harm the
   readability of our code.

 - As René points out not all compilers necessarily support all
   features.

I think using _Bool could be useful for the reasons Peff outlined. As for other features I've written code that I think would have benefited from compound literals, but off the top of my head I can't think of any other C99 features that I personally wish we were using. I think that decl-after-statement is occasionally useful to declare a variable near where it is used in a long function body but it is much simpler just to ban it altogether and encourage people to break up long functions to make them more readable.

Best Wishes

Phillip

> Thanks.
> > 

Copy link

On the Git mailing list, Phillip Wood wrote (reply to this):

Hi René

On 16/12/2023 10:47, René Scharfe wrote:
> Use the data type bool and its values true and false to document the
> binary return value of skip_prefix() and friends more explicitly.
> > This first use of stdbool.h, introduced with C99, is meant to check
> whether there are platforms that claim support for C99, as tested by
> 7bc341e21b (git-compat-util: add a test balloon for C99 support,
> 2021-12-01), but still lack that header for some reason.
> > A fallback based on a wider type, e.g. int, would have to deal with
> comparisons somehow to emulate that any non-zero value is true:
> >     bool b1 = 1;
>     bool b2 = 2;
>     if (b1 == b2) puts("This is true.");
> >     int i1 = 1;
>     int i2 = 2;
>     if (i1 == i2) puts("Not printed.");
>     #define BOOLEQ(a, b) (!(a) == !(b))
>     if (BOOLEQ(i1, i2)) puts("This is true.");
> > So we'd be better off using bool everywhere without a fallback, if
> possible.  That's why this patch doesn't include any.

Thanks for the comprehensive commit message, I agree that we'd be better off avoiding adding a fallback. The patch looks good, I did wonder if we really need to covert all of these functions for a test-balloon but the patch is still pretty small overall.

Best Wishes

Phillip

> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>   git-compat-util.h | 42 ++++++++++++++++++++++--------------------
>   1 file changed, 22 insertions(+), 20 deletions(-)
> > diff --git a/git-compat-util.h b/git-compat-util.h
> index 3e7a59b5ff..603c97e3b3 100644
> --- a/git-compat-util.h
> +++ b/git-compat-util.h
> @@ -225,6 +225,7 @@ struct strbuf;
>   #include <stddef.h>
>   #include <stdlib.h>
>   #include <stdarg.h>
> +#include <stdbool.h>
>   #include <string.h>
>   #ifdef HAVE_STRINGS_H
>   #include <strings.h> /* for strcasecmp() */
> @@ -684,11 +685,11 @@ report_fn get_warn_routine(void);
>   void set_die_is_recursing_routine(int (*routine)(void));
> >   /*
> - * If the string "str" begins with the string found in "prefix", return 1.
> + * If the string "str" begins with the string found in "prefix", return true.
>    * The "out" parameter is set to "str + strlen(prefix)" (i.e., to the point in
>    * the string right after the prefix).
>    *
> - * Otherwise, return 0 and leave "out" untouched.
> + * Otherwise, return false and leave "out" untouched.
>    *
>    * Examples:
>    *
> @@ -699,57 +700,58 @@ void set_die_is_recursing_routine(int (*routine)(void));
>    *   [skip prefix if present, otherwise use whole string]
>    *   skip_prefix(name, "refs/heads/", &name);
>    */
> -static inline int skip_prefix(const char *str, const char *prefix,
> -			      const char **out)
> +static inline bool skip_prefix(const char *str, const char *prefix,
> +			       const char **out)
>   {
>   	do {
>   		if (!*prefix) {
>   			*out = str;
> -			return 1;
> +			return true;
>   		}
>   	} while (*str++ == *prefix++);
> -	return 0;
> +	return false;
>   }
> >   /*
>    * Like skip_prefix, but promises never to read past "len" bytes of the input
>    * buffer, and returns the remaining number of bytes in "out" via "outlen".
>    */
> -static inline int skip_prefix_mem(const char *buf, size_t len,
> -				  const char *prefix,
> -				  const char **out, size_t *outlen)
> +static inline bool skip_prefix_mem(const char *buf, size_t len,
> +				   const char *prefix,
> +				   const char **out, size_t *outlen)
>   {
>   	size_t prefix_len = strlen(prefix);
>   	if (prefix_len <= len && !memcmp(buf, prefix, prefix_len)) {
>   		*out = buf + prefix_len;
>   		*outlen = len - prefix_len;
> -		return 1;
> +		return true;
>   	}
> -	return 0;
> +	return false;
>   }
> >   /*
> - * If buf ends with suffix, return 1 and subtract the length of the suffix
> - * from *len. Otherwise, return 0 and leave *len untouched.
> + * If buf ends with suffix, return true and subtract the length of the suffix
> + * from *len. Otherwise, return false and leave *len untouched.
>    */
> -static inline int strip_suffix_mem(const char *buf, size_t *len,
> -				   const char *suffix)
> +static inline bool strip_suffix_mem(const char *buf, size_t *len,
> +				    const char *suffix)
>   {
>   	size_t suflen = strlen(suffix);
>   	if (*len < suflen || memcmp(buf + (*len - suflen), suffix, suflen))
> -		return 0;
> +		return false;
>   	*len -= suflen;
> -	return 1;
> +	return true;
>   }
> >   /*
> - * If str ends with suffix, return 1 and set *len to the size of the string
> - * without the suffix. Otherwise, return 0 and set *len to the size of the
> + * If str ends with suffix, return true and set *len to the size of the string
> + * without the suffix. Otherwise, return false and set *len to the size of the
>    * string.
>    *
>    * Note that we do _not_ NUL-terminate str to the new length.
>    */
> -static inline int strip_suffix(const char *str, const char *suffix, size_t *len)
> +static inline bool strip_suffix(const char *str, const char *suffix,
> +				size_t *len)
>   {
>   	*len = strlen(str);
>   	return strip_suffix_mem(str, len, suffix);
> --
> 2.43.0
> 

Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Phillip Wood <phillip.wood123@gmail.com> writes:

> Thanks for the comprehensive commit message, I agree that we'd be
> better off avoiding adding a fallback. The patch looks good, I did
> wonder if we really need to covert all of these functions for a
> test-balloon but the patch is still pretty small overall.

I do have to wonder, though, if we want to be a bit more careful
than just blindly trusting the platform (i.e. <stdbool.h> might
exist and __STDC_VERSION__ may say C99, but under the hood their
implementation may be buggy and coerce the result of an assignment
of 2 to be different from assigning true).

In any case, this is a good starting place.  Let's queue it, see
what happens, and then think about longer-term plans.

Thanks.

Copy link

On the Git mailing list, René Scharfe wrote (reply to this):

Am 18.12.23 um 21:19 schrieb Junio C Hamano:
> Phillip Wood <phillip.wood123@gmail.com> writes:
>
>> Thanks for the comprehensive commit message, I agree that we'd be
>> better off avoiding adding a fallback. The patch looks good, I did
>> wonder if we really need to covert all of these functions for a
>> test-balloon but the patch is still pretty small overall.
>
> I do have to wonder, though, if we want to be a bit more careful
> than just blindly trusting the platform (i.e. <stdbool.h> might
> exist and __STDC_VERSION__ may say C99, but under the hood their
> implementation may be buggy and coerce the result of an assignment
> of 2 to be different from assigning true).

We could add a compile-time check like below.  I can't decide if this
would be prudent or paranoid.  It's cheap, though, so perhaps just add
this tripwire for non-conforming compilers without making a judgement?

René



diff --git a/git-compat-util.h b/git-compat-util.h
index 603c97e3b3..8212feaa37 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -705,7 +705,7 @@ static inline bool skip_prefix(const char *str, const char *prefix,
 {
 	do {
 		if (!*prefix) {
-			*out = str;
+			*out = str + BUILD_ASSERT_OR_ZERO((bool)1 == (bool)2);
 			return true;
 		}
 	} while (*str++ == *prefix++);

Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Fri, Dec 15, 2023 at 02:46:36PM +0000, Phillip Wood wrote:

> > Yeah. b2() is wrong for passing "2" to a bool.
> 
> I think it depends what you mean by "wrong" §6.3.1.2 of standard is quite
> clear that when any non-zero scalar value is converted to _Bool the result
> is "1"

Yeah, sorry, I was being quite sloppy with my wording. I meant "wrong"
as in "I would ideally flag this in review for being weird and
confusing".

Of course there are many reasonable cases where you might pass an
integer "foo" rather than explicitly booleanizing it with "!!foo". So I
do agree it's a real potential problem (and I'm sufficiently convinced
that we should avoid an "int" fallback if we can).

-Peff

Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Sat, Dec 16, 2023 at 11:47:21AM +0100, René Scharfe wrote:

> Use the data type bool and its values true and false to document the
> binary return value of skip_prefix() and friends more explicitly.
> 
> This first use of stdbool.h, introduced with C99, is meant to check
> whether there are platforms that claim support for C99, as tested by
> 7bc341e21b (git-compat-util: add a test balloon for C99 support,
> 2021-12-01), but still lack that header for some reason.
> 
> A fallback based on a wider type, e.g. int, would have to deal with
> comparisons somehow to emulate that any non-zero value is true:
> 
>    bool b1 = 1;
>    bool b2 = 2;
>    if (b1 == b2) puts("This is true.");
> 
>    int i1 = 1;
>    int i2 = 2;
>    if (i1 == i2) puts("Not printed.");
>    #define BOOLEQ(a, b) (!(a) == !(b))
>    if (BOOLEQ(i1, i2)) puts("This is true.");
> 
> So we'd be better off using bool everywhere without a fallback, if
> possible.  That's why this patch doesn't include any.

Thanks for putting this together. I agree this is the right spot to end
up for now (and that if for whatever reason we find that some platforms
can't handle it, we probably should revert and not try the naive
fallback).

-Peff

Copy link

On the Git mailing list, phillip.wood123@gmail.com wrote (reply to this):

Hi Peff

On 21/12/2023 09:56, Jeff King wrote:
> On Fri, Dec 15, 2023 at 02:46:36PM +0000, Phillip Wood wrote:
> >>> Yeah. b2() is wrong for passing "2" to a bool.
>>
>> I think it depends what you mean by "wrong" §6.3.1.2 of standard is quite
>> clear that when any non-zero scalar value is converted to _Bool the result
>> is "1"
> > Yeah, sorry, I was being quite sloppy with my wording. I meant "wrong"
> as in "I would ideally flag this in review for being weird and
> confusing".

That makes sense, it certainly is confusing

> Of course there are many reasonable cases where you might pass an
> integer "foo" rather than explicitly booleanizing it with "!!foo". So I
> do agree it's a real potential problem (and I'm sufficiently convinced
> that we should avoid an "int" fallback if we can).

I had a look at gnulib the other day and the list of limitations in the documentation of their <stdbool.h> fallback makes it look quite unattractive. They helpfully list some compilers where _Bool is not implemented (IRIX, Tru64) or does not work correctly (HP-UX, AIX). As far as I can see all the bug reports cited are from 2003-2006 on obsolete compiler versions, hopefully _Bool is better supported these days.

Best Wishes

Phillip

> -Peff

out = &buffers[which_buffer++];
which_buffer %= ARRAY_SIZE(buffers);
out = &buffers[which_buffer];
which_buffer ^= 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does not take into account the fact that which_buffer is used as an index in buffers. If the size of buffer changes, this "hack" renders the change useless, producing confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants