Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sideband.c: replace int with size_t for clarity #1625

Closed
wants to merge 1 commit into from

Conversation

Chand-ra
Copy link

@Chand-ra Chand-ra commented Dec 22, 2023

cc: Torsten Bögershausen tboegi@web.de
cc: Taylor Blau me@ttaylorr.com

@Chand-ra
Copy link
Author

/preview

Copy link

gitgitgadget bot commented Dec 22, 2023

Preview email sent as pull.1625.git.1703263989185.gitgitgadget@gmail.com

@Chand-ra
Copy link
Author

/submit

Copy link

gitgitgadget bot commented Dec 22, 2023

Submitted as pull.1625.git.1703264469238.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1625/Chand-ra/dusra-v1

To fetch this version to local tag pr-1625/Chand-ra/dusra-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1625/Chand-ra/dusra-v1

Copy link

gitgitgadget bot commented Dec 22, 2023

On the Git mailing list, Torsten Bögershausen wrote (reply to this):

On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> From: Chandra Pratap <chandrapratap3519@gmail.com>
>
> Replace int with size_t for clarity and remove the
> 'NEEDSWORK' tag associated with it.
>
> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> ---
>     sideband.c: replace int with size_t for clarity
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1625
>
>  sideband.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/sideband.c b/sideband.c
> index 6cbfd391c47..1599e408d1b 100644
> --- a/sideband.c
> +++ b/sideband.c
> @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
>   * of the line. This should be called for a single line only, which is
>   * passed as the first N characters of the SRC array.
>   *
> - * NEEDSWORK: use "size_t n" instead for clarity.
>   */
> -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
>  {
>  	int i;
>

Thanks for working on this.
There is, however, more potential for improvements.
First of all: the "int i" from above.
Further down, we read
	for (i = 0; i < ARRAY_SIZE(keywords); i++) {

However, a size of an array can never be negative, so that
an unsigned data type is a better choice than a signed.
And, arrays can have more elements than an int can address,
at least in theory.
For a reader it makes more sense, to replace
int i;
with
size_t i;


And further down, there is another place for improvments:

		int len = strlen(p->keyword);
		if (n < len)
			continue;

Even here, a strlen is never negative. And a size_t is the choice for len,
since all "modern" implementations declare strlen() to return size_t

Do you think that you can have a look at these changes ?

I will be happy to do a review, and possibly other people as well.

Copy link

gitgitgadget bot commented Dec 22, 2023

User Torsten Bögershausen <tboegi@web.de> has been added to the cc: list.

Copy link

gitgitgadget bot commented Dec 22, 2023

On the Git mailing list, Chandra Pratap wrote (reply to this):

Thanks for the feedback, Torsten. I was working on improving the rest of
sideband.c as you suggested when I encountered this snippet on line 82:

while (0 < n && isspace(*src)) {
strbuf_addch(dest, *src);
src++;
n--;
}

Here, we are decreasing the value of an unsigned type to potentially below
0 which may lead to overflow and result in some nasty bug. In this case,
is it wise to continue with replacing 'int n' with 'size_t n' as the
NEEDSWORK tag suggests or should we improve upon the rest of the file
and revert 'size_t n' to 'int n' ?

On Fri, 22 Dec 2023 at 23:27, Torsten Bögershausen <tboegi@web.de> wrote:
>
> On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> > From: Chandra Pratap <chandrapratap3519@gmail.com>
> >
> > Replace int with size_t for clarity and remove the
> > 'NEEDSWORK' tag associated with it.
> >
> > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> > ---
> >     sideband.c: replace int with size_t for clarity
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/1625
> >
> >  sideband.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/sideband.c b/sideband.c
> > index 6cbfd391c47..1599e408d1b 100644
> > --- a/sideband.c
> > +++ b/sideband.c
> > @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
> >   * of the line. This should be called for a single line only, which is
> >   * passed as the first N characters of the SRC array.
> >   *
> > - * NEEDSWORK: use "size_t n" instead for clarity.
> >   */
> > -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> > +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
> >  {
> >       int i;
> >
>
> Thanks for working on this.
> There is, however, more potential for improvements.
> First of all: the "int i" from above.
> Further down, we read
>         for (i = 0; i < ARRAY_SIZE(keywords); i++) {
>
> However, a size of an array can never be negative, so that
> an unsigned data type is a better choice than a signed.
> And, arrays can have more elements than an int can address,
> at least in theory.
> For a reader it makes more sense, to replace
> int i;
> with
> size_t i;
>
>
> And further down, there is another place for improvments:
>
>                 int len = strlen(p->keyword);
>                 if (n < len)
>                         continue;
>
> Even here, a strlen is never negative. And a size_t is the choice for len,
> since all "modern" implementations declare strlen() to return size_t
>
> Do you think that you can have a look at these changes ?
>
> I will be happy to do a review, and possibly other people as well.

Copy link

gitgitgadget bot commented Dec 22, 2023

User Chandra Pratap <chandrapratap3519@gmail.com> has been added to the cc: list.

Copy link

gitgitgadget bot commented Dec 22, 2023

On the Git mailing list, Torsten Bögershausen wrote (reply to this):

(PLease, avoid top-posting on this list)
On Fri, Dec 22, 2023 at 11:57:25PM +0530, Chandra Pratap wrote:
> Thanks for the feedback, Torsten. I was working on improving the rest of
> sideband.c as you suggested when I encountered this snippet on line 82:
>
> while (0 < n && isspace(*src)) {
> strbuf_addch(dest, *src);
> src++;
> n--;
> }
>
> Here, we are decreasing the value of an unsigned type to potentially below
> 0 which may lead to overflow and result in some nasty bug. In this case,
> is it wise to continue with replacing 'int n' with 'size_t n' as the
> NEEDSWORK tag suggests or should we improve upon the rest of the file
> and revert 'size_t n' to 'int n' ?

Yes, that could be a solution.
But.
In general, we are are striving to use size_t for all objects that live in
memory, and that is a long term thing.
Careful review is needed, and that is what you just did.

If we look at this code again:
while (0 < n && isspace(*src)) {
  strbuf_addch(dest, *src);
  src++;
  n--;
}

When n is signed, it makes sense to use "0 < n".
However, if I think about it, it should work for unsigned as well,
wouldn't it ?
We can leave it as is, or replace it by

while (n && isspace(*src)) {
  strbuf_addch(dest, *src);
  src++;
  n--;
}




>
> On Fri, 22 Dec 2023 at 23:27, Torsten Bögershausen <tboegi@web.de> wrote:
> >
> > On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> > > From: Chandra Pratap <chandrapratap3519@gmail.com>
> > >
> > > Replace int with size_t for clarity and remove the
> > > 'NEEDSWORK' tag associated with it.
> > >
> > > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> > > ---
> > >     sideband.c: replace int with size_t for clarity
> > >
> > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> > > Pull-Request: https://github.com/gitgitgadget/git/pull/1625
> > >
> > >  sideband.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/sideband.c b/sideband.c
> > > index 6cbfd391c47..1599e408d1b 100644
> > > --- a/sideband.c
> > > +++ b/sideband.c
> > > @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
> > >   * of the line. This should be called for a single line only, which is
> > >   * passed as the first N characters of the SRC array.
> > >   *
> > > - * NEEDSWORK: use "size_t n" instead for clarity.
> > >   */
> > > -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> > > +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
> > >  {
> > >       int i;
> > >
> >
> > Thanks for working on this.
> > There is, however, more potential for improvements.
> > First of all: the "int i" from above.
> > Further down, we read
> >         for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> >
> > However, a size of an array can never be negative, so that
> > an unsigned data type is a better choice than a signed.
> > And, arrays can have more elements than an int can address,
> > at least in theory.
> > For a reader it makes more sense, to replace
> > int i;
> > with
> > size_t i;
> >
> >
> > And further down, there is another place for improvments:
> >
> >                 int len = strlen(p->keyword);
> >                 if (n < len)
> >                         continue;
> >
> > Even here, a strlen is never negative. And a size_t is the choice for len,
> > since all "modern" implementations declare strlen() to return size_t
> >
> > Do you think that you can have a look at these changes ?
> >
> > I will be happy to do a review, and possibly other people as well.
>

Copy link

gitgitgadget bot commented Dec 22, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

Torsten Bögershausen <tboegi@web.de> writes:

Just this part.

> Further down, we read
> 	for (i = 0; i < ARRAY_SIZE(keywords); i++) {
>
> However, a size of an array can never be negative, so that
> an unsigned data type is a better choice than a signed.
> And, arrays can have more elements than an int can address,
> at least in theory.
> For a reader it makes more sense, to replace
> int i;
> with
> size_t i;

It is a very good discipline to use size_t to index into an array
whose size is externally controled (e.g., we slurp what the end user
or the server on the other end of the connection gave us into a
piece of memory we allocate) to avoid integer overflows as "int" is
often narrower than "size_t".  But this particular one is a Meh; the
keywords[] is a small hardcoded array whose size and contents are
totally under our control.

@Chand-ra
Copy link
Author

/submit

Copy link

gitgitgadget bot commented Dec 23, 2023

Submitted as pull.1625.v2.git.1703351016486.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1625/Chand-ra/dusra-v2

To fetch this version to local tag pr-1625/Chand-ra/dusra-v2:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1625/Chand-ra/dusra-v2

Copy link

gitgitgadget bot commented Dec 26, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Chandra Pratap via GitGitGadget" <gitgitgadget@gmail.com> writes:

> -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)

Changing the type of the paramter alone might be a good start, but
does not really help anybody, as (1) the callers are not taught to
handle wider integral types for the values they pass and (2) the
function uses "int len" it computes internally to compare with "n".

There are three callers in demultiplex_sideband(), one of whose
paramters is "int len" and is passed to this function in one of
these calls.  Among the other two, one uses "int linelen" derived
from scanning the piece of memory read from sideband via strpbrk(),
and the other uses strlen(b) which is the length of the "remainder"
of the same buffer after the loop that processes one line at a time
using the said strpbrk() is done with the complete lines in the
early part.

The buffer involved in all of the above stores bytes taken from a
packet received via the pkt-line interface, which is capable of
transferring only 64kB at a time.

I _think_ the most productive use of our time is to replace the
NEEDSWORK with a comment saying why it is fine to use "int" here to
avoid tempting the next developer to come up with this patch
again---they will waste their time to do so without thinking it
through if we leave the (incomplete) NEEDSWORK comment, I am afraid.

@Chand-ra
Copy link
Author

/submit

Copy link

gitgitgadget bot commented Dec 27, 2023

Submitted as pull.1625.v3.git.1703672407895.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1625/Chand-ra/dusra-v3

To fetch this version to local tag pr-1625/Chand-ra/dusra-v3:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1625/Chand-ra/dusra-v3

Copy link

gitgitgadget bot commented Dec 27, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Chandra Pratap via GitGitGadget" <gitgitgadget@gmail.com> writes:

>       - * NEEDSWORK: use "size_t n" instead for clarity.
>      ++ * It is fine to use "int n" here instead of "size_t n" as all calls to this
>      ++ * function pass an 'int' parameter.

This does not sound like a sufficient justification, though.

We should also explain why "int" is good enough for these callers.
Otherwise, using size_t throughout the callchain would become
another viable solution.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
@Chand-ra
Copy link
Author

/submit

Copy link

gitgitgadget bot commented Dec 28, 2023

Submitted as pull.1625.v4.git.1703750460527.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-1625/Chand-ra/dusra-v4

To fetch this version to local tag pr-1625/Chand-ra/dusra-v4:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-1625/Chand-ra/dusra-v4

Copy link

gitgitgadget bot commented Dec 28, 2023

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Chandra Pratap via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Subject: Re: [PATCH v4] sideband.c: remove redundant 'NEEDSWORK' tag

The reason for removal is not that it was redundant and we said the
same thing elsewhere.  Rather, what it claimed to be necessary has
turned to be unwanted.  So something like

    Subject: sideband.c: update stale NEEDSWORK comment

    If we really wanted to change the type of the parameter to this
    function to "size_t", we should also update its callers to hold
    the values they use to compute the parameter also in "size_t".

    But in this callchain, "int" is wide enough.  Avoid tempting
    future developers into wasting their time on using "size_t"
    around this function.

or along that line would be more appropriate, perhaps?

Thanks.

> From: Chandra Pratap <chandrapratap3519@gmail.com>
>
> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> ---
>     sideband.c: replace int with size_t for clarity
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v4
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v4
> Pull-Request: https://github.com/gitgitgadget/git/pull/1625
>
> Range-diff vs v3:
>
>  1:  273415aa6a4 ! 1:  8c003256e5b sideband.c: remove redundant 'NEEDSWORK' tag
>      @@ sideband.c: void list_config_color_sideband_slots(struct string_list *list, cons
>         *
>       - * NEEDSWORK: use "size_t n" instead for clarity.
>       + * It is fine to use "int n" here instead of "size_t n" as all calls to this
>      -+ * function pass an 'int' parameter.
>      ++ * function pass an 'int' parameter. Additionally, the buffer involved in
>      ++ * storing these 'int' values takes input from a packet via the pkt-line
>      ++ * interface, which is capable of transferring only 64kB at a time.
>         */
>        static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
>        {
>
>
>  sideband.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/sideband.c b/sideband.c
> index 6cbfd391c47..266a67342be 100644
> --- a/sideband.c
> +++ b/sideband.c
> @@ -69,7 +69,10 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
>   * of the line. This should be called for a single line only, which is
>   * passed as the first N characters of the SRC array.
>   *
> - * NEEDSWORK: use "size_t n" instead for clarity.
> + * It is fine to use "int n" here instead of "size_t n" as all calls to this
> + * function pass an 'int' parameter. Additionally, the buffer involved in
> + * storing these 'int' values takes input from a packet via the pkt-line
> + * interface, which is capable of transferring only 64kB at a time.
>   */
>  static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
>  {
>
> base-commit: 1a87c842ece327d03d08096395969aca5e0a6996

Copy link

gitgitgadget bot commented Dec 28, 2023

This branch is now known as cp/sideband-array-index-comment-fix.

Copy link

gitgitgadget bot commented Dec 28, 2023

This patch series was integrated into seen via git@c14b910.

@gitgitgadget gitgitgadget bot added the seen label Dec 28, 2023
Copy link

gitgitgadget bot commented Dec 28, 2023

This patch series was integrated into seen via git@b06b488.

Copy link

gitgitgadget bot commented Jan 2, 2024

This patch series was integrated into seen via git@0cc53d1.

Copy link

gitgitgadget bot commented Jan 2, 2024

On the Git mailing list, Taylor Blau wrote (reply to this):

On Fri, Dec 22, 2023 at 11:01:37AM -0800, Junio C Hamano wrote:
> Torsten Bögershausen <tboegi@web.de> writes:
>
> Just this part.
>
> > Further down, we read
> > 	for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> >
> > However, a size of an array can never be negative, so that
> > an unsigned data type is a better choice than a signed.
> > And, arrays can have more elements than an int can address,
> > at least in theory.
> > For a reader it makes more sense, to replace
> > int i;
> > with
> > size_t i;
>
> It is a very good discipline to use size_t to index into an array
> whose size is externally controled (e.g., we slurp what the end user
> or the server on the other end of the connection gave us into a
> piece of memory we allocate) to avoid integer overflows as "int" is
> often narrower than "size_t".  But this particular one is a Meh; the
> keywords[] is a small hardcoded array whose size and contents are
> totally under our control.

I certainly agree in theory, though I've always erred on the side of
always using size_t for indexing into arrays, even if they're small. It
removes a potential pitfall if you are working with an
externally-controlled array and happen to forget to use size_t.

But if there is an existing index variable with type "int", and we can
easily validate that it's small, I probably wouldn't bother changing it
if I was editing nearby code.

Thanks,
Taylor

Copy link

gitgitgadget bot commented Jan 2, 2024

User Taylor Blau <me@ttaylorr.com> has been added to the cc: list.

Copy link

gitgitgadget bot commented Jan 3, 2024

This patch series was integrated into seen via git@b5089f4.

Copy link

gitgitgadget bot commented Jan 3, 2024

There was a status update in the "New Topics" section about the branch cp/sideband-array-index-comment-fix on the Git mailing list:

In-code comment fix.
source: <pull.1625.v4.git.1703750460527.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Jan 3, 2024

This patch series was integrated into seen via git@4d4b0a3.

Copy link

gitgitgadget bot commented Jan 4, 2024

This patch series was integrated into seen via git@4ce4e76.

Copy link

gitgitgadget bot commented Jan 5, 2024

This patch series was integrated into seen via git@3bfaa3f.

Copy link

gitgitgadget bot commented Jan 6, 2024

There was a status update in the "Cooking" section about the branch cp/sideband-array-index-comment-fix on the Git mailing list:

In-code comment fix.

Will merge to 'next'.
source: <pull.1625.v4.git.1703750460527.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Jan 9, 2024

This patch series was integrated into seen via git@101eee3.

Copy link

gitgitgadget bot commented Jan 9, 2024

This patch series was integrated into next via git@f906bc8.

@gitgitgadget gitgitgadget bot added the next label Jan 9, 2024
Copy link

gitgitgadget bot commented Jan 10, 2024

There was a status update in the "Cooking" section about the branch cp/sideband-array-index-comment-fix on the Git mailing list:

In-code comment fix.

Will merge to 'master'.
source: <pull.1625.v4.git.1703750460527.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Jan 10, 2024

This patch series was integrated into seen via git@605b24f.

Copy link

gitgitgadget bot commented Jan 12, 2024

There was a status update in the "Cooking" section about the branch cp/sideband-array-index-comment-fix on the Git mailing list:

In-code comment fix.

Will merge to 'master'.
source: <pull.1625.v4.git.1703750460527.gitgitgadget@gmail.com>

Copy link

gitgitgadget bot commented Jan 13, 2024

This patch series was integrated into seen via git@063d663.

Copy link

gitgitgadget bot commented Jan 16, 2024

This patch series was integrated into seen via git@6484eb9.

Copy link

gitgitgadget bot commented Jan 16, 2024

This patch series was integrated into master via git@6484eb9.

Copy link

gitgitgadget bot commented Jan 16, 2024

This patch series was integrated into next via git@6484eb9.

@gitgitgadget gitgitgadget bot added the master label Jan 16, 2024
@gitgitgadget gitgitgadget bot closed this Jan 16, 2024
Copy link

gitgitgadget bot commented Jan 16, 2024

Closed via 6484eb9.

@Chand-ra Chand-ra deleted the dusra branch May 10, 2024 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant