Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nxt_stpecpy(), and use it in isolation code to replace (and optimize) code that is hard to get right #805

Closed
wants to merge 2 commits into from

Conversation

alejandro-colomar
Copy link
Contributor

@alejandro-colomar alejandro-colomar commented Dec 10, 2022

/*
 * SYNOPSIS
 *      char *stpecpy(char *dst, char past_end[0], const char *restrict src);
 *
 * ARGUMENTS
 *      dst     Destination string.  Will be null-terminated after the
 *              call.
 *
 *      past_end
 *              Pointer to one past the last byte in the 'dst' buffer.
 *              This is used to truncate the output string if
 *              necessary.
 *
 *      src     Source string to copy from.  This string should be
 *              terminated with a NUL byte.
 *
 * DESCRIPTION
 *      This function is similar to stpcpy(3), but it truncates the
 *      output string as necessary.
 *
 *      You can chain calls to this function as you would do with
 *      stpcpy(3).  Only after the last call it is necessary to check
 *      for truncation (but don't forget to do so):
 *
 *          past_end = buf + sizeof(buf);
 *          p = buf;
 *          p = stpecpy(p, past_end, "foo");
 *          p = stpecpy(p, past_end, "bar");
 *          p = stpecpy(p, past_end, "baz");
 *          if (p == past_end) {
 *              --p;
 *              // Handle truncation.
 *          }
 *
 *      This function is an improvement over strscpy(9) and strlcpy(3)
 *      (and strlcat(3)), both in terms of usability and performance.
 *
 * RETURN VALUE
 *      This function returns a pointer to the 'end' of the destination
 *      string, that is, a pointer to its NUL byte.
 *
 *      However, if the destination string has been truncated, it
 *      returns a pointer to one past the end (past_end).
 *
 * ERRORS
 *      This function has no errors.  If strlcpy(3)-like behavior is
 *      desired, that is, if you want to crash if the input string is
 *      invalid (doesn't contain a terminating NUL byte), you may use
 *      stpecpyx() instead:
 *
 *      inline char *
 *      stpecpyx(char *dst, const char *restrict src, char past_end[0])
 *      {
 *          if (unlikely(src[strlen(src)] != '\0'))
 *              raise(SIGSEGV);
 *
 *          return stpecpy(dst, src, past_end);
 *      }
 *
 * CAVEATS
 *      Truncation is normally a bug, and you should detect it.  Only
 *      if a truncated string is not a problem, it makes sense to ignore
 *      the return value of this function.
 */

This function is in the public domain, and its design can be found in
the Codidact forum, in the following link.

Link: <https://software.codidact.com/posts/285946/287522#answer-287522>
Signed-off-by: Alejandro Colomar <alx@nginx.com>

@alejandro-colomar
Copy link
Contributor Author

alejandro-colomar commented Dec 10, 2022

BTW, before you ask, this is a perfect case for why restrict is good.

$ grepc nxt_stpecpy
./src/nxt_string.h:79:
NXT_EXPORT char *nxt_stpecpy(char *dst, char *past_end,
    const char *restrict src);


./src/nxt_string.c:154:
char *
nxt_stpecpy(char *dst, char past_end[0], const char *restrict src)
{
    char *p;

    if (dst == past_end) {
        return past_end;
    }

    p = memccpy(dst, src, '\0', past_end - dst);
    if (p != NULL) {
        return p - 1;
    }

    /* truncation detected */
    past_end[-1] = '\0';
    return past_end;
}

restrict tells the programmer that the first and second pointers to the function are pointers to the same object, while the last one points to a distinct object. It helps readability (at the same time that it allows the compiler to better understand the code, and also be able to warn about silly mistakes).

@alejandro-colomar alejandro-colomar marked this pull request as ready for review December 10, 2022 17:36
@alejandro-colomar
Copy link
Contributor Author

I think I will change the function prototype, to pass the past_end pointer next to its dst relative. It changes the "standard" way of putting the limits after all other parameters, but it would improve alignment and therefore readability, and also, there's no standard function accepting past_end, so I have a bit of freedom there, in therms of consistency.

@alejandro-colomar alejandro-colomar linked an issue Dec 10, 2022 that may be closed by this pull request
@alejandro-colomar
Copy link
Contributor Author

Cc: @igorsysoev @VBart @mar0x

/*
 * SYNOPSIS
 *      char *stpecpy(char *dst, char past_end[0],
 *                    const char *restrict src);
 *
 * ARGUMENTS
 *      dst     Destination string.  Will be null-terminated after the
 *              call.
 *
 *      past_end
 *              Pointer to one past the last byte in the 'dst' buffer.
 *              This is used to truncate the output string if
 *              necessary.
 *
 *      src     Source string to copy from.  This string should be
 *              terminated with a NUL byte.
 *
 * DESCRIPTION
 *      This function is similar to stpcpy(3), but it truncates the
 *      output string as necessary.
 *
 *      You can chain calls to this function as you would do with
 *      stpcpy(3).  Only after the last call it is necessary to check
 *      for truncation (but don't forget to do so):
 *
 *          past_end = buf + sizeof(buf);
 *          p = buf;
 *          p = stpecpy(p, past_end, "foo");
 *          p = stpecpy(p, past_end, "bar");
 *          p = stpecpy(p, past_end, "baz");
 *          if (p == past_end) {
 *              --p;
 *              // Handle truncation.
 *          }
 *
 *      This function is an improvement over strscpy(9) and strlcpy(3)
 *      (and strlcat(3)), both in terms of usability and performance.
 *
 * RETURN VALUE
 *      This function returns a pointer to the 'end' of the destination
 *      string, that is, a pointer to its NUL byte.
 *
 *      However, if the destination string has been truncated, it
 *      returns a pointer to one past the end (past_end).
 *
 * ERRORS
 *      This function has no errors.  If strlcpy(3)-like behavior is
 *      desired, that is, if you want to crash if the input string is
 *      invalid (doesn't contain a terminating NUL byte), you may use
 *      stpecpyx() instead:
 *
 *      inline char *
 *      stpecpyx(char *dst, const char *restrict src, char past_end[0])
 *      {
 *          if (unlikely(src[strlen(src)] != '\0'))
 *              raise(SIGSEGV);
 *
 *          return stpecpy(dst, src, past_end);
 *      }
 *
 * CAVEATS
 *      Truncation is normally a bug, and you should detect it.  Only
 *      if a truncated string is not a problem, it makes sense to ignore
 *      the return value of this function.
 */

This function is in the public domain, and its design can be found in
the Codidact forum, in the following link.

Link: <https://software.codidact.com/posts/285946/287522#answer-287522>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
Calling snprintf(3) for concatenating strings is _unsafe_.  The affected
code in this patch has had a bug in two different implementations before
finally being fixed.  That can only be attributed to the lack of an
appropriate API in the C standard library for copying strings with
truncation.  snprintf(3) was being used because there is nothing better.

Let's use the newly added nxt_stpecpy(), which has a simpler interface.

As a side effect, this optimizes the code, since this API is optimal
for copying strings with truncation (for example, it removes calls to
strlen(3); not calling formatting functions also removes overhead).

Closes: <#804>
Signed-off-by: Alejandro Colomar <alx@nginx.com>
@alejandro-colomar
Copy link
Contributor Author

I'm closing temporarily until the next release.

@ac000
Copy link
Member

ac000 commented Dec 11, 2022

I do object to calling the current code unsafe.

Can you point out how in its current form it's unsafe?

Or do you really mean, it's hard to get right?

What about the other dozen or so uses of snprintf(3)?

@alejandro-colomar alejandro-colomar changed the title Add nxt_stpecpy(), and use it in isolation code to replace (and optimize) unsafe code Add nxt_stpecpy(), and use it in isolation code to replace (and optimize) code that is hard to get right Dec 11, 2022
@alejandro-colomar
Copy link
Contributor Author

alejandro-colomar commented Dec 11, 2022

I do object to calling the current code unsafe.

There are different levels of unsafety, and they are different to different people.

Can you point out how in its current form it's unsafe?

There are no bugs or vulnerabilities in your code, if that's the question.

Or do you really mean, it's hard to get right?

Yes, I meant that, which to me is the same as unsafe. I'm sorry if this usage of the term caused any confusion/troubles, and I reworded to clarify.

What about the other dozen or so uses of snprintf(3)?

Oh, there are a few that are really unsafe, for any meaning of unsafe, including nasal demons https://stackoverflow.com/a/45395334. I only had a fast look at those other snprintf(3) calls and found several bugs (UB). Reported here: #795 (comment).

Cheers,
Alex

@hongzhidao
Copy link
Contributor

As Andrew mentioned, the feature of cgroup isolation has been done and there might be no changes next.
Andrew has done a great job, if there is no issue, why not just keep them unchanged?
For any C project, they are not perfect. In my feelings, it's better to focus on new features. That's my two cents.
But maybe others have better ideas. @lcrilly @tippexs @ac000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

snprintf(3) is hard to get right for concatenating strings
3 participants