Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http: fix use-of-uninitialized-value bug #1295

Merged
merged 1 commit into from
Jun 14, 2024

Conversation

pkillarjun
Copy link
Contributor

oss-fuzz Issue 68458.
My Old PR #1292

@pkillarjun
Copy link
Contributor Author

pkillarjun commented May 29, 2024

What Happened:
I was trying to change the commit message in #1292.

Suddenly, GitHub closed my PR. After that, I force-updated it, which corrupted the git history for that branch and uploaded junk in my PR.

I did fix everything, but I can't reopen that branch, thanks github. isaacs/github#361

I did find found one fix, but before that, I deleted my branch and recreated it. Now I can't use this fix, it seems.

@ac000 ac000 self-requested a review May 29, 2024 19:07
Comment on lines 1182 to 1188

lhq.replace = 0;
lhq.proto = &nxt_http_fields_hash_proto;
lhq.pool = NULL;

for (i = 0; i < count; i++) {
key = NXT_HTTP_FIELD_HASH_INIT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed...

lhq.proto.alloc = nxt_lvlhsh_alloc

Which is

void *
nxt_lvlhsh_alloc(void *data, size_t size)
{
    return nxt_memalign(size, size);
}

Where lhq.pool (data) is not used.

Even with this

nxt_lvlhsh_convert_bucket_to_level()
{
    ...
    proto = lhq->proto;
    ...
    q.proto = proto;                                                        
    q.pool = lhq->pool;
    ...
}

lhq.pool is not used with nxt_lvlhsh_alloc()

It's also not used by

void                                                                            
nxt_lvlhsh_free(void *data, void *p)                                                                            
{                                                                               
    nxt_free(p);                                                                
}

Copy link
Contributor Author

@pkillarjun pkillarjun May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it. lhq->pool is passed through these functions for ??(probably nginx coding style).

so is it a bug in MSan?
Not actually, MSan gives the responsibility to the caller for initializing all the parameters for the callee.

This particular MSan error is a fuzz-blocker for fuzz_http.

There are currently three fixes.

  • The best, of course is to merge this PR: http: fix use-of-uninitialized-value bug #1295.
  • Disabling fuzz_http for MSan, I don't think that is a good idea.
  • Run this PR as a patch in oss-fuzz.sh compiling script, or run disable_msn.patch patch in oss-fuzz.sh.

disable_msn.patch

index 7a8b3dd..025827b 100644
--- a/src/nxt_lvlhsh.c
+++ b/src/nxt_lvlhsh.c
@@ -284,7 +284,7 @@ nxt_lvlhsh_insert(nxt_lvlhsh_t *lh, nxt_lvlhsh_query_t *lhq)
 }
 
 
-static nxt_int_t
+__attribute__((no_sanitize("memory"))) static nxt_int_t
 nxt_lvlhsh_new_bucket(nxt_lvlhsh_query_t *lhq, void **slot)
 {
     uint32_t  *bucket;
@@ -448,7 +448,7 @@ nxt_lvlhsh_bucket_insert(nxt_lvlhsh_query_t *lhq, void **slot, uint32_t key,
 }
 
 
-static nxt_int_t
+__attribute__((no_sanitize("memory"))) static nxt_int_t
 nxt_lvlhsh_convert_bucket_to_level(nxt_lvlhsh_query_t *lhq, void **slot,
     nxt_uint_t nlvl, uint32_t *bucket)
 {

Also a very simple question: Does this initialization of value break anything in this code-base?

Copy link
Contributor Author

@pkillarjun pkillarjun May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is like c standards or -Wextra. It's not important(Actually, they are important; you don't need them to perform some CPU cycles), but it's good to use it.

src/nxt_http_parse.c Outdated Show resolved Hide resolved
@ac000
Copy link
Member

ac000 commented May 30, 2024

I get it. lhq->pool is passed through these functions for unbeknownst reason.

Not unknown actually. It's simply the prototype for these functions

typedef void *(*nxt_lvlhsh_alloc_t)(void *ctx, size_t size);
typedef void (*nxt_lvlhsh_free_t)(void *ctx, void *p);

Also a very simple question: Does this initialization of value break anything in this code-base?

I can't say with 100% certainty, but I'd doubt it.

However, while I'm not against doing things to appease tools, I want to fully understand why (and it needs to be fully explained in the commit message).

There are of course numerous reasons not to just go initialising everything, but one is that it could hide legitimate bugs.

In this case setting lhq.pool to NULL also seems dubious, as this parameter, for functions that use it, points to an area of memory for dynamic allocation.

On the other hand setting it to NULL could simply mean it's not used... I'm really in two minds about this...

So I have a couple of questions

  1. You say that the second patch hunk wasn't important, why is that?
  2. Similarly, why is it important in the first case? Also I'm curious how exactly we get into the situation where lhq.pool is being used when it shouldn't?

Hmm, is it simply the act of passing lhq.pool (uninitialised) to the alloc() function, even though it isn't used, that it's complaining about?

Actually I'm wondering why we don't get a -Wunused-parameter warning from

void *
nxt_lvlhsh_alloc(void *data, size_t size)
{
    return nxt_memalign(size, size);
} 

for the data parameter...

Oh, damn! we're compiling with -Wno-unused-parameter, I wonder what hell breaks loose if we don't...

I would have no issues with marking data as __attribute__((unused)), if that would be taken into account by the sanitizer?

This does however seem a bit of a misnomer

SUMMARY: MemorySanitizer: use-of-uninitialized-value /src/unit/src/nxt_lvlhsh.c:292:14 in nxt_lvlhsh_new_bucket

As we never actually tried to use it. Like it's valid to get a pointer to 1-past the end of an array, as long to you don't dereference it....

If we had actually tried to use it in the alloc() function, then sure... So in this case, I do think the sanitizer needs to be smarter...

@pkillarjun
Copy link
Contributor Author

pkillarjun commented May 30, 2024

You say that the second patch hunk wasn't important, why is that?

It was overwork. That 'bug' did not affect my fuzzers in any way, shape, or form, so I reverted it.

Similarly, why is it important in the first case? Also I'm curious how exactly we get into the situation where lhq.pool is being used when it shouldn't?

I don't think I understand your question, but I think you are asking about nxt_http_fields_hash;

Hmm, is it simply the act of passing lhq.pool (uninitialised) to the alloc() function, even though it isn't used, that it's complaining about?

Da, Simple sample to reproduce:

file.c
#include <stdio.h>

void test_func(int idx);

int
main(void) {
    int idx;

    test_func(idx);
    return 0;
}

void
test_func(int idx) {
    idx = 0;

    printf("The init is: %d\n", idx);
    return;
}
compile
clang \
    -O0 -fno-omit-frame-pointer -gline-tables-only -Wno-error=enum-constexpr-conversion \
    -Wno-error=incompatible-function-pointer-types -Wno-error=int-conversion -Wno-error=deprecated-declarations \
    -Wno-error=implicit-function-declaration -Wno-error=implicit-int -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION \
    -fsanitize=memory -fsanitize-memory-track-origins \
    file.c -o file

./file

I would have no issues with marking data as attribute((unused)), if that would be taken into account by the sanitizer?

This does however seem a bit of a misnomer

First, it's a bad idea to add it in production because it only works with Clang, and other compilers will probably throw an error.
Second, it's:

__attribute__((no_sanitize("memory"))) static nxt_int_t
 nxt_lvlhsh_new_bucket(nxt_lvlhsh_query_t *lhq, void **slot)

__attribute__((no_sanitize("memory"))) static nxt_int_t
 nxt_lvlhsh_convert_bucket_to_level(nxt_lvlhsh_query_t *lhq, void **slot,

@ac000
Copy link
Member

ac000 commented May 30, 2024

I would have no issues with marking data as attribute((unused)), if that would be taken into account by the sanitizer?
This does however seem a bit of a misnomer

First, it's a bad idea to add it in production because it only works with Clang, and other compilers will probably throw an error.

__attribute__((unused)) is a bad idea?

It works perfectly fine in GCC... we already use it in Unit... I've been using it with GCC for years...

Clang and GCC are currently all we care about... (I just recently removed support for all kinds of weird compilers!).

Second, it's:

__attribute__((no_sanitize("memory"))) static nxt_int_t
 nxt_lvlhsh_new_bucket(nxt_lvlhsh_query_t *lhq, void **slot)

__attribute__((no_sanitize("memory"))) static nxt_int_t
nxt_lvlhsh_convert_bucket_to_level(nxt_lvlhsh_query_t *lhq, void **slot,

That's a little heavy handed, you only want to avoid the sanitizer for certain alloc() functions...

However I'm starting to lean towards applying this. It certainly needs a full explanation in the commit message however, which I'm happy to write, (including about if the compiler was smart enough it'd give a -Wuninitialized warning)...

For example it will catch this

/* u.c */

struct t {
	void *p;
	int len;
};

static void test(void *p __attribute__((unused)), int len)
{
	(void)len;
}

int main(void)
{
	struct t t;

	t.len = 42;
	test(t.p, t.len);

	return 0;
}
$ gcc -Wall -Wextra -O0 u.c
u.c: In function ‘main’:
u.c:16:9: warning: ‘t.p’ is used uninitialized [-Wuninitialized]
   16 |         test(t.p, t.len);
      |         ^~~~~~~~~~~~~~~~
u.c:13:18: note: ‘t’ declared here
   13 |         struct t t;
      |                  ^
$

But not this

/* u2.c */

struct t {
        void *p;
        int len;
};

static void _test(void *p __attribute__((unused)), int len)
{
        (void)len;
}

static void test(struct t *t)
{
        _test(t->p, t->len);
}

int main(void)
{
        struct t t;

        t.len = 42;
        test(&t);

        return 0;
}
$ gcc -Wall -Wextra -O0 u2.c
$

@pkillarjun
Copy link
Contributor Author

pkillarjun commented May 31, 2024

attribute((unused)) is a bad idea?

It works perfectly fine in GCC... we already use it in Unit... I've been using it with GCC for years...

Clang and GCC are currently all we care about... (I just recently removed support for all kinds of weird compilers!).

Microsoft(I don't care about them either, but it's just a thought.) ?

That's a little heavy handed, you only want to avoid the sanitizer for certain alloc() functions...

The patch would be something like this:

diff --git a/src/nxt_lvlhsh.c b/src/nxt_lvlhsh.c
index 7a8b3dd..5821446 100644
--- a/src/nxt_lvlhsh.c
+++ b/src/nxt_lvlhsh.c
@@ -284,7 +284,7 @@ nxt_lvlhsh_insert(nxt_lvlhsh_t *lh, nxt_lvlhsh_query_t *lhq)
 }
 
 
-static nxt_int_t
+MSAN static nxt_int_t
 nxt_lvlhsh_new_bucket(nxt_lvlhsh_query_t *lhq, void **slot)
 {
     uint32_t  *bucket;
@@ -448,7 +448,7 @@ nxt_lvlhsh_bucket_insert(nxt_lvlhsh_query_t *lhq, void **slot, uint32_t key,
 }
 
 
-static nxt_int_t
+MSAN static nxt_int_t
 nxt_lvlhsh_convert_bucket_to_level(nxt_lvlhsh_query_t *lhq, void **slot,
     nxt_uint_t nlvl, uint32_t *bucket)
 {
diff --git a/src/nxt_main.h b/src/nxt_main.h
index 7880e55..929a3db 100644
--- a/src/nxt_main.h
+++ b/src/nxt_main.h
@@ -169,4 +169,10 @@ NXT_EXPORT extern nxt_task_t    nxt_main_task;
 NXT_EXPORT extern nxt_atomic_t  nxt_task_ident;
 
 
+#if defined(__has_feature) && __has_feature(memory_sanitizer)
+    #define MSAN __attribute__((no_sanitize("memory"))) 
+#else
+#define MSAN
+#endif
+
 #endif /* _NXT_LIB_H_INCLUDED_ */

MSAN can be replaced with __no_instrument, __no_instrument can be used with ASAN, UBSAN and TSan.

However I'm starting to lean towards applying this. It certainly needs a full explanation in the commit message however, which I'm happy to write, (including about if the compiler was smart enough it'd give a -Wuninitialized warning)...

Well __attribute__((unused)) isn't working, it still gives the same error.

@ac000
Copy link
Member

ac000 commented May 31, 2024

attribute((unused)) is a bad idea?
It works perfectly fine in GCC... we already use it in Unit... I've been using it with GCC for years...
Clang and GCC are currently all we care about... (I just recently removed support for all kinds of weird compilers!).

Microsoft(I don't care about them either, but it's just a thought.) ?

Well, we don't run on Windows and Mircrosofts C compilers have historically at least been pretty terrible from what I hear.

That's a little heavy handed, you only want to avoid the sanitizer for certain alloc() functions...

The patch would be something like this:
...
MSAN can be replaced with __no_instrument, __no_instrument can be used with ASAN, UBSAN and TSan.

No I mean disabling sanitization for these functions when it's only some of the alloc() functions that need to be skipped, which doesn't seem possible...

However I'm starting to lean towards applying this. It certainly needs a full explanation in the commit message however, which I'm happy to write, (including about if the compiler was smart enough it'd give a -Wuninitialized warning)...

While __attribute__((unused)) is working for Msan, it still gives the same error.

No, I mean the setting of lhq.pool to NULL patch...

@pkillarjun
Copy link
Contributor Author

pkillarjun commented May 31, 2024

No I mean disabling sanitization for these functions when it's only some of the alloc() functions that need to be skipped, which doesn't seem possible...

Oh i Misread you. Yes, a line it not possible.

No, I mean the setting of lhq.pool to NULL patch...

Alright.

It certainly needs a full explanation in the commit message however, which I'm happy to write

You can do it, and thanks for the help;

@pkillarjun
Copy link
Contributor Author

Can you merge this PR and add a good commit message?

@ac000
Copy link
Member

ac000 commented Jun 4, 2024 via email

@ac000
Copy link
Member

ac000 commented Jun 5, 2024

Added a commit message

$ git range-diff 92739f79...f1b35623
1:  92739f79 ! 1:  f1b35623 http: fix use-of-uninitialized-value bug
    @@ Metadata
      ## Commit message ##
         http: fix use-of-uninitialized-value bug
     
    +    This was found via MSan.
    +
    +    In nxt_http_fields_hash() we setup a nxt_lvlhsh_query_t structure and
    +    initialise a couple of its members.
    +
    +    At some point we call
    +
    +      lhq->proto->alloc(lhq->pool, nxt_lvlhsh_bucket_size(lhq->proto));
    +
    +    Which in this case is
    +
    +      void *
    +      nxt_lvlhsh_alloc(void *data, size_t size)
    +      {
    +          return nxt_memalign(size, size);
    +      }
    +
    +    So even though lhq.ppol wasn't previously initialised we don't actually
    +    use it in that particular function.
    +
    +    However MSan triggers on the fact that we are passing an uninitialised
    +    value into that function.
    +
    +    Indeed, compilers will generally complain about such things, e.g
    +
    +      /* u.c */
    +      struct t {
    +            void *p;
    +            int len;
    +      };
    +
    +      static void test(void *p __attribute__((unused)), int len)
    +      {
    +            (void)len;
    +      }
    +
    +      int main(void)
    +      {
    +            struct t t;
    +
    +            t.len = 42;
    +            test(t.p, t.len);
    +
    +            return 0;
    +      }
    +
    +    GCC and Clang will produce a -Wuninitialized warning.
    +
    +    But they won't catch the following...
    +
    +      /* u2.c */
    +      struct t {
    +              void *p;
    +              int len;
    +      };
    +
    +      static void _test(void *p __attribute__((unused)), int len)
    +      {
    +              (void)len;
    +      }
    +
    +      static void test(struct t *t)
    +      {
    +              _test(t->p, t->len);
    +      }
    +
    +      int main(void)
    +      {
    +              struct t t;
    +
    +              t.len = 42;
    +              test(&t);
    +
    +              return 0;
    +      }
    +
    +    Which is why we don't get a compiler warning about lhq.pool.
    +
    +    In this case initialising lhg.pool even though we don't use it here
    +    seems like the right thing to do and maybe compilers will start being
    +    able to catch these in the future.
    +
    +    Actually GCC with -fanalyzer does catch the above
    +
    +      $ gcc -Wall -Wextra -O0 -fanalyzer u2.c
    +      u2.c: In function ‘test’:
    +      u2.c:15:9: warning: use of uninitialized value ‘*t.p’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
    +         15 |         _test(t->p, t->len);
    +            |         ^~~~~~~~~~~~~~~~~~~
    +        ‘main’: events 1-3
    +          |
    +          |   18 | int main(void)
    +          |      |     ^~~~
    +          |      |     |
    +          |      |     (1) entry to ‘main’
    +          |   19 | {
    +          |   20 |         struct t t;
    +          |      |                  ~
    +          |      |                  |
    +          |      |                  (2) region created on stack here
    +          |......
    +          |   23 |         test(&t);
    +          |      |         ~~~~~~~~
    +          |      |         |
    +          |      |         (3) calling ‘test’ from ‘main’
    +          |
    +          +--> ‘test’: events 4-5
    +                 |
    +                 |   13 | static void test(struct t *t)
    +                 |      |             ^~~~
    +                 |      |             |
    +                 |      |             (4) entry to ‘test’
    +                 |   14 | {
    +                 |   15 |         _test(t->p, t->len);
    +                 |      |         ~~~~~~~~~~~~~~~~~~~
    +                 |      |         |
    +                 |      |         (5) use of uninitialized value ‘*t.p’ here
    +                 |
    +
         Signed-off-by: Arjun <pkillarjun@protonmail.com>
    +    Link: <https://clang.llvm.org/docs/MemorySanitizer.html>
    +    [ Commit message - Andrew ]
    +    Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
     
      ## src/nxt_http_parse.c ##
     @@ src/nxt_http_parse.c: nxt_http_fields_hash(nxt_lvlhsh_t *hash,

This was found via MSan.

In nxt_http_fields_hash() we setup a nxt_lvlhsh_query_t structure and
initialise a couple of its members.

At some point we call

  lhq->proto->alloc(lhq->pool, nxt_lvlhsh_bucket_size(lhq->proto));

Which in this case is

  void *
  nxt_lvlhsh_alloc(void *data, size_t size)
  {
      return nxt_memalign(size, size);
  }

So even though lhq.ppol wasn't previously initialised we don't actually
use it in that particular function.

However MSan triggers on the fact that we are passing an uninitialised
value into that function.

Indeed, compilers will generally complain about such things, e.g

  /* u.c */
  struct t {
  	void *p;
  	int len;
  };

  static void test(void *p __attribute__((unused)), int len)
  {
  	(void)len;
  }

  int main(void)
  {
  	struct t t;

  	t.len = 42;
  	test(t.p, t.len);

  	return 0;
  }

GCC and Clang will produce a -Wuninitialized warning.

But they won't catch the following...

  /* u2.c */
  struct t {
          void *p;
          int len;
  };

  static void _test(void *p __attribute__((unused)), int len)
  {
          (void)len;
  }

  static void test(struct t *t)
  {
          _test(t->p, t->len);
  }

  int main(void)
  {
          struct t t;

          t.len = 42;
          test(&t);

          return 0;
  }

Which is why we don't get a compiler warning about lhq.pool.

In this case initialising lhg.pool even though we don't use it here
seems like the right thing to do and maybe compilers will start being
able to catch these in the future.

Actually GCC with -fanalyzer does catch the above

  $ gcc -Wall -Wextra -O0 -fanalyzer u2.c
  u2.c: In function ‘test’:
  u2.c:15:9: warning: use of uninitialized value ‘*t.p’ [CWE-457] [-Wanalyzer-use-of-uninitialized-value]
     15 |         _test(t->p, t->len);
        |         ^~~~~~~~~~~~~~~~~~~
    ‘main’: events 1-3
      |
      |   18 | int main(void)
      |      |     ^~~~
      |      |     |
      |      |     (1) entry to ‘main’
      |   19 | {
      |   20 |         struct t t;
      |      |                  ~
      |      |                  |
      |      |                  (2) region created on stack here
      |......
      |   23 |         test(&t);
      |      |         ~~~~~~~~
      |      |         |
      |      |         (3) calling ‘test’ from ‘main’
      |
      +--> ‘test’: events 4-5
             |
             |   13 | static void test(struct t *t)
             |      |             ^~~~
             |      |             |
             |      |             (4) entry to ‘test’
             |   14 | {
             |   15 |         _test(t->p, t->len);
             |      |         ~~~~~~~~~~~~~~~~~~~
             |      |         |
             |      |         (5) use of uninitialized value ‘*t.p’ here
             |

Signed-off-by: Arjun <pkillarjun@protonmail.com>
Link: <https://clang.llvm.org/docs/MemorySanitizer.html>
[ Commit message - Andrew ]
Signed-off-by: Andrew Clayton <a.clayton@nginx.com>
@ac000
Copy link
Member

ac000 commented Jun 14, 2024

Rebased with master...

$ git range-diff f1b35623...04a24f61
-:  -------- > 1:  4fc50258 ci: Be more specific when to run the main Unit checks
-:  -------- > 2:  e77a0c16 Tests: explicitly specify 'f' prefix to format string before printing
-:  -------- > 3:  c9dced37 Tests: print unit.log on unsuccessful unmount
-:  -------- > 4:  98983f3f Add a GitHub discussions badge to the README
-:  -------- > 5:  a7e3686a Tools: improved error handling for unitc
-:  -------- > 6:  d7ec30c4 ci: Limit when to run checks on pull-requests
1:  f1b35623 = 7:  04a24f61 http: fix use-of-uninitialized-value bug

@ac000 ac000 self-requested a review June 14, 2024 14:10
@ac000 ac000 merged commit 04a24f6 into nginx:master Jun 14, 2024
21 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants