Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formal verification of your test suite #641

Closed
jakub-zwolakowski opened this issue Jul 8, 2020 · 5 comments
Closed

Formal verification of your test suite #641

jakub-zwolakowski opened this issue Jul 8, 2020 · 5 comments

Comments

@jakub-zwolakowski
Copy link

Hi, I'm Jakub from TrustInSoft, an advanced source code analyzer publisher for C and C++. I set up TrustInSoft CI on your tests: https://ci.trust-in-soft.com/projects/jakub-zwolakowski/json-c/3.

I have found few minor issues, all of them probably quite harmless. This tool relies on formal methods and it is able to detect the most subtle Undefined Behaviors, both significant and harmless, so these results are particularly impressive!

  1. Index out of bound (in file 'json_object.c' line 1284)

  2. Invalid pointer comparison (in file 'linkhash.c' line 634)

  3. Incorrect printf format (in file 'json_util.c' line 76, caused by the call to the function '_json_c_set_last_err' in file 'json_object.c' line 1636)

  4. Invalid argument in a call to 'calloc' (in file 'json_tokener.c' line 130).

Can you let me know if you find those findings interesting?

@jakub-zwolakowski
Copy link
Author

jakub-zwolakowski commented Jul 8, 2020

More details about the issues and potential fixes (see https://ci.trust-in-soft.com/projects/jakub-zwolakowski/json-c):

  1. Index out of bound (in file 'json_object.c' line 1284)

    The the field 'idata' of the union field 'c_string' of the struct 'jso' (i.e. 'jso->c_string.idata') is declared with length equal 1 (more exactly: 'char idata[1];' in the declaration of 'struct json_object_string'). And here it is written at index 'len', with variable 'len' larger than 0. This kind of access causes Undefined Behavior, even if the real length of memory allocated underneath is sufficient. The potential fix is to simply replace the 'idata' field's length by 0, i.e. 'char idata[0];'. This is strictly speaking forbidden according to the C standard (C ISO 6.7.6.2p1: Array declarators - Constraints: "If the expression is a constant expression, it shall have a value greater than zero." ), but allowed by a gcc extension (see https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html).

  2. Invalid pointer comparison in file 'linkhash.c' line 634

    The result of the pointer comparison 't->table[n].k == LH_EMPTY' (where 'LH_EMPTY' is equal to '(void *)-1') is not well defined and may differ on different architectures (as the result of converting an integer value to a pointer type is implementation-defined according to C ISO 6.3.2.3p5). The potential fix is to define a new variable (e.g. 'extern int tis_lh_empty;') and use the address of this variable instead of a constant (so '#define LH_EMPTY (void *)(&tis_lh_empty)'). Something similar could be done for the other constant 'LH_FREED'.

  3. Incorrect printf format in file 'json_util.c' line 76

    This is caused by the call to the function '_json_c_set_last_err' in file 'json_object.c' line 1636. The format string passed as an argument here contains the format specifier '%p', which expects a 'void *' pointer value, and the provided value is actually of a 'json_object_to_json_string_fn' function pointer type. The behavior of 'vsnprintf' (called in file 'json_util.c' line 76) in this case is undefined. Casting the pointer to 'void *' solves the problem.

  4. Invalid argument in a call to 'calloc' in file 'json_tokener.c' line 130

    The 'calloc' function is called with the value of variable 'depth' passed as its first formal argument 'nmemb', and this value happens to be equal '-2' in some execution paths. This might be a security risk, but it is extremely difficult to exploit. And adding a simple check (e.g. 'if (depth <= 0) return NULL;') before the call to 'calloc' would be enough to fix it. Although maybe the fact that 'depth' equal '-2' can happen at this point of program is involuntary and there is a functional-level problem hiding somewhere around, which you might find worth investigating upon seeing this.

NOTE: Again, all these problems are pretty innocuous and probably don't have any dangerous consequences at this time. However, if you would like to render your code fully standard-compliant (and be able to use TrustInSoft CI to automatically detect potential serious issues in the future) you might want to consider applying these modification suggestions.

@hawicz
Copy link
Member

hawicz commented Jul 11, 2020

  1. gcc isn't the only compiler, so simply depending on the gcc extension is insufficient. Theoretically, we could use the C99 syntax (char idata[]) but I'm not sure how well that's supported on systems using e.g. VS2010. This would require further investigation to figure out what actually to do.

  2. This seems like it's barely worth doing, since we don't really care what the result of converting the integer to a pointer actually, just that it's consistent. One problem is that making this change would require exporting a couple more symbols, which I'd prefer not to do. I think we'll skip this one.

  3. eh.. I suppose adding the cast isn't a big deal. Will do. (6542d33)

  4. This "fix" makes no sense. If you're going to intentionally call json_tokener_new_ex() with a negative value, well that's the calling code's problem, not json-c's.

hawicz added a commit that referenced this issue Jul 11, 2020
@hawicz
Copy link
Member

hawicz commented Jul 11, 2020

Oh, the other problem with #2 is that, even if we reuse some existing exported symbol (e.g. #define LH_EMPTY (void *)(&lh_table_new)) existing already-compiled code could still be using the old values, and thus this would be an ABI change. I noted this as a possible change to include in a "1.0" major release.

@jakub-zwolakowski
Copy link
Author

What concerns the issue 1: I understand your concern about the portability. Unfortunately the solution you propose (char idata[]) is not better in this case: using a flexible array member in a union is not allowed by the C standard (C11 6.7.2.1p18), but also it is not even a gcc extension (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53548). So another workaround is necessary.

Also: do you think that testing/deploying TrustInSoft CI on json-c to guarantee no Undefined Behavior might be beneficial for the project? What do you think about this sample of the results that the tool can provide? We’re currently testing TrustInSoft CI on public projects on GitHub so any feedback is greatly appreciated.

@hawicz
Copy link
Member

hawicz commented Jul 14, 2020

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

As a special case, the last element of a structure with more than one named member may have an incomplete array type; ...

Ah, I see what you mean, that only applies to a struct, not a union. hmm... If we were to swap the order of the fields and make it an unnamed union, then paragraph 13 applies:

The members of an anonymous structure or union are considered to be members of the containing structure or union
and then idata is the last element of the structure, and thus compilers "should" be perfectly fine with it.

However, I feel like practically speaking making that change would cause more problems, since incomplete array types are less likely to be handled naturally than an array type with some actual size. In short, I think we should leave it as-is, with char idata[1].

A similar consideration or practicality applies for whether I think using TrustinSoft would be beneficial. Though I might get a warm fuzzy feeling having some kind of check against UB, whether the code actually works and whether it is written in a way that doesn't compromise the readability just to gain a "pass" on that check is more important. Reports of newly introduced UB would be welcome, since that stands a good chance of indicating areas that might need careful analysis, but I won't be considering all such potential UB as something that must be changes/fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants