-
Notifications
You must be signed in to change notification settings - Fork 1.7k
CPP: Add query for CWE-758: Reliance on Implementation-Defined Behavior when using malloc with zero size #9088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ihsinme,
Thanks for another contribution. Here are my initial comments.
In general, these queries are getting fairly long and complex. Have you considered whether some of our more high-level libraries could be of use to you? For instance, we have a Guards library for figuring out which expression is guarding which statements. It seems like this is something you could use to simplify this query quite a bit.
For instance, this query could be restated as: "Find statements that are guarded by the result of a malloc
, but where the length argument hasn't been checked against zero".
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
exists(AssignBitwiseOperation ab | | ||
ab.getLValue().(VariableAccess).getTarget() = fc.getArgument(0).(VariableAccess).getTarget() and | ||
ab.getASuccessor*() = fc | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The QLDoc (and the predicate name) makes it sound like this predicate is looking for guards occurring before the function call fc
, but this case is looking for bitwise assignment operators. Can you add a comment why this is necessary?
Likewise, why do we need a case for Initializer
s and Assignment
s in this predicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps a more extended comment before the predicate would suffice, or make a comment on each block?
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
CI is complaining due to a non-ASCII character:
|
…cWithZeroSize.ql Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
…cWithZeroSize.ql Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
please tell me the error |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are my final review comments for this PR. If you intend to open a bounty for this PR, please do so after you've addressed these issues.
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.ql
Outdated
Show resolved
Hide resolved
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-758/semmle/tests/test.cpp
Show resolved
Hide resolved
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-758/semmle/tests/test.cpp
Show resolved
Hide resolved
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-758/semmle/tests/test.cpp
Show resolved
Hide resolved
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-758/semmle/tests/test.cpp
Outdated
Show resolved
Hide resolved
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-758/semmle/tests/test.cpp
Show resolved
Hide resolved
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-758/semmle/tests/test.cpp
Outdated
Show resolved
Hide resolved
…cWithZeroSize.ql Co-authored-by: Mathias Vorreiter Pedersen <mathiasvp@github.com>
cpp/ql/src/experimental/Security/CWE/CWE-758/DangerousUseMallocWithZeroSize.qhelp
Outdated
Show resolved
Hide resolved
Thanks for the suggestion. I looked at the library if you mean replace exists(IfStmt ifs, ComparisonOperation co, Expr ec |
ifs.getCondition().getAChild*() = co and
co.hasOperands(fc.getArgument(0).(VariableAccess).getTarget().getAnAccess(), ec) and
exists(Expr etmp |
etmp.getValue() = ["0", "1"] and
(
ec = etmp or
globalValueNumber(ec) = globalValueNumber(etmp)
)
)
) on the exists(GuardCondition guard, Expr exp |
exp.getValue() = ["0", "1"] and
guard.ensuresEq(globalValueNumber(fc.getArgument(0)).getAnExpr(), exp, _, _, _)
or
guard = globalValueNumber(fc.getArgument(0)).getAnExpr()
) then it won't allow to exclude comparisons anywhere, including after a call to malloc, excluding something like 289:
290: *bits_out = NULL;
291: *len_out = 0;
292: if (len == 0)
293: return ASN1_BAD_LENGTH;
294: unused = *asn1++;
295: len--;
296: if (unused > 7)
297: return ASN1_BAD_FORMAT;
298:
299: bits = malloc(len);
300: if (bits == NULL)
301: return ENOMEM;
302: memcpy(bits, asn1, len);
303: if (len > 1)
304: bits[len - 1] &= (0xff << unused);
305:
306: *bits_out = bits;
307: *len_out = len;
308: return 0;
309:} https://github.com/krb5/krb5/blob/master/src/lib/krb5/asn.1/asn1_encode.c#L299-L299 i did not understand how to make it easier to work with the guard, although I would like to note that it catches situations that I miss if(arg0) , I will think about them if there are many fp. I want to note that I will try to use more high-level libraries. I just really like to clean up the FP and forget about the beauty of the code, sorry. ( |
While I do like beautiful code, it's not just about beautiful code. Using higher-level libraries (like the guards library, or the dataflow library, or the range analysis library) will make your code more efficient (because those libraries are written by extremely competent QL writers) and easier for us to review (because the CodeQL reviewers know those libraries very well, and they capture the intended meaning of a piece of code much better than an ad-hoc list of AST classes). In any case, I appreciate you trying to use the Guards library in this case. For future reference: What I usually do is a mix of both. The standard library will give me most of the cases I care about, and I can add a couple of ad-hoc cases on top of that. |
{ | ||
char *ptr; | ||
int len; | ||
len = a+b; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a good test? Couldn't a
and b
be both 0
? I have very little experience with cpp, so please correct me if I'm wrong 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are absolutely right, this example is not unambiguous.
this came about as a compromise to clean up possible FP situations, and I thought it would be better to remove all possible addition operations, instead of complicating the query with different analysis situations.
but if you think this is a problem, I'll change it to a constant and tweak the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! If you want to leave the test as is, I'd suggest adding a small comment for a future reviewer not to have the same question :)
What is the reason behind |
good afternoon @MathiasVP. |
good afternoon @MathiasVP. |
Hi @ihsinme, Your latest changes look good to me! I see that the query doesn't have a whole lot of results on LGTM. It's difficult to say whether this is because people are super careful with their
This will likely find a lot more results (some of which will be FPs, but hopefully some of them will also be true positives). |
Hi @MathiasVP Thanks for the idea of query extension and separately for the implementation paths. what you suggest may have more impact than undefined behavior, i would like to stay in the original logic of this request. if you notice, I suggested expanding the query along the line of undefined behavior by adding two situations that are not related to
I'm hoping the request will pass the reward bar as it stands, but if that doesn't allow it to pass the reward score, then I'll consider what other simple situations of undefined behavior can be added. |
…definedBehavior.qhelp
…definedBehavior.qlref
cpp/ql/src/experimental/Security/CWE/CWE-758/SomeCasesUndefinedBehavior.ql
Fixed
Show fixed
Hide fixed
dangerousUseBufferAndSize(fc) | ||
or | ||
fc.getTarget().getAnAttribute().hasName("noreturn") and | ||
exists(ReturnStmt rs | fc.getTarget().getEntryPoint().getASuccessor+() = rs) and |
Check warning
Code scanning / CodeQL
Expression can be replaced with a cast
cpp/ql/src/experimental/Security/CWE/CWE-758/SomeCasesUndefinedBehavior.ql
Fixed
Show fixed
Hide fixed
( | ||
thisFunctionFread(etmp.(FunctionCall), erttmp) | ||
or | ||
erttmp.getValue() = "0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Code Scanning is saying here, you're not binding the etmp
variable in this case. This means that you'll get a cartesian product during execution, which will cause your analysis to blow up on large codebases. You should either:
- Bind
etmp
in this branch, or - Move this case out of the
exists
that bringsetmp
into scope.
Does that make sense?
Hi @ihsinme, Sorry I haven't gotten back to you on this. Here are some comments:
I don't think my suggestion changes the impact of the query. The strategy I sketched would still attempt to flag the cases you're looking for, but only do so when an attacker can activate the undefined behavior. The query is growing increasingly more general, and I think it's difficult to write proper documentation for a query that collects a bunch of possible undefined behavior tests. I'll leave it to the SecurityLab team to review the new direction of the query. If Security Lab decides that the query is still not finding interesting results I think we should reconsider whether this query is something we want to merge. |
Good afternoon @MathiasVP. long pauses complicate the analysis (((
In the near future I will correct the errors of the analyzer.
I'm not opposed to your suggestion, but my analysis shows that the ability to influence the amount of allocated memory by user input is very rare in projects with a large number of stars, and by extension of influence, I meant a critically large amount of memory allocation. but I'm ready to do it.
last time #9088 (comment) you said you liked the changes, now you think they are controversial because of the documentation and I need to offer a more detailed description?
this is a very controversial point, you are proposing to decide whether the request is suitable for a merge (not for an award) based on the result of SecurityLab team. this is strange, perhaps there will be few detections on their collection of projects, which only means that it will not pass the reward bar, but this everything can be useful to improve the code. At the same time, I repeat once again, I am not trying to challenge your decision, I am trying to understand it. |
Yeah, I haven't been keeping a good eye on this PR. I'm sorry. I'll make sure that it won't happen again!
Awesome. Thank you!
Alright. That's fair. This was mostly a suggestion for how to change the strategy of your query if that's what you wanted to do eventually.
Sorry if I didn't make this clear enough. I'm saying that I'm not too keen on the direction of this query. I really like the idea of a query that finds places where the argument of allocation functions could be But, in addition to the undefined behavior related to calling I think this should really be 3 queries: One query for Of course, all of these queries should demonstrate that they're finding meaningful results in real-world codebases.
Whether or not a query is suitable for being merged into the CodeQL repository depends on many factors. A few of them are:
For point 1 the analysis of the Security Lab team suggests that the query doesn't find many vulnerabilities in real-world code. For point 2, I think the query is growing increasingly complex, and for that reason, I'd really like to see some more good alerts from this query that warrant the complexity of the query. |
Thanks for the quick response.
I agree with the division of the request into several is good not only in this case. but I don't know if this can be considered as a valid request for a reward or for each small request one will have to look for a CVE
Does this mean that the result of running commands SecurityLab on the current version of the query is already known? |
Oh, no. Sorry. I meant from the first round of reviewing by the Security Lab. |
I agree. @ihsinme I'd suggest taking the basic and organized path here. Let's make this suggestion the way it was thought at first (finding places where the argument of allocation functions could be |
as far as I understand, we have come to the conclusion that @MathiasVP and @jorgectf want it to be a story about zero size, I don’t know the contents of the @jorgectf project collection, but I understand that there is hardly anything simple there, like user input in size. so they will probably tell me that the impact of the request is too low and does not comply with the rules ((( therefore, I would like to understand what point of the rules about rewards I would have violated if I had initially sent these three situations in the request combined in the description as undefined behavior, and did not single out one situation with malloc. perceiving them as such
Or am I misunderstanding everything and we are talking about splitting these situations into 3 different files and still continue to consider the request for a reward as one? @jorgectf in addition, it would be great to understand whether the results appeared in the current version of the query on your collection. because it has increased noticeably on mine. but last time I had a few detects and you didn't have any, so my collection is weaker. |
@ihsinme My last thought wasn't trying to avoid violating any rules, but to polish the contribution in an organized, basic, yet strong query. However, according to how you called the contribution: "CPP: Add query for CWE-758: Reliance on Implementation-Defined Behavior when using malloc with zero size", adding more queries unrelated to the specific objective of the contribution could be counterproductive, as the organized and clear submission of each one in a separated way could be rewarded with higher bounties than the way this contribution is being submitted. Moreover, if you try to make the contribution itself "stronger" by adding more queries with the objective of us finding more results (meaning that the query intended to be finding results does not find enough), that doesn't sound like fair play 🤔 Since this part of the discussion is unrelated to To sum up, with clear suggestions from @MathiasVP that this query as is might not be suitable for a contribution, and from Security Lab that this could be handled in a more organized way to aim for higher bounties, you could either:
Let me know your final decision. |
hello @MathiasVP
therefore, I would like to clarify that the analyzer could not have made a mistake in this place. also about detection |
QHelp previews: cpp/ql/src/experimental/Security/CWE/CWE-758/SomeCasesUndefinedBehavior.qhelpSome cases undefined behavior dangerous.The query considers some cases that can lead to undefined behavior. For example the behavior of the malloc function is not defined when the size value is zero, so it is not possible to evaluate the correctness only by the result of the operation. Add an estimate for the lower bound of the size. ExampleThe following example shows the erroneous and fixed ways to use the malloc function. ...
if (len > 256 || !(ptr = (char *)malloc(len))) // BAD
return;
ptr[len-1] = 0;
...
if (len==0 || len > 256 || !(ptr = (char *)malloc(len))) // GOOD
return;
ptr[len-1] = 0;
...
References
|
cpp/ql/src/experimental/Security/CWE/CWE-758/SomeCasesUndefinedBehavior.ql
Fixed
Show fixed
Hide fixed
While it's very possible that Code Scanning can make a mistake in general, in this case it's correct. See the alert here. The code looks like: exists(Expr erttmp, Expr etmp | // erttmp and etmp are brought into scope here
erttmp.getEnclosingStmt() instanceof ReturnStmt and
ftmp.getTarget().getEntryPoint().getASuccessor*() = erttmp and
(
// NOTE: erttmp and etmp are bound here
thisFunctionFread(etmp, erttmp)
or
// NOTE: Only erttmp is bound here!
erttmp.getValue() = "0"
or
// NOTE: etmp and erttmp is bound in this case
etmp.(AssignExpr).getLValue().(VariableAccess).getTarget() =
erttmp.(VariableAccess).getTarget() and
etmp.(AssignExpr).getRValue().getValue() = "0" and
not exists(VariableAccess vatmp |
vatmp.getTarget() = erttmp.(VariableAccess).getTarget() and
vatmp.getEnclosingStmt() != etmp.getEnclosingStmt() and
vatmp.getEnclosingStmt() != erttmp.getEnclosingStmt() and
etmp.getASuccessor+() = vatmp
)
)
) So as you can see, in the |
Sorry, but despite the good idea, the proposed code is inefficient, and very difficult to maintain. As you declared that you are not willing to improve it further, I am closing this PR. |
I apologize for the late reply, I didn't have a chance to see your comments before yesterday. @MathiasVP I didn't refuse to improve the request and I don't refuse now. On August 31, 2022, I wrote:
implied that I am waiting for the results of the detection from the Security Lab. 10/10/2022 I received your clarification about the complexity of the request and the proposal to wait for the results of the Security Lab.
then I continued the conversation with @jorgectf and moved it into a issue github/securitylab#689 would to separate the requirements of the PR and the requirements of the Security Lab. and here, while I'm willing to do as you say, I just want you to understand my position.
you close the PR without discussing the options, but simply close the PR that you have been working on for almost half a year since May 9, 2022. I am surprised. I want you to understand that you and the other guys I work with on improving the query code are objective CodeQL experts for me, and I always try to follow their recommendations (maybe I don’t always have enough knowledge for this). but there are guys from Security Lab who evaluate the application according to their collections and it happens github/securitylab#475 that after accepting the PR, I have to remove half of the code to satisfy their requirements. ( Now I don't know which is better.
|
this query looks for undefined behaviors associated with a malloc call with size zero.
in this case, we can get a non-zero answer and we will no longer be able to evaluate its correctness.
CVE-2017-6886
CVE-2013-6378
There are two situations of interest, receiving a zero argument, one through an exported function, the second through reading data from a file.