-
Notifications
You must be signed in to change notification settings - Fork 1.7k
CPP: Add query for CVE-2022-37454: Integer addition may overflow inside if statement #12036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @nmouha! Thanks for your contribution. Since this PR is part of a bounty process the Security Lab team will do an initial review the query, and once they're happy with the intention of the query and its results the CodeQL C/C++ team will review the query. |
Hi @nmouha, your query is looking for integer overflow, but there are no constraints that a, b and c are actually integers. One of the false positives patterns I have noticed is that it flags |
Thank you @JarLob for your feedback! Sorry to overlook this obvious false positive. I've updated the query to make sure that it involves an integral type, and I've also incorporated all previous feedback of @rdmarsh2 in #codeql-writing. If there are further improvements to consider, I'd be happy to follow up! |
The latest query still flags double and int comparison, which is IMHO wrong as int is implicitly converted to double. |
Thanks again @JarLob! I just updated the query to address your feedback. The previous query detected The Regarding potential overlap with other queries, the previous query had an overlap with Looking forward to hear your thoughts! |
Hi @nmouha!
The second version was detecting
To give you some insights, the first version of the query returned 129 results when run on a 1000 projects, the v2 - 6315 on the same projects. v3 - 997. Technically any int size;
int capacity;
int newElements;
if (size + newElements > capacity)
newElements = capacity - size; is good, because its
What I meant is that if the query is going to diverge from the
I think you have enough information in the query to show how the code should be rewritten? I think that would be valuable. Many developers may not understand immediately what is wrong with the code. If that is too complicated, I think at least a message like |
Thanks again @JarLob! I really appreciate that you're guiding me through this. I've updated the query and I'm including test cases below. If I misunderstood something or if certain patterns should be added/removed, just let me know and I'll adjust the query!
|
Hi @nmouha, sorry for the delay. The query looks good. However to clarify, what I suggested is not modifying the description of the query (it is not visible in alerts), but the codeql/cpp/ql/src/experimental/Security/CWE/CWE-190/AllocMultiplicationOverflow.ql Lines 39 to 41 in a45a0ee
"if (a+b>c) a=c-b" was detected where "a + b" may potentially overflow/wraparound. The code can be rewritten to "if (a>c-b) a=c-b" which avoids the overflow for each alert with actual a, b and c .
We definitely would like you to put the test from your comment into a test like in https://github.com/github/codeql/tree/main/cpp/ql/test/experimental/query-tests/Security/CWE/CWE-190/AllocMultiplicationOverflow, but this is what I think @MathiasVP will comment on. One comment I have on the test though is these lines: // allow overlap with SignedOverflowCheck.ql
if (a+b>a) a = a-b; Wouldn't an attempt to detect overflow look like |
Thank you @JarLob! I've reverted the description and I've updated the alert to look more similar to If we want the variable names corresponding to a, b, and c, then maybe the text becomes more difficult to read, and it may also be less clear that only the addition needs to be replaced:
However, if you want, I can make this change with a select statement along these lines:
I've also added a It would be helpful to extend the tests with some false positives that occur in practice, and maybe even improve the query to avoid them. I'm not sure how to approach this; as you mentioned the query returns results for some existing projects, but I don't which of those results are potential false positives that can be added to the tests. |
I see your point. I defer this to the CodeQL team to decide which is the best way to report.
I mentioned false positives because of floating point types, but I think they are gone now. There could be an additional test: Here is an example of current FP because The submission is now in CodeQL review stage, they may have comments. |
That's exactly the type of false positive example that I was looking for. Thank you @JarLob! It seems that the false positive is due to an inherent limitation of codeql/cpp/ql/lib/semmle/code/cpp/rangeanalysis/SimpleRangeAnalysis.qll Lines 1684 to 1685 in 720eed3
which returns true after a few iterations due to A more precise analysis (e.g., using a symbolic execution tool) shows that addition here can never overflow. But such tools have a risk of path explosion, which CodeQL wants to avoid. So rather than looking for any OpenSSL has the same code pattern but with a ternary conditional operator instead. (I can write the query to include this result, however I confirmed that it is also a false positive.) I personally think the following code is a much more readable way to ensure that
But PS: I added the additional test case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @nmouha,
Thanks for your query. This looks great! I have a couple of comments and suggestions. Let me know if something isn't clear in my explanations, and I'll do my best to expand on it. Also, it looks like your code isn't formatted according to our contribution guidelines (see Section 3 of https://github.com/github/codeql/blob/main/CONTRIBUTING.md#submitting-a-new-experimental-query)
* @description Detects "if (a+b>c) a=c-b", which incorrectly implements | ||
* a = min(a,c-b) if a+b overflows. Should be replaced by | ||
* "if (a>c-b) a=c-b". Also detects "if (b+a>c) a=c-b" | ||
* (swapped terms in addition), if (a+b>c) { a=c-b }" | ||
* (assignment inside block), "c<a+b" (swapped operands) and | ||
* ">=", "<", "<=" instead of ">" (all operators). This | ||
* integer overflow is the root cause of the buffer overflow | ||
* in the SHA-3 reference implementation (CVE-2022-37454). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can keep the description short, and just off-load some of these other patterns to the qhelp file. How about this suggestion instead?
* @description Detects "if (a+b>c) a=c-b", which incorrectly implements | |
* a = min(a,c-b) if a+b overflows. Should be replaced by | |
* "if (a>c-b) a=c-b". Also detects "if (b+a>c) a=c-b" | |
* (swapped terms in addition), if (a+b>c) { a=c-b }" | |
* (assignment inside block), "c<a+b" (swapped operands) and | |
* ">=", "<", "<=" instead of ">" (all operators). This | |
* integer overflow is the root cause of the buffer overflow | |
* in the SHA-3 reference implementation (CVE-2022-37454). | |
* @description Writing 'if (a+b>c) a=c-b' incorrectly implements | |
* 'a = min(a,c-b)' if 'a+b' overflows. This integer | |
* overflow is the root cause of the buffer overflow | |
* in the SHA-3 reference implementation (CVE-2022-37454). |
blockstmt.getAStmt() = exprstmt)) and | ||
exprstmt.getExpr() = assignexpr and | ||
assignexpr.getRValue() = subexpr and | ||
((hashCons(addexpr.getLeftOperand()) = hashCons(assignexpr.getLValue()) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I see why you need hashCons
here: Does addexpr.getLeftOperand()
not have the same value number as assignexpr.getLValue()
? So you're forced to use hashCons
instead?
(ifstmt.getThen() = blockstmt and | ||
blockstmt.getAStmt() = exprstmt)) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a subtle issue here that the missing auto-formatting is hiding really well 😅. You're only providing a value for blockstmt
in the ifstmt.getThen() = blockstmt and blockstmt.getAStmt() = exprstmt
case, but not in the ifstmt.getThen() = exprstmt
case. If we're not rescued by the CodeQL optimizer this will generate a cartesian product between all the IfStmt ifstmt, RelationalOperation relop, ExprStmt exprstmt, AssignExpr assignexpr, AddExpr addexpr, SubExpr subexpr
values considered by the query, and all BlockStmt
s in the program. To avoid this, you can wrap the ifstmt.getThen() = blockstmt and blockstmt.getAStmt() = exprstmt
case in an exists
like this.
(ifstmt.getThen() = blockstmt and | |
blockstmt.getAStmt() = exprstmt)) and | |
exists(BlockStmt blockstmt | ifstmt.getThen() = blockstmt and blockstmt.getAStmt() = exprstmt) and |
This limits the scope of the blockstmt
and avoids the possibility of a cartesian product (which will hurt performance a lot on large databases).
import semmle.code.cpp.commons.Exclusions | ||
|
||
from IfStmt ifstmt, RelationalOperation relop, ExprStmt exprstmt, BlockStmt blockstmt, AssignExpr assignexpr, AddExpr addexpr, SubExpr subexpr | ||
where ifstmt.getCondition() = relop and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could increase the true-positive rate of this query a lot if you used our semmle.code.cpp.controlflow.Guards
library. This would allow you to detect more patterns such as this one completely automagically:
if (a+b <= c) {
/* ... */
} else { a=c-b; }
To use semmle.code.cpp.controlflow.Guards
, the only changes I think you need to make is:
- Import
semmle.code.cpp.controlflow.Guards
- Replace
from IfStmt ifstmt
withfrom GuardCondition guard
- Replace
ifstmt.getCondition() = relop
with(guard.ensuresLt(_, addexpr, _, block, true) or guard.ensuresLt(addexpr, _, _, block, true))
. This says thataddexpr
must be an operand that is on either side of a comparison. Now admittedly, this isn't as pretty as before. But you can factor this out into a predicate to make it look like the code you had before 🙂.
This would also allow you to get rid of the blockstmt
variable, and get rid of the two explicit cases for:
(ifstmt.getThen() = exprstmt or(ifstmt.getThen() = blockstmt and blockstmt.getAStmt() = exprstmt))
and instead replace them with a single condition: block.getANode() = exprstmt
.
Hi @MathiasVP, Thank you very much for the detailed explanations! I was wondering why so many queries are written using I've updated the pull request according to your suggestions. Below are some explanations: Indeed, I'm suggesting to change
Thanks again and don't hesitate to let me know if you have additional feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with the exception of one small question.
GuardCondition guard, Expr expr, ExprStmt exprstmt, BasicBlock block, AssignExpr assignexpr, | ||
AddExpr addexpr, SubExpr subexpr | ||
where | ||
(guard.ensuresLt(expr, addexpr, 0, block, _) or guard.ensuresLt(addexpr, expr, 0, block, _)) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like you lost your not isFromMacroDefinition(relop) and
condition in 5c6fc2f. Was that intentional?
The expression "if (a+b>c) a=c-b" is incorrect if "a+b" overflows. It should be replaced by "if (a>c-b) a=c-b", which avoids the potential overflow and is much easier to understand. This pattern is the root cause of CVE-2022-37454, a buffer overflow vulnerability in the "official" SHA-3 implementation. It has been confirmed that the addition in https://github.com/openssl/openssl/blob/master/providers/implementations/kdfs/hkdf.c#L534 cannot overflow. So this is only a minor change proposal to avoid a potentially vulnerable code pattern and to improve readability. More information: github/codeql#12036 (comment) CLA: trivial Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #20990) (cherry picked from commit 56a51b5)
The expression "if (a+b>c) a=c-b" is incorrect if "a+b" overflows. It should be replaced by "if (a>c-b) a=c-b", which avoids the potential overflow and is much easier to understand. This pattern is the root cause of CVE-2022-37454, a buffer overflow vulnerability in the "official" SHA-3 implementation. It has been confirmed that the addition in https://github.com/openssl/openssl/blob/master/providers/implementations/kdfs/hkdf.c#L534 cannot overflow. So this is only a minor change proposal to avoid a potentially vulnerable code pattern and to improve readability. More information: github/codeql#12036 (comment) CLA: trivial Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #20990) (cherry picked from commit 56a51b5)
The expression "if (a+b>c) a=c-b" is incorrect if "a+b" overflows. It should be replaced by "if (a>c-b) a=c-b", which avoids the potential overflow and is much easier to understand. This pattern is the root cause of CVE-2022-37454, a buffer overflow vulnerability in the "official" SHA-3 implementation. It has been confirmed that the addition in https://github.com/openssl/openssl/blob/master/providers/implementations/kdfs/hkdf.c#L534 cannot overflow. So this is only a minor change proposal to avoid a potentially vulnerable code pattern and to improve readability. More information: github/codeql#12036 (comment) CLA: trivial Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #20990)
Hi @MathiasVP, The Just FYI: the false positive found by @JarLob resulted in changes to both OpenSSL and |
QHelp previews: cpp/ql/src/experimental/Security/CWE/CWE-190/IfStatementAdditionOverflow.qhelpInteger addition may overflow inside if statementDetects Also detects variants such as This integer overflow is the root cause of the buffer overflow in the SHA-3 reference implementation (CVE-2022-37454). RecommendationReplace by References
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. This LGTM now!
The query detects "if (a+b>c) a=c-b", which is incorrect if a+b overflows. This code should be replaced by "if (a>c-b) a=c-b", which avoids the overflow and is much easier to understand. (It becomes much clearer that it implements "a = min(a,c-b)".)
This pattern is the root cause of CVE-2022-37454, a buffer overflow vulnerability in the "official" SHA-3 implementation that has been fixed in the latest versions of some large projects (including PHP and Python). More info: https://mouha.be/sha-3-buffer-overflow/