-
Notifications
You must be signed in to change notification settings - Fork 1.8k
CPP: Add query for CWE-125 Out-of-bounds Read with different interpretation of the string when use mbtowc #9089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @ihsinme, Thanks for another contribution. We'll review it very soon! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The qhelp
looks fine, I want to clear up my understanding of the tests, then I will review the QL.
cpp/ql/src/experimental/Security/CWE/CWE-125/DangerousUseMbtowc.qhelp
Outdated
Show resolved
Hide resolved
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-125/semmle/tests/test.cpp
Show resolved
Hide resolved
int len = 9; | ||
for (wchar_t wc; (ret = mbtowc(&wc, ptr, 16)) > 0; len-=ret) { // GOOD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be:
int len = 9; | |
for (wchar_t wc; (ret = mbtowc(&wc, ptr, 16)) > 0; len-=ret) { // GOOD | |
int len = 10; | |
for (wchar_t wc; (ret = mbtowc(&wc, ptr, len)) > 0; len-=ret) { // GOOD |
(if not, what is the intent here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this and the next situation(goodTest2,goodTest3,goodTest4), there is an indication of the presence of zero at the end of the buffer, so the overflow will not occur. But if the read value is less than the maximum length of this character, then you can get unpredictable results, like badTest1,badTest2 here.
therefore it is necessary to specify some maximum character size value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand now, the rules / assumptions are:
- if the input is not null terminated, we must specify the size of the input to prevent overflow.
- if the input is null terminated, overflow is not a concern because the
0
will prevent it.- but if the size parameter is less than
MB_LEN_MAX
,mbtowc
could fail, so the size parameter should be >=MB_LEN_MAX
(orMB_CUR_MAX
).
- but if the size parameter is less than
- the return value of
mbtowc
should only be used to update the input pointer, not the output pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the clear wording.
If I could write it right away, it would be easier to understand.
I'm sorry.
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-125/semmle/tests/test.cpp
Show resolved
Hide resolved
break; | ||
len-=ret; | ||
ptr+=ret; | ||
wc++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this example. I first assumed wc
is an array of wide characters and wc_len
is its length - but then why are we passing it (via len = wc_len
) it into mbtowc
as the length of the array of byte characters, ptr
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the case of using this buffer length ptr
, I just looked at CVE for a long time and wanted to reduce the FP.
If you want I can change this name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I think I see. In the CVE they allocate a wide string with length equal to the length of the input string, because each character of input produces at most one wide character of output. This might be clearer if you allocate wc
here rather than taking it as a parameter, something like:
wc_len = len; // sufficient because each character of input produces at most one wide character of output
wchar_t *wc = new wchar_t[wc_len];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to see it in this particular example or in all of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the same applies to goodTest5
(this case), badTest3
, badTest4
, badTest5
, badTest6
and badTest7
. I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry to be late with the reply.
I fixed it.
cpp/ql/test/experimental/query-tests/Security/CWE/CWE-125/semmle/tests/test.cpp
Show resolved
Hide resolved
…c.qhelp Co-authored-by: Geoffrey White <40627776+geoffw0@users.noreply.github.com>
@geoffw0 look when you have time. for functions "mbstowcs","_mbstowcs_l","mbsrtowcs","MultiByteToWideChar","WideCharToMultiByte"
for work with character conversion
show me the work for all test files so that i can add the correct result. |
I've added the results from the test runner here: ihsinme#1 |
C++: Accept test results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This query is doing too many different things now, and some of them don't correspond with the query name or qhelp provided. Its difficult to review and will be difficult to maintain and use. I think it needs more focus, or breaking up into smaller queries.
A quick variant analysis run found one result in the wild, that was for the case: "Access beyond the allocated memory is possible, the length can change without changing the pointer." (in pwsafe
).
arrayExpr.getArrayBase().(VariableAccess).getTarget().getADeclarationEntry().getType() = | ||
arrayType and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be simplified to:
arrayExpr.getArrayBase().getType() =
arrayType and
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to think, probably this will add to the detection
situation a[]={1,2,3}
, I'll see if we don't lose anything, I'll fix it.
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On test examples, I did not find a difference and made your suggestion into the code.
) and | ||
arrayType.getArraySize() != arrayType.getByteSize() and | ||
msg = | ||
"The size of the array element is greater than one byte, so the offset will point outside the array." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case seems reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an attempt to add index confusion detection when element size is not 1
loop.getCondition().(LTExpr).getRightOperand() = sizeofArray | ||
) and | ||
msg = | ||
"This buffer may contain multibyte characters, so attempting to copy may result in part of the last character being lost." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't tell what's going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you saw the post here github/securitylab#688 (comment) and noticed the apparent symmetry with the test file, so I didn't explicitly state it. this is an attempt to add the situation described in https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2008/c5t35k3c(v=vs.90).
"Access beyond the allocated memory is possible, the length can change without changing the pointer." | ||
or | ||
exists(AssignPointerAddExpr aetmp | | ||
"This buffer may contain multibyte characters, so an attempt to copy may result in an overflow." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nor here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
You're right, I missed the changes for
I also agree that the query has gotten bigger (although it still considers one error class), but I hope you guide me and I'll combine the predicates into either classes or a common import. |
good afternoon @geoffw0. |
Hi @ihsinme . I have no further suggestions for improvements here. It looks like the checks have failed because |
…deCharacters.expected
…byteOrWideCharacters.qlref
Hi @geoffw0. Can you start checking? |
exists(FunctionCall fc | | ||
fc = exp and | ||
( | ||
exists(Loop lptmp | lptmp = fc.getEnclosingStmt().getParentStmt*()) and |
Check warning
Code scanning / CodeQL
Expression can be replaced with a cast
} | ||
|
||
/** Holds if expression is constant or operator call `sizeof`. */ | ||
predicate argConstOrSizeof(Expr exp) { |
Check warning
Code scanning / CodeQL
Missing QLDoc for parameter
mbFunction.getTarget().getName().matches("_mbs%") or | ||
mbFunction.getTarget().getName().matches("mbs%") or | ||
mbFunction.getTarget().getName().matches("_mbc%") or | ||
mbFunction.getTarget().getName().matches("mbc%") |
Check notice
Code scanning / CodeQL
Use a set literal in place of `or`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
The checks have all passed. I'll merge this now.
in this request, I'm looking for confusion when working with function mbtowc. as practice has shown, we can often run abroad.
CVE-2019-19221