Skip to content

Conversation

ihsinme
Copy link
Contributor

@ihsinme ihsinme commented May 9, 2022

in this request, I'm looking for confusion when working with function mbtowc. as practice has shown, we can often run abroad.

CVE-2019-19221

@ihsinme ihsinme requested a review from a team as a code owner May 9, 2022 13:18
@MathiasVP
Copy link
Contributor

Hi @ihsinme,

Thanks for another contribution. We'll review it very soon!

@geoffw0 geoffw0 self-requested a review May 24, 2022 08:25
@geoffw0 geoffw0 self-assigned this May 24, 2022
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The qhelp looks fine, I want to clear up my understanding of the tests, then I will review the QL.

Comment on lines +32 to +33
int len = 9;
for (wchar_t wc; (ret = mbtowc(&wc, ptr, 16)) > 0; len-=ret) { // GOOD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be:

Suggested change
int len = 9;
for (wchar_t wc; (ret = mbtowc(&wc, ptr, 16)) > 0; len-=ret) { // GOOD
int len = 10;
for (wchar_t wc; (ret = mbtowc(&wc, ptr, len)) > 0; len-=ret) { // GOOD

(if not, what is the intent here?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this and the next situation(goodTest2,goodTest3,goodTest4), there is an indication of the presence of zero at the end of the buffer, so the overflow will not occur. But if the read value is less than the maximum length of this character, then you can get unpredictable results, like badTest1,badTest2 here.
therefore it is necessary to specify some maximum character size value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand now, the rules / assumptions are:

  • if the input is not null terminated, we must specify the size of the input to prevent overflow.
  • if the input is null terminated, overflow is not a concern because the 0 will prevent it.
    • but if the size parameter is less than MB_LEN_MAX, mbtowc could fail, so the size parameter should be >= MB_LEN_MAX (or MB_CUR_MAX).
  • the return value of mbtowc should only be used to update the input pointer, not the output pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the clear wording.
If I could write it right away, it would be easier to understand.
I'm sorry.

break;
len-=ret;
ptr+=ret;
wc++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this example. I first assumed wc is an array of wide characters and wc_len is its length - but then why are we passing it (via len = wc_len) it into mbtowc as the length of the array of byte characters, ptr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the case of using this buffer length ptr, I just looked at CVE for a long time and wanted to reduce the FP.

If you want I can change this name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think I see. In the CVE they allocate a wide string with length equal to the length of the input string, because each character of input produces at most one wide character of output. This might be clearer if you allocate wc here rather than taking it as a parameter, something like:

wc_len = len; // sufficient because each character of input produces at most one wide character of output
wchar_t *wc = new wchar_t[wc_len];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want to see it in this particular example or in all of them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the same applies to goodTest5 (this case), badTest3, badTest4, badTest5, badTest6 and badTest7. I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry to be late with the reply.
I fixed it.

@ihsinme
Copy link
Contributor Author

ihsinme commented Jul 4, 2022

@geoffw0 look when you have time.
I expanded the query.
added

for functions "mbstowcs","_mbstowcs_l","mbsrtowcs","MultiByteToWideChar","WideCharToMultiByte"

  1. undefined behavior when crossing buffers
  2. undefined behavior with a null destination buffer
  3. situations of absence of zero termination
  4. memory size calculation error

for work with character conversion

  1. direct copy errors
  2. error when establishing termination

show me the work for all test files so that i can add the correct result.

@geoffw0
Copy link
Contributor

geoffw0 commented Jul 12, 2022

show me the work for all test files so that i can add the correct result.

I've added the results from the test runner here: ihsinme#1

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query is doing too many different things now, and some of them don't correspond with the query name or qhelp provided. Its difficult to review and will be difficult to maintain and use. I think it needs more focus, or breaking up into smaller queries.

A quick variant analysis run found one result in the wild, that was for the case: "Access beyond the allocated memory is possible, the length can change without changing the pointer." (in pwsafe).

Comment on lines 120 to 121
arrayExpr.getArrayBase().(VariableAccess).getTarget().getADeclarationEntry().getType() =
arrayType and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be simplified to:

arrayExpr.getArrayBase().getType() =
      arrayType and

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to think, probably this will add to the detection
situation a[]={1,2,3}, I'll see if we don't lose anything, I'll fix it.

Thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On test examples, I did not find a difference and made your suggestion into the code.

) and
arrayType.getArraySize() != arrayType.getByteSize() and
msg =
"The size of the array element is greater than one byte, so the offset will point outside the array."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case seems reasonable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an attempt to add index confusion detection when element size is not 1

loop.getCondition().(LTExpr).getRightOperand() = sizeofArray
) and
msg =
"This buffer may contain multibyte characters, so attempting to copy may result in part of the last character being lost."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell what's going on here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you saw the post here github/securitylab#688 (comment) and noticed the apparent symmetry with the test file, so I didn't explicitly state it. this is an attempt to add the situation described in https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2008/c5t35k3c(v=vs.90).

"Access beyond the allocated memory is possible, the length can change without changing the pointer."
or
exists(AssignPointerAddExpr aetmp |
"This buffer may contain multibyte characters, so an attempt to copy may result in an overflow."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nor here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@ihsinme
Copy link
Contributor Author

ihsinme commented Jul 17, 2022

You're right, I missed the changes for qhelp and the name. how do you look at the following correction options.

DangerousUseMbtowc to DangerousWorksWithMultibyteOrWideCharacters

Using function <code>mbtowc</code> with an invalid length argument can result in an out-of-bounds access error or unexpected result. If you are sure you are working with a null-terminated string, use the length macros, if not, use the correctly computed length.
Using a function to convert multibyte or wide characters with an invalid length argument may result in an out-of-range access error or unexpected results.

I also agree that the query has gotten bigger (although it still considers one error class), but I hope you guide me and I'll combine the predicates into either classes or a common import.

@ihsinme
Copy link
Contributor Author

ihsinme commented Aug 3, 2022

good afternoon @geoffw0.
any news on this PR?

@geoffw0
Copy link
Contributor

geoffw0 commented Aug 8, 2022

Hi @ihsinme .

I have no further suggestions for improvements here.

It looks like the checks have failed because DangerousUseMbtowc.ql needs autoformatting. Once that's fixed I will merge.

@ihsinme
Copy link
Contributor Author

ihsinme commented Aug 8, 2022

Hi @geoffw0.
I added corrections, including the text corrections I wrote about above.

Can you start checking?

exists(FunctionCall fc |
fc = exp and
(
exists(Loop lptmp | lptmp = fc.getEnclosingStmt().getParentStmt*()) and

Check warning

Code scanning / CodeQL

Expression can be replaced with a cast

The assignment to [lptmp](1) in the exists(..) can replaced with an instanceof expression.
}

/** Holds if expression is constant or operator call `sizeof`. */
predicate argConstOrSizeof(Expr exp) {

Check warning

Code scanning / CodeQL

Missing QLDoc for parameter

The QLDoc has no documentation for exp, but the QLDoc mentions sizeof
Comment on lines +135 to +138
mbFunction.getTarget().getName().matches("_mbs%") or
mbFunction.getTarget().getName().matches("mbs%") or
mbFunction.getTarget().getName().matches("_mbc%") or
mbFunction.getTarget().getName().matches("mbc%")

Check notice

Code scanning / CodeQL

Use a set literal in place of `or`

This formula of 4 predicate calls can be replaced with a single call on a set literal, improving readability.
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

The checks have all passed. I'll merge this now.

@geoffw0 geoffw0 merged commit db8a310 into github:main Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants