Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions change-notes/1.21/analysis-cpp.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,5 +17,6 @@
| Memory is never freed (`cpp/memory-never-freed`) | More correct results | Support added for more Microsoft-specific allocation functions, including `LocalAlloc`, `GlobalAlloc`, `HeapAlloc` and `CoTaskMemAlloc`. |
| Resource not released in destructor (`cpp/resource-not-released-in-destructor`) | Fewer false positive results | Resource allocation and deallocation functions are now determined more accurately. |
| Comparison result is always the same | Fewer false positive results | The range analysis library is now more conservative about floating point values being possibly `NaN` |
| Wrong type of arguments to formatting function (`cpp/wrong-type-format-argument`) | More correct results and fewer false positive results | This query now more accurately identifies wide and non-wide string/character format arguments on different platforms. Platform detection has also been made more accurate for the purposes of this query. |

## Changes to QL libraries
8 changes: 7 additions & 1 deletion cpp/ql/src/semmle/code/cpp/File.qll
Original file line number Diff line number Diff line change
Expand Up @@ -302,7 +302,13 @@ class File extends Container, @file {
predicate compiledAsMicrosoft() {
exists(Compilation c |
c.getAFileCompiled() = this and
c.getAnArgument() = "--microsoft"
(
c.getAnArgument() = "--microsoft" or
c.getAnArgument().toLowerCase().replaceAll("\\", "/").matches("%/cl.exe")
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matching on \cl.exe seems fragile to me; for example, what if it's written in upper case or with a forward slash? @ian-semmle, is this the best we can do? Why don't we always get a --microsoft argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed the matter on Slack a few days ago, and agreed that the right way would be for the extractor to tell us explicitly whether a compilation was Microsoft or not.

Right now we might get slightly better accuracy by looking for the existence of the _MSC_VER macro anywhere in the snapshot, though we'll just have to assume that all files are compiled as Microsoft if we see it (so we'll do worse in the probably very rare case of mixed Microsoft and non-Microsoft compilations). Another suggestion was looking for paths beginning C: or similar (which detects a Microsoft file system, rather than compiler, and may work poorly with test path normalization).

I'm happy to implement the _MSC_VER thing if you're convinced it would be preferable. Or just make the cl.exe clause a bit more robust?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it sounds better to make the cl.exe test more robust. Is it only relevant to test for cl.exe when there's also a --mimic argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think so. My logic was that with matching the \, it's highly unlikely we'll get false results for cl.exe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick Google search shows that some people like to spell it CL.exe, so turning the match into a case-insensitive regex might be necessary. What Compilation argument do we get if the compiler is just invoked as cl, without the .exe? Or the --mimic option only work if .exe is included? I suppose this argument is only ever produced by our own tracer, so your current match might be fine if the tracer normalises everything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made it slash/case insensitive, that seems pretty uncontroversial. Removing the need for .exe seems riskier to me as cl by itself is a very short string that could coincidentally match something else (e.g. some command line flag that happens to be /CL).

) or exists(File parent |
parent.compiledAsMicrosoft() and
parent.getAnIncludedFile() = this
)
}

Expand Down
28 changes: 13 additions & 15 deletions cpp/ql/src/semmle/code/cpp/commons/Printf.qll
Original file line number Diff line number Diff line change
Expand Up @@ -32,19 +32,19 @@ class AttributeFormattingFunction extends FormattingFunction {
* A standard function such as `vprintf` that has a format parameter
* and a variable argument list of type `va_arg`.
*/
predicate primitiveVariadicFormatter(TopLevelFunction f, int formatParamIndex, boolean wide) {
predicate primitiveVariadicFormatter(TopLevelFunction f, int formatParamIndex) {
f.getName().regexpMatch("_?_?va?[fs]?n?w?printf(_s)?(_p)?(_l)?")
and (
if f.getName().matches("%\\_l")
then formatParamIndex = f.getNumberOfParameters() - 3
else formatParamIndex = f.getNumberOfParameters() - 2
) and if f.getName().matches("%w%") then wide = true else wide = false
)
}

private
predicate callsVariadicFormatter(Function f, int formatParamIndex, boolean wide) {
predicate callsVariadicFormatter(Function f, int formatParamIndex) {
exists(FunctionCall fc, int i |
variadicFormatter(fc.getTarget(), i, wide)
variadicFormatter(fc.getTarget(), i)
and fc.getEnclosingFunction() = f
and fc.getArgument(i) = f.getParameter(formatParamIndex).getAnAccess()
)
Expand All @@ -54,11 +54,11 @@ predicate callsVariadicFormatter(Function f, int formatParamIndex, boolean wide)
* Holds if `f` is a function such as `vprintf` that has a format parameter
* (at `formatParamIndex`) and a variable argument list of type `va_arg`.
*/
predicate variadicFormatter(Function f, int formatParamIndex, boolean wide) {
primitiveVariadicFormatter(f, formatParamIndex, wide)
predicate variadicFormatter(Function f, int formatParamIndex) {
primitiveVariadicFormatter(f, formatParamIndex)
or (
not f.isVarargs()
and callsVariadicFormatter(f, formatParamIndex, wide)
and callsVariadicFormatter(f, formatParamIndex)
)
}

Expand All @@ -68,12 +68,10 @@ predicate variadicFormatter(Function f, int formatParamIndex, boolean wide) {
*/
class UserDefinedFormattingFunction extends FormattingFunction {
UserDefinedFormattingFunction() {
isVarargs() and callsVariadicFormatter(this, _, _)
isVarargs() and callsVariadicFormatter(this, _)
}

override int getFormatParameterIndex() { callsVariadicFormatter(this, result, _) }

override predicate isWideCharDefault() { callsVariadicFormatter(this, _, true) }
override int getFormatParameterIndex() { callsVariadicFormatter(this, result) }
}

/**
Expand Down Expand Up @@ -674,8 +672,8 @@ class FormatLiteral extends Literal {
/**
* Gets the char type required by the nth conversion specifier.
* - in the base case this is the default for the formatting function
* (e.g. `char` for `printf`, `wchar_t` for `wprintf`).
* - the `%S` format character reverses wideness.
* (e.g. `char` for `printf`, `char` or `wchar_t` for `wprintf`).
* - the `%C` format character reverses wideness.
* - the size prefixes 'l'/'w' and 'h' override the type character
* to wide or single-byte characters respectively.
*/
Expand Down Expand Up @@ -721,8 +719,8 @@ class FormatLiteral extends Literal {
/**
* Gets the string type required by the nth conversion specifier.
* - in the base case this is the default for the formatting function
* (e.g. `char` for `printf`, `wchar_t` for `wprintf`).
* - the `%S` format character reverses wideness.
* (e.g. `char *` for `printf`, `char *` or `wchar_t *` for `wprintf`).
* - the `%S` format character reverses wideness on some platforms.
* - the size prefixes 'l'/'w' and 'h' override the type character
* to wide or single-byte characters respectively.
*/
Expand Down
55 changes: 39 additions & 16 deletions cpp/ql/src/semmle/code/cpp/models/interfaces/FormattingFunction.qll
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ private Type stripTopLevelSpecifiersOnly(Type t) {
*/
Type getAFormatterWideType() {
exists(FormattingFunction ff |
result = stripTopLevelSpecifiersOnly(ff.getDefaultCharType()) and
result = stripTopLevelSpecifiersOnly(ff.getFormatCharType()) and
result.getSize() != 1
)
}
Expand All @@ -46,6 +46,14 @@ abstract class FormattingFunction extends Function {
/** Gets the position at which the format parameter occurs. */
abstract int getFormatParameterIndex();

/**
* Holds if this `FormattingFunction` is in a context that supports
* Microsoft rules and extensions.
*/
predicate isMicrosoft() {
getFile().compiledAsMicrosoft()
}

/**
* Holds if the default meaning of `%s` is a `wchar_t *`, rather than
* a `char *` (either way, `%S` will have the opposite meaning).
Expand All @@ -55,31 +63,44 @@ abstract class FormattingFunction extends Function {
deprecated predicate isWideCharDefault() { none() }

/**
* Gets the default character type expected for `%s` by this function. Typically
* `char` or `wchar_t`.
* Gets the character type used in the format string for this function.
*/
Type getDefaultCharType() {
result =
Type getFormatCharType() {
result =
stripTopLevelSpecifiersOnly(
stripTopLevelSpecifiersOnly(
getParameter(getFormatParameterIndex()).getType().getUnderlyingType()
).(PointerType).getBaseType()
)
}

/**
* Gets the default character type expected for `%s` by this function. Typically
* `char` or `wchar_t`.
*/
Type getDefaultCharType() {
(
isMicrosoft() and
result = getFormatCharType()
) or (
not isMicrosoft() and
result instanceof PlainCharType
)
}

/**
* Gets the non-default character type expected for `%S` by this function. Typically
* `wchar_t` or `char`. On some snapshots there may be multiple results where we can't tell
* which is correct for a particular function.
*/
Type getNonDefaultCharType() {
(
getDefaultCharType().getSize() = 1 and
result = getAFormatterWideTypeOrDefault()
) or (
getDefaultCharType().getSize() > 1 and
result instanceof PlainCharType
)
(
getDefaultCharType().getSize() = 1 and
result = getWideCharType()
) or (
not getDefaultCharType().getSize() = 1 and
result instanceof PlainCharType
)
}

/**
Expand All @@ -89,10 +110,12 @@ abstract class FormattingFunction extends Function {
*/
Type getWideCharType() {
(
result = getDefaultCharType() or
result = getNonDefaultCharType()
) and
result.getSize() > 1
result = getFormatCharType() and
result.getSize() > 1
) or (
not getFormatCharType().getSize() > 1 and
result = getAFormatterWideTypeOrDefault() // may have more than one result
)
}

/**
Expand Down
2 changes: 1 addition & 1 deletion cpp/ql/src/semmle/code/cpp/security/PrintfLike.qll
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ import external.ExternalArtifact
predicate printfLikeFunction(Function func, int formatArg) {
(formatArg = func.(FormattingFunction).getFormatParameterIndex() and not func instanceof UserDefinedFormattingFunction)
or
primitiveVariadicFormatter(func, formatArg, _)
primitiveVariadicFormatter(func, formatArg)
or
exists(ExternalData data |
// TODO Do this \ to / conversion in the toolchain?
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
| tests.cpp:18:15:18:22 | Hello | This argument should be of type 'char *' but is of type 'char16_t *' |
| tests.cpp:19:15:19:22 | Hello | This argument should be of type 'char *' but is of type 'wchar_t *' |
| tests.cpp:25:17:25:23 | Hello | This argument should be of type 'wchar_t *' but is of type 'char *' |
| tests.cpp:26:17:26:24 | Hello | This argument should be of type 'wchar_t *' but is of type 'char16_t *' |
| tests.cpp:30:17:30:24 | Hello | This argument should be of type 'char *' but is of type 'char16_t *' |
| tests.cpp:31:17:31:24 | Hello | This argument should be of type 'char *' but is of type 'wchar_t *' |
| tests.cpp:33:36:33:42 | Hello | This argument should be of type 'char16_t *' but is of type 'char *' |
| tests.cpp:35:36:35:43 | Hello | This argument should be of type 'char16_t *' but is of type 'wchar_t *' |
| tests.cpp:38:36:38:43 | Hello | This argument should be of type 'char *' but is of type 'char16_t *' |
| tests.cpp:39:36:39:43 | Hello | This argument should be of type 'char *' but is of type 'wchar_t *' |
| tests.cpp:26:17:26:24 | Hello | This argument should be of type 'char *' but is of type 'char16_t *' |
| tests.cpp:27:17:27:24 | Hello | This argument should be of type 'char *' but is of type 'wchar_t *' |
| tests.cpp:29:17:29:23 | Hello | This argument should be of type 'wchar_t *' but is of type 'char *' |
| tests.cpp:30:17:30:24 | Hello | This argument should be of type 'wchar_t *' but is of type 'char16_t *' |
| tests.cpp:34:36:34:43 | Hello | This argument should be of type 'char *' but is of type 'char16_t *' |
| tests.cpp:35:36:35:43 | Hello | This argument should be of type 'char *' but is of type 'wchar_t *' |
| tests.cpp:37:36:37:42 | Hello | This argument should be of type 'char16_t *' but is of type 'char *' |
| tests.cpp:39:36:39:43 | Hello | This argument should be of type 'char16_t *' but is of type 'wchar_t *' |
| tests.cpp:42:37:42:44 | Hello | This argument should be of type 'char *' but is of type 'char16_t *' |
| tests.cpp:43:37:43:44 | Hello | This argument should be of type 'char *' but is of type 'wchar_t *' |
| tests.cpp:45:37:45:43 | Hello | This argument should be of type 'char16_t *' but is of type 'char *' |
| tests.cpp:47:37:47:44 | Hello | This argument should be of type 'char16_t *' but is of type 'wchar_t *' |
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
| tests.cpp:8:5:8:10 | printf | char | char16_t, wchar_t | char16_t, wchar_t |
| tests.cpp:9:5:9:11 | wprintf | wchar_t | char | wchar_t |
| tests.cpp:10:5:10:12 | swprintf | char16_t | char | char16_t |
| tests.cpp:8:5:8:10 | printf | char | char | char16_t, wchar_t | char16_t, wchar_t |
| tests.cpp:9:5:9:11 | wprintf | wchar_t | char | wchar_t | wchar_t |
| tests.cpp:10:5:10:12 | swprintf | char16_t | char | char16_t | char16_t |
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ import cpp
from FormattingFunction f
select
f,
concat(f.getFormatCharType().toString(), ", "),
concat(f.getDefaultCharType().toString(), ", "),
concat(f.getNonDefaultCharType().toString(), ", "),
concat(f.getWideCharType().toString(), ", ")
Original file line number Diff line number Diff line change
Expand Up @@ -22,19 +22,27 @@ void tests() {
printf("%S", u"Hello"); // GOOD
printf("%S", L"Hello"); // GOOD

wprintf(L"%s", "Hello"); // BAD: expecting wchar_t
wprintf(L"%s", u"Hello"); // BAD: expecting wchar_t
wprintf(L"%s", L"Hello"); // GOOD
wprintf(L"%s", "Hello"); // GOOD
wprintf(L"%s", u"Hello"); // BAD: expecting char
wprintf(L"%s", L"Hello"); // BAD: expecting char

wprintf(L"%S", "Hello"); // GOOD
wprintf(L"%S", u"Hello"); // BAD: expecting char
wprintf(L"%S", L"Hello"); // BAD: expecting char
wprintf(L"%S", "Hello"); // BAD: expecting wchar_t
wprintf(L"%S", u"Hello"); // BAD: expecting wchar_t
wprintf(L"%S", L"Hello"); // GOOD

swprintf(buffer, BUF_SIZE, u"%s", "Hello"); // BAD: expecting char16_t
swprintf(buffer, BUF_SIZE, u"%s", u"Hello"); // GOOD
swprintf(buffer, BUF_SIZE, u"%s", L"Hello"); // BAD: expecting char16_t
swprintf(buffer, BUF_SIZE, u"%s", "Hello"); // GOOD
swprintf(buffer, BUF_SIZE, u"%s", u"Hello"); // BAD: expecting char
swprintf(buffer, BUF_SIZE, u"%s", L"Hello"); // BAD: expecting char

swprintf(buffer, BUF_SIZE, u"%S", "Hello"); // GOOD
swprintf(buffer, BUF_SIZE, u"%S", u"Hello"); // BAD: expecting char
swprintf(buffer, BUF_SIZE, u"%S", L"Hello"); // BAD: expecting char
swprintf(buffer, BUF_SIZE, u"%S", "Hello"); // BAD: expecting char16_t
swprintf(buffer, BUF_SIZE, u"%S", u"Hello"); // GOOD
swprintf(buffer, BUF_SIZE, u"%S", L"Hello"); // BAD: expecting char16_t

swprintf(buffer, BUF_SIZE, u"%hs", "Hello"); // GOOD
swprintf(buffer, BUF_SIZE, u"%hs", u"Hello"); // BAD: expecting char
swprintf(buffer, BUF_SIZE, u"%hs", L"Hello"); // BAD: expecting char

swprintf(buffer, BUF_SIZE, u"%ls", "Hello"); // BAD: expecting char16_t
swprintf(buffer, BUF_SIZE, u"%ls", u"Hello"); // GOOD
swprintf(buffer, BUF_SIZE, u"%ls", L"Hello"); // BAD: expecting char16_t
}
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
| printf.cpp:33:31:33:37 | test | This argument should be of type 'char *' but is of type 'char16_t *' |
| printf.cpp:45:29:45:35 | test | This argument should be of type 'char *' but is of type 'char16_t *' |
| printf.cpp:52:29:52:35 | test | This argument should be of type 'char16_t *' but is of type 'wchar_t *' |
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
| printf.cpp:15:5:15:12 | swprintf | char16_t | char | char16_t |
| printf.cpp:15:5:15:12 | swprintf | char | char16_t | char16_t |
| printf.cpp:26:5:26:11 | sprintf | char | char16_t | char16_t |
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ int sprintf(char *dest, char *format, ...);
void test1() {
WCHAR string[20];

swprintf(string, u"test %s", u"test"); // GOOD
swprintf(string, u"test %s", u"test"); // BAD: `char16_t` string parameter read as `char` string
}

void test2() {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
| printf1.h:45:18:45:20 | ull | This argument should be of type 'unsigned int' but is of type 'unsigned long long' |
| printf1.h:46:18:46:20 | ull | This argument should be of type 'unsigned int' but is of type 'unsigned long long' |
| printf1.h:47:19:47:21 | ull | This argument should be of type 'unsigned int' but is of type 'unsigned long long' |
| printf1.h:126:18:126:19 | wc | This argument should be of type 'char *' but is of type 'wchar_t *' |
| printf1.h:127:18:127:18 | c | This argument should be of type 'wchar_t *' but is of type 'char *' |
| real_world.h:61:21:61:22 | & ... | This argument should be of type 'int *' but is of type 'short *' |
| real_world.h:62:22:62:23 | & ... | This argument should be of type 'short *' but is of type 'int *' |
| real_world.h:63:22:63:24 | & ... | This argument should be of type 'short *' but is of type 'unsigned int *' |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
| common.h:12:12:12:17 | printf | char | wchar_t | wchar_t |
| common.h:15:12:15:18 | wprintf | wchar_t | char | wchar_t |
| common.h:15:12:15:18 | wprintf | char | wchar_t | wchar_t |
| format.h:4:13:4:17 | error | char | wchar_t | wchar_t |
| real_world.h:8:12:8:18 | fprintf | char | wchar_t | wchar_t |
| real_world.h:33:6:33:12 | msg_out | char | wchar_t | wchar_t |
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,11 @@ void test_chars(char c, wchar_t wc, wint_t wt)
wprintf(L"%C", wc); // GOOD (converts to wint_t)
wprintf(L"%C", wt); // GOOD
}

void test_ws(char *c, wchar_t *wc)
{
wprintf(L"%s", c); // GOOD
wprintf(L"%s", wc); // BAD
wprintf(L"%S", c); // BAD
wprintf(L"%S", wc); // GOOD
}
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,11 @@
| printf1.h:74:19:74:22 | C_ST | This argument should be of type 'ssize_t' but is of type 'unsigned long long' |
| printf1.h:75:19:75:28 | sizeof(<expr>) | This argument should be of type 'ssize_t' but is of type 'unsigned long long' |
| printf1.h:84:23:84:35 | ... - ... | This argument should be of type 'ssize_t' but is of type 'long long' |
| printf1.h:125:18:125:18 | c | This argument should be of type '__wchar_t *' but is of type 'char *' |
| printf1.h:128:18:128:19 | wc | This argument should be of type 'char *' but is of type '__wchar_t *' |
| real_world.h:61:21:61:22 | & ... | This argument should be of type 'int *' but is of type 'short *' |
| real_world.h:62:22:62:23 | & ... | This argument should be of type 'short *' but is of type 'int *' |
| real_world.h:63:22:63:24 | & ... | This argument should be of type 'short *' but is of type 'unsigned int *' |
| real_world.h:64:22:64:24 | & ... | This argument should be of type 'short *' but is of type 'signed int *' |
| wide_string.h:25:18:25:20 | c | This argument should be of type 'char' but is of type 'char *' |
| wide_string.h:29:19:29:22 | c | This argument should be of type 'wchar_t' but is of type '__wchar_t *' |
Original file line number Diff line number Diff line change
Expand Up @@ -119,3 +119,11 @@ void test_chars(char c, wchar_t wc, wint_t wt)
wprintf(L"%C", wc); // BAD [NOT DETECTED]
wprintf(L"%C", wt); // BAD [NOT DETECTED]
}

void test_ws(char *c, wchar_t *wc, wint_t *wt)
{
wprintf(L"%s", c); // BAD
wprintf(L"%s", wc); // GOOD
wprintf(L"%S", c); // GOOD
wprintf(L"%S", wc); // BAD
}
Original file line number Diff line number Diff line change
Expand Up @@ -26,5 +26,5 @@ void test_wchar4(char c, const char cc, wchar_t wc, const wchar_t wcc) {
printf("%wc", wc); // GOOD
printf("%wc", wcc); // GOOD
printf("%wc", L'c'); // GOOD
printf("%wc", L"c"); // BAD [NOT DETECTED]
printf("%wc", L"c"); // BAD
}