CPP: Fix WrongTypeFormatArguments.ql char16_t * issues (and others) #286

geoffw0 · 2018-10-05T13:23:34Z

Fix handling of wide string types in 'WrongTypeFormatArguments.ql'. Previously they were assumed to always be wchar_t *, which lead to high numbers of false positive results on projects where, say, char16_t * strings are used. Also fixed:

an issue where the query would get confused by a snapshot which had multiple types that naturally correspond with a single format character, for example (commonly) multiple pointer sizes and %p.
an issue where custom formatting functions were assumed to be built-in formatting functions due to exactly matching in name.

The first three commits are from an old pull request by @rdmarsh2, that was never merged due to concerns about mixing string types (which have been fixed here).

I have some smaller fixes to follow, but this PR should eliminate most of the false positives we've been looking at.

@jbj

Certain Microsoft projects, such as CoreCLR and ChakraCore, use a library called the PAL, which enables two-byte strings in the printf family of functions, even when built on a platform with four-byte strings. This adds support for determining the size of a wide character from the definitions of such functions, rather than assuming that they match the compiler's wchar_t.

…s most of the tests).

jbj · 2018-10-05T14:55:48Z

@geoffw0 there are conflicts even though this is a freshly opened PR. Based on the affected file names, I'm guessing that you edited them before all files in the repo got normalised to Unix line endings. Try to rebase on the latest master with git rebase -Xignore-space-change ... or, if that doesn't work, git rebase -Xignore-all-space .... Then check the files to make sure the line endings are all LF.

…ingFunction.qll.

geoffw0 · 2018-10-05T15:54:26Z

I've fixed the conflicts, it was on an old base revision with two genuine conflicts (one in a test, one in a change note). Didn't notice any issue with line endings but let me know if there is one!

jbj

I'll merge this now so we can deploy to lgtm.com today. I tested it on ChakraCore and comdb2, and the results LGTM.

jbj · 2018-10-08T07:48:13Z

cpp/ql/src/Likely Bugs/Format/WrongTypeFormatArguments.ql

@@ -25,7 +25,8 @@ private predicate formattingFunctionCallExpectedType(FormattingFunctionCall ffc,
      ffc.getTarget() = f and
      f.getFormatParameterIndex() = i and
      ffc.getArgument(i) = fl and
-      fl.getConversionType(pos) = expected
+      fl.getConversionType(pos) = expected and
+      count(fl.getConversionType(pos)) = 1


I suspect this is only needed because the where clause in this query has its logic the wrong way around for when there are multiple expected types. I'd like to try fixing this, but I won't let it block this PR.

I will look at this in a follow-up.

To clarify, I'm thinking that

not trivialConversion(expected.getUnspecifiedType(), actual.getUnspecifiedType())

should be replaced by

not exists(Type anyExpected | formatArgType(ffc, n, anyExpected, arg, actual) and trivialConversion(anyExpected.getUnspecifiedType(), actual.getUnspecifiedType()) )

If it's a UI issue to have multiple alert messages for the same line, we could use strictconcat to concatenate the names of the expected types instead of having a Type expected in the from clause.

You were right - see #1255 and https://git.semmle.com/Semmle/code/pull/31630.

jbj · 2018-10-08T07:48:52Z

cpp/ql/src/semmle/code/cpp/models/interfaces/FormattingFunction.qll

@@ -7,6 +7,38 @@
 */

 import semmle.code.cpp.Function
+
+private Type stripTopLevelSpecifiersOnly(Type t) {


Why was it necessary to introduce this? The commit that introduced it didn't come with any test output changes.

It does have test changes, but in a test that's still internal (see internal PR https://git.semmle.com/Semmle/code/pull/28289).

Having said this, looking at it again I think I can do better. I'll do this as follow-up work.

XXE query

Fix type access expression extraction for function/property references

geoffw0 added the C++ label Oct 5, 2018

rdmarsh2 and others added 7 commits October 5, 2018 15:32

C++: document new predicate

5b8925c

C++: consider attributes when finding wide string functions

fe8f7e9

CPP: Allow declarations of library printf functions in source (repair…

6e5207c

…s most of the tests).

CPP: Test fixes as a result of changes.

e74721e

CPP: Annotate test.

39f030b

CPP: Add a test where different wide types are present.

2af56b8

geoffw0 added 10 commits October 5, 2018 16:40

CPP: More test cases.

8005558

CPP: Add a test where different word sizes are present.

1af6c10

CPP: New mechanism for string types in printf.qll.

e2be19b

CPP: Test getDefaultCharType etc.

89c5648

CPP: Replace stripTopLevelSpecifiers to emulate old behaviour.

580471a

CPP: Make getAFormatterWideType more general and move it into Formatt…

2841897

…ingFunction.qll.

CPP: Lets just not report when we're not sure.

94ff2e5

CPP: Fix for consistency.

605db44

CPP: Simplify getAFormatterWideType.

67a7b75

CPP: Change note.

998b28b

geoffw0 force-pushed the wrongtype16 branch from 2706b76 to 998b28b Compare October 5, 2018 15:52

CPP: Additional test case fixed in combination with typedef work.

99816d7

jbj approved these changes Oct 8, 2018

View reviewed changes

jbj merged commit 0644e0f into github:master Oct 8, 2018

geoffw0 mentioned this pull request Apr 16, 2019

CPP: WrongTypeFormatArguments.ql Improvements #1255

Merged

aibaars added a commit that referenced this pull request Oct 14, 2021

Merge pull request #286 from github/aibaars/xxe

2f46277

XXE query

smowton pushed a commit to smowton/codeql that referenced this pull request Apr 16, 2022

Merge pull request github#286 from github/kotlin-reference-type-access

2034b1e

Fix type access expression extraction for function/property references

geoffw0 deleted the wrongtype16 branch February 10, 2023 21:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

CPP: Fix WrongTypeFormatArguments.ql char16_t * issues (and others) #286

CPP: Fix WrongTypeFormatArguments.ql char16_t * issues (and others) #286

Uh oh!

geoffw0 commented Oct 5, 2018

Uh oh!

jbj commented Oct 5, 2018

Uh oh!

geoffw0 commented Oct 5, 2018

Uh oh!

jbj left a comment

Uh oh!

jbj Oct 8, 2018

Uh oh!

geoffw0 Oct 8, 2018

Uh oh!

jbj Oct 8, 2018

Uh oh!

geoffw0 Apr 16, 2019

Uh oh!

jbj Oct 8, 2018

Uh oh!

geoffw0 Oct 8, 2018

Uh oh!

Uh oh!

CPP: Fix WrongTypeFormatArguments.ql char16_t * issues (and others) #286

CPP: Fix WrongTypeFormatArguments.ql char16_t * issues (and others) #286

Uh oh!

Conversation

geoffw0 commented Oct 5, 2018

Uh oh!

jbj commented Oct 5, 2018

Uh oh!

geoffw0 commented Oct 5, 2018

Uh oh!

jbj left a comment

Choose a reason for hiding this comment

Uh oh!

jbj Oct 8, 2018

Choose a reason for hiding this comment

Uh oh!

geoffw0 Oct 8, 2018

Choose a reason for hiding this comment

Uh oh!

jbj Oct 8, 2018

Choose a reason for hiding this comment

Uh oh!

geoffw0 Apr 16, 2019

Choose a reason for hiding this comment

Uh oh!

jbj Oct 8, 2018

Choose a reason for hiding this comment

Uh oh!

geoffw0 Oct 8, 2018

Choose a reason for hiding this comment

Uh oh!

Uh oh!