Skip to content

Conversation

geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Oct 5, 2018

Fix handling of wide string types in 'WrongTypeFormatArguments.ql'. Previously they were assumed to always be wchar_t *, which lead to high numbers of false positive results on projects where, say, char16_t * strings are used. Also fixed:

  • an issue where the query would get confused by a snapshot which had multiple types that naturally correspond with a single format character, for example (commonly) multiple pointer sizes and %p.
  • an issue where custom formatting functions were assumed to be built-in formatting functions due to exactly matching in name.

The first three commits are from an old pull request by @rdmarsh2, that was never merged due to concerns about mixing string types (which have been fixed here).

I have some smaller fixes to follow, but this PR should eliminate most of the false positives we've been looking at.

@jbj

@geoffw0 geoffw0 added the C++ label Oct 5, 2018
rdmarsh2 and others added 7 commits October 5, 2018 15:32
Certain Microsoft projects, such as CoreCLR and ChakraCore, use a
library called the PAL, which enables two-byte strings in the printf
family of functions, even when built on a platform with four-byte
strings. This adds support for determining the size of a wide character
from the definitions of such functions, rather than assuming that they
match the compiler's wchar_t.
@jbj
Copy link
Contributor

jbj commented Oct 5, 2018

@geoffw0 there are conflicts even though this is a freshly opened PR. Based on the affected file names, I'm guessing that you edited them before all files in the repo got normalised to Unix line endings. Try to rebase on the latest master with git rebase -Xignore-space-change ... or, if that doesn't work, git rebase -Xignore-all-space .... Then check the files to make sure the line endings are all LF.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Oct 5, 2018

I've fixed the conflicts, it was on an old base revision with two genuine conflicts (one in a test, one in a change note). Didn't notice any issue with line endings but let me know if there is one!

Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge this now so we can deploy to lgtm.com today. I tested it on ChakraCore and comdb2, and the results LGTM.

@@ -25,7 +25,8 @@ private predicate formattingFunctionCallExpectedType(FormattingFunctionCall ffc,
ffc.getTarget() = f and
f.getFormatParameterIndex() = i and
ffc.getArgument(i) = fl and
fl.getConversionType(pos) = expected
fl.getConversionType(pos) = expected and
count(fl.getConversionType(pos)) = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is only needed because the where clause in this query has its logic the wrong way around for when there are multiple expected types. I'd like to try fixing this, but I won't let it block this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look at this in a follow-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I'm thinking that

not trivialConversion(expected.getUnspecifiedType(), actual.getUnspecifiedType())

should be replaced by

not exists(Type anyExpected |
  formatArgType(ffc, n, anyExpected, arg, actual) and
  trivialConversion(anyExpected.getUnspecifiedType(), actual.getUnspecifiedType())
)

If it's a UI issue to have multiple alert messages for the same line, we could use strictconcat to concatenate the names of the expected types instead of having a Type expected in the from clause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -7,6 +7,38 @@
*/

import semmle.code.cpp.Function

private Type stripTopLevelSpecifiersOnly(Type t) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was it necessary to introduce this? The commit that introduced it didn't come with any test output changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does have test changes, but in a test that's still internal (see internal PR https://git.semmle.com/Semmle/code/pull/28289).

Having said this, looking at it again I think I can do better. I'll do this as follow-up work.

@jbj jbj merged commit 0644e0f into github:master Oct 8, 2018
aibaars added a commit that referenced this pull request Oct 14, 2021
smowton pushed a commit to smowton/codeql that referenced this pull request Apr 16, 2022
Fix type access expression extraction for function/property references
@geoffw0 geoffw0 deleted the wrongtype16 branch February 10, 2023 21:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants