-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[CPP-370] Non-constant format
arguments to printf
and friends
#1251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Before I go any further, I have a question regarding In I'm basically trying to obtain a "baseline" as to what is and is not correct behavior for this query. |
On the other hand, the translation of a valid format string will almost always be a valid format string; producing a result everywhere that a translated string is formatted will mean that the vast majority of results in a project using It looks like the |
We don't include any preprocessor logic as AST elements, but we do capture information about what preprocessor directives were encountered, what macro expansions occurred, and what AST elements were partially or completely generated by a particular macro expansion. Take a look at https://help.semmle.com/qldoc/cpp/semmle/code/cpp/Macro.qll/module.Macro.html for the classes and predicates involved in macro handling. |
So it looks like we're baking in a lot of assumptions regarding |
The main danger this rule is intended to prevent is that two strings with different expected format parameters could flow to the same formatting function call, and therefore one of them is provided the wrong set of parameters. If the |
These are deliberate exclusions to keep the 'false positive' (or perhaps I should say 'noisy result') rates down. See https://jira.semmle.com/browse/ODASA-5357 (I think there was a specific one for
A translated string is likely to be loaded from some kind of data store which the developers have similar control over to the source code (admittedly, the truth of this claim will vary) - hence it may be considered 'safe' to a similar degree as a constant string written in the source code.
I find it helpful to imagine that I 'own' the source code in question and come across the alert on it - am I likely to change the code as a result of seeing this, or just moan that those LGTM developers don't understand why my code is good? Furthermore on LGTM we bias towards high accuracy (low false positives) rather than widest coverage (low false negatives), as frankly most people are pushed for time and would rather look at a smaller set of higher quality alerts - though for critical security queries there's some flexibility here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your qldoc change LGTM.
Thanks guys for the |
Regarding this PR, are you planning to push any more changes or shall I merge it now? |
@geoffw0, there will be more changes once I figure out how to implement them. :-) |
So I've pushed my initial attempt at using the global DataFlow library. There are now 2 areas in which I need assistance: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll continue reviewing when I can read the .expected
file and when we've settled the big-picture discussions on what this query should do.
// For the following `...gettext` functions, we assume that | ||
// all translations preserve the type and order of `%` specifiers | ||
// (and hence are safe to use as format strings). This is | ||
// assumption is hard-coded into the query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment would be more helpful in the query itself. In any case, please tweak the formatting so the line numbers in the test don't get shifted down. Otherwise it's too hard to review NonConstantFormat.expected
, and that's the most important file to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm about to push a "new" version of NonConstantFormat.expected
with the corrected line numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new version of NonConstantFormat.expected
that is in sync with NonConstantFormat.c
and test.cpp
has been pushed. Please let me know if you're still encountering problems.
I looked through |
I think the annotations look reasonable (though the false positives would be nice to fix, obviously). But the I need to figure out how to add additional edges to the dataflow graph via |
On second thought, I do not see why using |
You mean that if a non-constant format string is cast to I expect that if you get the interprocedural data flow library working, then the |
The latter. |
printf(_("Hello, World\n")); // OK | ||
{ | ||
char hello[] = "hello, World\n"; | ||
hello[0] = 'H'; | ||
printf(hello); // NOT OK | ||
printf(_(hello)); // NOT OK | ||
printf(_(hello)); // OK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the way you think here: everything that comes out of gettext
is safe (for the purpose of this query).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query actually goes a bit further and assumes that _(...)
is always OK, regardless of where the _
macro points at. Do you think this is overly lenient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could look into adjusting it, but I'd prefer keeping this PR focused on the data flow change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
| | ||
// we disallow parameters, since they may be bound to unsafe arguments | ||
// at various call sites. | ||
not v instanceof Parameter and source.asExpr() instanceof StringLiteral |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Joining the two conjuncts on a single line makes it look like and
binds tighter than not
when in fact the precedence is the other way around.
// we disallow parameters, since they may be bound to unsafe arguments | ||
// at various call sites. | ||
not v instanceof Parameter and source.asExpr() instanceof StringLiteral | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This predicate doesn't relate source
and sink
, so it computes a Cartesian product. Let's discuss how to fix it in the next meeting.
I just checked in a rewrite of NonConstantFormat.ql using the taint tracking library. Feel free to review the query as you like, because it is almost certainly not its final version. I've run a 66-project query (https://lgtm.com/query/1255083408835090613/) and see a lot of hits in From a data flow perspective, I believe that the query is correct, but its results are confusing to say the least. What do you guys think should be done? We could just check for string literals at the calls to |
Just discovered that I broke |
This behaviour sounds fine to me, and most of the results look OK as well at a glance. Could you call out one in particular you're not happy with? |
I agree that the results from this query can be confusing, but is this PR making it much worse? It would be great to add path explanations to this query, but can that be deferred to a follow-up PR? |
It's my impression you're now happy with the test suite results relative to what we had before. Then the next step is to test the query on our set of LGTM test projects. If it's hard to get an overview of what's changed, you can try to combine the old and the new queries into two "diff queries", where one shows only the results that were gained relative to the old query, and the other shows only the results that were lost. |
Before: https://lgtm.com/query/3266353611601973701/ At first glance, it appears that the DataFlow-based query is strictly more sensitive than the old query, but I will investigate in more detail. |
The new query (https://lgtm.com/query/2338642181633737976/) generates lots of new alerts (esp. for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pleased to see that the changes lead to new good results for this query. However, I sampled the first ~10 differences between the old and the new queries, and I think there are too many new FPs and FNs. Please try to address some of them so we can confidently say that this query is better than the previous version.
class blab { | ||
void out1(void) { | ||
char *fmt = (char *)__builtin_alloca(10); | ||
diagnostic(fmt); // GOOD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely this is BAD
, not GOOD
? It's flagged by the query, so if you think it's GOOD
, then it's at least GOOD [FALSE POSITIVE]
. But I think it's BAD
since the only reason to allocate a writeable buffer is to write non-constant data to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant BAD [FALSE POSITIVE]
in the comment above. But you're right, this is just plain BAD
.
@@ -18,7 +18,7 @@ extern "C" int snprintf ( char * s, int n, const char * format, ... ); | |||
struct A { | |||
void do_print(const char *fmt0) { | |||
char buf[32]; | |||
snprintf(buf, 32, fmt0); // BAD [should detect at top-most call] | |||
snprintf(buf, 32, fmt0); // GOOD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean BAD
? This line is still flagged by the query. We've been a bit back and forth about formats from parameters being passed to non-vprintf
functions, and I think you've convinced me that it's BAD
.
Overall, it looks like this file has come out of sync with its entries in NonConstantFormat.expected
. Please go through this file (and the others) to ensure that the comment starts with // BAD
if and only if the line is listed in the expected
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the opposite recollection, i.e., that we should leave things as they are (wrt handling vprintf
-like and non-vprintf
-like calls). Also, is it true that // BAD
always imply an alert being signaled? In nested.cpp
above, I (originally) marked line 21 as // BAD
since the alert was in the wrong place (i.e., a false positive), but I also marked line 42 as // BAD
since that is where the alert should have gone (a false negative). Am I getting my nomenclature mixed up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will tentatively mark line 21 as // BAD [FALSE POSITIVE]
and line 42 as // GOOD [NOT DETECTED]
. This will adhere to your rule that // BAD
iff the line gets flagged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please go through this file (and the others) to ensure that the comment starts with
// BAD
if and only if the line is listed in theexpected
file.
What I wrote there did not make sense, sorry. Let me try again.
We write // BAD
to mean that the format call ought to be flagged by a perfect query. If it's not actually flagged in the expected
file, we add [NOT DETECTED]
. We write // GOOD
to mean that the format call should not be flagged by a perfect query. If it's actually flagged in the expected
file, we add [FALSE POSITIVE]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that makes more sense, thanks. I still think we should use // FLAGGED
and // NOT FLAGGED
, but that's a discussion for another day. :-)
For the following code fragment /***************************************************************************** * * Display an ARM64 condition code for the conditional instructions */ void emitter::emitDispCond(insCond cond) { const static char* armCond[16] = {"eq", "ne", "hs", "lo", "mi", "pl", "vs", "vc", "hi", "ls", "ge", "lt", "gt", "le", "AL", "NV"}; // The last two are invalid unsigned imm = (unsigned)cond; assert((0 <= imm) && (imm < ArrLen(armCond))); printf(armCond[imm]); } the new query generates an alert whereas the old query did not. The programmer clearly knows what they're doing, so the question really is whether it is acceptable to have a false positive here. Another problem I'm running into is that I absolutely cannot reproduce this result in a smaller file containing just the function above (plus all the dependencies). (The original function is found in the |
After my latest commit (which took care of handling "small" projects: ========================== lastpass/lastpass-cli: 1 true positive curl/curl: 1 false negative tesseract-ocr/tesseract: 1 false negative (which is identical to two other true positives!) Blizzard/s2client-api: 1 negative (true/false?) musescore/MuseScore: 3 true positives systemd/systemd: 1 true positive domoticz/domoticz: 1 true positive cxong/cdogs-sdl: 1 true positive ntpsec/ntpsec: 1 false negative pgsql-jp/jpug-doc: 1 false negative fawkesrobotics/fawkes: 3 negatives (true/false?) nishadg246/pybullet-play: 3 true positives numpy/numpy: 3 true positives protocolbuffers/protobuf: 1 false negative "big" projects: ========================== dotnet/coreclr: 231 -> 363 libretro/RetroArch: 59 -> 54 torvalds/linux: 16 -> 15 wireshark/wireshark: 41 -> 34 coreutils/coreutils: 46 -> 85 ManaPlus/ManaPlus: 12 -> 5 bloomberg/comdb2: 48 -> 39 microsoft/ChakraCore: 16 -> 18 vim/vim: 43 -> 9 FFmpeg/FFmpeg: 4 -> 7 I have not yet analyzed the "big" projects, but if the smaller projects are any indication, I think I would find a smattering of true positives and false negatives. Can anyone think of how to attack the false negative problem? |
I'd like to help with the investigation of those false negatives, but it's unlikely that I'll have time this week. |
I have some good news and some bad news. The good news is that most of the false negatives listed above are true negatives. (The one exception involves aliasing writes that the The bad news is that we're taking a performance hit. I don't have exact numbers, just anecdotal experience with running the new vs. old queries inside of Eclipse. |
Some As of yet, I did not analyze the vast majority of alerts that are common to both queries; there are probably some FPs there. |
For reference: Old query https://lgtm.com/query/3153700718820681798/, new query https://lgtm.com/query/1613184920912057134/ |
For So a pattern is emerging in which we are finding lots of new good results while losing some existing results. |
|
I think the query needs a bit more tweaking. We treat assignment as a non-const operation (which it is), but I think we should allow it if the RHS is a string literal. I keep seeing the idiom whereby a |
Modified new query run: https://lgtm.com/query/9048479769501953958/ |
To speed up the taint analysis in `NonConstantFormat.ql` and to remove FPs that were due to taint spreading from `i` to `a[i]`, this commit stops the taint tracking in `NonConstantFormat.ql` at every node that could not possibly contain a string. I tested performance on Wireshark, and it's fine. Pulling out the `isSanitizerNode` prevented `isSanitizer` from turning into four half-slow RA predicates due to both CPE and `#antijoin_rhs` transformations happening.
I've opened zlaski-semmle#1 as a PR against this PR. It should fix the array access problems. |
C++: NonConstantFormat taint only for string types
After merging in dotnet/coleclr: 316->300, of which 4 TP and 20 FN. |
Created a separate PR - #1533 - to deal with the 1.22 change note describing the improved query. |
No description provided.