-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Use a TaintTracking::Configuration
in three more queries
#8382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Use a TaintTracking::Configuration
in three more queries
#8382
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I've made a few comments and questions. I will also do some kind of run on LGTM of each query before approving this to merge.
LGTM runs...
|
Thanks a lot! I was just in the process of creating diff queries myself. |
They're not diff queries - I decided that viewing the difference in presentation ( |
Yeah, that's fair. I stripped out the path-problem stuff in |
On
|
Thanks for looking at the result changes for
It may also be due to the removal of the barriers on "unpredictable" instructions. If we like the results I don't think we should include that barrier to remove these results.
Indeed, this should have a change note. If nothing else, the change note should mention that these queries are now path-problem queries. I'll write the change note first thing tomorrow if we like the result changes on the other two queries. |
On
|
You might be right, specifically the case:
which might mean we now get taint flowing through things like pointer differences ( |
On
|
Indeed, I think we should give it a
There are no barriers based on small types in I might start by porting over the unpredictable instruction barrier to just this one query. I don't want to do too much work on improving the FP rate of a |
…ex-validation' (and not the array expression).
Fixed in 693eca2. |
@geoffw0 all the new results on We now get a much more manageable difference. Here's a diff query and here are all the results. I think they look good! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, looks much better. 👍
There are still false positives for that query to do with characters being checked against (presumably) 256-long arrays, but I don't think we need to fix that as part of this work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still 👍
I've created the internal PR with the required changes. |
…pp/unclear-array-index-validation' to prevent an explosion of new results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM.
nodeIsBarrierEqualityCandidate(node, access, checkedVar) | ||
) | ||
or | ||
// Don't use dataflow into binary instructions if both operands are unpredictable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do wonder what case this is supposed to cover besides subtraction ... but the existence of the case below as well suggests there's more to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, these two barriers block flow in ~100 examples on vim/vim
. For this first case, it's most often subtraction/addition. The next case also handles stuff like getting the index of some character in a non-constant string (which they know is bounded due to invariants on the string). I imagine we could do something much more clever here if we want to, but that should probably be once we decide to invest effort in rewriting this query to something much more modern.
Initially, I was also going to modernize the sanitizers used in the three queries, but I realized that this code should be changed anyway once we merged the ongoing work on range analysis. So for now I've copied the ad-hoc sanitizers into these three queries to not change their behavior.