Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Use TaintTracking::Configuration in TaintedAllocationSize #3519

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

rdmarsh2
Copy link
Contributor

This migrates TaintedAllocationSize to use the standard TaintTracking::Configuration mechanism rather than DefaultTaintTracking. It includes a new SanitizerGuard instance for bounds checks, but it has some false positives where a variable is bounds checked and reassigned if out of bounds.

rdmarsh2 added 4 commits May 19, 2020 11:52
This brings the temporary conflation of pointers and objects at function
arguments and parameters into TaintTrackingUtil, since we don't have
precise flow for indirections yet.
This migrates TaintedAllocationSize to use the standard
TaintTracking::Configuration mechanism rather than DefaultTaintTracking.
This includes a new SanitizerGuard instance, but it has some false
positives where a variable is bounds checked and reassigned if out of
bounds.
This adds a copy of the test from before github#3511. The prior commit fixed
the false positive in the modified version but not the original.
@rdmarsh2 rdmarsh2 added the C++ label May 19, 2020
@rdmarsh2 rdmarsh2 requested review from jbj and geoffw0 May 19, 2020 19:50
@rdmarsh2 rdmarsh2 requested a review from a team as a code owner May 19, 2020 19:50
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Individual changes LGTM, but the three new false positives in the TaintedAllocationSize test are a problem (a setback for https://github.com/github/codeql-c-analysis-team/issues/80). Do you know what's causing them, and how easily they might be addressed?

@rdmarsh2
Copy link
Contributor Author

Yes - they all have a fairly similar pattern:

// x is already tainted
if(x < 100) {
  x = 100;
}
sink(x); // tainted

The issue is that the SanitizerGuard class currently doesn't handle phi operands along a critical edge. I don't think there's a good fix for it without changing the Node type in IR dataflow...

@geoffw0
Copy link
Contributor

geoffw0 commented May 21, 2020

The security / defaulttainttracking library casts a wide 'benefit of doubt' around comparisons, by considering any read of a variable x for which there exists an expression (anywhere) resembling x < y to block taint. This is not very principled, but it does weed out a lot of false positives and still allows most good results through. I'm happy to discuss improving this exception, but I don't think UpperBoundGuard in the PR is sufficient. I've made a PR onto this with my attempt of repairing that mechanism: rdmarsh2#2

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes not already discussed LGTM, and I may have come across negatively in previous comments when I didn't intend to. This is a big and much needed improvement to TaintedAllocationSize. I like the splitting of taint sources into local and remote, I think that design will be helpful in future.

@@ -76,7 +76,7 @@ void processData2(char *start, char *end)
{
char *copy;

copy = new char[end - start]; // GOOD
copy = new char[end - start]; // GOOD [FALSE POSITIVE]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this new result is acceptable. I'm not entirely sure why the old version of the query didn't flag it, and I suspect it was a fluke.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was prevented by the predictability barrier in DefaultTaintTracking, which I did not carry over to this version. The taint source is actually the buffer filled by fread, not its return value. I think when #3123 is merged and we can remove pointer/object conflation on arguments, this should be fixed.

@jbj
Copy link
Contributor

jbj commented May 25, 2020

We've had a lot of FP reports on this particular query, so we can't ignore problems with comparison guards.

The issue is that the SanitizerGuard class currently doesn't handle phi operands along a critical edge. I don't think there's a good fix for it without changing the Node type in IR dataflow...

A major motivation for porting this query is to ensure that our libraries offer what they need for typical queries. Let's discuss in the next team meeting how we can make that happen.

@geoffw0
Copy link
Contributor

geoffw0 commented May 28, 2020

My concerns have been addressed. 👍

@rdmarsh2 rdmarsh2 added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label May 29, 2020
@rdmarsh2
Copy link
Contributor Author

@rdmarsh2
Copy link
Contributor Author

@rdmarsh2
Copy link
Contributor Author

rdmarsh2 commented Jun 1, 2020

Those changes from the CPP-Differences job don't look good. The new results are all false positives from added flow through pointer differences, and there's about a 1.5% slowdown to the overall suite because of this query, plus some other slowdowns that might be related

@adityasharad adityasharad changed the base branch from master to main August 14, 2020 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ depends on internal PR This PR should only be merged in sync with an internal Semmle PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants