Skip to content

Conversation

jbj
Copy link
Contributor

@jbj jbj commented Sep 10, 2019

This commit adds a semmle.code.cpp.ir.dataflow.DefaultTaintTracking library that's API-compatible with the semmle.code.cpp.security.TaintTracking library. The new library is implemented on top of the IR data flow library.

The idea is to evolve this library until it can replace semmle.code.cpp.security.TaintTracking without decreasing our SAMATE score. Then we'll have the IR in production use, and we will have one less taint-tracking library in production.

This commit adds a `semmle.code.cpp.ir.dataflow.DefaultTaintTracking`
library that's API-compatible with the
`semmle.code.cpp.security.TaintTracking` library. The new library is
implemented on top of the IR data flow library.

The idea is to evolve this library until it can replace
`semmle.code.cpp.security.TaintTracking` without decreasing our SAMATE
score. Then we'll have the IR in production use, and we will have one
less taint-tracking library in production.
@jbj jbj added the C++ label Sep 10, 2019
@jbj jbj requested a review from a team as a code owner September 10, 2019 11:41
@jbj
Copy link
Contributor Author

jbj commented Sep 10, 2019

I've added this library without tests. As we talked about in yesterday's meeting, it's not obvious how to add tests without duplicating both queries and test results. At this point I don't think it's important to track test changes very closely since there are some obvious big-ticket items missing: flow through strcpy etc. and virtual dispatch. I've put todo comments directly in the QL code so it should be as easy at possible to find out both what to work on and where to add it.

Copy link
Contributor

@rdmarsh2 rdmarsh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the RelationalOperation change, all of these could be TODOs.

* A variable that has any kind of upper-bound check anywhere in the program
*/
private predicate hasUpperBoundsCheck(Variable var) {
exists(BinaryOperation oper, VariableAccess access |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use RelationalOperation rather than checking getOperator().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also in the security.TaintTracking library from which I copied this predicate.

oper.getOperator() = ">" or
oper.getOperator() = ">="
) and
oper.getLeftOperand() = access and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is using left vs right operand as a proxy for whether the value is being checked or is the bound. Would being the lesser or greater operand be a better proxy? (testing could be a TODO)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you note below, checking the whole variable rather than one variable access is itself a proxy. I've added a todo discussing it.

override predicate isBarrier(DataFlow::Node node) {
exists(Variable checkedVar |
accessesVariable(node.asInstruction(), checkedVar) and
hasUpperBoundsCheck(checkedVar)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run, I'd like to use guards or range analysis here. Should that be a TODO in here or just a ticket on Jira?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added that as a todo on hasUpperBoundsCheck.

// of `returnArgument` to the `interfaces.Taint` library.
//
// TODO: Flow from input argument to output argument of known functions: Port
// missing parts of `copyValueBetweenArguments` to the `interfaces.Taint`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these should go in interfaces.DataFlow instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

exists(DefaultTaintTrackingCfg cfg, DataFlow::Node sink |
cfg.hasFlow(DataFlow::exprNode(source), sink)
|
sink.asExpr().getConversion*() = tainted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the conversion instructions are UnaryInstructions and therefore covered by the clause in instructionTaintStep. We should be able to omit the getConversion*().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some edge cases where it makes a difference. I've added a todo.

@rdmarsh2 rdmarsh2 merged commit e71a39f into github:master Sep 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants