Description
I'm trying to model indirect flow through external functions, similar to the example below. I want to follow the taint from taint_source
to taint_sink
through process_taint
and process_taint2
. Therefore, I need to model that the taint of the data
member is propagated to the outputs.
struct S {
int data;
int dummy
};
int taint_source();
S* process_taint(S* input);
void process_taint2(S* input, S* output);
void taint_sink(int tainted);
void df1() {
S* s = new S();
S* t;
s->data = taint_source();
process_taint2(s, t);
taint_sink(t->data);
taint_sink(t->dummy);
}
void df2() {
S* u = new S();
S* v;
u->data = taint_source();
v = process_taint(u);
taint_sink(v->data);
taint_sink(v->dummy);
}
int main(int argc, char* argv[]) {
df1();
df2();
return 0;
}
I use this basic query for the example
/**
* @kind path-problem
*/
import cpp
import semmle.code.cpp.dataflow.new.TaintTracking
import MyFlow::PathGraph
module MyFlowConf implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
source.asExpr() = any(Call c | c.getTarget().hasName("taint_source))
}
predicate isSink(DataFlow::Node sink) {
sink.asExpr() = any(Call c | c.getTarget().hasName("taint_sink)).getAnArgument()
}
}
module MyFlow = TaintTracking::Global<MyFlowConf>;
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink, source, sink, "Flow"
Using the following MaD I can get a correct taint flow.
extensions:
- addsTo:
pack: codeql/cpp-all
extensible: summaryModel
data:
- ["", "", False, "process_taint", "", "", "Argument[*0].Field[data]", "ReturnValue[*].Field[data]", "taint", "manual"]
- ["", "", False, "process_taint2", "", "", "Argument[*0].Field[data]", "Argument[*1].Field[data]", "taint", "manual"]
However, as far as I can see that would require to copy the rule for every member of S
using the same pattern. For JS there seems to be an AnyMember
keyword but it looks like this is not available in C++. Is there a wildcard to specify the same field/access path in the input and output?
Alternatively I tried to model it as an additional flow step like this
predicate isAdditionalFlowStep2(DataFlow::Node source, DataFlow::Node sink) {
exists(Assignment a |
a.getLValue().getAChild() = sink.asIndirectExpr()
and a.getRValue() = source.asExpr()
and source.asExpr().(Call).getTarget().hasName("taint_source")
)
or
exists(Call c |
c.getTarget().hasName("process_taint")
and sink.asIndiretExpr() = c
and source.asIndirectExpr() = c.getAnArgument()
)
or
exists(Call c |
c.getTarget().hasName("process_taint2")
and source.asIndirectExpr() = c.getArgument(0)
and sink.asDefiningArgument() = c.getArgument(1)
)
or
source.asIndirectExpr() = sink.asExpr().(FieldAccess).getQualifier()
}
Which finds the flows but also produces false positives where dummy
is given to taint_sink
as it should not be tainted.
How can I model the propagation of indirect data flow with the correct access path for external functions?