-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
I'm developing a custom taint tracking rule to analyze native methods in Java, but I'm encountering issues when tainted data involves field access. CodeQL seems to treat fields and their containing objects as separate entities, which breaks taint propagation in certain scenarios.
Example Case
Consider this Java code where d1.data
is tainted and d2
(an alias of d1
) is passed as a sink:
public class Main {
static {
System.loadLibrary("java2nativefieldalias");
}
public native void nativeTransfer(Data d1, Data d2);
public static void main(String[] args) {
Data d1 = new Data();
Main m = new Main();
// Source
d1.data = SourceSink.source();
Data d2 = d1;
// aliased arguments
m.nativeTransfer(d1, d2); // Should detect taint flow but doesn't
Data d3 = new Data();
d3.data = "clean data";
// non-aliased arguments
m.nativeTransfer(d1, d3);
}
}
With the corresponding native implementation:
#include <stdio.h>
#include "Main.h"
void Leak(const char* str) {
printf("sensitive data: %s\n", str);
}
JNIEXPORT void JNICALL Java_Main_nativeTransfer
(JNIEnv *env, jobject obj, jobject data1, jobject data2) {
jclass dataClass = (*env)->GetObjectClass(env, data2);
jfieldID dataField = (*env)->GetFieldID(env, dataClass, "data", "Ljava/lang/String;");
jstring dataString = (jstring)(*env)->GetObjectField(env, data2, dataField);
const char* data = (*env)->GetStringUTFChars(env, dataString, NULL);
if (data == NULL) return;
Leak(data); // leak
(*env)->ReleaseStringUTFChars(env, dataString, data);
}
Current Query and Workaround
My current taint tracking configuration includes additional flow steps to handle field propagation and aliasing:
predicate isSource(DataFlow::Node source) {
// Source from SourceSink.source() method
exists(MethodAccess ma |
ma.getMethod().getName() = "source" and
ma.getMethod().getDeclaringType().hasName("SourceSink") and
source.asExpr() = ma
)
}
predicate isSink(DataFlow::Node sink) {
// Sink: second argument to nativeTransfer method
exists(MethodAccess ma |
ma.getMethod().getName() = "nativeTransfer" and
ma.getMethod().isNative() and
sink.asExpr() = ma.getArgument(1) // d2 parameter
)
}
predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
// Field-to-object propagation
exists(FieldAccess fa |
fa.getField().getName() = "data" and
node1.asExpr() = fa and // Field is tainted
node2.asExpr() = fa.getQualifier() // Object should also be tainted
)
or
// Alias propagation handling
exists(AssignExpr assign |
assign.getRhs() = node1.asExpr() and
assign.getDest() = node2.asExpr()
)
}
Questions
-
JNI Taint Analysis Best Practices: Are there recommended approaches or built-in mechanisms in CodeQL for handling taint tracking through JNI methods, particularly when dealing with field access patterns like in the example above?
-
Alias Propagation Behavior: The second predicate (alias propagation) appears redundant since CodeQL should inherently handle aliasing. However, without it, the taint flow from
d1
tod2
via assignment (Data d2 = d1;
) isn't detected. Is this expected behavior, or might there be an issue with my configuration?
Expected vs Actual Behavior
Expected: CodeQL should detect the taint flow from d1.data
→ d1
→ d2
→ nativeTransfer(d2)
→ field access in native code → Leak()
.
Actual: Without the custom aliasing predicate, the taint chain breaks at the object level, even though the field data itself is properly tracked.
The workaround functions but seems suboptimal. I'm seeking guidance on whether this is the correct approach or if there are better solutions available in CodeQL's taint tracking framework.