Skip to content

[Java] Taint Tracking for JNI Field Access and Object Aliasing in Java #20536

@starsalt0124

Description

@starsalt0124

I'm developing a custom taint tracking rule to analyze native methods in Java, but I'm encountering issues when tainted data involves field access. CodeQL seems to treat fields and their containing objects as separate entities, which breaks taint propagation in certain scenarios.

Example Case

Consider this Java code where d1.data is tainted and d2 (an alias of d1) is passed as a sink:

public class Main {
    static {
        System.loadLibrary("java2nativefieldalias");
    }

    public native void nativeTransfer(Data d1, Data d2);

    public static void main(String[] args) {
        Data d1 = new Data();
        Main m = new Main();
        // Source
        d1.data = SourceSink.source();
        Data d2 = d1;
        // aliased arguments
        m.nativeTransfer(d1, d2);  // Should detect taint flow but doesn't
        Data d3 = new Data();
        d3.data = "clean data";
        // non-aliased arguments
        m.nativeTransfer(d1, d3);
    }
}

With the corresponding native implementation:

#include <stdio.h>
#include "Main.h"

void Leak(const char* str) {
    printf("sensitive data: %s\n", str);
}

JNIEXPORT void JNICALL Java_Main_nativeTransfer
  (JNIEnv *env, jobject obj, jobject data1, jobject data2) {
    jclass dataClass = (*env)->GetObjectClass(env, data2);
    jfieldID dataField = (*env)->GetFieldID(env, dataClass, "data", "Ljava/lang/String;");
    jstring dataString = (jstring)(*env)->GetObjectField(env, data2, dataField);
    const char* data = (*env)->GetStringUTFChars(env, dataString, NULL);
    if (data == NULL) return;
    Leak(data); // leak
    (*env)->ReleaseStringUTFChars(env, dataString, data);
}

Current Query and Workaround

My current taint tracking configuration includes additional flow steps to handle field propagation and aliasing:

predicate isSource(DataFlow::Node source) {
  // Source from SourceSink.source() method
  exists(MethodAccess ma |
    ma.getMethod().getName() = "source" and
    ma.getMethod().getDeclaringType().hasName("SourceSink") and
    source.asExpr() = ma
  )
}

predicate isSink(DataFlow::Node sink) {
  // Sink: second argument to nativeTransfer method
  exists(MethodAccess ma |
    ma.getMethod().getName() = "nativeTransfer" and
    ma.getMethod().isNative() and
    sink.asExpr() = ma.getArgument(1)  // d2 parameter
  )
}

predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
    // Field-to-object propagation
    exists(FieldAccess fa |
      fa.getField().getName() = "data" and
      node1.asExpr() = fa and                // Field is tainted
      node2.asExpr() = fa.getQualifier()     // Object should also be tainted
    )
    or
    // Alias propagation handling
    exists(AssignExpr assign |
      assign.getRhs() = node1.asExpr() and
      assign.getDest() = node2.asExpr()
    )
}

Questions

  1. JNI Taint Analysis Best Practices: Are there recommended approaches or built-in mechanisms in CodeQL for handling taint tracking through JNI methods, particularly when dealing with field access patterns like in the example above?

  2. Alias Propagation Behavior: The second predicate (alias propagation) appears redundant since CodeQL should inherently handle aliasing. However, without it, the taint flow from d1 to d2 via assignment (Data d2 = d1;) isn't detected. Is this expected behavior, or might there be an issue with my configuration?

Expected vs Actual Behavior

Expected: CodeQL should detect the taint flow from d1.datad1d2nativeTransfer(d2) → field access in native code → Leak().

Actual: Without the custom aliasing predicate, the taint chain breaks at the object level, even though the field data itself is properly tracked.

The workaround functions but seems suboptimal. I'm seeking guidance on whether this is the correct approach or if there are better solutions available in CodeQL's taint tracking framework.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions