Skip to content

IFDSTaintAnalysis skips inter-procedural analysis for functions with sret parameters #809

@leeewee

Description

@leeewee

IFDSTaintAnalysis::getSummaryFlowFunction() incorrectly returns a non-null summary for defined functions that have sret (struct return) parameters, causing the IFDS solver to skip inter-procedural analysis entirely. This results in missed taint flows through functions that return large structs.

Environment

  • PhASAR version: v2510 (latest development branch)
  • LLVM version: 16
  • OS: Ubuntu 20.04

Description

When analyzing code that calls a function with an sret parameter (used for returning large structs by value), the taint analysis fails to trace data flow through the called function. This is because getSummaryFlowFunction() adds the sret argument to the Kill set unconditionally, which causes a non-null summary to be returned. When the IFDS solver sees a non-null summary, it uses the summary instead of performing inter-procedural analysis.

Steps to Reproduce

  1. Create a test file sret_test.c:
#include <stdio.h>

typedef struct { char data[64]; int x; } BigStruct;

BigStruct get_data(char* input) {
    BigStruct s;
    s.data[0] = input[0];
    s.x = 42;
    return s;
}

char* taint_source(void);
void sink_func(char c);

int main() {
    char* tainted = taint_source();
    BigStruct bs = get_data(tainted);  // sret call
    sink_func(bs.data[0]);  // Should be flagged as leak!
    return 0;
}
  1. Compile to LLVM IR:
clang-16 -S -emit-llvm -O0 sret_test.c -o sret_test.ll
  1. Create taint config config.json:
{
  "name": "sret-test",
  "version": 1.0,
  "functions": [
    {
      "name": "taint_source",
      "ret": "source"
    },
    {
      "name": "sink_func",
      "params": {
        "sink": [0]
      }
    }
  ]
}
  1. Run analysis:
phasar-cli --data-flow-analysis ifds-taint --analysis-config config.json --entry-points main --module sret_test.ll

Expected Behavior

The analysis should detect a taint leak at sink_func(bs.data[0]) because:

  1. taint_source() returns tainted data
  2. The tainted pointer is passed to get_data()
  3. Inside get_data(), the tainted data is copied to the struct
  4. The struct is returned via sret to bs
  5. bs.data[0] flows to sink_func()leak

Actual Behavior

No leak is detected. The analysis does not enter get_data() for inter-procedural analysis.

Root Cause Analysis

The issue is in IFDSTaintAnalysis::getSummaryFlowFunction() at lines 466-473:

if (CS->hasStructRetAttr()) {
    const auto *SRet = CS->getArgOperand(0);
    if (!Gen.count(SRet)) {
      // SRet is guaranteed to be written to by the call. If it does not
      // generate it, we can freely kill it
      Kill.insert(SRet);
    }
}

This code adds the sret argument to Kill for all functions with sret, regardless of whether the function is a declaration or a definition.

Later, at line 508-520:

if (Gen.empty()) {
    if (!Leak.empty() || !Kill.empty()) {
      return lambdaFlow([...](d_t Source) -> container_type {
        // ... returns a summary flow function
      });
    }
}

Since Kill is not empty (it contains the sret argument), a non-null summary is returned. The IFDS solver in IDESolver::processCall() then uses this summary instead of the regular call flow, completely bypassing inter-procedural analysis:

// In IDESolver.h, processCall():
if (SpecialSum) {
    // Uses summary - no inter-procedural edge created
    for (n_t ReturnSiteN : ReturnSiteNs) {
        container_type Res = computeSummaryFlowFunction(SpecialSum, d1, d2);
        // ...
    }
} else {
    // Normal call flow - creates inter-procedural edge
    FlowFunctionPtrType Function = CachedFlowEdgeFunctions.getCallFlowFunction(n, SCalledProcN);
    // ...
}

Proposed Fix

Only add sret to the Kill set for declarations (functions without a body). For defined functions, allow inter-procedural analysis to trace the data flow through the function body.

// Only kill sret for declarations - for defined functions, we want
// inter-procedural analysis to trace taints through the function body
if (CS->hasStructRetAttr() && DestFun->isDeclaration()) {
    const auto *SRet = CS->getArgOperand(0);
    if (!Gen.count(SRet)) {
      Kill.insert(SRet);
    }
}

Verification

With the proposed fix applied:

Test Case Original PhASAR Fixed PhASAR
Simple flow (no sret) ✅ Leak detected ✅ Leak detected
Flow through sret function ❌ No leak Leak detected

Additional Context

The generated LLVM IR for the test case shows the sret attribute:

define dso_local void @get_data(ptr noalias sret(%struct.BigStruct) align 4 %0, ptr noundef %1) {
  ; function body
}

; Call site in main:
call void @get_data(ptr sret(%struct.BigStruct) align 4 %3, ptr noundef %5)

When get_data is a defined function (has a body), the analysis should enter it to trace how the tainted %5 (second argument) flows to the struct written through %0 (sret pointer).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions