-
Notifications
You must be signed in to change notification settings - Fork 155
Description
IFDSTaintAnalysis::getSummaryFlowFunction() incorrectly returns a non-null summary for defined functions that have sret (struct return) parameters, causing the IFDS solver to skip inter-procedural analysis entirely. This results in missed taint flows through functions that return large structs.
Environment
- PhASAR version: v2510 (latest development branch)
- LLVM version: 16
- OS: Ubuntu 20.04
Description
When analyzing code that calls a function with an sret parameter (used for returning large structs by value), the taint analysis fails to trace data flow through the called function. This is because getSummaryFlowFunction() adds the sret argument to the Kill set unconditionally, which causes a non-null summary to be returned. When the IFDS solver sees a non-null summary, it uses the summary instead of performing inter-procedural analysis.
Steps to Reproduce
- Create a test file
sret_test.c:
#include <stdio.h>
typedef struct { char data[64]; int x; } BigStruct;
BigStruct get_data(char* input) {
BigStruct s;
s.data[0] = input[0];
s.x = 42;
return s;
}
char* taint_source(void);
void sink_func(char c);
int main() {
char* tainted = taint_source();
BigStruct bs = get_data(tainted); // sret call
sink_func(bs.data[0]); // Should be flagged as leak!
return 0;
}- Compile to LLVM IR:
clang-16 -S -emit-llvm -O0 sret_test.c -o sret_test.ll- Create taint config
config.json:
{
"name": "sret-test",
"version": 1.0,
"functions": [
{
"name": "taint_source",
"ret": "source"
},
{
"name": "sink_func",
"params": {
"sink": [0]
}
}
]
}- Run analysis:
phasar-cli --data-flow-analysis ifds-taint --analysis-config config.json --entry-points main --module sret_test.llExpected Behavior
The analysis should detect a taint leak at sink_func(bs.data[0]) because:
taint_source()returns tainted data- The tainted pointer is passed to
get_data() - Inside
get_data(), the tainted data is copied to the struct - The struct is returned via
srettobs bs.data[0]flows tosink_func()→ leak
Actual Behavior
No leak is detected. The analysis does not enter get_data() for inter-procedural analysis.
Root Cause Analysis
The issue is in IFDSTaintAnalysis::getSummaryFlowFunction() at lines 466-473:
if (CS->hasStructRetAttr()) {
const auto *SRet = CS->getArgOperand(0);
if (!Gen.count(SRet)) {
// SRet is guaranteed to be written to by the call. If it does not
// generate it, we can freely kill it
Kill.insert(SRet);
}
}This code adds the sret argument to Kill for all functions with sret, regardless of whether the function is a declaration or a definition.
Later, at line 508-520:
if (Gen.empty()) {
if (!Leak.empty() || !Kill.empty()) {
return lambdaFlow([...](d_t Source) -> container_type {
// ... returns a summary flow function
});
}
}Since Kill is not empty (it contains the sret argument), a non-null summary is returned. The IFDS solver in IDESolver::processCall() then uses this summary instead of the regular call flow, completely bypassing inter-procedural analysis:
// In IDESolver.h, processCall():
if (SpecialSum) {
// Uses summary - no inter-procedural edge created
for (n_t ReturnSiteN : ReturnSiteNs) {
container_type Res = computeSummaryFlowFunction(SpecialSum, d1, d2);
// ...
}
} else {
// Normal call flow - creates inter-procedural edge
FlowFunctionPtrType Function = CachedFlowEdgeFunctions.getCallFlowFunction(n, SCalledProcN);
// ...
}Proposed Fix
Only add sret to the Kill set for declarations (functions without a body). For defined functions, allow inter-procedural analysis to trace the data flow through the function body.
// Only kill sret for declarations - for defined functions, we want
// inter-procedural analysis to trace taints through the function body
if (CS->hasStructRetAttr() && DestFun->isDeclaration()) {
const auto *SRet = CS->getArgOperand(0);
if (!Gen.count(SRet)) {
Kill.insert(SRet);
}
}Verification
With the proposed fix applied:
| Test Case | Original PhASAR | Fixed PhASAR |
|---|---|---|
| Simple flow (no sret) | ✅ Leak detected | ✅ Leak detected |
| Flow through sret function | ❌ No leak | ✅ Leak detected |
Additional Context
The generated LLVM IR for the test case shows the sret attribute:
define dso_local void @get_data(ptr noalias sret(%struct.BigStruct) align 4 %0, ptr noundef %1) {
; function body
}
; Call site in main:
call void @get_data(ptr sret(%struct.BigStruct) align 4 %3, ptr noundef %5)When get_data is a defined function (has a body), the analysis should enter it to trace how the tainted %5 (second argument) flows to the struct written through %0 (sret pointer).