circle-ir 3.37.0
Python multi-hop taint propagation (#20)
Closes the indirection-pattern false-negative tail uncovered after #18. The supplement that fixed one-hop direct flows still emitted taint.flows = [] for every aliased / container / round-trip shape — the dominant remaining driver of OWASP BenchmarkPython misses and the blocker for circle-ir-ai#75.
Shapes now detected
- Shape A — configparser round-trip:
conf.set('s','k', tainted); bar = conf.get('s','k'); cur.execute(f'... {bar}') - Shape B — list/dict round-trip:
lst.append(tainted); bar = lst[0]; argList = ['sh','-c', f'echo {bar}']; subprocess.run(argList) - Shape C — simple alias chain (not in the original bug report):
bar = uid; sql = "..." + bar; cur.execute(sql). Even one rename of a tainted variable broke the flow.
Fix
Two surgical changes (~50 LOC total):
detectExpressionScanFlowsnow acceptscode+languageand, for Python, expandssourcesWithVarwith synthetic source records for every derived/aliased variable produced bybuildPythonTaintedVars. Synthetic records inherit the earliest real source'sline/type/confidenceso emitted flows still anchor at the originalrequest.form.get(...)site, not at the alias.buildPythonTaintedVarsgained one rule:(\w+)\.(append|extend|insert|add|push|put|appendleft)\(taintedExpr)taints the receiver. This composes with the existing dict-access propagation so list-append-then-subscript-read round-trips correctly.
Why not a full Python DFG
A proper buildPythonDFG mirroring buildJavaDFG is ~990 LOC plus a separate AST pass for compound-expression arg decomposition. The supplement + rule are deterministic, regex-based, and unblock the entire BenchmarkPython false-negative tail today. Full DFG remains future work and will subsume this supplement.
Java non-regression
Alias expansion is gated on language === 'python'. Java sources rarely set .variable (matched on annotations/types), so sourcesWithVar is empty for Java and the supplement is a no-op. Verified by explicit Java sqli non-regression test + the full 156-case Juliet suite.
Tests
6 new end-to-end regression cases (shapes A, B, B-variant, C, #18 one-hop control, Java non-regression). 1931 passing tests (1925 baseline + 6 new).
Not addressed (future work)
- Cross-module / cross-file helper indirection (
helpers.db_sqlite.results(cur, sql)) — requires inter-procedural taint summaries. - Full Python DFG builder.
🤖 Generated with Claude Code