Skip to content

Add support for multi file taint tracking#1

Merged
mfow-nullify merged 7 commits into
mainfrom
mfow/multi_file
Apr 10, 2026
Merged

Add support for multi file taint tracking#1
mfow-nullify merged 7 commits into
mainfrom
mfow/multi_file

Conversation

@mfow-nullify
Copy link
Copy Markdown

@mfow-nullify mfow-nullify commented Mar 30, 2026

Previously, taint propagation stopped at file boundaries, so flows that passed through imported helpers or module-level bindings were missed. This PR adds project-scoped taint tracking for rules that opt into interfile: true, with the strongest new coverage around Python import/package flows and supporting callback/HOF stabilization elsewhere.

Rather than introducing a separate analysis pipeline, this reuses the existing taint summary machinery and extends it with a shared per-rule interfile context.

What changed

  • Build a shared interfile context once per taint rule during scan setup, then reuse it while checking each file.
  • Treat interfile: true as an extension of the existing taint_intrafile summary flow, so the same signature and call-graph machinery can work across files.
  • Build a project-wide call graph from the scan inputs, with import-aware resolution for modules, packages, aliases, wildcard imports, relative imports, top-level code, and imported object initialization.
  • Restrict interfile analysis to the source-to-sink relevant subgraph for the rule, then materialize a rule-specific signature DB and builtin model DB from that graph.
  • Track tainted imported globals and module-level values across direct imports, aliased imports, package re-exports, and wildcard imports.
  • Disable file-local regex prefiltering for interfile rules so files that contain only a source or only a sink are not incorrectly dropped.

Stability and propagation fixes included here

  • Normalize function identity and call-site lookup so multi-file call-graph edges resolve more reliably.
  • Improve callback/HOF propagation by recording callback position, adding JS event-handler HOFs (on, addEventListener), and carrying Arg-shaped parameter assumptions into function and lambda summaries.
  • Handle implicit receivers and constructor receivers more consistently during signature instantiation and taint propagation.

Test coverage

  • Adds a large interfile regression suite focused on Python import topologies:
    • direct imports
    • aliased imports
    • local and relative imports
    • package imports and re-exports
    • wildcard imports
    • module-level values
    • imported class/classmethod/instance-method flows
    • multi-hop helper chains
    • multiple sink sites/files
    • scan-order independence
    • shadowing/rebinding cases to avoid overtainting
  • Updates cross-function callback/HOF tests and marks a few unsupported cases explicitly.

Current limits and assumptions

  • Interfile results only exist for files included in the scan inputs; upstream modules outside the scan set intentionally do not produce findings.
  • Ruby lambda callback invocation and return sanitization are still incomplete.
  • JS callback writes through captured outer variables are still not modeled yet.

@mfow-nullify mfow-nullify marked this pull request as ready for review April 10, 2026 01:26
@mfow-nullify mfow-nullify merged commit 3237744 into main Apr 10, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants