Skip to content

PS: Add field flow #104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 26, 2024
Merged

PS: Add field flow #104

merged 2 commits into from
Sep 26, 2024

Conversation

MathiasVP
Copy link
Collaborator

@MathiasVP MathiasVP commented Sep 24, 2024

This PR implements field-based dataflow (i.e., "field flow") for Powershell.

Field flow is what allows us to get flow in examples such as:

$x.f = Source
Sink $x.f

While it may look innocent, field flow is extremely powerful, and extremely hard to get to perform well. Luckily, all the complications are hidden away inside the shared dataflow library, so all we need to do is:

  • Make sure there exists DataFlow::Nodes to represent the notion of "a dataflow node where we have just stored data into". These nodes are known as "post-update nodes".
  • Tell the dataflow library what a write into a field looks like. This is the job of storeStep.
  • Tell the dataflow library what a read of a field looks like. This is the job of readStep.

Consider an example such as:

$x.f = Source # line 1
$y = $x # line 2
Sink $y.f # line 3

operationally, what happens is that:

  • There is a storeStep from Source to the post-update node for $x (which is printed as [post] $x). Internally, the data flow "remembers" that there has been a write to f when it generates PathNodes (that's the part that's extremely hard to get to perform well). When viewing paths, this will be printed as something like [post] $x [f]. The [f] part is known as the "access path" and can contain up to 5 entries (which represents tracking a value that's been stored into 5 nested structs/classes).
  • There's a flow step from [post] $x on line 1 to $x on line 2 (we get this from SSA).
  • There's a flow step from $x to $y
  • There's a readStep from $y to $y.f that reads from y. Because the dataflow library "remembers" that there was a previous write to f it will "pop off" f from the access path.

@MathiasVP MathiasVP merged commit 5803e06 into main Sep 26, 2024
1 check passed
bdrodes pushed a commit that referenced this pull request Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant