-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C#: Adopt shared data flow implementation #1306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C#: Adopt shared data flow implementation #1306
Conversation
- General refactoring to fit with the shared data flow implementation. - Move CFG splitting logic into `ControlFlowReachability.qll`. - Replace `isAdditionalFlowStepIntoCall()` with `TaintedParameterNode`. - Redefine `ReturnNode` to be the actual values that are returned, which should yield better path information. - No longer consider overrides in CIL calls.
I also ran a performance test on a huge customer snapshot; here performance went from 17695s to 84240s, but I have verified that my upcoming improvements roughly restores performance (19518s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great stuff! You have done a lot of work.
@@ -32,6 +33,8 @@ class CallContext extends TCallContext { | |||
/** An empty call context. */ | |||
class EmptyCallContext extends CallContext, TEmptyCallContext { | |||
override string toString() { result = "<empty>" } | |||
|
|||
override Location getLocation() { result instanceof EmptyLocation } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
override EmptyLocation getLocation() { any() }
?
private import DataFlowPrivate | ||
private import DataFlowPublic | ||
|
||
private ControlFlowElement getAScope(boolean exactScope) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like there could be a class ControlFlowScope
here instead
} | ||
|
||
cached | ||
DotNet::Callable getEnclosingCallable(Node node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we use classes here? There seems to be a lot of synthetic dispatch here that would be easier to maintain using QL classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is simply to make sure that the cached predicates are evaluated in the same stage (in the cached module DataFlowPrivateCached
).
@@ -378,7 +379,9 @@ class ArrayCreation extends Expr, @array_creation_expr { | |||
class AnonymousFunctionExpr extends Expr, Callable, @anonymous_function_expr { | |||
override string getName() { result = "<anonymous>" } | |||
|
|||
override Type getReturnType() { result = getType().(DelegateType).getReturnType() } | |||
override Type getReturnType() { | |||
result = getType().(SystemLinqExpressions::DelegateExtType).getDelegateType().getReturnType() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this.
@@ -1,21 +1,13 @@ | |||
edges | |||
| ExposureInTransmittedData.cs:16:32:16:39 | access to local variable password | ExposureInTransmittedData.cs:16:32:16:39 | access to local variable password | | |||
| ExposureInTransmittedData.cs:20:32:20:44 | call to method ToString | ExposureInTransmittedData.cs:20:32:20:44 | call to method ToString | | |||
| ExposureInTransmittedData.cs:24:32:24:41 | access to property Message | ExposureInTransmittedData.cs:24:32:24:41 | access to property Message | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does all of this self-flow come from? It does not seem to be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because the shared implementation represents paths of length 0 as paths of length 1 by adding a synthetic sink node. This is also the reason why the nodes
predicate is no longer needed. (ping @aschackmull)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that these are therefore only additions to the edges
query predicate, not the select
predicate.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Although we might need to add a nodes
predicate in order to save the PathNode.toString
as labels, but that's a separate concern.
#select | ||
| HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | The hard-coded value "myPa55word" flows to $@ which is compared against $@. | HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | "myPa55word" | HardcodedCredentials.cs:17:13:17:20 | access to local variable password | access to local variable password | | ||
| HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | The hard-coded value "myPa55word" flows to $@ which is compared against $@. | HardcodedCredentials.cs:17:25:17:36 | "myPa55word" | "myPa55word" | HardcodedCredentials.cs:17:13:17:20 | access to local variable password | access to local variable password | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be some unnecessary duplication of the results. I think these duplicated results must be removed (also in other queries).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this is actually because of a bug in the query; it uses hasFlow
, but should have used hasFlowPath
.
@@ -46,25 +45,22 @@ abstract class ArgumentCallContext extends CallContext { | |||
* Holds if this call context represents the argument at position `i` of the | |||
* call expression `call`. | |||
*/ | |||
abstract predicate isArgument(DotNet::Expr call, int i); | |||
abstract predicate isArgument(Expr call, int i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't bother with CIL::Expr
because we don't analyse virtual dispatch in CIL anyway. So it makes sense to remove these from the call context for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this is because these call contexts are now only used when resolving delegate targets in DelegateDataFlow.qll
, and there flow is restricted to C# code. The shared DataFlowImplCommon.qll
has its own notion of a call context.
Finally, the PR that brings C# in line with Java and C/C++ on the data flow implementation. The PR is structured into several commits, which are best reviewed one-by-one:
toString()
change.DataFlowImpl
(Common
).qll
need not be reviewed.ControlFlowReachability.qll
.isAdditionalFlowStepIntoCall()
withTaintedParameterNode
.ReturnNode
to be the actual values that are returned, which should yield better path information.Here are some performance numbers (full report here):
I already have a few performance improvements ready for the shared implementation, which should bring performance back at par, but I would rather do those in a follow-up PR.