-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Dataflow: Add support for implicit reads #6107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataflow: Add support for implicit reads #6107
Conversation
Quick question about how this relates to something we do in the JS libraries. We support custom load steps via: /**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if the property `prop` of the object `pred` should be loaded into `succ`.
*/
predicate isAdditionalLoadStep(DataFlow::Node pred, DataFlow::Node succ, string prop) I'm wondering if |
It is almost equivalent, yes. There's a slight difference if you also use type pruning, because this would easily remove the flow you gained from |
5cbbae0
to
d383c0f
Compare
javaGenerated file changes for java
- `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,417,,,,,,,,
+ `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,420,,,,,,,,
- Totals,,84,1622,181,13,6,6,,33,1,58
+ Totals,,84,1625,181,13,6,6,,33,1,58
- org.apache.commons.lang3,,,417,,,,,,,,,,,,,,324,93
+ org.apache.commons.lang3,,,420,,,,,,,,,,,,,,292,128 |
e2fa670
to
c06e152
Compare
javaGenerated file changes for java
- `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,417,,,,,,,,
+ `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,420,,,,,,,,
- Totals,,84,1622,181,13,6,6,,33,1,58
+ Totals,,84,1625,181,13,6,6,,33,1,58
- org.apache.commons.lang3,,,417,,,,,,,,,,,,,,324,93
+ org.apache.commons.lang3,,,420,,,,,,,,,,,,,,292,128 |
Started a benchmark to check effects on codeql-go (without any default implementation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go benchmark results are good (either neutral or slightly beneficial)
Java perf measurements are thrown off by a case of bad magic. I'll need to fix it and restart them. |
javaGenerated file changes for java
- `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,417,,,,,,,,
+ `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,420,,,,,,,,
- Totals,,84,1624,181,13,6,6,,33,1,58
+ Totals,,84,1627,181,13,6,6,,33,1,58
- org.apache.commons.lang3,,,417,,,,,,,,,,,,,,324,93
+ org.apache.commons.lang3,,,420,,,,,,,,,,,,,,292,128 |
Looks like the performance regression by the store-as-taint workaround is indeed fixed:
But the new feature that's added to the dataflow library to do this does come with an added performance overhead - mainly due to tuple-numbering, as far as I have been able to see. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks solid to me.
|
||
/** A reference through the contents of some collection-like container. */ | ||
private class CollectionContent extends Content, TCollectionContent { | ||
override string toString() { result = "<element>" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you change the C++ content toString()
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To match Java. They were originally just copy-pasted from Java and weren't used in either language. When I added support for collection-flow in Java I updated them to something I thought made more sense (but the old somewhat arbitrary values still lingered in C++). So now that I moved them I thought I might as well update them. I did check with @MathiasVP beforehand that this was ok with him.
javaGenerated file changes for java
- `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,417,,,,,,,,
+ `Apache Commons Lang <https://commons.apache.org/proper/commons-lang/>`_,``org.apache.commons.lang3``,,420,,,,,,,,
- Totals,,84,1634,181,13,6,6,,33,1,58
+ Totals,,84,1637,181,13,6,6,,33,1,58
- org.apache.commons.lang3,,,417,,,,,,,,,,,,,,324,93
+ org.apache.commons.lang3,,,420,,,,,,,,,,,,,,292,128 |
CPP-Differences doesn't look super good. There's a clear performance regression on all of the dataflow queries. Is such a big slowdown expected? |
I'd say unfortunately yes for the time being. It seems like all dataflow queries are taking a hit across all languages. I've been trying to dig into it, but there's no clear single culprit - rather it seems like the wrapping of I do have some ideas for improving the situation, but I'd rather look into those as follow-up work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation!
This adds a new overridable predicate on configurations:
The purpose is to support sinks and taint steps accepting non-empty accesspaths.
I've also added a default override in taint-tracking configurations, which the language specific implementations can use to add such default support. This is just a (hopefully helpful) default and can be overridden by specific configurations. For Java, the default implementation is all array-, collection-, and map-value-read steps that match the type at the given sink / additional taint step.
This means that we ought to be able to drop the temporary store-as-taint workaround, which was costing quite some performance.