-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Python: Add module boundary flow steps #4514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Add module boundary flow steps #4514
Conversation
Based on https://www.python.org/dev/peps/pep-0328/#guido-s-decision Original "code" is in the Public Domain.
This is the quick-and-dirty solution, as discussed. An even quicker-and-dirtier solution would have used `ModuleValue::attr` and take the `getOrigin` of that as the source of the jump step. However, this turns out to be a bad choice, since `attr` might fail to have a value for the given attribute (for a variety of reasons). Thus, we instead appeal to a helper predicate that keeps track of which names are defined by which right-hand-sides in a given module. (Observe that type tracking works correctly for `x` in `mymodule.py`, even though `x` is never assigned a value in the eyes of the Value API.) This means that points-to is only used to actually figure out if the object we're looking an attribute up on is a module or not. This is the next thing to replace in order to eliminate the dependence on points-to, but this will require some care to ensure that all module lookups are handled correctly. Only two test files needed to be changed for the tests to pass. The first was the fixed false negative in the type tracker, and the other was a bunch of missing flow in the regression test. I have manually removed the `# Flow not found` annotations to make them consistent with the output. Pay particular attention to the annotation on line 117 -- I believe it was misplaced and should have been on line 106 instead (where, indeed, we now have flow where none appeared before).
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One simple typo, otherwise looks good. It does look like you are correct about the test annotations. It will be glorious once all test annotations are verified :-)
python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
Co-authored-by: yoff <lerchedahl@gmail.com>
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
RasmusWL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I'll just run this on a snapshot locally, and if everything looks good, merge it as well 👍
Local testing shows that the `getDefinition` result for this is a `SSA filter definition`, and not an `AssignmentDefinition`.
RasmusWL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running it against real snapshot showed that we did not quite handle my case. I figured out the problem, and added testcase in tausbn#2
…efinition Python: Add test for tricky module member for type-tracking
I'm slightly suspicious of this fix -- it seems to work, but it makes me wonder if we're potentially missing other kinds of flow, by not handling other kinds of definitions. Also, I feel like this should really be attached to an appropriate post-update node of the given argument. As it is written now, the flow will go from the argument _before_ the call, which obviously misses a step if the argument is modified by the call. In practice, I would expect this to be rather rare.
RasmusWL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 💪 works on the real project I mentioned in tausbn#2 as well 🎉
yoff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, it seems we might do this in a more principled way later, by referring to ModuleExports, perhaps. But this is fine as bandaid for now.
This is the quick-and-dirty solution, as discussed.
An even quicker-and-dirtier solution would have used
ModuleValue::attrand take thegetOriginof that as the source ofthe jump step. However, this turns out to be a bad choice, since
attrmight fail to have a value for the given attribute (for avariety of reasons). Thus, we instead appeal to a helper predicate
that keeps track of which names are defined by which right-hand-sides
in a given module. (Observe that type tracking works correctly for
xin
mymodule.py, even thoughxis never assigned a value in theeyes of the Value API.)
This means that points-to is only used to actually figure out if the
object we're looking an attribute up on is a module or not. This is
the next thing to replace in order to eliminate the dependence on
points-to, but this will require some care to ensure that all module
lookups are handled correctly.
Only two test files needed to be changed for the tests to pass. The
first was the fixed false negative in the type tracker, and the other
was a bunch of missing flow in the regression test. I have manually
removed the
# Flow not foundannotations to make them consistentwith the output. Pay particular attention to the annotation on line
117 -- I believe it was misplaced and should have been on line 106
instead (where, indeed, we now have flow where none appeared before).
Finally, this PR adds a bunch of test files based on PEP-328 in anticipation
of the import resolution handling that will be needed eventually. The actual
test will come at a later point. (Probably not this PR.)