Python: Add module boundary flow steps #4514

tausbn · 2020-10-19T17:26:37Z

This is the quick-and-dirty solution, as discussed.

An even quicker-and-dirtier solution would have used
ModuleValue::attr and take the getOrigin of that as the source of
the jump step. However, this turns out to be a bad choice, since
attr might fail to have a value for the given attribute (for a
variety of reasons). Thus, we instead appeal to a helper predicate
that keeps track of which names are defined by which right-hand-sides
in a given module. (Observe that type tracking works correctly for x
in mymodule.py, even though x is never assigned a value in the
eyes of the Value API.)

This means that points-to is only used to actually figure out if the
object we're looking an attribute up on is a module or not. This is
the next thing to replace in order to eliminate the dependence on
points-to, but this will require some care to ensure that all module
lookups are handled correctly.

Only two test files needed to be changed for the tests to pass. The
first was the fixed false negative in the type tracker, and the other
was a bunch of missing flow in the regression test. I have manually
removed the # Flow not found annotations to make them consistent
with the output. Pay particular attention to the annotation on line
117 -- I believe it was misplaced and should have been on line 106
instead (where, indeed, we now have flow where none appeared before).

Finally, this PR adds a bunch of test files based on PEP-328 in anticipation
of the import resolution handling that will be needed eventually. The actual
test will come at a later point. (Probably not this PR.)

Based on https://www.python.org/dev/peps/pep-0328/#guido-s-decision Original "code" is in the Public Domain.

This is the quick-and-dirty solution, as discussed. An even quicker-and-dirtier solution would have used `ModuleValue::attr` and take the `getOrigin` of that as the source of the jump step. However, this turns out to be a bad choice, since `attr` might fail to have a value for the given attribute (for a variety of reasons). Thus, we instead appeal to a helper predicate that keeps track of which names are defined by which right-hand-sides in a given module. (Observe that type tracking works correctly for `x` in `mymodule.py`, even though `x` is never assigned a value in the eyes of the Value API.) This means that points-to is only used to actually figure out if the object we're looking an attribute up on is a module or not. This is the next thing to replace in order to eliminate the dependence on points-to, but this will require some care to ensure that all module lookups are handled correctly. Only two test files needed to be changed for the tests to pass. The first was the fixed false negative in the type tracker, and the other was a bunch of missing flow in the regression test. I have manually removed the `# Flow not found` annotations to make them consistent with the output. Pay particular attention to the annotation on line 117 -- I believe it was misplaced and should have been on line 106 instead (where, indeed, we now have flow where none appeared before).

yoff

One simple typo, otherwise looks good. It does look like you are correct about the test annotations. It will be glorious once all test annotations are verified :-)

python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll

Co-authored-by: yoff <lerchedahl@gmail.com>

yoff

LGTM 👍

RasmusWL

LGTM!

I'll just run this on a snapshot locally, and if everything looks good, merge it as well 👍

Local testing shows that the `getDefinition` result for this is a `SSA filter definition`, and not an `AssignmentDefinition`.

RasmusWL

Running it against real snapshot showed that we did not quite handle my case. I figured out the problem, and added testcase in tausbn#2

…efinition Python: Add test for tricky module member for type-tracking

I'm slightly suspicious of this fix -- it seems to work, but it makes me wonder if we're potentially missing other kinds of flow, by not handling other kinds of definitions. Also, I feel like this should really be attached to an appropriate post-update node of the given argument. As it is written now, the flow will go from the argument _before_ the call, which obviously misses a step if the argument is modified by the call. In practice, I would expect this to be rather rare.

RasmusWL

Nice 💪 works on the real project I mentioned in tausbn#2 as well 🎉

yoff

Interesting, it seems we might do this in a more principled way later, by referring to ModuleExports, perhaps. But this is fine as bandaid for now.

Fixed by github#4514

tausbn added 2 commits October 16, 2020 12:03

Python: Add PEP-328 test example

60fcb5e

Based on https://www.python.org/dev/peps/pep-0328/#guido-s-decision Original "code" is in the Public Domain.

tausbn requested a review from a team as a code owner October 19, 2020 17:26

github-actions bot added the Python label Oct 19, 2020

tausbn changed the title ~~Python add module boundary flow steps~~ Python: Add module boundary flow steps Oct 19, 2020

yoff previously approved these changes Oct 19, 2020

View reviewed changes

python/ql/src/experimental/dataflow/internal/DataFlowPrivate.qll Outdated Show resolved Hide resolved

Python: Fix typo in QLDoc

f5ec548

Co-authored-by: yoff <lerchedahl@gmail.com>

tausbn dismissed yoff’s stale review via f5ec548 October 19, 2020 21:51

tausbn requested a review from yoff October 19, 2020 21:54

yoff previously approved these changes Oct 19, 2020

View reviewed changes

RasmusWL approved these changes Oct 20, 2020

View reviewed changes

Python: Add test for tricky module member for type-tracking

045a6c3

Local testing shows that the `getDefinition` result for this is a `SSA filter definition`, and not an `AssignmentDefinition`.

RasmusWL requested changes Oct 20, 2020

View reviewed changes

Merge pull request #2 from RasmusWL/python-tricky-import-ssa-filter-d…

802a725

…efinition Python: Add test for tricky module member for type-tracking

tausbn dismissed yoff’s stale review via 802a725 October 20, 2020 10:51

tausbn added 2 commits October 20, 2020 13:11

Python: Mark failing test as false negative

860cafe

tausbn requested review from RasmusWL and yoff October 20, 2020 11:29

RasmusWL approved these changes Oct 20, 2020

View reviewed changes

yoff approved these changes Oct 20, 2020

View reviewed changes

yoff merged commit 17155b6 into github:main Oct 20, 2020

RasmusWL added a commit to RasmusWL/codeql that referenced this pull request Oct 20, 2020

Python: Django route handlers in different file now works

6920f30

Fixed by github#4514

tausbn deleted the python-add-module-boundary-flow-steps branch February 12, 2021 18:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Add module boundary flow steps #4514

Python: Add module boundary flow steps #4514

Uh oh!

tausbn commented Oct 19, 2020

Uh oh!

yoff left a comment

Uh oh!

Uh oh!

yoff left a comment

Uh oh!

RasmusWL left a comment

Uh oh!

RasmusWL left a comment

Uh oh!

RasmusWL left a comment

Uh oh!

yoff left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Python: Add module boundary flow steps #4514

Python: Add module boundary flow steps #4514

Uh oh!

Conversation

tausbn commented Oct 19, 2020

Uh oh!

yoff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yoff left a comment

Choose a reason for hiding this comment

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

yoff left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants