Python: Use `TUnknown` as the result of calls to methods with unknown return types #2915

tausbn · 2020-02-25T15:37:26Z

TL;DR: We were not handling calls to methods like dict.get correctly, and thus they would not receive a corresponding Value. The change is fairly simple: If we don't know the precise return type, just assume it's object. The pointed-to Value will then be an unknown instance of object, which is fine.

To ease the rollout of this test, currently we only report missing points-to information for nodes that either - appear as an argument in a call to a function named `check`, or - appear inside a scope where the first line is annotated with a comment ending in "check". The idea behind the second version is that once we have points-to running at a level where no node inside a scope that _ought_ to have points-to is missing this information, we can simply remove all uses of `check(...)` from inside this scope, and annotate the entire scope with `# check`. Once this has been done for the entire file, we can then remove all the comments and just require _everything_ to be checked. Note that I don't expect all nodes to have the need for points-to information. For instance, there are nodes representing scope entry and exit, and for these it doesn't make sense to require that they "point-to" anything. Similarly, `NameNode` appearing in a "store" (i.e. as the left hand side of an assignment) do not strictly need to have points-to information, although it might be more intuitive if they did. Thus, the `relevant_node` predicate will almost certainly need to be extended to exclude these kinds of nodes.

RasmusWL

a few nitpicks, but otherwise LGTM 😄

python/ql/src/semmle/python/objects/Callables.qll

python/ql/test/library-tests/PointsTo/new/PointsToMissing.ql

RasmusWL · 2020-02-26T09:35:18Z

I'm actually curious if this would have any performance implications. I don't have an intuitive feel for it, but I'm guessing that anything that changes points-to analysis might be worth investigating? 😄

tausbn · 2020-02-26T13:43:25Z

You're right. I was considering this myself. I'll set up a differences job to see how it fares.

…eturn-types

tausbn · 2020-03-12T16:45:35Z

Dist-compare report: https://git.semmle.com/gist/taus/269cda979d7c3d34ebbbf12d4f860eca
Lots of changes in alerts (probably because there is way less points-to pruning now). I still haven't gone through these results to see if they're correct.

tausbn · 2020-03-12T17:31:30Z

Okay, having now looked at the results, I see some problems with them. Currently, we do not consider an unknown instance of object to be callable, so we get lots of false positives for non-callable called.
I'll see if TUnknown fares better in this aspect.

tausbn · 2020-03-16T10:40:31Z

Note: I expect tests to start failing again, since a bunch of Values will now be missing (as we do not at present allow unknown values). I will fix this up in a later commit, as well as address the review comments.

RasmusWL · 2020-03-16T15:03:00Z

I think it looks good, but could we do an other dist-compare after the Python: Use `TUnknown` instead of `TUnknownInstance`. commit?

tausbn · 2020-03-16T15:08:42Z

Indeed. That's why I left the "Awaiting Evaluation" label. Hopefully we should have fresh results tomorrow. 🙂

tausbn · 2020-03-16T20:27:53Z

Latest report: https://git.semmle.com/gist/taus/07875fdb07c028c31a0a7c06a26eab44
Haven't looked at it in detail yet, but it looks decent enough.

RasmusWL

As discussed in our meeting, Taus looked over the results and said the look good, so this is good to go!

RasmusWL · 2020-03-18T16:51:58Z

Except for that merge conflict of course 😄 (probably just got introduced since I just merged 2 PRs)

…eturn-types

RasmusWL · 2020-03-19T09:27:52Z

Oh no, tests are failing 😢 @tausbn

tausbn added 3 commits February 25, 2020 16:07

Python: Add tests for missing points-to for built-in methods.

5813209

Python: Use object as default return type for built-ins.

35ada17

tausbn added the Python label Feb 25, 2020

tausbn requested a review from a team as a code owner February 25, 2020 15:37

RasmusWL previously approved these changes Feb 25, 2020

View reviewed changes

python/ql/src/semmle/python/objects/Callables.qll Outdated Show resolved Hide resolved

python/ql/test/library-tests/PointsTo/new/PointsToMissing.ql Outdated Show resolved Hide resolved

Python: Update test results for ReturnTypes.ql for Python 2.

1526c86

tausbn dismissed RasmusWL’s stale review via 1526c86 February 25, 2020 16:55

tausbn mentioned this pull request Feb 27, 2020

Python: Add example of re.compile missing points-to #2932

Merged

tausbn added the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label Mar 10, 2020

Merge branch 'master' into python-add-points-to-for-missing-builtin-r…

4b5a20d

…eturn-types

Python: Use TUnknown instead of TUnknownInstance.

2d8f3bb

tausbn changed the title ~~Python: Use object as default return type for built-ins.~~ Python: Use TUnknown as the result of calls to methods with unknown return types Mar 16, 2020

tausbn added 2 commits March 16, 2020 12:48

Python: Fix up tests.

81f6877

Python: Fix comment based on review.

5579dfb

tausbn requested a review from RasmusWL March 16, 2020 11:51

RasmusWL previously approved these changes Mar 18, 2020

View reviewed changes

Merge branch 'master' into python-add-points-to-for-missing-builtin-r…

ae1268f

…eturn-types

tausbn dismissed RasmusWL’s stale review via ae1268f March 18, 2020 16:59

RasmusWL approved these changes Mar 19, 2020

View reviewed changes

semmle-qlci merged commit 2821b01 into github:master Mar 19, 2020

tausbn deleted the python-add-points-to-for-missing-builtin-return-types branch February 12, 2021 18:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Use `TUnknown` as the result of calls to methods with unknown return types #2915

Python: Use `TUnknown` as the result of calls to methods with unknown return types #2915

Uh oh!

tausbn commented Feb 25, 2020

Uh oh!

RasmusWL left a comment

Uh oh!

Uh oh!

Uh oh!

RasmusWL commented Feb 26, 2020

Uh oh!

tausbn commented Feb 26, 2020

Uh oh!

tausbn commented Mar 12, 2020

Uh oh!

tausbn commented Mar 12, 2020

Uh oh!

tausbn commented Mar 16, 2020

Uh oh!

RasmusWL commented Mar 16, 2020

Uh oh!

tausbn commented Mar 16, 2020

Uh oh!

tausbn commented Mar 16, 2020

Uh oh!

RasmusWL left a comment

Uh oh!

RasmusWL commented Mar 18, 2020

Uh oh!

RasmusWL commented Mar 19, 2020

Uh oh!

Uh oh!

Python: Use TUnknown as the result of calls to methods with unknown return types #2915

Python: Use TUnknown as the result of calls to methods with unknown return types #2915

Uh oh!

Conversation

tausbn commented Feb 25, 2020

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

RasmusWL commented Feb 26, 2020

Uh oh!

tausbn commented Feb 26, 2020

Uh oh!

tausbn commented Mar 12, 2020

Uh oh!

tausbn commented Mar 12, 2020

Uh oh!

tausbn commented Mar 16, 2020

Uh oh!

RasmusWL commented Mar 16, 2020

Uh oh!

tausbn commented Mar 16, 2020

Uh oh!

tausbn commented Mar 16, 2020

Uh oh!

RasmusWL left a comment

Choose a reason for hiding this comment

Uh oh!

RasmusWL commented Mar 18, 2020

Uh oh!

RasmusWL commented Mar 19, 2020

Uh oh!

Uh oh!

Python: Use `TUnknown` as the result of calls to methods with unknown return types #2915

Python: Use `TUnknown` as the result of calls to methods with unknown return types #2915