Skip to content

Python: Use TUnknown as the result of calls to methods with unknown return types #2915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

tausbn
Copy link
Contributor

@tausbn tausbn commented Feb 25, 2020

TL;DR: We were not handling calls to methods like dict.get correctly, and thus they would not receive a corresponding Value. The change is fairly simple: If we don't know the precise return type, just assume it's object. The pointed-to Value will then be an unknown instance of object, which is fine.

To ease the rollout of this test, currently we only report missing points-to
information for nodes that either

- appear as an argument in a call to a function named `check`, or
- appear inside a scope where the first line is annotated with a comment ending
  in "check".

The idea behind the second version is that once we have points-to running at a
level where no node inside a scope that _ought_ to have points-to is missing
this information, we can simply remove all uses of `check(...)` from inside this
scope, and annotate the entire scope with `# check`. Once this has been done for
the entire file, we can then remove all the comments and just require
_everything_ to be checked.

Note that I don't expect all nodes to have the need for points-to information.
For instance, there are nodes representing scope entry and exit, and for these
it doesn't make sense to require that they "point-to" anything. Similarly,
`NameNode` appearing in a "store" (i.e. as the left hand side of an assignment)
do not strictly need to have points-to information, although it might be more
intuitive if they did.

Thus, the `relevant_node` predicate will almost certainly need to be extended to
exclude these kinds of nodes.
@tausbn tausbn added the Python label Feb 25, 2020
@tausbn tausbn requested a review from a team as a code owner February 25, 2020 15:37
RasmusWL
RasmusWL previously approved these changes Feb 25, 2020
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few nitpicks, but otherwise LGTM 😄

@RasmusWL
Copy link
Member

I'm actually curious if this would have any performance implications. I don't have an intuitive feel for it, but I'm guessing that anything that changes points-to analysis might be worth investigating? 😄

@tausbn
Copy link
Contributor Author

tausbn commented Feb 26, 2020

You're right. I was considering this myself. I'll set up a differences job to see how it fares.

@tausbn tausbn added the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label Mar 10, 2020
@tausbn
Copy link
Contributor Author

tausbn commented Mar 12, 2020

Dist-compare report: https://git.semmle.com/gist/taus/269cda979d7c3d34ebbbf12d4f860eca
Lots of changes in alerts (probably because there is way less points-to pruning now). I still haven't gone through these results to see if they're correct.

@tausbn
Copy link
Contributor Author

tausbn commented Mar 12, 2020

Okay, having now looked at the results, I see some problems with them. Currently, we do not consider an unknown instance of object to be callable, so we get lots of false positives for non-callable called.
I'll see if TUnknown fares better in this aspect.

@tausbn tausbn changed the title Python: Use object as default return type for built-ins. Python: Use TUnknown as the result of calls to methods with unknown return types Mar 16, 2020
@tausbn
Copy link
Contributor Author

tausbn commented Mar 16, 2020

Note: I expect tests to start failing again, since a bunch of Values will now be missing (as we do not at present allow unknown values). I will fix this up in a later commit, as well as address the review comments.

@tausbn tausbn requested a review from RasmusWL March 16, 2020 11:51
@RasmusWL
Copy link
Member

I think it looks good, but could we do an other dist-compare after the Python: Use `TUnknown` instead of `TUnknownInstance`. commit?

@tausbn
Copy link
Contributor Author

tausbn commented Mar 16, 2020

Indeed. That's why I left the "Awaiting Evaluation" label. Hopefully we should have fresh results tomorrow. 🙂

@tausbn
Copy link
Contributor Author

tausbn commented Mar 16, 2020

Latest report: https://git.semmle.com/gist/taus/07875fdb07c028c31a0a7c06a26eab44
Haven't looked at it in detail yet, but it looks decent enough.

RasmusWL
RasmusWL previously approved these changes Mar 18, 2020
Copy link
Member

@RasmusWL RasmusWL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in our meeting, Taus looked over the results and said the look good, so this is good to go!

@RasmusWL
Copy link
Member

Except for that merge conflict of course 😄 (probably just got introduced since I just merged 2 PRs)

@RasmusWL
Copy link
Member

Oh no, tests are failing 😢 @tausbn

@semmle-qlci semmle-qlci merged commit 2821b01 into github:master Mar 19, 2020
@tausbn tausbn deleted the python-add-points-to-for-missing-builtin-return-types branch February 12, 2021 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish Python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants