Skip to content

Conversation

jbj
Copy link
Contributor

@jbj jbj commented May 3, 2019

See https://jira.semmle.com/browse/CPP-280. This PR is best reviewed one commit at a time.

This PR is supposed to be completely behaviour-preserving. I'll run the full test suite and CPP-Differences to confirm that.

jbj added 6 commits May 3, 2019 10:37
This imports `QualifiedName.qll` from
2f74a456290b9e0850b7308582e07f5d68de3a36 and makes minimal changes so it
compiles.

Original author: Ian Lynagh <ian@semmle.com>
This removes all uses of `Declaration.getQualifiedName` that I think can
be removed without changing any behaviour. The following uses in the
LGTM default suite remain:

* `cpp/ql/src/Security/CWE/CWE-121/UnterminatedVarargsCall.ql` (in `select`).
* `cpp/ql/src/semmle/code/cpp/dataflow/internal/DataFlowDispatch.qll` (needs template args).
* `cpp/ql/src/semmle/code/cpp/security/FunctionWithWrappers.qll` (used for alert messages).
This predicate handles templates differently from the other overloads
with the same name, so it's likely to cause confusion.
@jbj jbj added the C++ label May 3, 2019
The predicate is still deprecated, but we can't mark it as such until
the queries in our internal repo have migrated away from it.
Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any performance figures for this change?

class Namespace extends @namespace {
string toString() { result = "QualifiedName Namespace" }

string getName() { namespaces(this, result) }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be namespaces(underlyingElement(this),result) as it is in Namespace.qll?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because underlyingElement is defined higher up in the stack of abstractions.

@@ -0,0 +1,343 @@
class Namespace extends @namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the same name as a class in Namespace.qll, as do most of the other classes in this file. Are we relying on (QL) namespaces to make them distinct? Is it worth explaining what's going on in a comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I’ll add a comment at the top of the file to explain what’s going on.

The reason for writing this file directly on the dbscheme is that qualified names are needed to improve the ResolveClass mechanism, and ResolveClass sits below the AST classes. That’s what Ian was doing on the branch I took this from.

getQualifiedName() != "g_strdup_printf" and
getQualifiedName() != "__builtin___sprintf_chk" and
getName() != "g_strdup_printf" and
getName() != "__builtin___sprintf_chk" and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be

not hasGlobalName("g_strdup_printf") and
not hasGlobalName("__builtin___sprintf_chk") and

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would give the same result, but I wanted to avoid the negation to minimise the performance risk. Given that we've already restricted the name in the charpred to come from a small set, do you agree it's equivalent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, I hadn't looked outside this predicate.

jbj added 4 commits May 6, 2019 10:14
Since the types in `QualifiedName.qll` are raw db types, callers need to
use `underlyingElement` and `unresolveElement` as appropriate. This has
no effect right now but will be needed when we switch the AST type
hierarchy to `newtype`s.
@jbj
Copy link
Contributor Author

jbj commented May 6, 2019

I ran https://jenkins.internal.semmle.com/job/Query-Changes/job/CPP-Differences/555 to confirm that this PR doesn't change behaviour, but there were three changes, all in jdk8u.

  • One change is an alert from StackAddressEscapes.ql that disappears in a function named unpacker::abort. I can't find the original alert locally. I think it's just wobble because I can find previous jobs (like Query-Changes/CPP # 658) where this particular query wobbles on this particular snapshot.
  • Two alerts from IntegerOverflowTainted.ql disappeared. That's because the security-pack taint tracking library uses hasGlobalName to identify sources, and hasGlobalName changes in this PR so that the function os::recvfrom is no longer considered to have the global name recvfrom (os is a class, not a namespace). I think that's a change we want to keep.

@jbj jbj marked this pull request as ready for review May 6, 2019 11:39
@jbj jbj requested a review from a team as a code owner May 6, 2019 11:39
@jbj
Copy link
Contributor Author

jbj commented May 6, 2019

I think this PR is now ready for review. I'll post performance results when I have them.

@jbj
Copy link
Contributor Author

jbj commented May 8, 2019

I've measured performance, and it looks good. Compare the 1.20 results to the new results. It's not an apples-to-apples comparison because many unrelated things have changed between the two revisions, and the second evaluation ran with more RAM, but I think it's clear what happened with the qualified-name predicates:

  1. getQualifiedName went from 59 invocations taking a total of 2163 seconds to just 9 invocations taking a total of 252 seconds. I'd like to get it even further down, but this is what I could easily do without changing semantics.
  2. declarationHasQualifiedName (which implements the new multi-argument hasQualifiedName predicates and hasGlobalName) has 2 invocations (one of them is for a join_rhs helper predicate) taking a total of 31 seconds. That number is only for the predicate itself. Its whole cached stage takes 94 seconds, which can be seen in the Per-stage tab of the spreadsheet.

Copy link
Contributor

@rdmarsh2 rdmarsh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor nits, LGTM otherwise

override string getName() { enumconstants(this, _, _, _, result, _) }

UserType getDeclaringEnum() { enumconstants(this, result, _, _, _, _) }
// Unlike the usual `EnumConstant`, this one doesn't have a `getDeclaringType()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not?

Copy link
Contributor

@geoffw0 geoffw0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only reviewed a few bits. I think this change is a good idea.

* For performance, prefer the multi-argument `hasQualifiedName` or
* `hasGlobalName` predicates since they don't construct so many intermediate
* strings. For debugging, the `semmle.code.cpp.Print` module produces more
* detailed output but are also more expensive to compute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

* component of `namespaceQualifier`, no declaring type, and a base name of
* `baseName`.
*
* See the 3-argument `hasQualifiedName` for more examples.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for examples as there are no examples here. Though I'd prefer you move the "std", "vector" example to this predicate.

@jbj
Copy link
Contributor Author

jbj commented May 15, 2019

@rdmarsh2 and @geoffw0, I've pushed a commit to address your comments.

@rdmarsh2 rdmarsh2 merged commit 1479586 into github:master May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants