-
Notifications
You must be signed in to change notification settings - Fork 1.8k
C++: Fix getQualifiedName
performance issues
#1303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This imports `QualifiedName.qll` from 2f74a456290b9e0850b7308582e07f5d68de3a36 and makes minimal changes so it compiles. Original author: Ian Lynagh <ian@semmle.com>
This removes all uses of `Declaration.getQualifiedName` that I think can be removed without changing any behaviour. The following uses in the LGTM default suite remain: * `cpp/ql/src/Security/CWE/CWE-121/UnterminatedVarargsCall.ql` (in `select`). * `cpp/ql/src/semmle/code/cpp/dataflow/internal/DataFlowDispatch.qll` (needs template args). * `cpp/ql/src/semmle/code/cpp/security/FunctionWithWrappers.qll` (used for alert messages).
This predicate handles templates differently from the other overloads with the same name, so it's likely to cause confusion.
The predicate is still deprecated, but we can't mark it as such until the queries in our internal repo have migrated away from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any performance figures for this change?
class Namespace extends @namespace { | ||
string toString() { result = "QualifiedName Namespace" } | ||
|
||
string getName() { namespaces(this, result) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be namespaces(underlyingElement(this),result)
as it is in Namespace.qll
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because underlyingElement
is defined higher up in the stack of abstractions.
@@ -0,0 +1,343 @@ | |||
class Namespace extends @namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the same name as a class in Namespace.qll
, as do most of the other classes in this file. Are we relying on (QL) namespaces to make them distinct? Is it worth explaining what's going on in a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I’ll add a comment at the top of the file to explain what’s going on.
The reason for writing this file directly on the dbscheme is that qualified names are needed to improve the ResolveClass mechanism, and ResolveClass sits below the AST classes. That’s what Ian was doing on the branch I took this from.
getQualifiedName() != "g_strdup_printf" and | ||
getQualifiedName() != "__builtin___sprintf_chk" and | ||
getName() != "g_strdup_printf" and | ||
getName() != "__builtin___sprintf_chk" and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be
not hasGlobalName("g_strdup_printf") and
not hasGlobalName("__builtin___sprintf_chk") and
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would give the same result, but I wanted to avoid the negation to minimise the performance risk. Given that we've already restricted the name in the charpred to come from a small set, do you agree it's equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, I hadn't looked outside this predicate.
Since the types in `QualifiedName.qll` are raw db types, callers need to use `underlyingElement` and `unresolveElement` as appropriate. This has no effect right now but will be needed when we switch the AST type hierarchy to `newtype`s.
I ran https://jenkins.internal.semmle.com/job/Query-Changes/job/CPP-Differences/555 to confirm that this PR doesn't change behaviour, but there were three changes, all in jdk8u.
|
I think this PR is now ready for review. I'll post performance results when I have them. |
I've measured performance, and it looks good. Compare the 1.20 results to the new results. It's not an apples-to-apples comparison because many unrelated things have changed between the two revisions, and the second evaluation ran with more RAM, but I think it's clear what happened with the qualified-name predicates:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor nits, LGTM otherwise
override string getName() { enumconstants(this, _, _, _, result, _) } | ||
|
||
UserType getDeclaringEnum() { enumconstants(this, result, _, _, _, _) } | ||
// Unlike the usual `EnumConstant`, this one doesn't have a `getDeclaringType()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only reviewed a few bits. I think this change is a good idea.
* For performance, prefer the multi-argument `hasQualifiedName` or | ||
* `hasGlobalName` predicates since they don't construct so many intermediate | ||
* strings. For debugging, the `semmle.code.cpp.Print` module produces more | ||
* detailed output but are also more expensive to compute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
* component of `namespaceQualifier`, no declaring type, and a base name of | ||
* `baseName`. | ||
* | ||
* See the 3-argument `hasQualifiedName` for more examples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for examples
as there are no examples here. Though I'd prefer you move the "std", "vector"
example to this predicate.
See https://jira.semmle.com/browse/CPP-280. This PR is best reviewed one commit at a time.
This PR is supposed to be completely behaviour-preserving. I'll run the full test suite and CPP-Differences to confirm that.