-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cyclomatic Complexity at Type Level for C++ #683
Comments
I have a question. How should we handle nested classes? class A {
void f();
void g();
class B {
void h();
}
}; Should we add the complexity of Conceptually, I like the idea to penalize containing classes for each of their inner class since they fall under them as a logical unit. This would also discourage using too much class nesting overall. However, from a practical standpoint, it might be inconvenient for the user to have complexities counted twice or even more times in case of deeper nesting, e.g. |
@Seeker04 As far as I remember we counted "nested functions" (lamdas) towards a function's McCabe metric in #689, since it increases the complexity of that function to understand. (Should be checked.) With the same argument a nested type increases the complexity of the container type, therefore I would count it in. |
Progress Running the following query on the parsing result of tinyxml2 select astValue, value from (CppAstNodeMetrics m join CppAstNode n on m.astNodeId = n.id) where type = 3; yields the following table:
Most values are correct, however there are a few problems:
struct Entity {
const char* pattern;
int length;
char value;
};
|
Note: In order to use valgrind, I needed to run This solved the valgrind error: valgrind: m_libcfile.c:66 (vgPlain_safe_fd): Assertion 'newfd >= VG_(fd_hard_limit)' failed.
Segmentation fault (core dumped) The image was built using the Docker guide. |
The failed assertion occurs because of this:
|
In the current state the following query on tinyxml2: select astValue, value from (CppAstNodeMetrics m join CppAstNode n on m.astNodeId = n.id) where type = 3; yields:
Classes with only implicit methods are no longer present and the metric of the others are also correct now. Template classes still cause some duplicates. A potential solution to that would be to use getTemplateSpecializationKind in the parser checking the TemplateSpecializationKind. Then a new |
As discussed today, first the types of the project should be considered and only then the methods of those types should be queried. This should be also more efficient. |
I did some investigation. We see results like this for template classes:
because some methods are only generated for the original template AST node and not for the template instantiations. In case of
My first hunch was that these are the ones that does not depend on the template argument, but that's not true, because It looks like the method AST nodes for the template instantiations are only generated on demand, i.e., if they are used in the translation unit. This would also explain the even more diverse set of values for This is covered in [temp.inst] in the C++ standard (In C++11, this is §14.7.1/10. In C++14, it is §14.7.1/11, and in C++17 it is §17.7.1/9. Excerpt from C++17 below) - source
The "shall not" part suggests that this is not enforced and is probably compiler dependant. Simpler/older compilers might generate all methods, but g++ 11.4.0 does not. I still think, the best way to remove these duplicates would be to somehow include information about a record being a template class or template instantiation in the database (by tagging it, for example). Because there is no way to tell the templates and template instantiations apart by observing the current database (i.e., If we could distinguish between the template AST node and its instantiations, then we could simply only include the former in the type McCabe metrics. |
In programming language standards "shall" is the synonym for "must" and "shall not" is a synonym for "must not". "Should" and "should not" has the meaning of optionality. So it is not compiler dependent (and never was).
Let's proceed with that approach then. Is it valid though that the complexity numbers are so different for these template method instantiations? |
Yes, it is valid. The three methods which are not used by the template instantiations of I implemented the logic that marks and then skips template instantiations - other than explicit specializations - during the metric calculation. It's in the latest commit: 0bf929a Now, only the proper values appear for the template classes |
There are still duplicate entities in xerces-c for the following query: // Lookup the definition (different AST node if not defined in class body)
const auto methodDef = _ctx.db->query_one<AstNode>(
odb::query<AstNode>::entityHash == methodAstNode->entityHash &&
odb::query<AstNode>::symbolType == AstNode::SymbolType::Function &&
odb::query<AstNode>::astType == AstNode::AstType::Definition); These duplicates can be extracted with the query: select astValue, count(*) from CppAstNode
where symbolType = 1 and astType = 3
group by entityHash having count(*) > 1
order by astValue; which yields the following result:
This is not a parsing defect, these functions indeed have the displayed number of definitions in the project. This happens, because they are compiled to different binaries, so ODR is not violated. They go to several test binaries, see tests/CMakeLists.txt. An interesting observation is that the On the CodeCompass GUI, if I select a node for This means that the conventional queries are also not smart enough to find the definition that belongs to the corresponding translation unit. I suspect, this is not possible with the current database. CodeCompass does not handle cases where symbols of the same hash (e.g. same signature) get compiled to different binaries. Either the hash had to be extended (by adding a mangled translation unit name perhaps?), or we would need some new field to store this information in @mcserep @intjftw If yes, we should decide how to proceed. If not, then I can replace the In xerces-c, 14 of the 15886 function definition nodes have such duplicate entities. This is a 0.08% ratio, and it also includes functions that are not methods (like |
@Seeker04 See #198 about the issue of project-level but not ODR-level (program-level) duplications. That patch stalled, and I highly suspect the main reason is that it was developed in a world prior to incremental analysis (#266) making its way into CodeCompass, and as it stands, #198 is highly incompatible with the latter development. |
…are compiled to different binaries (Ericsson#683)
…are compiled to different binaries (Ericsson#683)
@whisperity Thank you, this is indeed the information we would need to make this metric more accurate. @mcserep @intjftw I pushed a workaround that considers the first function definition found in these cases (and also rebased my branch). Now there are no assert failed errors, all CI checks passed. |
…are compiled to different binaries (Ericsson#683)
See #682 for Cyclomatic Complexity at Function Level.
Adapted to the object-oriented paradigm, this metric can also be defined for classes as the sum of its methods complexity metric.
The text was updated successfully, but these errors were encountered: