Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Analysis & Code Metrics #403

Closed
robodude666 opened this issue Apr 21, 2015 · 11 comments

Comments

@robodude666
Copy link

commented Apr 21, 2015

Would be cool to support metrics and graphs over time. Can start with "simple" metrics like Lines of Code, Cyclomatic Complexity, Maintainability Index, number of broken rules, etc.

@rubberduck203

This comment has been minimized.

Copy link
Member

commented Apr 21, 2015

👍 Consider it on the roadmap. Feel free to fork and start implementing. It could be a while before we get there.

@retailcoder retailcoder added this to the Version 2.0 milestone Sep 28, 2015

@retailcoder

This comment has been minimized.

Copy link
Member

commented Sep 28, 2015

I want that in 2.x, but it needs to be specified in more details:

  • Lines of code; easy. just count non-empty trimmed lines; separate comment lines from actual lines with executable instructions.
  • Cyclomatic complexity; that's probably a job for a parse tree listener. not exactly a walk in the park, I'm not sure how I'd calculate this metric, but it's certainly feasible.
  • Maintainability index; even harder, requires the above two metrics, and some "Halstead Volume" figure.
  • Nesting; report how deeply nested the code is.

Maintainability Index = MAX(0,(171 - 5.2 * ln(Halstead Volume) - 0.23 * (Cyclomatic Complexity) - 16.2 * ln(Lines of Code))*100 / 171)

http://blogs.msdn.com/b/codeanalysis/archive/2007/11/20/maintainability-index-range-and-meaning.aspx

Halstead Volume is $V = N \times \log_2 \eta$ where \eta is the number of distinct operators and N is the total number of operators plus the total number of operands - which seems to roughly correspond to tokens.

https://en.wikipedia.org/wiki/Halstead_complexity_measures


Assuming number of broken rules stands for code inspection issues found, I don't think I would factor that in, as the code inspections evolve with each new release, and are essentially things we think are worth fixing, not to mention the false positives. Besides, the number of code inspection results ("issues") are already an available figure.

I'd be more than happy to see the 3 metrics above, as an initial release.

@rubberduck203

This comment has been minimized.

Copy link
Member

commented Sep 28, 2015

Re: Cyclomatic complexity, wikipedia can help us out there.

https://en.wikipedia.org/wiki/Cyclomatic_complexity

The cyclomatic complexity of a section of source code is the number of linearly independent paths within it. For instance, if the source code contained no control flow statements (conditionals or decision points), such as IF statements, the complexity would be 1, since there is only a single path through the code. If the code had one single-condition IF statement, there would be two paths through the code: one where the IF statement evaluates to TRUE and another one where it evaluates to FALSE, so complexity will be 2 for single IF statement with single condition. Two nested single-condition IFs, or one IF with two conditions, would produce a complexity of 4, 2 for each branch within the outer conditional.

Mathematically, the cyclomatic complexity of a structured program[a] is defined with reference to the control flow graph of the program, a directed graph containing the basic blocks of the program, with an edge between two basic blocks if control may pass from the first to the second. The complexity M is then defined as[2]

M = E − N + 2P,
where

E = the number of edges of the graph.
N = the number of nodes of the graph.
P = the number of connected components.

@retailcoder retailcoder modified the milestones: Future Versions, Version 2.0 Dec 24, 2015

@autoboosh autoboosh self-assigned this Feb 13, 2016

@autoboosh

This comment has been minimized.

Copy link
Contributor

commented Feb 13, 2016

The attached commit takes a first step towards code metrics. It adds the following:

  1. Control flow analysis
  2. Control flow graph
  3. The CFG can be used for a lot of optimizations, measurements etc. The aforementioned commit depicts some of these by including an implementation of the cyclomatic complexity, maintainability index as well as lines of code

The CFG doesn't support macros yet. Also errors are not correct 100% (not sure if that's even possible).

The print()-method in the CFG allows output to be visualized with tools like Webgraphviz. E.g. the following code:

Public Sub Test()
    For Each bar in foo
        If x > 5 Then
            Call TestFunc
        End If
    Next n
End Sub

Would be represented in the following way:

digraph Test {
    ENTRY -> ForEachStmtContext_2_4_6_9
    ForEachStmtContext_2_4_6_9 -> BlockIfThenElseContext_3_8_5_8
    ForEachStmtContext_2_4_6_9 -> EXIT
    BlockIfThenElseContext_3_8_5_8 -> EXIT
    BlockIfThenElseContext_3_8_5_8 -> ForEachStmtContext_2_4_6_9
    BlockIfThenElseContext_3_8_5_8 -> IfConditionStmtContext_3_11_3_15_ExplicitCallStmtContext_4_12_4_17
    IfConditionStmtContext_3_11_3_15_ExplicitCallStmtContext_4_12_4_17 -> EXIT
    IfConditionStmtContext_3_11_3_15_ExplicitCallStmtContext_4_12_4_17 -> ForEachStmtContext_2_4_6_9
}

Which would produce this image:

image

There are probably still quite a few bugs (just found one while writing this).

What do you guys think?

@PeterMTaylor

This comment has been minimized.

Copy link

commented Feb 13, 2016

Nice and looks useful.

@autoboosh

This comment has been minimized.

Copy link
Contributor

commented Feb 15, 2016

Peephole optimizations? Maybe there are certain instructions in VBA that should always be replaced by better versions?

Also, we could borrow some ideas from Google's closure compiler: https://github.com/google/closure-compiler/tree/222e62cac00cb9e1b144f3471f9f87f418a3425f/src/com/google/javascript/jscomp

@retailcoder

This comment has been minimized.

Copy link
Member

commented Feb 15, 2016

@autoboosh nah, I'd rather have these as suggestion level inspections - like the one that suggests using Left$ over Left, to avoid implicit type conversions for example.

@Vogel612

This comment has been minimized.

Copy link
Member

commented Nov 24, 2017

#3522 includes "lines" "cyclomatic complexity" and "nesting level". It does not include a maintainability index. The design should be extensible enough to not completely go bonkers when somebody wants to implement it. I think we can close this issue when the PR is merged, though.

@MDoerner

This comment has been minimized.

Copy link
Contributor

commented Nov 25, 2017

On the module level, we could think about adding two instability metrics:

  1. Let No be the number of module-to-module references from the module to other modules and Ni be the number of those from other modules to the module. The metric would be No/(Ni+No).

  2. Let Fo be the number of members of the module depending on members of another module and Fi the number of members of other modules depending on members of the module. The metric would be Fo/(Fo+Fi).

@ThunderFrame

This comment has been minimized.

Copy link
Member

commented Nov 29, 2017

Would be nice to be able to arrange modules by:

  • Alphabetical name
  • Module type, then alphabetical name
  • Folders
@Vogel612

This comment has been minimized.

Copy link
Member

commented Dec 18, 2018

Follow up for the Maintainability Index in #4657

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.