Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment not used #5128

Closed
wants to merge 35 commits into from

Conversation

retailcoder
Copy link
Member

Fixes known false positives by tracking and ignoring conditional assignments & references, and assignments made inside a loop body.

Copy link
Contributor

@MDoerner MDoerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get a few test with an actual expression tree, e.g. nested conditionals and conditionals in loops?

Rubberduck.Parsing/Symbols/VariableDeclaration.cs Outdated Show resolved Hide resolved
case VBAParser.ElseBlockContext _:
node = new BranchNode(tree);
lastAssignment = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really a concern of a code path analysis? This seems to be more specialized than the type suggests. When code path analysis API matures, this might need to be refactored into more specialized type. That said, I'm fine with merging it as-is as long it has a TODO marker to note that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lastAssignment is an AssignmentNode whose Usages grow with each ReferenceNode objects as the tree is traversed - that said I need to add more tests, not sure this state reset for Else blocks is covered yet.

@retailcoder
Copy link
Member Author

How should we go about walking GoTo/GoSub...Return execution paths?

@bclothier
Copy link
Contributor

bclothier commented Sep 3, 2019

If the code path analysis doesn't handle the GoTo & GoSub/Return, that probably needs to be its own PR to handle all edge cases with the arbitrary jumps. I suspect we may need to do 2 passes, one to enumerate all labels as a jump target, then to trace the path among the jumps.

/// <summary>
/// Gets all nodes reading this assignment's value.
/// </summary>
public IEnumerable<INode> Usages => _usages;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be IEnumerable<ReferenceNode>?

/// </summary>
public IEnumerable<INode> Usages => _usages;

internal void AddUsage(INode node) => _usages.Add(node);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddUsage(ReferenceNode node)?

ValueNode = value;
}

public AssignmentNode ValueNode { get; }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: not used for AssignmentNotUsedInspection - remove?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is the vanilla code path analysis API, I would definitely want to remove. I think the API itself needs to be made generic, which an inspection can then specialize for their own uses to keep the concerns clean, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK, maybe I'm misunderstanding what the code path analysis API is supposed to do, but the way I see it, it's very much its job to create things like ReferenceNode and AssignmentNode - where AssignmentNode has its Usages already tracked, therefore where ReferenceNode already knows what assignment/value it's referring to, i.e. the client code (e.g. inspections) doesn't need to further process the trees much.
I'll still remove it because YAGNI as far as this PR is concerned, but I feel like the boundaries of this API need to be discussed/clarified.

{
INode node = default;
VBAParser.BooleanExpressionContext branchCondition;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assigned, but not used - feels confusing & wrong

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. A proper solution would be to have a true expression evaluation engine (which is a open issue ATM as we have 3 engines, each not working 100% to specs) but that's definitely beyond this PR's scope.

@retailcoder
Copy link
Member Author

Something feels off with the way we're flattening the trees for each Block - while this correctly separates execution paths, it makes it difficult to identify blocks belonging to loop or conditional execution paths.

INode.GetAssignmentNodes (extension method) does this:

var blocks = node.GetNodes(typeof(BlockNode));
foreach (var block in blocks)
{
    var flattened = block.GetFlattenedNodes(typeof(GenericNode), typeof(BlockNode));

And then proceeds to yield the AssignmentNode instances within the flattened block tree.

I'm thinking BlockNode needs an easy way to tell what type of parent it's under, and AssignmentNode needs to track which BlockNode it lives in. That way this code...

Public Sub DoSomething(ByVal foo As Long)
    If foo > 10 Then
        foo = 10
    Else
        foo = 0
    End If
    Debug.Print foo
End Sub

Makes a tree like this (tracking foo)...

  • BlockNode
    • ReferenceNode (foo > 10)
    • BranchNode (if)
      • BlockNode
        • AssignmentNode (foo = 10)
    • BranchNode (else)
      • BlockNode
        • AssignmentNode (foo = 0)
    • ReferenceNode (Debug.Print foo)

...and then ReferenceNode would link back to not just one AssignmentNode, but every AssignmentNode in each possible execution path, e.g. here the Debug.Print foo reference node would know that the value it's using is referencing foo = 10 in the If branch, and foo = 0 in the Else branch... but that's moot if ReferenceNode isn't linking back to an AssignmentNode... I think the AssignmentNotUsed inspection is actually going to need ReferenceNode to know this, in order to properly handle branch nodes; given a ReferenceNode, the inspection would be able to know whether its value is assigned in all execution paths, and consequently given an AssignmentNode the inspection would be able to know if it's referenced in any ReferenceNode in the same execution path, or any execution path that follows (barring GoTo/GoSub/Resume/Return jumps for now).

@retailcoder retailcoder added the PR-Status: WIP Pull request is work-in-progress, more commits should be expected. label Sep 7, 2019

public interface IExecutableNode : INode
{
bool HasExecuted { get; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I'm not sure bool is what we want here. If we need to measure the reachability, we need to also consider the conditionals. I'm thinking an flag bitmask is what we want:

Unreachable = 0,
Reachable = 1,
Conditional = 2,
Unconditional = 4,
Error = 8,

// Use those for assignment
ConditionallyReachable = Conditional | Reachable,
AlwaysReachable = Unconditional | Reachable,
ConditionalError = Conditional | Error

I also used the word reachable rather than execute because we are not literally executing; we are only assessing whether we could execute this node.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See analogous class under Parsing: the flag is contextual - I've yet to define what's on IExecutionContext but the idea is to have each code path carry its own context.


public class ReturnJumpNode : JumpNode<VBAParser.ReturnStmtContext>
{
public ReturnJumpNode(VBAParser.ReturnStmtContext tree, JumpNode<VBAParser.GoSubStmtContext> origin)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be the statement immediately following the GoSub, rather than the GoSub itself?

var node = new ResumeJumpNode(tree, label?.Context);
CurrentNode = node;
}
// todo: handle Resume Next?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I follow, this is handling the Resume <label> case only. We can't resolve both Resume and Resume Next. Both have similar behavior, the primary difference being that Resume repeats the same statement that failed whereas Resume Next picks up from the next statement. However, in both cases, they don't return to same place; they could return to anywhere within the procedure because errors can happen anywhere. Thus, we don't have a defined return point for those 2 statement at the compile time, unlike the Resume <label>.

I think we want to flag where Resume and Resume Next are used and simply presume that they will come from a reachable path without worrying about exactly where they will go back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything under Rubberduck.CodeAnalysis/CodePathAnalysis is going to end up deleted or rewritten - that said I'll be treating OnErrorGoTo statements as jump nodes going to a label; I intend CPA to traverse them.

@@ -20,7 +20,7 @@ private IEnumerable<IInspectionResult> GetInspectionResults(string code, bool in
using (var state = MockParser.CreateAndParse(vbe.Object))
{

var inspection = new AssignmentNotUsedInspection(state, new Walker());
var inspection = new AssignmentNotUsedInspection(state, new ProcedureTreeVisitor());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't new things up - use the arrange pattern.

Copy link
Contributor

@MDoerner MDoerner Sep 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My next PR will actually introduce a simple base class for inspection tests.


var inspection = new ExcelMemberMayReturnNothingInspection(state);
var visitor = new ProcedureTreeVisitor();
var inspection = new ExcelMemberMayReturnNothingInspection(state, visitor);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, don't new up things here. There are several news in the file that should be an arrange.

{
public partial class VBAParser
{
public partial class ConstSubStmtContext : IIdentifierContext, IAnnotatedContext
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought - I notice that several of the classes have same code repeated for the IIdentifierContext & IAnnottedContext. Maybe consider delegating to a privately-backed class (or a privately-backed class for each interface) so that all public members are now public Foo => _backingField.Foo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed the same thing, and it tickles. I think the problem is that the nodes need to implement the interfaces, since they are being cast to these interfaces in a couple of places.. and implementing CPA traversal will be much easier if nodes can be cast to (or filtered by) the CPA interfaces (e.g. IExecutableNode) they're implementing.

For example there could easily be an ExecutableNodeImpl class:

public class ExecutableNodeImpl : IExecutableNode
{
    private readonly IDictionary<IExecutionContext, bool> _hasExecuted
        = new Dictionary<IExecutionContext, bool>();

    public bool HasExecuted(IExecutionContext context) 
        => _hasExecuted.TryGetValue(context, out var value) && value;

    public void Execute(IExecutionContext context)
        => _hasExecuted[context] = true;
}

And then the GoToStmtContext extension could do this:

private readonly ExecutableNodeImpl _executableNode = new ExecutableNodeImpl();

But then, it's no longer implementing IJumpNode : IExecutableNode, so it would need to define the members anyway.

I'm concluding that as long as we want/need tree nodes to implement interfaces, we'll have to put up with stupid-simple, but repetitive code in all the partial extension classes.

@retailcoder
Copy link
Member Author

This is going to be WIP for a while; will re-open as a draft PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-Status: WIP Pull request is work-in-progress, more commits should be expected.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants