Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic intra-procedural flow analysis on high level IOperation tre… #1541

Merged
merged 4 commits into from
Jan 24, 2018

Conversation

mavasani
Copy link
Contributor

…e to improve preciseness of rules

This PR adds NullAnalysis and StringContentAnalysis to help improve results of ReviewSqlQueriesForSecurityVulnerabilities rule (CA2100). We track null state and string content (literal/non-literal state) of locals and parameters. In future this can be improved by tracking fields with points to analysis, analyzing special string functions as such as string.Format, string.Substring, string.Replace, etc.

NOTE: Currently we do not generate any back-edges for any loops or track the results of conditional expression for improving the if/else branch analysis - this work will be done once we use the CFG exposed by Microsoft.CodeAnalysis (see dotnet/roslyn#24263).

…e to improve preciseness of rules

This PR adds NullAnalysis and StringContentAnalysis to help improve results of ReviewSqlQueriesForSecurityVulnerabilities rule (CA2100). We track null state and string content (literal/non-literal state) of locals and parameters. In future this can be improved by tracking fields with points to analysis, analyzing special string functions as such as string.Format, string.Substring, string.Replace, etc.

NOTE: Currently we do not generate any back-edges for any loops or track the results of conditional expression for improving the if/else branch analysis - this work will be done once we use the CFG exposed by Microsoft.CodeAnalysis (see dotnet/roslyn#24263).
@mavasani mavasani requested a review from a team January 19, 2018 23:03
public static IBlockOperation GetTopmostParentBlock(this IOperation operation)
{
IOperation currentOperation = operation;
IBlockOperation topMostBlockOperation = null;
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M [](start = 31, length = 1)

nit: either lowercase to 'm' or upper case in the method name. #Resolved

namespace Microsoft.CodeQuality.Analyzers.Exp.UnitTests.Maintainability
{
public partial class ReviewSQLQueriesForSecurityVulnerabilitiesTests : DiagnosticAnalyzerTestBase
{
Copy link
Contributor Author

@mavasani mavasani Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend starting with reviewing the tests to understand the additional level of preciseness we are getting from the current version of flow analysis and our scope for improving on it in future. #Closed

@@ -14,5 +14,18 @@ internal static class IDictionaryExtensions
dictionary.Add(key, value);
}
}
public static void Reset<TKey, TValue>(
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add an empty line #Resolved


public override IDictionary<TKey, TValue> Bottom => new Dictionary<TKey, TValue>();

public override int Compare(IDictionary<TKey, TValue> oldValue, IDictionary<TKey, TValue> newValue)
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare [](start = 28, length = 7)

this one does not work like a regular Compare:
Compare(a,b) == -Compare(b,a) #Resolved

Copy link
Contributor Author

@mavasani mavasani Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will take a look. This file is directly ported from Edgardo's prototype, I will understand the logic further. #Resolved

{
var value = base.VisitVariableDeclarator(operation, argument);

// Handle variabl declarations without initializer (IVariableInitializerOperation).
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variabl [](start = 22, length = 7)

variable #Resolved

{
var value = base.VisitInvocation(operation, argument);

// Current, we are not performing flow analysis for invocations of lambda or delegate or local function.
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current [](start = 15, length = 7)

Currently? #Resolved


// Current, we are not performing flow analysis for invocations of lambda or delegate or local function.
// Pessimistically assume that all the current state could change and reset all our current analysis data.
// TODO: Analyze lambda and local functions and flow the values from it's exit block to CurrentAnalysisData.
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO [](start = 15, length = 4)

Work item for the TODO? #Resolved

/// <summary>
/// Operation visitor to flow the abstract dataflow analysis values across a given statement in a basic block.
/// </summary>
internal abstract class DataFlowOperationWalker<TAnalysisData, TAbstractAnalysisValue> : OperationVisitor<object, TAbstractAnalysisValue>
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nWalker [](start = 44, length = 7)

Maybe Visitor is more consistent #Resolved

// The newly computed abstract values for each basic block
// must be always greater or equal than the previous value
// to ensure termination.
Debug.Assert(compare <= 0, "The newly computed abstract value must be greater or equal than the previous one.");
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compare <= 0 [](start = 29, length = 12)

This semantic is not standard for Compare. Should review naming #Resolved

/// <returns>A new copy of <paramref name="value"/></returns>
public virtual T Copy(T value)
{
return value;
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return value; [](start = 12, length = 13)

Is this a Copy? #Resolved

/// <para>Less than zero: <paramref name="oldValue"/> is less than <paramref name="newValue"/>.</para>
/// <para>Zero: <paramref name="oldValue"/> equals <paramref name="newValue"/>.</para>
/// <para>Greater than zero: <paramref name="oldValue"/> is greater than <paramref name="newValue"/>.</para>
///</returns>
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this seems to be the right semantic! #Resolved

internal void AddPredecessor(BasicBlock block)
{
_predecessors.Add(block);
}
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for consistency with methods just above, you can use expression bodied method #Resolved


/// <summary>
/// Result from execution of <see cref="StringContentAnalysis"/> on a basic block.
/// It store string content values for symbols at the start and end of the basic block.
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

store [](start = 11, length = 5)

stores #Resolved

/// <summary>
/// Operation visitor to flow the string content values across a given statement in a basic block.
/// </summary>
private sealed class StringContentDataFlowOperationWalker : DataFlowOperationWalker<StringContentAnalysisData, StringContentAbstractValue>
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walker [](start = 59, length = 6)

Visitor? #Resolved


public override StringContentAbstractValue VisitObjectCreation(IObjectCreationOperation operation, object argument)
{
// TODO: Analyze string constructor
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO [](start = 19, length = 4)

work item? #Resolved


public override StringContentAbstractValue VisitInvocation_NonLambdaOrDelegateOrLocalFunction(IInvocationOperation operation, object argument)
{
// TODO: Handle invocations of string methods (Format, SubString, Replace, etc.)
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO [](start = 19, length = 4)

Work item? #Resolved

Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concat is another method to look out for. #Resolved

@jamesqo
Copy link
Contributor

jamesqo commented Jan 19, 2018

Wow, this is really cool. Shouldn't this be implemented in Roslyn itself though instead of here? #Resolved


public override NullAbstractValue VisitDeclarationExpression(IDeclarationExpressionOperation operation, object argument)
{
var unused = base.VisitDeclarationExpression(operation, argument);
Copy link
Contributor

@ivanbasov ivanbasov Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var unused = base.VisitDeclarationExpression(operation, argument); [](start = 16, length = 66)

Do we have a state in the base class we affect with those calls? The semantic is confusing. #Resolved

Copy link
Contributor Author

@mavasani mavasani Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base class is storing the value for every visited operation in a map (see Visit override in the base class). This map is eventually added to the final analysis result. If a subtype of the visitor skips visiting some nodes, then it will lead to missing entries in this map. It relies on standard visitor implementation pattern, that every single node in the tree will be visited.

However, I agree that we must add a debug check or assert to verify this contract, otherwise it can easily be violated in a subtype. I will do so in a separate commit. #Resolved

Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can use discards here, or omit the assignment entirely? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added debug code in DataflowOperationVisitor.Flow that will enforce that we Visit every single descendant operation.


In reply to: 162760854 [](ancestors = 162760854)

Copy link
Contributor

@ivanbasov ivanbasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕐

@mavasani
Copy link
Contributor Author

mavasani commented Jan 20, 2018

@jamesqo Yes, we hope eventually we will be able to expose a CFG and a framework for DFA from Roslyn, but we have decided to start the work here to give us faster iterations and end-to-end testing and also help make quicker progress on the DFA based rules. #Resolved

/// <summary>
/// Gets a collection of the string literals that could possibly make up the contents of this string <see cref="Operand"/>.
/// </summary>
public ImmutableHashSet<string> LiteralValues { get; }
Copy link
Member

@333fred 333fred Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably convenient if these were ImmutableOrderedSets instead of unordered HashSets. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, scratch that.


In reply to: 162766847 [](ancestors = 162766847)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have simplified the implementation of this type to just track: (1) Set of possible literal values and (2) Non literal state. Hopefully, the Merge function is also more intuitive now.


In reply to: 162766874 [](ancestors = 162766874,162766847)

{
result = value1;
}
else if (value1 != value2)
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this assert as documentation (or a comment to this effect):

Debug.Assert((value1 == MaybeNull && value2 == NotNull) ||
             (value1 == NotNull && value2 == MaybeNull);
``` #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt this is evident from the previous conditions. I will add a comment.


In reply to: 162772691 [](ancestors = 162772691)

result = NullAbstractValue.MaybeNull;
}
else
{
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this assert to make clear what's happening:

Debug.Assert(value1 == value2);
``` #WontFix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed.


In reply to: 162772716 [](ancestors = 162772716)

}
else
{
return DefaultValue;
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this happen? #Resolved


public override NullAbstractValue VisitDeclarationExpression(IDeclarationExpressionOperation operation, object argument)
{
var unused = base.VisitDeclarationExpression(operation, argument);
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can use discards here, or omit the assignment entirely? #Resolved

return GetValueBasedOnInstanceOrReferenceValue(operation.Operation);
}

public override NullAbstractValue VisitAddressOf(IAddressOfOperation operation, object argument)
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this isolated from all of the other methods that don't look like copy&paste? #Resolved

/// Result from execution of <see cref="NullAnalysis"/> on a basic block.
/// It store null values for symbols at the start and end of the basic block.
/// </summary>
internal class NullBlockAnalysisResult: AbstractBlockAnalysisResult<NullAnalysisData, NullAbstractValue>
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space before colon #Resolved

return other != null &&
LiteralState == other.LiteralState &&
NonLiteralState == other.NonLiteralState &&
LiteralValues.Equals(other.LiteralValues) &&
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think immutable collections do contents equality, they just use the default object.Equals I believe. Was that your intent here? #Resolved

mergedLiteralState = StringContainsState.Yes;
if (otherState.LiteralState == StringContainsState.Yes)
{
// FxCop compat: Only merge literalValues if both states are Yes
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other possibilities could there be? #Resolved


public override StringContentAbstractValue VisitInvocation_NonLambdaOrDelegateOrLocalFunction(IInvocationOperation operation, object argument)
{
// TODO: Handle invocations of string methods (Format, SubString, Replace, etc.)
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concat is another method to look out for. #Resolved


public BasicBlockKind Kind { get; private set; }
public ImmutableArray<IOperation> Statements => _statements.ToImmutable();
public ImmutableHashSet<BasicBlock> Successors => _successors.ToImmutable();
Copy link
Contributor

@jamesqo jamesqo Jan 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, why is an ImmutableHashSet necessary here? #WontFix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Edgardo's original prototype could invoke AddSuccessor with same block multiple times. This is temporary code that will go away once we move to compiler's CFG, so I am leaving it as is.


In reply to: 162773459 [](ancestors = 162773459)

@mavasani
Copy link
Contributor Author

Ping, can I get another look? Thanks

Copy link
Contributor

@ivanbasov ivanbasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@mavasani
Copy link
Contributor Author

@dotnet/roslyn-analysis can I get one more review please?

@@ -1,6 +1,6 @@
Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio 15
VisualStudioVersion = 15.0.27110.0
VisualStudioVersion = 15.0.27130.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert?

Copy link
Member

@333fred 333fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@jinujoseph
Copy link
Contributor

@AlekseyTs FYI

@mavasani mavasani merged commit 3cf60c6 into dotnet:master Jan 24, 2018
@mavasani mavasani deleted the FlowAnalysis branch January 24, 2018 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants