-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic intra-procedural flow analysis on high level IOperation tre… #1541
Conversation
…e to improve preciseness of rules This PR adds NullAnalysis and StringContentAnalysis to help improve results of ReviewSqlQueriesForSecurityVulnerabilities rule (CA2100). We track null state and string content (literal/non-literal state) of locals and parameters. In future this can be improved by tracking fields with points to analysis, analyzing special string functions as such as string.Format, string.Substring, string.Replace, etc. NOTE: Currently we do not generate any back-edges for any loops or track the results of conditional expression for improving the if/else branch analysis - this work will be done once we use the CFG exposed by Microsoft.CodeAnalysis (see dotnet/roslyn#24263).
public static IBlockOperation GetTopmostParentBlock(this IOperation operation) | ||
{ | ||
IOperation currentOperation = operation; | ||
IBlockOperation topMostBlockOperation = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
M [](start = 31, length = 1)
nit: either lowercase to 'm' or upper case in the method name. #Resolved
namespace Microsoft.CodeQuality.Analyzers.Exp.UnitTests.Maintainability | ||
{ | ||
public partial class ReviewSQLQueriesForSecurityVulnerabilitiesTests : DiagnosticAnalyzerTestBase | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend starting with reviewing the tests to understand the additional level of preciseness we are getting from the current version of flow analysis and our scope for improving on it in future. #Closed
@@ -14,5 +14,18 @@ internal static class IDictionaryExtensions | |||
dictionary.Add(key, value); | |||
} | |||
} | |||
public static void Reset<TKey, TValue>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add an empty line #Resolved
|
||
public override IDictionary<TKey, TValue> Bottom => new Dictionary<TKey, TValue>(); | ||
|
||
public override int Compare(IDictionary<TKey, TValue> oldValue, IDictionary<TKey, TValue> newValue) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compare [](start = 28, length = 7)
this one does not work like a regular Compare:
Compare(a,b) == -Compare(b,a) #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will take a look. This file is directly ported from Edgardo's prototype, I will understand the logic further. #Resolved
{ | ||
var value = base.VisitVariableDeclarator(operation, argument); | ||
|
||
// Handle variabl declarations without initializer (IVariableInitializerOperation). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
variabl [](start = 22, length = 7)
variable #Resolved
{ | ||
var value = base.VisitInvocation(operation, argument); | ||
|
||
// Current, we are not performing flow analysis for invocations of lambda or delegate or local function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current [](start = 15, length = 7)
Currently? #Resolved
|
||
// Current, we are not performing flow analysis for invocations of lambda or delegate or local function. | ||
// Pessimistically assume that all the current state could change and reset all our current analysis data. | ||
// TODO: Analyze lambda and local functions and flow the values from it's exit block to CurrentAnalysisData. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO [](start = 15, length = 4)
Work item for the TODO? #Resolved
/// <summary> | ||
/// Operation visitor to flow the abstract dataflow analysis values across a given statement in a basic block. | ||
/// </summary> | ||
internal abstract class DataFlowOperationWalker<TAnalysisData, TAbstractAnalysisValue> : OperationVisitor<object, TAbstractAnalysisValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nWalker [](start = 44, length = 7)
Maybe Visitor is more consistent #Resolved
// The newly computed abstract values for each basic block | ||
// must be always greater or equal than the previous value | ||
// to ensure termination. | ||
Debug.Assert(compare <= 0, "The newly computed abstract value must be greater or equal than the previous one."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compare <= 0 [](start = 29, length = 12)
This semantic is not standard for Compare. Should review naming #Resolved
/// <returns>A new copy of <paramref name="value"/></returns> | ||
public virtual T Copy(T value) | ||
{ | ||
return value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return value; [](start = 12, length = 13)
Is this a Copy? #Resolved
/// <para>Less than zero: <paramref name="oldValue"/> is less than <paramref name="newValue"/>.</para> | ||
/// <para>Zero: <paramref name="oldValue"/> equals <paramref name="newValue"/>.</para> | ||
/// <para>Greater than zero: <paramref name="oldValue"/> is greater than <paramref name="newValue"/>.</para> | ||
///</returns> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this seems to be the right semantic! #Resolved
internal void AddPredecessor(BasicBlock block) | ||
{ | ||
_predecessors.Add(block); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for consistency with methods just above, you can use expression bodied method #Resolved
|
||
/// <summary> | ||
/// Result from execution of <see cref="StringContentAnalysis"/> on a basic block. | ||
/// It store string content values for symbols at the start and end of the basic block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store [](start = 11, length = 5)
stores #Resolved
/// <summary> | ||
/// Operation visitor to flow the string content values across a given statement in a basic block. | ||
/// </summary> | ||
private sealed class StringContentDataFlowOperationWalker : DataFlowOperationWalker<StringContentAnalysisData, StringContentAbstractValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Walker [](start = 59, length = 6)
Visitor? #Resolved
|
||
public override StringContentAbstractValue VisitObjectCreation(IObjectCreationOperation operation, object argument) | ||
{ | ||
// TODO: Analyze string constructor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO [](start = 19, length = 4)
work item? #Resolved
|
||
public override StringContentAbstractValue VisitInvocation_NonLambdaOrDelegateOrLocalFunction(IInvocationOperation operation, object argument) | ||
{ | ||
// TODO: Handle invocations of string methods (Format, SubString, Replace, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO [](start = 19, length = 4)
Work item? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concat
is another method to look out for. #Resolved
Wow, this is really cool. Shouldn't this be implemented in Roslyn itself though instead of here? #Resolved |
|
||
public override NullAbstractValue VisitDeclarationExpression(IDeclarationExpressionOperation operation, object argument) | ||
{ | ||
var unused = base.VisitDeclarationExpression(operation, argument); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var unused = base.VisitDeclarationExpression(operation, argument); [](start = 16, length = 66)
Do we have a state in the base class we affect with those calls? The semantic is confusing. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Base class is storing the value for every visited operation in a map (see Visit override in the base class). This map is eventually added to the final analysis result. If a subtype of the visitor skips visiting some nodes, then it will lead to missing entries in this map. It relies on standard visitor implementation pattern, that every single node in the tree will be visited.
However, I agree that we must add a debug check or assert to verify this contract, otherwise it can easily be violated in a subtype. I will do so in a separate commit. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can use discards here, or omit the assignment entirely? #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added debug code in DataflowOperationVisitor.Flow
that will enforce that we Visit every single descendant operation.
In reply to: 162760854 [](ancestors = 162760854)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🕐
@jamesqo Yes, we hope eventually we will be able to expose a CFG and a framework for DFA from Roslyn, but we have decided to start the work here to give us faster iterations and end-to-end testing and also help make quicker progress on the DFA based rules. #Resolved |
/// <summary> | ||
/// Gets a collection of the string literals that could possibly make up the contents of this string <see cref="Operand"/>. | ||
/// </summary> | ||
public ImmutableHashSet<string> LiteralValues { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would probably convenient if these were ImmutableOrderedSets instead of unordered HashSets. #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have simplified the implementation of this type to just track: (1) Set of possible literal values and (2) Non literal state. Hopefully, the Merge function is also more intuitive now.
In reply to: 162766874 [](ancestors = 162766874,162766847)
{ | ||
result = value1; | ||
} | ||
else if (value1 != value2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add this assert as documentation (or a comment to this effect):
Debug.Assert((value1 == MaybeNull && value2 == NotNull) ||
(value1 == NotNull && value2 == MaybeNull);
``` #Resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I felt this is evident from the previous conditions. I will add a comment.
In reply to: 162772691 [](ancestors = 162772691)
result = NullAbstractValue.MaybeNull; | ||
} | ||
else | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this assert to make clear what's happening:
Debug.Assert(value1 == value2);
``` #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
} | ||
else | ||
{ | ||
return DefaultValue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would this happen? #Resolved
|
||
public override NullAbstractValue VisitDeclarationExpression(IDeclarationExpressionOperation operation, object argument) | ||
{ | ||
var unused = base.VisitDeclarationExpression(operation, argument); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can use discards here, or omit the assignment entirely? #Resolved
return GetValueBasedOnInstanceOrReferenceValue(operation.Operation); | ||
} | ||
|
||
public override NullAbstractValue VisitAddressOf(IAddressOfOperation operation, object argument) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this isolated from all of the other methods that don't look like copy&paste? #Resolved
/// Result from execution of <see cref="NullAnalysis"/> on a basic block. | ||
/// It store null values for symbols at the start and end of the basic block. | ||
/// </summary> | ||
internal class NullBlockAnalysisResult: AbstractBlockAnalysisResult<NullAnalysisData, NullAbstractValue> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: space before colon #Resolved
return other != null && | ||
LiteralState == other.LiteralState && | ||
NonLiteralState == other.NonLiteralState && | ||
LiteralValues.Equals(other.LiteralValues) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think immutable collections do contents equality, they just use the default object.Equals
I believe. Was that your intent here? #Resolved
mergedLiteralState = StringContainsState.Yes; | ||
if (otherState.LiteralState == StringContainsState.Yes) | ||
{ | ||
// FxCop compat: Only merge literalValues if both states are Yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What other possibilities could there be? #Resolved
|
||
public override StringContentAbstractValue VisitInvocation_NonLambdaOrDelegateOrLocalFunction(IInvocationOperation operation, object argument) | ||
{ | ||
// TODO: Handle invocations of string methods (Format, SubString, Replace, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concat
is another method to look out for. #Resolved
|
||
public BasicBlockKind Kind { get; private set; } | ||
public ImmutableArray<IOperation> Statements => _statements.ToImmutable(); | ||
public ImmutableHashSet<BasicBlock> Successors => _successors.ToImmutable(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why is an ImmutableHashSet
necessary here? #WontFix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe Edgardo's original prototype could invoke AddSuccessor with same block multiple times. This is temporary code that will go away once we move to compiler's CFG, so I am leaving it as is.
In reply to: 162773459 [](ancestors = 162773459)
Ping, can I get another look? Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dotnet/roslyn-analysis can I get one more review please? |
@@ -1,6 +1,6 @@ | |||
Microsoft Visual Studio Solution File, Format Version 12.00 | |||
# Visual Studio 15 | |||
VisualStudioVersion = 15.0.27110.0 | |||
VisualStudioVersion = 15.0.27130.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Revert?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlekseyTs FYI |
…e to improve preciseness of rules
This PR adds NullAnalysis and StringContentAnalysis to help improve results of ReviewSqlQueriesForSecurityVulnerabilities rule (CA2100). We track null state and string content (literal/non-literal state) of locals and parameters. In future this can be improved by tracking fields with points to analysis, analyzing special string functions as such as string.Format, string.Substring, string.Replace, etc.
NOTE: Currently we do not generate any back-edges for any loops or track the results of conditional expression for improving the if/else branch analysis - this work will be done once we use the CFG exposed by Microsoft.CodeAnalysis (see dotnet/roslyn#24263).