Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static analysis for .NET 5 #40739

Open
stephentoub opened this issue Sep 1, 2019 · 25 comments

Comments

@stephentoub
Copy link
Member

commented Sep 1, 2019

In FxCop's hey day, many rules were written to help validate the correctness and performance of .NET code. .NET has evolved significantly since then, however, and the creation of new rules has not kept up with the times; we've added very few new analyzers to help validate the correctness and performance of code using the wealth of new types, methods, and patterns that have made their way into .NET in the interim.

For .NET 5, we should augment https://github.com/dotnet/roslyn-analyzers/tree/master/src/Microsoft.NetCore.Analyzers with new analyzers to help further validate the correctness and improve the performance of code written for .NET.

Guidelines:

  • Every rule we add should be applicable to many libraries targeting netcoreapp and/or netstandard, including but not limited to those that compose .NET Core itself.
  • Every rule we add should be done so with the goal of enabling it in all of the libraries that compose .NET Core itself.
  • Any rule we add must have a very low false positive rate. We should be able to enable the rule in all of the libraries that compose .NET Core with few-to-no suppressions.
  • Auto-fixes should be included whenever possible. However, they should only trigger for cases of extremely high confidence that the revised code is as good or better than the original.
  • When in doubt, heuristics should err on the side of not warning, i.e. we prefer false negatives over false positives. If a "confidence" level system is ever added to analyzers, this can be revisited.

This issue exists to collect and catalog ideas for such rules to be implemented in the .NET 5 timeframe.

Note that the notes below are just starting ideas... cases called out may not be the right ones to check, may be missing additional cases of importance, etc. When we decide to tackle one of these, part of the work item (which should be split off into a separate issue) will be determining the right heuristics to employ (and potentially even deciding that the false positive rate will be too high to include such a rule at all).

Correctness

System

  • HashCode usage. Find GetHashCode implementations doing manual hashcode combination and switch to HashCode.Combine.

  • ArgumentException arguments. Find places where nameof(…) is passed as the first (rather than second) argument to ArgumentException. Also places where nameof is used as the argument name in an ArgumentException (or derived type) but where it's not referring to a parameter.

  • Local DateTime math. Find places where DateTimes known to be in local time (e.g. resulting from DateTime.Now) are used in math.

  • Buffer.BlockCopy. The count argument is in bytes, but it's easy to accidentally do something like Buffer.BlockCopy(src, 0, dst, 0, src.Length), and if src is an array of elements larger than bytes, this is a bug.

System.Buffers

  • ArrayPool misuse. The rule should flag cases where a buffer is potentially being returned to the shared pool multiple times, e.g. where a buffer is returned and then still used after the Return; where a method taking an array by ref Returns the array but doesn't null out the ref; where there's a Return call in a try block, where that same instance is also returned in a catch block, and where something after the Return call in the try block could cause an exception that would trigger the same instance to be returned again in the catch; etc.

  • MemoryManager finalizers. Adding a finalizer to a MemoryManager<T>-derived type is likely an indication of a bug, as it suggests a native resource that could have been handed out in a Span<T> is getting cleaned up and potentially while it’s still in use by the Span<T>.

System.Linq

  • PLINQ nops. Using .AsParallel() at the end of a LINQ query, e.g. foreach (var item in src.Select(…).Where(…).AsParallel(…)), is a nop and should either be removed or the AsParallel moved earlier in the query. I’ve even seen developers write foreach (var item in src.AsParallel()) thinking it parallelizes the foreach loop, which it doesn’t… it’d be good to warn about such misuse.

System.Runtime.InteropServices

  • P/Invoke errors. If #40740 goes ahead, we should flag cases where we can detect very likely misuse.

System.Text.Json

  • Utf8JsonReader misuse. The rule should flag cases where a reader is passed around by value.

System.Threading

  • readonly SpinLock fields. SpinLock is a mutable struct, meant only for advanced scenarios. Accidentally making a SpinLock field readonly can result in silent but significant problems, as any mutations to the instance (e.g. Enter, Exit) will be done on a compiler-generated copy and thus be ignored, making the lock an expensive nop. (It might make sense to extend this analyzer to additional mutable struct types where storing them in a readonly field is likely a bug, e.g. GCHandle.)
  • CancellationToken flowing. The rule should try to identify places where a CancellationToken should have been passed but wasn’t, e.g. in an async method that takes a CancellationToken, a method is called that has an overload that takes a CancellationToken but a shorter overload that doesn’t take a CancellationToken was used instead. cc: @marklio

System.Threading.Tasks

  • Passing TaskContinuationOptions to new TaskCompletionSource<T>(object state). TCS has a ctor that takes a TaskCreationOptions options and another ctor that takes an object state. There’s a similar enum to TaskCreationOptions that’s only meant to be used with ContinueWith: TaskContinuationsOptions. It’s easy to accidentally pass a TaskContinuationOptions to the TCS ctor, in which case it binds to the overload accepting an object and isn’t treating as options at all (and, adding insult to “your options aren’t respected” injury, it also boxes).
  • ValueTask/ValueTask<T> correctness. The rule should detect cases where there's a strong liklihood a ValueTask{<T>} is being used incorrectly, e.g. where a single instance may be awaited multiple times, where an instance may be awaited and then also returned out of a method, where an instance may have .GetAwaiter().GetResult() called on it when it's not obviously already completed, when one is stored into a static field or a dictionary or some other publishing mechanism, etc.

Performance

System

Span
  • string.Concat with substrings. The rule should flag instances of a pattern like str1 + str2.Substring(…) + str3 or string.Concat(str1, str2.Substring(…), str3) and instead switch to using the span-based string.Concat(str1, str2.AsSpan(…), str3).

  • AsSpan instead of Substring. Somewhat more generally, any time string.Substring is used as an argument to something where there's an equivalent overload that takes a ReadOnlySpan<char> (e.g. StringBuilder.Append(string) vs StringBuilder.Append(ReadOnlySpan<char>)), the case can be flagged to be changed to use AsSpan instead.

  • Span.SequenceEquals. Identify open-coded comparison loops between two spans/arrays and suggest replacing with SequenceEquals.

  • static ReadOnlySpan<byte> properties. With a pattern like private static readonly byte[] s_array = new byte[] { ... } where the static readonly byte[] is internal/private, all consumers of s_array could instead operate on a span, and the field is initialized to a new byte[] of constant values, it can be changed instead to static ReadOnlySpan<byte> Data => new byte[] { ... }, and the C# compiler will optimize the implementation.

  • Replace local allocations with span stackallocs.. Flag places where known small temporary arrays of primitives (e.g. with a small constant length / where the total size of sizeof(T)*length can be determined to be < some threshold) not inside any loop and not passed around could be replaced by span stackallocs.

String
  • string.Concat consolidation. Various patterns of string concatenation generate unnecessary intermediate strings, e.g. string result = s1 + s2; return result + s3; will create an unnecessary string allocation, as will string result = s1 + s2; if (condition) result+= s3; return result; which could be rewritten as string result = condition ? s1 + s2 + s3 : s1 + s2;. The rule would find and offer fixes for such patterns.

  • Primitive substring parsing. The rule should flag instances of a pattern like int.Parse(str.Substring(…)) and instead switch to using the span-based int.Parse(str.AsSpan(…)). This would apply to all of the primitive types, and more generally potentially anything that has an overload taking a ReadOnlySpan<char> instead of a string.

  • String.IndexOf(...) == -1. Calls to String.IndexOf(...) where the result is then just compared to -1 can instead be replaced by calls to String.Contains(...).

StringBuilder
  • StringBuilder.Append(char vs string). It's common to see calls to StringBuilder.Append(string) with a const string containing a single character, e.g. ",". These would be slightly cheaper as calls using a const char instead.

  • StringBuilder.Append(primitive.ToString()). The primitive should be passed directly instead.

Stream
  • Stream.ReadByte/WriteByte missing overrides. The rule should flag custom Stream-derived types that don’t override ReadByte or WriteByte.

  • Stream.ReadAsync/WriteAsync missing overrides. The rule should flag custom Stream-derived types that override BeginRead/EndRead or BeginWrite/EndWrite but that don’t override ReadAsync or WriteAsync. And it should flag custom Stream-derived types that override the array-based ReadAsync or WriteAsync but that don’t override the Memory-based overloads of the same name. (Potentially the same should be done for the Span-based overloads, but as the array-based Read and Write methods are abstract and thus must be overridden, it’s harder to say whether those should be or not.)

  • Stream.Read/WriteAsync overload usage. Find places where await stream.Read/WriteAsync(array, offset, length, …) are used and recommend they be replaced by calls to the overloads that take {ReadOnly}Memory<byte>, to benefit from the return type being ValueTask<int>.

Tuples
  • Tuple instead of ValueTuple. The rule should find and flag cases where a Tuple<…> is being used but where a ValueTuple<…> would suffice, ideally with the C# language syntax employed. There are some cases where a Tuple<…> is beneficial, however, so the patterns identified here would be constrained.
Nullable
  • Nullable.GetValueOrDefault. After checking a Nullable<T>.HasValue, it's common to see calls to Nullable<T>.Value; instead of calling Value, it's less work to call GetValueOrDefault(), as Value repeats the HasValue check. It's possible a future JIT could optimize away the duplicate check, but if nothing else using GetValueOrDefault() makes the job of the JIT easier.
Other
  • params array allocation in loops. Find calls to System.* methods inside loops, where those methods take params arrays and those params arrays are being allocated, and hoist the allocation to before the loop if possible.

  • Lifting arrays of consts to statics. Arrays of consts passed to known methods on types like System.String (e.g. string.IndexOfAny(new[] { ',', '.' })) can be lifted out to static readonly fields.

System.Collections

  • Dictionary.ContainsKey(key) followed by Dictionary.Remove(key). The pair can be combined into just the Remove call.

  • Dictionary.ContainsKey(key) followed by Dictionary.this[key]. The pair can be combined into just a TryGetValue call.

  • !Dictionary.ContainsKey(key) followed by Dictionary.Add. The pair can be combined into just a TryAdd call.

  • Dictionary.ContainsKey(key), followed by Dictionary.this[key], followed by Dictionary.Remove(key). The trio can be combined into just the Remove call, using the overload accepting out TValue value.

System.Collections.Concurrent

  • Count instead of IsEmpty. Find places where a concurrent collection’s Count is accessed and compared to 0, then replace with IsEmpty.

System.Runtime.InteropServices

  • sizeof vs Marshal.SizeOf. sizeof is much more efficient than Marshal.SizeOf. In cases where they'll produce the same answer, the former should be preferred (it'll generally require using unsafe, though, so this should only fire if unsafe is allowed in the project).

System.Threading

  • Replace “old” synchronization primitives with newer ones. e.g. find places where a ManualResetEvent is created without a name, where its never passed to WaitHandle.WaitAll/Any, etc., and replace usage with ManualResetEventSlim.

System.Threading.Tasks

  • Using ValueTask<T> instead of Task<T>. Flagging internal/private methods that returns Ts that won’t be entirely cached (e.g. bool) and where every caller of the method only ever awaits its result directly.

  • Task.Delay in Task.WhenAny. Flag places where a Task.Delay is used as an argument to WhenAny and where that Task.Delay doesn’t take a cancellation token, in which case the Task.Delay is likely leaving a timer running for longer than is necessary.

  • Task.WhenAll with one argument. There’s no reason to call WhenAll with a single Task; just use the Task.

  • Task.WaitAll with one argument. Task.Wait can be used instead.

Style

System.Runtime.InteropServices

  • DllImport defaults. Flag places where an attribute is set on a DllImport that’s already the default value.

System.Threading

  • CancellationToken.ThrowIfCancellationRequested. Flag code that does if (token.IsCancellationRequested) throw new OperationCanceledException(); or if (token.IsCancellationRequested) throw new OperationCanceledException(token); and replace with token.ThrowIfCancellationRequested();.

cc: @jaredpar, @mavasani, @danmosemsft

@stephentoub stephentoub added this to the 5.0 milestone Sep 1, 2019
@mavasani

This comment has been minimized.

Copy link

commented Sep 2, 2019

Tagging @mikadumont, who is driving the effort to add the FxCop analyzers package by default to all the new C#/VB netstandard and netcore proejcts created in Visual Studio. Having the above analyzers light by default for new projects would be awesome!

@Therzok

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2019

The string.Concat optimization looks very close to an analyzer I suggested for System.Text.StringBuilder.Append.

Maybe we can piggy back on the implementation PR to get a similar set of analyzers done quicker?

@Therzok

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2019

Similarly, there's another suggestion that could be done in System.Collections.Concurrent: dotnet/roslyn-analyzers#2274

@Evangelink

This comment has been minimized.

Copy link

commented Sep 2, 2019

Is it something community developers could help with? I'd like to be able to help with the development of some (of those) analyzers.

@KrisVandermotten

This comment has been minimized.

Copy link

commented Sep 2, 2019

I have some of these implemented. Should I contribute?

@scalablecory

This comment has been minimized.

Copy link
Member

commented Sep 2, 2019

Assuming you can't find an existing one, I'd start by creating an issue for each analyzer you'd want to add, over on dotnet/roslyn-analyzers.

@Siderite

This comment has been minimized.

Copy link

commented Sep 3, 2019

I have rarely used IsEmpty because the overwhelming scenario is IsNullOrEmpty, which I always have to create in an extension method. .Net should consider an out of the box solution, similar to String
IsNullOrEmpty.

@Siderite

This comment has been minimized.

Copy link

commented Sep 3, 2019

... and no to x?.IsEmpty != true

@DaZombieKiller

This comment has been minimized.

Copy link

commented Sep 3, 2019

readonly SpinLock fields

GCHandle is another example where such an analyzer would be useful

@mareklinka

This comment has been minimized.

Copy link
Collaborator

commented Sep 3, 2019

Hi everyone, I'll be co-organizing a small in-company hackathon next weekend and we'd like to pick up some of these (most likely the ArgumentException, PLINQ nops, and maybe the read-only spinlock fields). I'll start by creating new issues in the dotnet/roslyn-analyzers repo to get the discussion started.

Anything else I'll need to do to get the ball rolling?

@GSPP

This comment has been minimized.

Copy link

commented Sep 4, 2019

Giving some feedback:

ArgumentException arguments is a great case for a rule that is not relevant in some development scenarios. In typical LOB (line of business e.g. company internal software implementing business processes) applications, the standards for code quality are lower than in systems code (such as CoreFX and library code). I'm sure that the code I work on has this particular defect hundreds of times but I don't care in the slightest way about it. It does not affect productivity.

Another example is the category of performance rules. In LOB apps, 99% of code is cold. Performance does not matter to a large extent. This makes all performance-related rules noise.

In my experience, it is hard to find "linter" rules that help productivity in LOB apps. False positives are not just caused by inappropriate rule firing, they also are caused by correct rule firing in places that nobody cares about.

The difference between a good and a bad codebase is not the ruleset implemented by C# warnings, FxCop or proposed by this issue. If a codebase is bad, the issues usually are fundamental. Such issues cannot be expressed in terms of rules.

This is not to say that this effort is without value. Rather, I want to point out how varied the development scenarios are. This needs to be kept in mind when designing rule sets.

From personal experience, I know that developers get tired of warning systems very quickly if the warnings are not actionable.

If we want high actionability (and I think that is critical), this forces a tradeoff: Either the ruleset must be very small, or there must be a very good management system for unactionable warnings. I wonder how "industrial strength" analysis software does this (such as Coverity; Eric Lippert works there now).

@mavasani

This comment has been minimized.

Copy link

commented Sep 4, 2019

Any rule we add must have a very low false positive rate. We should be able to enable the rule in all of the libraries that compose .NET Core with few-to-no suppressions.
When in doubt, heuristics should err on the side of not warning, i.e. we prefer false negatives over false positives. If a "confidence" level system is ever added to analyzers, this can be revisited.

I agree with this in general, and this should definitely be the default behavior. However, note that we do support configurable options in the FxCop analyzers package, which can allow end users to customize bunch of aspects of the rule execution:

  1. Force the rule to be more aggressive at the expense of additional false positives. There are lot of users who prefer false positives/suppressions over false negatives. Additionally, one may also want to do a one-off exercise of switching to more aggressive mode to just scan additional potential issues.
  2. Parameterize the execution to meet each user's and/or project's requirements - for example, consider feature request dotnet/roslyn-analyzers#2799 to allow existing CA2241 rule to support custom string formatting methods.

Obviously, these configurability support can come as follow-up additions based on user requests, but would allow lot of flexibility in cases where the default analyzer behavior is not unanimously acceptable.

@sixlettervariables

This comment has been minimized.

Copy link

commented Sep 4, 2019

String.IndexOf(...) == 0. Calls to String.IndexOf(...) where the result is then just compared to 0 can instead be replaced by calls to String.Contains(...).

@stephentoub Should Contains actually be String.StartsWith(...)? Or was this meant to be for String.IndexOf(...) >= 0?

@stephentoub

This comment has been minimized.

Copy link
Member Author

commented Sep 4, 2019

Should Contains actually be String.StartsWith(...)? Or was this meant to be for String.IndexOf(...) >= 0?

Typo. It was actually meant to be comparing to -1 instead of to 0. Fixed. Thanks.

GCHandle is another example where such an analyzer would be useful

Yup. Thanks. Updated to mention it as well.

@xt0rted

This comment has been minimized.

Copy link

commented Sep 4, 2019

Having more analyzers is great, but is there a way to easily share configs through NuGet similar to ESLint or Stylelint configs on NPM? I haven't seen anything on doing this and it would greatly simplify adding it to all of my projects.

@mavasani

This comment has been minimized.

Copy link

commented Sep 4, 2019

Having more analyzers is great, but is there a way to easily share configs through NuGet similar to ESLint or Stylelint configs on NPM? I haven't seen anything on doing this and it would greatly simplify adding it to all of my projects.

Editorconfig file is the chosen and recommended analyzer configuration format starting Dev16.3. You can share configurations between projects by creating an editorconfig at the root of your solution folder. However, Roslyn compiler does not support specifying editorconfig files from arbitrary directories. I presume your feature request is similar to MicrosoftDocs/visualstudio-docs#3409 (comment) and hopefully my replies on that issue clarifies the current state.

@jaredpar

This comment has been minimized.

Copy link
Member

commented Sep 4, 2019

Suggestion for consideration:

Correctness

Span

  • Span.SequenceEquals: flag loops which manually compare Span<T> / ReadOnlySpan<T> instances and suggest SequenceEquals.
@stephentoub

This comment has been minimized.

Copy link
Member Author

commented Sep 4, 2019

@jaredpar, added under perf. Thanks.

@Siderite

This comment has been minimized.

Copy link

commented Sep 4, 2019

@ericstj

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

We could also catch bad usage of BinaryFormatter, and I don't just mean any usage as some might prefer. Where the type passed in to BinaryFormatter doesn't contain Serializable, where the type cast to from deserialization doesn't have Serializable. When Serializable is on a type we could check that all the types contained in that type are Serializable. Perhaps this is more of a compatibility rule, but I know that it would help solve a number of issues and runtime failures I've looked into in 3.0.

@stephentoub

This comment has been minimized.

Copy link
Member Author

commented Sep 10, 2019

@ericstj, we could do something like that. However, we'll run into the same issue highlighted in:
dotnet/roslyn-analyzers#1775 (comment)
The ref assemblies don't have [Serializable] today.

@ericstj

This comment has been minimized.

Copy link
Member

commented Sep 10, 2019

That's something we could fix. Serializable is effectively part of the contract and something that we must maintain once we add it. We have always been conservative about what we put in the reference assemblies, and it wasn't until this release we started adding and validating the presence of attributes. Compile-time validation of Serializeable seems like a reasonable value proposition of adding the attributes. That said we probably won't go back and add it to things already shipped so that would limit the usefulness of the analyzer.

@hughbe

This comment has been minimized.

Copy link
Collaborator

commented Sep 23, 2019

Thoughts on an analyser that would flag up cases like if (a || b && c) telling you to use bracketing (i.e. if ((a || b) && c) or if (a || (b && c))`)

see dotnet/winforms#1955

@stephentoub

This comment has been minimized.

Copy link
Member Author

commented Sep 23, 2019

Thoughts on an analyser that would flag up cases like if (a || b && c) telling you to use bracketing (i.e. if ((a || b) && c) or if (a || (b && c))`)

StyleCop already has such analyzers, e.g.
https://github.com/DotNetAnalyzers/StyleCopAnalyzers/blob/master/documentation/SA1408.md

@xt0rted

This comment has been minimized.

Copy link

commented Sep 23, 2019

There's an analyzer for that in Roslynator (RCS1123) as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.