-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Analyzer proposal
I would like to suggest an idea for a code analyzer that can detect if an allocation is made only to perform a dictionary or hash set lookup where the type of key is string
. For example, in the following cases an allocation can be avoided:
var dictionary = new Dictionary<string, string>();
var key = "...";
var keySpan = key.AsSpan();
dictionary[key.Substring(1)] = value;
dictionary[key.Remove(1)] = value;
dictionary[key[..2]] = value;
dictionary[keySpan.ToString()] = value;
The analyzer could recommend to avoid allocation by creating a ReadOnlySpan<char>
instead and using GetAlternateLookup<ReadOnlySpan<char>>
method exposed by various dictionary types as well as HashSet<T>
.
In case if ToLower{Invariant}
or ToUpper{Invariant}
is called on a key to perform a lookup, it might even be easier to specify StringComparer
when dictionary is constructed.
Examples
For example, for a code like this:
var dictionary = new Dictionary<string, string>();
...
var key = smth.Substring(1);
if (dictionary.TryGetValue(key, out var value) { ... }
the analyzer could recommend using GetAlternateLookup
:
var dictionary = new Dictionary<string, string>();
...
var key = smth.AsSpan(1);
var alternate = dictionary.GetAlternateLookup<ReadOnlySpan<char>>();
if (alternate.TryGetValue(key, out var value) { ... }
And in this case:
var dictionary = new Dictionary<string, string>();
...
var key = smth.ToLowerInvariant();
if (dictionary.TryGetValue(key, out var value) { ... }
it could recommend using StringComparer.InvariantCultureIgnoreCase
if it's possible:
var dictionary = new Dictionary<string, string>(StringComparer.InvariantCultureIgnoreCase);
...
var key = smth;
if (dictionary.TryGetValue(key, out var value) { ... }
Analysis
The analyzer can look for the following calls on instances of Dictionary<string, V>
, ConcurrentDictionary<string, V>
, and FrozenDictionary<string, V>
types:
- Indexer access
Add
,ContainsKey
,Remove
,TryAdd
,TryGetValue
calls
as well as for calls to Add
, Contains
, Remove
or TryGetValue
on instances of HashSet<string>
.
Then it can look at expression used as a key for a lookup. If it's a variable, it may continue to analyze assignments to that variable. In case of a simple expression like a conditional expression or coalesce operation, it may also analyze parts of that expression recursively. The analyzer could report diagnostic if an inline expression passed to an indexer/method or an expression assigned to a variable contains one of these calls:
Substring
,Remove
,Replace
,Trim
,TrimStart
,TrimEnd
,ToLower{Invariant}
,ToUpper{Invariant}
- Range expression
ToString
call on a{ReadOnly}Span<char>
While analyzing assignments to a variable, the analyzer should probably make sure that the variable is not used anywhere except in the lookup. For example, if the variable is passed to some API that takes a string
, the analyzer cannot recommend using Span
instead of string
:
key = smth.Substring(1);
Use(key); // some API that takes a string
if (dictionary.TryGetValue(key, out var value) { ... }
Real examples
I prototyped such an analyzer and ran it as a standalone analysis tool against projects in dotnet/runtime repo. Here are some real examples:
- In AdapterUtil.Common.cs, submitted PR #120603
- In CreateRuntimeRootDescriptorFile.cs on lines 167, 178, 526, 531, 556, 607 (no quick fix as the project also targets net472).