-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Tips and tricks for performance #25259
Comments
As for now I only found the following:
|
can you just run a .net profiler to see what's expensive in your scenarios? I'm personally a fan of DotPeek. |
Analyzer PerformanceHow to profileYou can look at StyleCopTester which is a program written for testing performance in the style cop analyzers project. You can also use a profiler over Visual Studio as @CyrusNajmabadi suggested General TipsEnable Concurrent ExecutionThere is an API newer versions of roslyn that allows you to do analysis in parallel. You need to set the method Syntax over SemanticsThe syntax APIs are far less expensive than the semantic apis. This means that if you can do all or most of your analysis on syntax nodes things will be fast. Most of our analyzers have the pattern of doing quick syntactic checks so we bail out early before doing more expensive semanic work. Prefer Stateless Analyzers over Stateful AnalyzersIf you register a Very Fast, only parsing needs to be done
Not as Fast, some binding will need to be done
Slow, everything needs to be done
|
@jmarolf that information is great, but horribly buried here in an obscure issue. Is it available in any other documentation anywhere? It really should be, but I've never found it. I have a related question. I've seen vague references to the operation API being faster than symbols because it somehow requires less binding, but never found a good resource that discusses this in any detail. Is that true, and if so, in what way? The operations seem to mostly just provide references to symbols, so I don't understand how they can really be an alternative. I prefer to use |
@SamPruden |
I wasn't comparing to syntax, which is obviously the fastest. I've seen it suggested that |
Who has suggested it? Please point out hte places where this has been stated so we can investigate. |
I've come across a few passing references when googling discussions around Roslyn analyzer performance. These two are some I can remember:
Without being intimately familiar with the implementation details of these things, it's hard to judge whether these imply that there are some specific guidelines we should be following for performance reasons. It's not clear to me which (if any) APIs trigger additional expensive binding operations or other forms of lazy loading. The text of 2103 seems to imply that less binding is required when using the |
@CyrusNajmabadi I believe that comment was here. @SamPruden |
Looking at one case, this appears to be algorithmic. i.e. in the original code it needed to do this: while (cur.Parent != null)
{
SyntaxNode pNode = cur.Parent;
ISymbol sym = pNode.GetDeclaredOrReferencedSymbol(model); meaning that not only did it hit every node and ask it for the symbol, it woudl then walk up the entire node chain above it. That's minimum n^2 (or n-log n) to try to look at the 'tree of information'. The point here is that IOp already gives you the tree. so instead of having to go figure it out again, you can just use it in its entirety. So, if you need to just look at a node and its possible meaning, do so. If you need to look at a whole lot of surrounding nodes around a node to figure out what's going on, might as well just use IOp since it literally just gives that to you :) |
I forgot about that, but I'd seen that too, yes. Thanks for that info, that's helpful. I'm particularly curious whether accessing the symbols provided by operations has any significant cost. As an example, I have an public override void VisitBinaryOperator(IBinaryOperation operation)
{
var method = operation.OperatorMethod;
// Skip built in operators
if (method == null) return;
// Actual analysis happens here
} I know that visiting every operation in the block probably seems crazy, but trust me, I do need to do that. Might reading This general pattern is one that comes up with many different operations throughout my analyzers, so I'm curious about general tips rather than just this specific example. |
Other explanations in dotnet/roslyn-analyzers#2103
The primary concern here is htat semantic models are our caching mechanism. And the analyzer infrastructure for node and operation analysis is already written to only look at one at a time. If, durin work, an analyzer goes and makes other semantic models, this is not good for perf. That's because it's basically causing all the information to be recreated for that other semantic model, only to throw it away afterwards when it returns back to the analysis engine and before the analysis engine calls into the next node. Doing this over and over again defeats the purpose of our analysis passes where we attempt to only create a semantic model once for a file and then use the same semantic model for all the analyzers that want to examine that file. This approach allows all the information computed by any analyzer to be cached and reused for other analyzers. Note: this benefit applies to analyzers doing GetSymbolInfo or doing GetOperation. It is primarily about ensuring that we're only creating one semanticmodel for doc and not creating unnecessary ones over and over again. |
It does not. |
It's not crazy. All this will do is allocate some memory as the operation-tree is hydrated out of the internal compiler 'bound nodes'. |
Yes. If you can avoid semantic work based on easy-to-check things in syntax, that's almost always a good idea. |
Yes. If you can algorithmically do less work, then do less work :) |
General tips:
In general, checking syntax local to you is nearly free. If you do need to walk, know that trees can be arbitrarily large though. In general, doing a few local semantic checks is not that expensive. If you do need to walk though, this will scale up accordingly. In general, thins are affected by the number of calls you do, not the types of check you're doing. So if you need to make 1000 checks because of the size of the tree, that's more expensive than a handful of checks done locally. As you scale to more and more checks the worse your system will do. |
Thank you for all of the info!
You've got me a little bit confused here. If accessing In my particular case I do need to be at the operation level for everything, because I unavoidably access and assess |
Sorry, i read it as "I'm wondering whether I should be falling back to |
Depends. If your check is a syntactic check, then certainly use syntax for it... |
Hi folks,
We have noticed we have quite a lot of performance issues with our analyzers but we cannot find any way to have a correct measurement/profiling of the analyzers. I know that there is some project for bench-marking them that is in progress but in the meantime I am wondering if you guys would have some tips on things to avoid or things to try to enforce when developing analyzer so that we avoid some big performance pitfall.
For example when checking for a type, is it recommended to check the string value before getting to the symbol? Shall we avoid to use
FirstAncestorOrSelf<T>
orChildNode().Where
orDescendantNodes().Where...
Cheers
The text was updated successfully, but these errors were encountered: