Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UtilityAnalyzer: Roll back the reversal of #6576 and bring back RegisterSemanticModelAction #7368

Closed
martin-strecker-sonarsource opened this issue Jun 12, 2023 · 7 comments · Fixed by #8412
Assignees
Labels
Type: Performance It takes too long.
Projects
Milestone

Comments

@martin-strecker-sonarsource
Copy link
Contributor

#6576 added support for RegisterSemanticModelAction. The Roslyn compiler pipeline exposes several entry points into the compilation pipeline. RegisterSemanticModelAction is one of them and currently missing. #6576 was rolled back by #7262 (reasoning can be found there). The part of #6576 that added the registration including the tests should be added back to the code base. The registration of the UtilityAnalyzer should be changed to use RegisterSemanticModelAction to re-use the semantic model but only after all UtilityAnalyzer are re-worked in a way to reduce the queries to the semantic model to the absolute minimum. If this is not possible, some UtilityAnalyzer (like e.g. the symref analyzer) might opt to create there own semantic model.

@pavel-mikula-sonarsource
Copy link
Contributor

This has proven to cause the regression by design. Instead, the other explored improvements (reducing model quering) should be done.

We should close this as a dead end.

@martin-strecker-sonarsource
Copy link
Contributor Author

Closing this ignores the fact that we do not re-use the semantic model but create a new one for each UtilityAnalyzer. We should have at least the option to re-use it and opt out only if we see fit. There might be reasons to allow any analyzer to operate on its own semantic model (SE I'm looking at you here), but it should be a balanced decision. At the moment it is:

That isn't well-balanced but more like a quick patch to fix the major problems caused by querying the semantic model for each identifier by the token type analyzer.

RegisterSemanticModelAction is needed if we want to be able to do more fine-grained re-use of the semantic model as described in #4217.

@martin-strecker-sonarsource
Copy link
Contributor Author

martin-strecker-sonarsource commented Jul 3, 2023

How do we measure the regression?

  • Ask Marry for the lists of projects, probably RavenDB (commit: the one on peach) and others
  • Clone locally
  • Current master of sonar-dotnet, Release build
    • Kill all msbuild/dotnet processes before run
    • Run the Scanner, local SQ SonarWay profile, if not reproducible use "All"
    • Produce a binlog
    • Build time is the base-line (also note the total analyzer time and the analyzer time of the TokenTypeAnalyzer)
    • Do this three times
  • Create a PR with the Rollback
    • Do the same with the Release build from the PR
    • This is the second base and should be worse than the first baseline
  • Fix the issues with caching one by one see UtilityAnalyzer: Reduce lock contention in ShouldGenerateMetrics #7411
  • Make a new measurement after each change

Merge acceptance criteria?

  • It needs to be at least as good as the current master

How to measure with dotnet trace

  • Attach to the AnalyzerRunner
  • Use dottrace "Sampling" profiling type
  • Filter for "Lock contention"

Consider also making the analyzer stateless: #7288

@mary-georgiou-sonarsource
Copy link
Contributor

The projects that had a notable regression at analysis time (see also graphs here) and were used also later on for measurements are:

ravendb: baae94d12fec8cbd7d4828f1ab3754eff2a1e74f
akka.net: a920b0723625e918e0b3e93e4b7f4a33e5533f5b
fluentassertions: 371ab153276c19fd04c94333b3d32adb5a13b203

The two analyzers that had a notable impact were the TokenTypeAnalyzer and the SymbolReferenceAnalyzer.

@pavel-mikula-sonarsource
Copy link
Contributor

Once we have the need to register for semantic model, we can add the possibility to register for it any time in the future. Without using it for the utility analyzers where it didn't work.

Data has proven that possibly creating new model is much faster than trying to use the nicer registration. At the same time, Roslyn might cache those for us. Or their creation is not as expensive as the thread starvation.

@pavel-mikula-sonarsource
Copy link
Contributor

@martin-strecker-sonarsource I'm missing an explanation why this was reopened. The evidence proves that this approach doesn't work.

@martin-strecker-sonarsource
Copy link
Contributor Author

martin-strecker-sonarsource commented Jul 12, 2023

Both kinds of registrations do the same thing. I logged how the current registration uses semantic models in this branch https://github.com/SonarSource/sonar-dotnet/tree/Martin/UtilityPerf_LogSemModelCreation
The results are:

  • The SemanticModel of a tree is reused between the UtilityAnalyzers and the rules in 99,79% of the cases (see below)
  • In 30% of the cases, the rule sees the SemanticModel before any of the UtilityAnalyzers
  • In 25% of the cases, the rule sees the SemanticModel after all of the UtilityAnalyzers

It will be interesting to see how the other registration changes the picture here. It also reuses the SemanticModel, but I expect changes in the reuse of the semantic model and the order between rules and utility analyzers.

image

I logged only a single rule usage of the semantic model, so we may miss some vital information.

For the models that were not reused, it looks like so:

  • C:\Projects\Sprints\UtilityAnalyzerPerf\Benchmark\Projects\akka.net\src\benchmark\SpawnBenchmark\Program.cs
    • First compilation:
      • The model of the rule was reused by 2 UtilityAnalyzer, while 4 got a shared second one.
    • Second compilation
      • The model of the rule was reused by 2 UtilityAnalyzer, while 4 got a shared second one.
    • Third compilation
      • The model of the rule was reused by 3 UtilityAnalyzer, while 3 got a shared second one.
  • C:\Projects\Sprints\UtilityAnalyzerPerf\Benchmark\Projects\akka.net\src\benchmark\SpawnBenchmark\RootActor.cs
    • First compilation:
      • Rule got one SemanticModel 2 times and shared it with 4 UtilityAnalyzer
      • 2 UtilityAnalyzer got a second shared SemanticModel
    • Second compilation
      • Rule got one SemanticModel 2 times and shared it with 2 UtilityAnalyzer
      • 4 UtilityAnalyzer got a second shared SemanticModel
    • Third compilation
      • Rule got one SemanticModel 2 times and shared it with 3 UtilityAnalyzer
      • 3 UtilityAnalyzer got a second shared SemanticModel
  • C:\Projects\Sprints\UtilityAnalyzerPerf\Benchmark\Projects\akka.net\src\contrib\cluster\Akka.DistributedData.LightningDB\LmdbDurableStore.cs
    • First compilation
      • The rule got 2 SemanticModel one of which was used by 6 UtilityAnalyzer before

It looks like the semantic model cache is cleared, and a new one is created sometimes. This can happen to a SyntaxNode rule as well (see the last case).

@github-actions github-actions bot added this to Review in progress in Best Kanban Nov 28, 2023
@github-actions github-actions bot moved this from Review in progress to Review approved in Best Kanban Nov 29, 2023
Best Kanban automation moved this from Review approved to Validate Peach Nov 30, 2023
@martin-strecker-sonarsource martin-strecker-sonarsource moved this from Validate Peach to Done in Best Kanban Dec 4, 2023
@sebastien-marichal sebastien-marichal added this to the 9.15 milestone Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Performance It takes too long.
Projects
Best Kanban
  
Done
Development

Successfully merging a pull request may close this issue.

4 participants