Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for registering an analyzer for concurrent execution of its actions #7184

Merged
merged 8 commits into from
Dec 4, 2015

Conversation

mavasani
Copy link
Contributor

@mavasani mavasani commented Dec 2, 2015

By default, analyzer driver never makes concurrent callbacks into a single analyzer instance. This change adds a new public API AnalysisContext.RegisterConcurrentExecution to enable thread-safe analyzers to opt into receiving concurrent callbacks, which should theoretically improve performance of such analyzers for concurrent builds.

Fixes #6737

This change replaces synchronous lock statements in the core analyzer executor and analyzer state tracker (for IDE live analysis) with semaphore slim DisposableWaitAsync invocations. This ensures that most of the analyzer driver and analyzer executor code is now async, with only the eventual callback into the analyzers being a synchronous method ([ExecuteAndCatchIfThrows_NoLock](http://source.roslyn.io/Microsoft.CodeAnalysis/R/02d7df8203d3591e.html))

Testing: I verified that the self-build time of Roslyn.sln (which uses a few analyzers), is almost identical after this change. We are hoping that this will improve build performance when there is lot of lock contention, which has shown up from perf traces (see dotnet#6053).

Fixes dotnet#6399
…ts actions.

By default, analyzer driver never makes concurrent callbacks into a single analyzer instance. This change adds a new public API 'AnalysisContext.RegisterConcurrentExecution' to enable thread-safe analyzers to receive concurrent callbacks, which should theoretically improve performance of such analyzers for concurrent builds.

Fixes dotnet#6737
@@ -76,6 +76,7 @@ virtual Microsoft.CodeAnalysis.MetadataReferenceResolver.ResolveMissingAssembly(
virtual Microsoft.CodeAnalysis.SourceReferenceResolver.ReadText(string resolvedPath) -> Microsoft.CodeAnalysis.Text.SourceText
Microsoft.CodeAnalysis.SemanticModel.GetOperation(Microsoft.CodeAnalysis.SyntaxNode node, System.Threading.CancellationToken cancellationToken = default(System.Threading.CancellationToken)) -> Microsoft.CodeAnalysis.Semantics.IOperation
abstract Microsoft.CodeAnalysis.SemanticModel.GetOperationCore(Microsoft.CodeAnalysis.SyntaxNode node, System.Threading.CancellationToken cancellationToken) -> Microsoft.CodeAnalysis.Semantics.IOperation
abstract Microsoft.CodeAnalysis.Diagnostics.AnalysisContext.RegisterConcurrentExecution() -> void
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the public API being added.

@mavasani
Copy link
Contributor Author

mavasani commented Dec 2, 2015

@JohnHamby @gafter @srivatsn @heejaechang can you please review?

@msJohnHamby
Copy link
Contributor

In a quick scan of the code, I didn't see anything preventing concurrent execution of actions that are too closely related. For example, it's probably not sound to simultaneously execute two syntax node actions registered by a code block start action for a given code block (because the two syntax node actions are very likely to be sharing state for the code block). Requiring an analyzer that does this to have no concurrency is too restrictive.

At some point in the past I wrote up a proposed set of rules for when two actions are too closely related to execute in parallel.

/// For example, end actions registered on any analysis unit (compilation, code block, operation block, etc.) are by definition semantically dependent on analysis from non-end actions registered on the same analysis unit.
/// Hence, end actions are never executed concurrently with non-end actions operating on the same analysis unit.
/// </remarks>
public abstract void RegisterConcurrentExecution();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it decide so that people can't flip it more than once?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the other methods named "Register***" register an action. Maybe "EnableConcurrency" or "EnableConcurrentActions" or "EnableConcurrentExecution"?

@heejaechang's question is interesting. I don't think we need to support flipping concurrence on and off, but if someone can make a case for it we should discuss that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 to @JohnHamby's point about Register versus Enable. I like EnableConcurrentExecution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@heejaechang
Copy link
Contributor

👍

@mavasani
Copy link
Contributor Author

mavasani commented Dec 2, 2015

it's probably not sound to simultaneously execute two syntax node actions registered by a code block start action for a given code block (because the two syntax node actions are very likely to be sharing state for the code block).

I am unsure why we would add such a restriction. What if these actions themselves are thread-safe and can deal with access/update of shared state concurrently? Adding such a restriction will prevent further performance improvements for such an analyzer.

I think we are talking about 3 buckets here:

  1. Completely non-concurrent
  2. Maximally concurrent and thread-safe
  3. Partially concurrent and thread safe

I propose we treat the new API to enforce analyzer needs to be (2) - where only semantically dependent actions are not executed concurrently. End actions by definition depend on non-end actions executing on same analysis unit (compilation, code block, operation block, etc.) and hence cannot be executed before them - driver guarantees that. However, sort of related actions that you mention can be written in thread-safe way and executed concurrently without any functional change in the output.
If and when we get to adding support for 3., we can allow authors to define subset of actions or action kinds that it deems thread safe.

@heejaechang
Copy link
Contributor

since we never promised ordering between same level actions (syntax tree, syntax node, code block), I think running those concurrently doesnt break any contracts.

we never promised we will run syntax node actions in in-order tree traversal or pre or post, or tree in the order it shows up in compilation and etc.

so, I think it is fine running them concurrently and let the author deal with managing concurrency.

@heejaechang
Copy link
Contributor

by the way, one thing, I think we need to explicitly state that even if one is registered to be executed concurrently, Host can still decide it runs it non-concurrent manner. basically, the call is to let host know characteristic of the analyzer, not to control behavior of the host.

host behavior is completely owned by the host (driver) and analyzer provider can't change it in any way.

in that sense, why is this a method call (runtime behavior) rather than metadata (attribute?) on the analyzer? first one feels like it is controlling behavior of host, when second feels like I am annotating characteristic of the analyzer.

@msJohnHamby
Copy link
Contributor

I hunted up the most succinct statement of the "related actions" rules I could find, and pasted it below. The incentive for such restrictions is to make a very high percentage of analyzers achieve maximal safe concurrency without an author having to take locks. Getting locking correct is very difficult, and we shouldn't push the problem entirely onto analyzer authors. The ideal experience is that an author turns on concurrency and an analyzer keeps working.

Here's the last draft of the rules. What we're doing with shared state will necessitate some refinement:

  1.  The Initialize methods of two or more different analyzers can run concurrently. We make no guarantees about state shared between analyzers, or about static state.
    
  2.  The actions of two or more different analyzers can run concurrently. We make no guarantees about state shared between analyzers.
    
  3.  Any number of actions that were registered directly by a single analyzer’s Initialize method can run concurrently. We make no guarantees about static state, or about cross-compilation state.
    
  4.  An action A is an ancestor of action B if B was registered either by A or by another action that has A as an ancestor. Two actions X and Y are related if any of these hold:
    
    a. X is an ancestor of Y.
    b. Y is an ancestor of X.
    c. X and Y have a common ancestor.
    d. X and Y are the same action, and that action has an ancestor.
  5. Any number of actions of a single analyzer can run concurrently so long as no two of those actions are related.

@heejaechang
Copy link
Contributor

@JohnHamby I dont think user can understand that rule. so either user will not mark it as concurrent or use lock regardless.

@mavasani
Copy link
Contributor Author

mavasani commented Dec 2, 2015

Getting locking correct is very difficult, and we shouldn't push the problem entirely onto analyzer authors. The ideal experience is that an author turns on concurrency and an analyzer keeps working.

I think analyzer author signing up for concurrency have to ensure their actions are thread safe (via locking, concurrent data structures, etc.). They cannot just turn on concurrency and expect their existing analyzer which shares state between actions to keep working.

I agree with your definition of related actions, but this won't be easy for any analyzer author to understand. We shouldn't require them to figure out what actions are related or unrelated and then go and implement thread safety.

Additionally, one big drawback of this restriction would be that all actions registered within a compilation start action would be related actions and hence wouldn't be executed concurrently - I believe majority of analyzers are written as compilation start analyzers just to fetch some well known types from the compilations, but have no related actions or state sharing.


/// <summary>
/// Register for concurrent execution of analyzer actions registered by this analyzer.
/// An analyzer that registers for concurrent execution should be more performant then a non-concurrent analyzer.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "can have better performance than" or "can be more efficient than"? I still haven't accepted "performant" as word--the first definition of it I could find online included "Commonly used by Microsofties but not a real word."

At least, "then" should become "than".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will change.

@mavasani
Copy link
Contributor Author

mavasani commented Dec 3, 2015

@Pilchie @JohnHamby I have addressed your feedback and updated the PR.

@Pilchie
Copy link
Member

Pilchie commented Dec 3, 2015

👍

@msJohnHamby
Copy link
Contributor

My "related actions" proposal turns out to be too pessimistic for some very important cases. There will be a follow-on discussion about the best way to deal with throttling concurrency in the presence of mutable state.

I sign off.

@mavasani
Copy link
Contributor Author

mavasani commented Dec 4, 2015

Thanks John.

mavasani added a commit that referenced this pull request Dec 4, 2015
Add support for registering an analyzer for concurrent execution of its actions
@mavasani mavasani merged commit 069d016 into dotnet:master Dec 4, 2015
@mavasani
Copy link
Contributor Author

mavasani commented Dec 4, 2015

I have filed #7228 to track John's proposal for enhancing the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants