-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Make IExperimentationService properly free-threaded #36013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
95b92e4 to
2910be3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why it moved to this factory model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The separation is described in the initial PR description above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation doesn't make sense to me -- it says both the IExperimentationServiceFactory can be fetched without threading dependencies (which is incorrect since it has UI thread dependencies -- see comments below), and then says IExperimentationService can be used without threading dependencies. I'm assuming that the description is wrong and you were trying to split this between a UI-thread dependent and UI-thread independent portion and just have a typo there. Put another way: if both can be used without threading dependencies, why split them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... it says both the IExperimentationServiceFactory can be fetched without threading dependencies (which is incorrect since it has UI thread dependencies - ...
It says IExperimentationServiceFactory can be fetched from the MEF catalog without threading dependencies. This statement is correct. The service cannot be used without threading dependencies, but that's normal for asynchronous services.
... then says IExperimentationService can be used without threading dependencies ...
This is also correct. Once you have an instance of IExperimentationService, that instance can be used without threading dependencies.
... if both can be used without threading dependencies, why split them?
Only IExperimentationService can be used without threading dependencies.
src/VisualStudio/Core/Def/Experimentation/VisualStudioExperimentationService.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure whether we can to throw here unless we know that it won't crash VS. otherwise, I would catch and fire NFW and return false so that we don't crash VS due to experiment failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now a direct call instead of going through reflection. The latter had many more potential failure points. We can check if the implementation of this service ever throws; it may have its own internal error handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This worries me: we'll be taking this for 16.2 GA so we may want to keep the catch in there to mitigate and then remove in 16.3.
src/VisualStudio/Core/Def/Implementation/Experimentation/AnalyzerVsixSuggestedActionCallback.cs
Outdated
Show resolved
Hide resolved
2910be3 to
821015c
Compare
821015c to
b800e9e
Compare
1e59c1e to
3293e67
Compare
3293e67 to
507fc7e
Compare
| if (_experimentationService is null) | ||
| { | ||
| try | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my preference will be just kicking off task here. and then in IsExperimentEnabled, check whether services are set, otherwise, wait the task under JTF.Run rather than make everything async.
or if really want to make things async, just make IsExperiementEnabled Async.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a more accurate representation of the asynchronous nature of the services involved. IsExperimentEnabled has no intrinsic asynchronous portion or cross-thread dependencies, but obtaining an instance of the service does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing wrong with making IExperimentationService async -- the remote implementation we checked in is doing this today:
roslyn/src/Workspaces/Remote/Core/Services/RemoteExperimentationService.cs
Lines 13 to 17 in 6ead316
| public bool IsExperimentEnabled(string experimentName) | |
| { | |
| var assetSource = AssetStorage.Default.AssetSource; | |
| return assetSource?.IsExperimentEnabledAsync(experimentName, CancellationToken.None).Result ?? false; | |
| } |
We didn't make it async at the top level because of the uses in the constructor that you're fixing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.Result
This is not a safe operation. Its use outside of ContinueWith is a code smell, and a design flaw if the API author has control over the scenario.
jasonmalinowski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'm confused by the overall approach here: although IExperimentationService is free-threaded, the fetching of it is still not, and anybody who calls GetExperimentationServiceAsync() and then blocks the UI thread waiting on it (even indirectly) will still deadlock. The real value of this change seems to be the deferring of the use of the experimentation service from the .editorconfig options provider, which is where the proximate cause of the deadlock is and could be just done directly without all the extra creation of two interfaces that I think only muddies the thread-safety state of this even further than it already is.
Thoughts?
In any case, I suspect this is a 16.2 regression -- the .editorconfig code using the experimentation service more aggressively now is causing us to see a spike, so we're going to have to backport this to that branch.
src/VisualStudio/Core/Def/Experimentation/VisualStudioExperimentationService.cs
Show resolved
Hide resolved
| { | ||
| } | ||
| }); | ||
| var featureFlags = (IVsFeatureFlags)await _serviceProvider.GetServiceAsync(typeof(SVsFeatureFlags)).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still a deadlock risk since the IVsFeatureFlags will cause COM RPC and risk deadlock -- are we following a rule that it shouldn't be waited for on the UI thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➡️ This is updated to perform the casts on the main thread until we verify that both interface implementations are managed objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are both managed objects, then it means the only jump to the UI thread behind GetServiceAsync was because the services weren't marked as free-threaded, and maybe this gives a different line of attack -- are the implementations actually free-threaded in practice but just need to be marked as such? If so, we can just ask the shell to fix that, this code then all becomes correct and the PR isn't needed at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation doesn't make sense to me -- it says both the IExperimentationServiceFactory can be fetched without threading dependencies (which is incorrect since it has UI thread dependencies -- see comments below), and then says IExperimentationService can be used without threading dependencies. I'm assuming that the description is wrong and you were trying to split this between a UI-thread dependent and UI-thread independent portion and just have a typo there. Put another way: if both can be used without threading dependencies, why split them?
| { | ||
| if (_enabled is null) | ||
| { | ||
| var experimentationService = await _experimentationServiceFactory.GetExperimentationServiceAsync(cancellationToken).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this still a deadlock risk if somebody were to ask for options on the UI thread and we hadn't already initialized it? Or are we taking a tactic that it's still not correct but "good enough"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't this still a deadlock risk if somebody were to ask for options on the UI thread and we hadn't already initialized it?
The deadlock occurred because MEF isn't JTF aware. We moved the part with thread dependencies outside of the MEF construction path where code is expected to be JTF aware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 There is still a deadlock risk if users block on the completion of this asynchronous method. This is the same deadlock risk that applies to all cases in Roslyn where code blocks on the completion of an asynchronous operation (currently no cases are proven correct, therefore the only safe assumption is all cases are potential deadlock scenarios).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but right now the only callers of this method are Roslyn code, so this would have to follow Roslyn threading rules instead of JTF rules. "Code is expected to be JTF aware" might be a guideline, but it's patently not the case here.
src/EditorFeatures/Core.Wpf/Options/LegacyEditorConfigDocumentOptionsProvider.cs
Show resolved
Hide resolved
| internal interface IExperimentationServiceFactory : IWorkspaceService | ||
| { | ||
| bool IsExperimentEnabled(string experimentName); | ||
| ValueTask<IExperimentationService> GetExperimentationServiceAsync(CancellationToken cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments please as to why this is now split up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The separation falls out from the default explanation for all MEF services:
- The constructor for
IExperimentationServiceFactorycannot call asynchronous code - The
IsExperimentEnabledmethod needs to be callable from synchronous code paths - The
IsExperimentEnabledmethod has a precondition that an asynchronous code path was called
None of these items seems especially worthwhile to put in code for this service, but if we have documentation somewhere for MEF practices we could mention it.
| return KnownUIContexts.SolutionExistsAndFullyLoadedContext.IsActive; | ||
| } | ||
|
|
||
| private async Task<bool> IsPartialLoadModeExperimentEnabledAsync(CancellationToken cancellationToken) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems we're doing the caching pattern like this in a bunch of places, can we make the pattern more common? I'm still not sure why we're not just making this IsExperimentEnabledAsync() that just does the GetExperimentationService + IsExperimentEnabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not sure why we're not just making this IsExperimentEnabledAsync() that just does the GetExperimentationService + IsExperimentEnabled.
Some of the callers are not asynchronous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make the pattern more common?
I haven't found a common pattern that doesn't come with a hit to performance, clarity, and/or correctness.
- The JTF-aware
AsyncLazy<T>conflicts with a Roslyn type of the same name, impairing clarity. - Roslyn's
AsyncLazy<T>is not safe for use on code paths involving JTF operations. For this service it could negatively impact correctness. - It would be simpler to not cache the value, but I wasn't sure which methods were called too frequently to make that viable.
src/VisualStudio/Core/Test.Next/Remote/RemoteHostClientServiceFactoryTests.cs
Show resolved
Hide resolved
| // don't enable partial load mode for ones that are not in experiment yet | ||
| // Don't enable partial load mode for ones that are not in experiment yet. If the experimentation | ||
| // service isn't available by the time this is called, the VS Service instance is used and it will | ||
| // check the value of the experiment at the time individual features are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having trouble with the wording. This is an opportunistic fast-path. We try to check the experimentation service, and if that succeeds and the experiment is not enabled, we return a lighter instance of IWorkspaceStatusService. If we can't check the experiment here, we use VisualStudioWorkspaceStatusServiceFactory.Service, and each call to that service will check the experiment.
This is the expected behavior. In general, it is not safe to block on the completion of a Task which is not known to be already completed. A bunch of code in Roslyn tries to be "safe" with respect to this, but there is no compile-time validation of the prerequisites so we repeatedly run into cases where it breaks down. All of my efforts to prove correctness (e.g. dotnet/roslyn-analyzers#2400) in constrained scenarios encountered cases where assumptions did not hold and deadlocks are a possibility. In the absence of a proof of correctness, it is never safe to assume that blocking on a task completion will work. |
This change removes the use of asynchronous methods during the MEF construction portion of the construction of
IExperimentationService.fixes :https://dev.azure.com/devdiv/DevDiv/_workitems/edit/934446