-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Simplify diagnostic tagging by making it use the standard tagging model. #23448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify diagnostic tagging by making it use the standard tagging model. #23448
Conversation
|
Tagging @heejaechang @sharwell . Note: this approach in incorrect in a very fundamental way currently. Wanted to start the discussion with you on how we might fix that. |
|
@heejaechang I took the appraoch whereby the event passes along the diagnostic id that was updated. However, this approach is fundamentally flawed with how tagging works currently. The reason for that is that tagging is hugely cancellable. i.e. any time tagging gets an event, it cancels the current work and enqueues the new recomputation. The assumption is that any recomputation supersedes any previous or inflight computation. That's not the case with this optimization. For example, if we heard about a global "recompute tags for all providers" notification, we can't supersede that with a request to recompute tags for a single provider id. We have a few options here afaict (other suggestions welcome!):
Thoughts? |
|
I think we can go with simple approach and do some perf investigation and see whether we need to optimize more. ... by the way, even with this approach, it doesnt remove the fact that diagnostic service is based on solution, and tagger is based on buffer and dynamically buffer to workspace and buffer to document association can be changed. in other word, we still need to filter out diagnostic events that is not associated with this subject buffer (see https://github.com/dotnet/roslyn/pull/22920/files#r153666048 if we want document specific events) .. we also can make diagnostic event source to aggregate object ids as we aggregate text changes on text change event source. |
That already happens here: https://github.com/dotnet/roslyn/blob/master/src/EditorFeatures/Core/Shared/Tagging/EventSources/TaggerEventSources.DiagnosticsChangedEventSource.cs#L28 |
Do you mean approach #1 ? |
|
ha, we had that before? interesting... |
|
@CyrusNajmabadi yep. I mean (1). basically whenever there is event source changes, we do what we currently do for initial tags. ask all diagnostics reported on this file, filter out empty and not related ones (suggestion/unnecessary/regular), and report rest. we no longer need sub taggers since we always report all of them. really simple. and for common case, probably good enough (since most of them will be empty or have small number of errors). probably will have issues for corner cases where file contains thousands of errors since it will repeatedly report those. |
Well, we'll still always diff the spans against the old ones. So we'll only report the change. |
|
Ok. I pushed the simpler model. take a look when you get a chance. |
93063d5 to
8c6a67c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Tagging @sharwell @heejaechang @dotnet/roslyn-ide -500 lines in diagnostic tagging FTW. Note: this should probably be smoke tested and also perf tested. While is simplifies things greatly, and pushes a lot of work to the BG. It may end up performing more work and make have unintended consequences. |
|
Tagging @dotnet/roslyn-ide |
|
📝 I'm planning to look at this for 15.7+ unless we get indications of more substantial problems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add ActiveContextChanged Event source here as well. and remove this (http://source.roslyn.io/#Microsoft.CodeAnalysis.Features/Diagnostics/DiagnosticAnalyzerService_IncrementalAnalyzer.cs,58)
looks like we added the code above as a quick way to handle context change when it is first introduced to workspace. but that is not right way to handle it. DiagnosticAnalyzerService shouldn't care about buffer. it should only care about solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrmm.. can a buffer even cross workspaces though? i think it can (from Misc Workspace to normal workspace, and vice versa). So i think we need WorkspaceRegistrationChanged as well. I will add ActiveContextChanged though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can buffer ever be null? I don't believe snapshot can have null buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably not. though i was just preserving hte code from before. so i would prefer to not change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move this try/catch down to ProduceTags for one specific updateArg so that even if 1 fails, we still can get other analyzers squiggles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
awesome! things got really simpler! and tests are all passed! |
|
probably need to do some manual testing before we check this in. @jasonmalinowski what would be branch we can check this in not affecting ask mode? like post 15.6 branch or something? |
|
@heejaechang We have no such branch -- even if we did nobody would be looking at it. If your desire is for some dogfooding/testing on it, there's a few things we can do. |
|
if we have a few release left for 15.6 RTM, then I think we should just check this in to 15.6 |
jasonmalinowski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole determination of editorSnapshot independent of requestedSnapshot seems really fishy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this preferred over just GetText()? It should be available either way so be equally fast. But if something goes sideways, this version is broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. i can change to just using GetText.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this ever different than spanToTag.SnapshotSpan.Snapshot? This code and then the .TranslateTo in ProduceTags(...) implies with much vigor they can be different, but I don't see how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think some of this logic dates back like 10ish years. It's quite possible that it was back at a time where we had less invariants about Documents/SourceTexts/Snapshots. I'm happy to remove (or assert it must be hte same).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it can only date 8 years ago... ;-)
Where did this move from originally? Was originally the snapshot you're mapping from related to the snapshot the diagnostics were computed for? It implies I can get stale spans, but since those buffers will be the same we're not correctly mapping things forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment on that says:
// Make sure we can find an editor snapshot for these errors. Otherwise we won't
// be able to make ITagSpans for them. If we can't, just bail out. This happens
// when the solution crawler is very far behind. However, it will have a more
// up to date document within it that it will eventually process. Until then
// we just keep around the stale tags we have.
I think a lot of the complexity here was that in the old system you had notifications from the diagnostic engine about a set of data, and we were trying to relate it to the editor-based data. There were moving at different speeds, with different invariants, and we had code like this to try to be resilient to that sort of thing.
If we're now just querying the diagnostic subsystem with teh correct tagger info we have, i don't think we have any issues adn we can likely clean this up even more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, looking at this, i don't think we need it anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old one did translate to since DiagnosticUpdated event is not part of tag event source. due to that it didn't update internal interval tree when diagnostic is updated, rather it just hold onto snapshot at the moment. and later did "translate to" to move spans to right one itself. so at the time, 2 documents were actually different (one saved when DiagnsoticUpdated is called, and one when event source is raised - ProduceTags is called)
this new one no longer does that, now DiagnosticUpdated is part of tag event source. when diagnostic changed, it save spans in taggers internal interval tree and let tagger's own mechanism to handle translate to. so now there is only 1 document. one that is from SpansToTag.Document. the code seems just artifact of some copy over from existing code. but I think it is just matter of how it get to same data. I dont think it will cause bug since 2 should be same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"// Make sure we can find an editor snapshot for these errors. Otherwise we won't
// be able to make ITagSpans for them. If we can't, just bail out. This happens
// when the solution crawler is very far behind. However, it will have a more
// up to date document within it that it will eventually process. Until then
// we just keep around the stale tags we have."
agree this is not needed anymore. since there is no longer 2 documents as we used to.
|
Not a problem (i hope). |
fd8c45e to
c6b74a2
Compare
|
@jasonmalinowski Did i do this properly? |
|
@CyrusNajmabadi Well done! |
|
Thanks @CyrusNajmabadi |
|
just realized this is missing ask mode template |
|
@jasonmalinowski Can you help with the template? |
|
Also, i would recommend this be smoke tested. |
|
@jinujoseph added ask mode template. |
|
@heejaechang I updated the ask mode template to link the other bugs |
|
thanks @sharwell ! |
|
@jinujoseph @mattscheffer we probably want to have manual testing as soon as possible. |
|
@richaverma1 to help with manual testing |
|
The scenario meets the bar. Consider this approved, pending satisfactory results from the manual testing. @richaverma1 Can you add the "Approved to merge" label when testing is complete and you are happy with the results? |
|
merged! thank you everyone! thank you @CyrusNajmabadi |
|
Any time! |
|
Pardon me for being a github beginner, but I'm in real pain because of this issue.. would appreciate some assistance on how to pull & install the fix |
|
@yuvalshtemer latest VS should have this fix, you don't need to pull anything. just move to latest VS. |
Customer scenario
User is working on a big solution and switching between git branches multiple times or close and reopen solutions multiple times. and on some unfortunate cases, VS will crash with out of memory exception.
Bugs this fixes
Fixes #24055
Fixes DevDiv 512757
Supersedes #22920, #23377, #23409, #23411
Workarounds, if any
after each git branch switching or solution open/close, give VS sometimes to process pending works enqueued by the operation.
Risk
this simplify our diagnostic tagger dramatically. so there is a risk where behavior might not exactly same as before. but we believe this is right direction to go.
Performance impact
this should remove OOM due to too many pending UI work items completely. that is source of most of our OOM crash.
Is this a regression from a previous update?
No
Root cause analysis
previously, diagnostic service didn't support pull model for all diagnostics source. so, tagger used event (push model) to hold onto last reported diagnostics and later use that to report tags. and that made us to use custom logic for the tagger which ends up forcing us to use UI thread to synchronize many states to remove potential race. and that caused us to push too many work items to UI thread in certain case such as git branch switching.
now, diagnostic service fully supporting pull model for all sources, this moves diagnostic taggers to follow our tagger framework which doesn't require UI thread for state synchronization. removing the root cause of OOM from the picture completely.
How was the bug found?
MemoryWatson
....
more dev detail.
From a conversation with @heejaechang #23409 (comment)
A while back the diagnostics subsystem had a limitation where you could only hear about some diagnostics if you explicitly listened for diagnostic events. i.e. if you weren't listenting and capturing those events, you couldn't go back and ask for those diagnostics later. This meant that we couldn't do diagnostic tagging (squiggles/fading/suggestions) like we did normal tagging. Normal tagging hears about events, pauses a bit, then goes and gets all the data necessary later to produce the tags. Because that data wasn't available 'later', diagnostic tagging had to aggregate the info and contort things to fit into the tagging infrastructure.
This restriction from the diagnostics service no longer exists. THat means we can great simplify how we do our tagging computation.