Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid overloading ls with completion resolve requests #10539

Merged
merged 2 commits into from
Apr 22, 2024

Conversation

pascalkuthe
Copy link
Member

@pascalkuthe pascalkuthe commented Apr 21, 2024

While we don't have any reports about this on github I saw a bunch of people having issues with crashing/overloaded lsps on reddit/matrix in the latest release. While working on larger improvements to the completion system I noticed the root cause for that.

While moving completion resolve to the event system in #9668 we introduced what is essentially a "DOS attack" on slow LSPs. Completion resolve requests were made in the render loop and debounced with a timeout. Once the timeout expired the resolve request was made. The problem is the next frame would immediately request a new completion resolve request (and mark the old one as obsolete but because lsp has no nothing of cancelation the server would still process it). So we were in essence sending one completion request to the server every 150ms and only spotted if the server managed to respond before we rendered a new frame. This caused overload on slower machines/with slower lsp.

In this PR I revamped the resolve handler so that a request is only ever resolved once. Both by checking if a request is already inflight and by marking failed resolve requests as resolved.

This PR was split off on larger changes to the completions since this is a somewhat major regression in the last release that I want to fix ASAP. To avoid a huge amount of merge conflicts on my side I kept one additional commit from the larger changes (that the resolvers changes were based on). That commit replaces the usize we used for the language server id with a proper type (using slotmap). That also turns looking up a language server by id/by name from O(n) to O(1) operations.

@pascalkuthe pascalkuthe added this to the 24.04 milestone Apr 21, 2024
@pascalkuthe pascalkuthe added C-bug Category: This is a bug A-language-server Area: Language server client E-medium Call for participation: Experience needed to fix: Medium / intermediate S-waiting-on-review Status: Awaiting review from a maintainer. C-perf labels Apr 21, 2024
pub tags: Vec<DiagnosticTag>,
pub source: Option<String>,
pub data: Option<serde_json::Value>,
}

// TODO turn this into an enum + feature flag when lsp becomes optional
pub type DiagnosticProvider = LanguageServerId;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this will be useful when introducing spell checking diagnostics

helix-core/src/diagnostic.rs Outdated Show resolved Hide resolved
helix-core/src/diagnostic.rs Outdated Show resolved Hide resolved

let res = self.call::<lsp::request::ResolveCompletionItem>(completion_item);
Some(async move { Ok(serde_json::from_value(res.await?)?) })
completion_item: &lsp::CompletionItem,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice, I remember thinking there was an unnecessary clone in completion resolving because of the type here lacking a ref 👍

the-mikedavis
the-mikedavis previously approved these changes Apr 21, 2024
Copy link
Member

@the-mikedavis the-mikedavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I think I've seen the completion resolve bug cause some odd effects in ErlangLS (which is relatively rather slow).

Both the refactor towards a slotmap for language servers and moving the completion resolve handler to its own file are 👍. The diff is a little large but I think that's a good sign that we should have a more formal ID for language servers anyways.

@archseer
Copy link
Member

Agreed, the changes look good! Do we also need a newtype ID for debuggers?

Just one nit, can you copy the problem description/solution part of the PR description into the last commit message? It wasn't clear to me what "don't resolve lsp with completion resolve" meant until I went back and read the description

While moving completion resolve to the event system in #9668 we introduced what
is essentially a "DOS attack" on slow LSPs. Completion resolve requests were
made in the render loop and debounced with a timeout. Once the timeout expired
the resolve request was made. The problem is the next frame would immediately
request a new completion resolve request (and mark the old one as obsolete but
because LSP has no notion of cancelation the server would still process it). So
we were in essence sending one completion request to the server every 150ms and
only stopped if the server managed to respond before we rendered a new frame.
This caused overload on slower machines/with slower LS.

In this PR I revamped the resolve handler so that a request is only ever
resolved once. Both by checking if a request is already in-flight and by marking
failed resolve requests as resolved.
@pascalkuthe
Copy link
Member Author

Agreed, the changes look good! Do we also need a newtype ID for debuggers?

Just one nit, can you copy the problem description/solution part of the PR description into the last commit message? It wasn't clear to me what "don't resolve lsp with completion resolve" meant until I went back and read the description

I amended the commit message. Adding newtype ids for DAP would be nice too (in general I almost always prefer newtyped integers to raw integers for ids) but I would defer that to a different PR.

@archseer archseer merged commit 38ee845 into master Apr 22, 2024
6 checks passed
@archseer archseer deleted the completion_resolve_dos branch April 22, 2024 03:27
@pascalkuthe pascalkuthe mentioned this pull request Apr 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-language-server Area: Language server client C-bug Category: This is a bug C-perf E-medium Call for participation: Experience needed to fix: Medium / intermediate S-waiting-on-review Status: Awaiting review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants