Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autocomplete: Various latency related tweaks and new eager cancellation experiment #3096

Merged
merged 4 commits into from
Feb 9, 2024

Conversation

philipp-spiess
Copy link
Contributor

A few small tweaks from my learnings of looking at some traces:

  1. Fixes a bug where the debounce time was increased for non-local models
  2. Sets the same debounce time for single-line and multi-line
  3. Remove some config evaluations off the critical path. Those are heavily cached but it would still cause the very first completion to be slower
  4. Add a new eager cancellation experiment that will cancel requests as soon as a new request is created and reduces the debounce time significantly to try and counter the latency regression

Test plan

  • For the new experiment, I added a abort handler in the fireworks client and ensured it was heavily hit
  • For the rest, just made sure completions still work. The changes are trivial.

@philipp-spiess philipp-spiess requested review from valerybugakov and a team February 9, 2024 11:31
@philipp-spiess philipp-spiess self-assigned this Feb 9, 2024
const isEagerCancellationEnabled = completionProviderConfig.getPrefetchedFlag(
FeatureFlag.CodyAutocompleteEagerCancellation
)
const debounceInterval = isLocalProvider ? 125 : isEagerCancellationEnabled ? 10 : 75
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 ☄️

Comment on lines +147 to +148
if (!eagerCancellation) {
this.testIfResultCanBeRecycledForInflightRequests(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this functionality when eagerCancellation === true? We can throttle (is it the right word here?) requests — with low debounce, we will have them almost on every keystroke, but instead of canceling all of them but the last one, we can keep on request in every number of requests made in the previous 100ms + always keep the tail one.

This way, we preserve the nice UX where the completion is generated early, and a user continues typing as suggested AND decrease the tail completion delay by 65ms. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure I understand this right. If we want to keep this, we would have to keep more than just the last request alive (so that another request can actually answer another one). Are you recommending we keep all but the last 2 completions active?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed on the call. We're going to follow-up on that in a separate PR.

@philipp-spiess philipp-spiess merged commit 703afda into main Feb 9, 2024
15 checks passed
@philipp-spiess philipp-spiess deleted the ps/ac-latency-experiments-and-tweaks branch February 9, 2024 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants