Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autocomplete: Various latency related tweaks and new eager cancellation experiment #3096

Merged
merged 4 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions lib/shared/src/experimentation/FeatureFlagProvider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ export enum FeatureFlag {
CodyAutocompleteUserLatency = 'cody-autocomplete-user-latency',
// Dynamically decide wether to show a single line or multiple lines for completions.
CodyAutocompleteDynamicMultilineCompletions = 'cody-autocomplete-dynamic-multiline-completions',
// Completion requests will be cancelled as soon as a new request comes in and the debounce time
// will be reduced to try and counter the latency impact.
CodyAutocompleteEagerCancellation = 'cody-autocomplete-eager-cancellation',
// Continue generations after a single-line completion and use the response to see the next line
// if the first completion is accepted.
CodyAutocompleteHotStreak = 'cody-autocomplete-hot-streak',
Expand Down
1 change: 1 addition & 0 deletions vscode/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ This is a log of all notable changes to Cody for VS Code. [Unreleased] changes a
### Changed

- Custom Command: The `description` field is now optional and will default to use the command prompt. [pull/3025](https://github.com/sourcegraph/cody/pull/3025)
- Autocomplete: Move some work off the critical path in an attempt to further reduce latency. [pull/3096](https://github.com/sourcegraph/cody/pull/3096)

## [1.4.0]

Expand Down
2 changes: 2 additions & 0 deletions vscode/src/completions/completion-provider-config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ class CompletionProviderConfig {
FeatureFlag.CodyAutocompleteHotStreak,
FeatureFlag.CodyAutocompleteSingleMultilineRequest,
FeatureFlag.CodyAutocompleteFastPath,
FeatureFlag.CodyAutocompleteUserLatency,
FeatureFlag.CodyAutocompleteEagerCancellation,
] as const

private get config() {
Expand Down
22 changes: 13 additions & 9 deletions vscode/src/completions/inline-completion-item-provider.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ import * as vscode from 'vscode'
import {
ConfigFeaturesSingleton,
FeatureFlag,
featureFlagProvider,
isCodyIgnoredFile,
RateLimitError,
wrapInActiveSpan,
Expand Down Expand Up @@ -173,6 +172,10 @@ export class InlineCompletionItemProvider
}
)
)

// Warm caches for the config feature configuration to avoid the first completion call
// having to block on this.
void ConfigFeaturesSingleton.getInstance().getConfigFeatures()
}

/** Set the tracer (or unset it with `null`). */
Expand Down Expand Up @@ -222,11 +225,6 @@ export class InlineCompletionItemProvider
this.lastCompletionRequestTimestamp = start
}

// We start feature flag requests early so that we have a high chance of getting a response
// before we need it.
const userLatencyPromise = featureFlagProvider.evaluateFeatureFlag(
FeatureFlag.CodyAutocompleteUserLatency
)
const tracer = this.config.tracer ? createTracerForInvocation(this.config.tracer) : undefined

let stopLoading: (() => void) | undefined
Expand Down Expand Up @@ -304,7 +302,9 @@ export class InlineCompletionItemProvider
}

const latencyFeatureFlags: LatencyFeatureFlags = {
user: await userLatencyPromise,
user: completionProviderConfig.getPrefetchedFlag(
FeatureFlag.CodyAutocompleteUserLatency
),
}

const artificialDelay = getArtificialDelay(
Expand All @@ -315,6 +315,10 @@ export class InlineCompletionItemProvider
)

const isLocalProvider = isLocalCompletionsProvider(this.config.providerConfig.identifier)
const isEagerCancellationEnabled = completionProviderConfig.getPrefetchedFlag(
FeatureFlag.CodyAutocompleteEagerCancellation
)
const debounceInterval = isLocalProvider ? 125 : isEagerCancellationEnabled ? 10 : 75
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 ☄️


try {
const result = await this.getInlineCompletions({
Expand All @@ -328,8 +332,8 @@ export class InlineCompletionItemProvider
requestManager: this.requestManager,
lastCandidate: this.lastCandidate,
debounceInterval: {
singleLine: isLocalProvider ? 75 : 125,
multiLine: 125,
singleLine: debounceInterval,
multiLine: debounceInterval,
},
setIsLoading,
abortSignal: abortController.signal,
Expand Down
28 changes: 23 additions & 5 deletions vscode/src/completions/request-manager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ import { partition } from 'lodash'
import { LRUCache } from 'lru-cache'
import type * as vscode from 'vscode'

import { isDefined, wrapInActiveSpan } from '@sourcegraph/cody-shared'
import { FeatureFlag, isDefined, wrapInActiveSpan } from '@sourcegraph/cody-shared'

import { addAutocompleteDebugEvent } from '../services/open-telemetry/debug-utils'

Expand All @@ -23,6 +23,8 @@ import type { ContextSnippet } from './types'
import { lines, removeIndentation } from './text-processing'
import { logDebug } from '../log'
import { isLocalCompletionsProvider } from './providers/experimental-ollama'
import { completionProviderConfig } from './completion-provider-config'
import { forkSignal } from './utils'

export interface RequestParams {
/** The request's document */
Expand Down Expand Up @@ -72,6 +74,9 @@ export class RequestManager {
private latestRequestParams: null | RequestsManagerParams = null

public async request(params: RequestsManagerParams): Promise<RequestManagerResult> {
const eagerCancellation = completionProviderConfig.getPrefetchedFlag(
FeatureFlag.CodyAutocompleteEagerCancellation
)
this.latestRequestParams = params

const { requestParams, provider, context, isCacheEnabled, tracer } = params
Expand All @@ -89,7 +94,10 @@ export class RequestManager {
// When request recycling is enabled, we do not pass the original abort signal forward as to
// not interrupt requests that are no longer relevant. Instead, we let all previous requests
// complete and try to see if their results can be reused for other inflight requests.
const abortController: AbortController = new AbortController()
const abortController: AbortController =
eagerCancellation && params.requestParams.abortSignal
? forkSignal(params.requestParams.abortSignal)
: new AbortController()

const request = new InflightRequest(requestParams, abortController)
this.inflightRequests.add(request)
Expand Down Expand Up @@ -135,7 +143,13 @@ export class RequestManager {
})

request.lastCompletions = processedCompletions
this.testIfResultCanBeRecycledForInflightRequests(request, processedCompletions)

if (!eagerCancellation) {
this.testIfResultCanBeRecycledForInflightRequests(
Comment on lines +147 to +148
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this functionality when eagerCancellation === true? We can throttle (is it the right word here?) requests — with low debounce, we will have them almost on every keystroke, but instead of canceling all of them but the last one, we can keep on request in every number of requests made in the previous 100ms + always keep the tail one.

This way, we preserve the nice UX where the completion is generated early, and a user continues typing as suggested AND decrease the tail completion delay by 65ms. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure I understand this right. If we want to keep this, we would have to keep more than just the last request alive (so that another request can actually answer another one). Are you recommending we keep all but the last 2 completions active?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed on the call. We're going to follow-up on that in a separate PR.

request,
processedCompletions
)
}
}

// Save hot streak completions for later use.
Expand All @@ -154,7 +168,9 @@ export class RequestManager {
)
}

this.cancelIrrelevantRequests()
if (!eagerCancellation) {
this.cancelIrrelevantRequests()
}
}
} catch (error) {
request.reject(error as Error)
Expand All @@ -163,7 +179,9 @@ export class RequestManager {
}
}

this.cancelIrrelevantRequests()
if (!eagerCancellation) {
this.cancelIrrelevantRequests()
}

void wrapInActiveSpan('autocomplete.generate', generateCompletions)
return request.promise
Expand Down
Loading