Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

introduce ContextMentionProvider API for mentionable context sources #3883

Merged
merged 5 commits into from
Apr 22, 2024

Conversation

sqs
Copy link
Member

@sqs sqs commented Apr 20, 2024

A ContextMentionProvider can provide context items that the user can @-mention in chat. It exposes an API with:

  • triggerPrefixes: eg npm: to show a list of possible npm packages to mention when the user types @npm:
  • queryContextItems: return a list of possible npm packages when the user types in @npm:left-p into the chat
  • resolveContextItem (optional): enrich the context item with more content, used right before the context is sent to the LLM and only run on items that the user explicitly mentioned (so it can be slower)

The experimental URL mention feature was also ported to use this new API. It was designed to support more mentionable context sources, such as #3866 (@-mentionable packages like @npm:left-pad).

Status: experimental. For now, this API is only enabled (and these context sources are only available) when the cody.experimental.noodle VS Code setting is true. If this setting is off, it is not possible for any new ContextMentionProviders to be triggered, which is how we can experiment with adding new ones with less risk to the overall stability of Cody.

Go to API for context sources (Sourcegraph community forum post) to read and share feedback/ideas about how you'd want to use this API as a dev. See OpenCtx for a related concept that we intend to work into Cody through this API as well.

Test plan

CI (which includes the revived URL mention e2e test)

@sqs sqs force-pushed the sqs/context-api branch 3 times, most recently from b81a838 to 11db5f0 Compare April 20, 2024 21:03
@sqs sqs marked this pull request as ready for review April 20, 2024 21:05
@PriNova
Copy link
Contributor

PriNova commented Apr 20, 2024

Hey @sqs

Will the ContextMentionProvider API be general in the sense, that the user can (in whatever way) fetch context programmatically, like scraping web content, executing custom scripts, fetching weather data or whatever the user likes to include as context?

@sqs
Copy link
Member Author

sqs commented Apr 21, 2024

@PriNova:

Will the ContextMentionProvider API be general in the sense, that the user can (in whatever way) fetch context programmatically, like scraping web content, executing custom scripts, fetching weather data or whatever the user likes to include as context?

Yes, but initially (1) it's experimental and we don't recommend anyone else building on it because it will change a lot and (2) it's only for explicitly at-mentioned context, not for automatically included context. BTW in case you haven't see it, https://openctx.org/ is a similar concept that we have that will help us bring more context in.

I just opened https://community.sourcegraph.com/t/api-for-context-sources-contextmentionprovider-openctx/152 for discussion about this API and building on it. Would love to hear your ideas and feedback over there!

@PriNova
Copy link
Contributor

PriNova commented Apr 21, 2024

Would love to hear your ideas and feedback over there!

Perfect @sqs . That is a great addition. Since Cody can fetch URL content, I can write a local server which acts as a custom content provider for whatever I like to put into context. Very excited about.

Copy link
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed this to learn some TS and also cause gonna work on top of this. Thanks! Will leave actual approval to someone who knows more, but it LGTM :)

lib/shared/src/mentions/api.ts Outdated Show resolved Hide resolved
if (item.content !== undefined) {
return [item as ContextItemWithContent]
}
const content = await fetchContentForURLContextItem(item.uri.toString(), true, signal)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in what situation would this happen? It seems you only return content set from queryContextItems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is currently correct. I could see this URL mention thing doing more work to extract content in the future for the specifically mentioned item (but not for the ones it shows in a menu). @thenamankumar's PR needs this separate query vs. resolve step, and I wanted to illustrate it here. I'll just remove it though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sqs in my working branch I am also using this function, just in the case of this implementation it didn't seem necessary. I think its worth keeping TBH, makes sense to have something which can defer more expensive work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even URL content fetching on mention is expensive and should be deferred. Right now it is not the case, that is why it takes a few seconds for the select option to show up.

I can fix it.

lib/shared/src/mentions/providers/urlMentions.ts Outdated Show resolved Hide resolved
lib/shared/src/mentions/query.ts Show resolved Hide resolved
Comment on lines +63 to +66
return provider.queryContextItems(
mentionQuery.text,
convertCancellationTokenToAbortSignal(cancellationToken)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are not respecting MAX_RESULTS here. Not sure if we need to? Should we passing this in as a parameter so the context providers are aware of this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it unlimited and see how we can do this as we write a few more providers.

default:

default: {
for (const provider of getEnabledContextMentionProviders()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noob question: is it possible for the return value of getEnabledContextMentionProviders to change between the earlier call and this call? In practice this is fine since we will just return no context, but just interested. I assume the answer is no since there is no async stuff going on, but I suppose it is possible for there to be a blocking call somewhere here?

vscode/src/editor/utils/editor-context.ts Outdated Show resolved Hide resolved
return {
...contextItem,
content,
size: contextItem.size ?? TokenCounter.countTokens(content),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: you don't need to set size here, resolveContextItem will fallback to countTokens on content if unset.

Comment on lines 34 to 36
} else if (mentionQuery.text.length === 1) {
telemetryRecorder?.withType(mentionQuery.type)
telemetryRecorder?.withType(mentionQuery.provider)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is accurate anymore? We will only know what the mention provider is after many more characters depending on the length of the triggerPrefix. Deciding when to log which provider we are using likely will have to happen at a different time with more complicated logic? I'm not even sure if this logic was that great before since if we have any debouncing we will miss logging this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Good point. I think we can just log the mentionQuery.provider since this call is debounced enough so that it won't be event-spammy.

Copy link
Contributor

@dominiccooney dominiccooney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, some feedback inline.

lib/shared/src/mentions/query.ts Show resolved Hide resolved
lib/shared/src/mentions/providers/urlMentions.ts Outdated Show resolved Hide resolved
return CONTEXT_MENTION_PROVIDERS
}

const isURLProviderEnabled =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could I get a ping before we ship these? The @-namespace is shared.

CC @kalanchan this is the sort of thing a feature flag flip to stable should check—if it adds to this namespace, are there any conflicts with existing or planned context providers (eg prefixes without a separator), Objective-C keywords, common Python decorators/Java annotations (...apologies Ruby and PHP programmers...) etc. etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

contextItem: ContextItem,
editor: Editor
): Promise<ContextItemWithContent> {
const content =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing actionable here now, but we should push PromptString into ContextItems to record the provenance of context as close to the source as possible. If you have input it would be great to hear it.

CC @philipp-spiess

export function isURLContextItem(item: Pick<ContextItem, 'uri'>): boolean {
return item.uri.scheme === 'http' || item.uri.scheme === 'https'
try {
const content = await fetchContentForURLContextItem(url.toString(), false, signal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize you're just moving stuff around here, but since you're making an extension point now...

Is there any cooldown on this? These fetches are visible, so https://customer.source could observe when people are on their way to typing https://customer.sourcegraph.com and so on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point. There is a debounce, but it's still possible. Do you think this is a problem solved by sufficiently long debounce, or an explicit commit step?

*
* This API is *experimental* and subject to rapid, unannounced change.
*/
export interface ContextMentionProvider<ID extends ContextMentionProviderID = ContextMentionProviderID> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a straightforward starting point for this purpose.

It would raise the comfort level immensely to see a JetBrains native autocomplete hooked up to the same source... There's something @beyang hand rolled for at-file mentions in chat right now, but we need real autocomplete on @-tags for edits...

lib/shared/src/mentions/providers/urlMentions.ts Outdated Show resolved Hide resolved
const isAllEnabled =
vscode.workspace.getConfiguration('cody').get<boolean>('experimental.noodle') === true
if (isAllEnabled) {
return CONTEXT_MENTION_PROVIDERS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sqs @thenamankumar I was looking at the package context PR. I was wondering how you imagine we inject the dependency on the sourcegraph graphql client. I think the way this is currently implemented needs to be changed a bit so we can inject dependencies like that.

sqs added 5 commits April 22, 2024 14:32
A ContextMentionProvider can provide context items that the user can `@`-mention in chat. It exposes an API with:

- triggerPrefixes: eg `npm:` to show a list of possible npm packages to mention when the user types `@npm:`
- queryContextItems: return a list of possible npm packages when the user types in `@npm:left-pa` into the chat
- resolveContextItem (optional): enrich the context item with more content, used right before the context is sent to the LLM and only run on items that the user explicitly mentioned (so it can be slower)

The experimental URL mention feature was also ported to use this new API.
@sqs sqs merged commit 799ba20 into main Apr 22, 2024
20 checks passed
@sqs sqs deleted the sqs/context-api branch April 22, 2024 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants