Skip to content

chat: add multiple 'initiator' types to provideLanguageModelResponse #250018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

connor4312
Copy link
Member

MCP servers can make chat requests via sampling. This swaps extensionId
in provideLanguageModelResponse with an initiator so we can
represent these calls properly.

MCP servers can make chat requests via sampling. This swaps `extensionId`
in `provideLanguageModelResponse` with an `initiator` so we can
represent these calls properly.
@connor4312 connor4312 requested review from roblourens and jrieken May 28, 2025 22:07
@connor4312 connor4312 assigned connor4312 and unassigned roblourens and jrieken May 28, 2025
@vs-code-engineering vs-code-engineering bot added this to the May 2025 milestone May 28, 2025
roblourens
roblourens previously approved these changes May 31, 2025
@jrieken
Copy link
Member

jrieken commented Jun 2, 2025

This swaps extensionId in provideLanguageModelResponse with an initiator so we can represent these calls properly.

I don't understand this. AFAIK the extensionId in the provideLanguageModelResponse direction isn't used for presentation but for extension specific checks, e.g to enable different models for friend extensions or for extension-specific token limits

@connor4312
Copy link
Member Author

I don't understand this. AFAIK the extensionId in the provideLanguageModelResponse direction isn't used for presentation but for extension specific checks, e.g to enable different models for friend extensions or for extension-specific token limits

In copilot-chat this is indeed the case. However we lack any way to say that a chat didn't come from an extension, and giving an invalid extension ID will cause copilot-chat to throw. This API change allows for us to express the initator correctly.

@jrieken
Copy link
Member

jrieken commented Jun 2, 2025

However we lack any way to say that a chat didn't come from an extension, and giving an invalid extension ID will cause copilot-chat to throw. This API change allows for us to express the initator correctly.

AFAIK that's something we never really used. There is an EXP from last year that serves as a kill switch (in case an extension goes bad) and then there is some CAPI logic with the x-onbehalf-on header. cc @lramos15 to check if CAPI ever used this

@connor4312 are you planning on extending that to MCP servers? IMO we should reconsider this and instead of glorifying the initiator we should just make extensionId optional and eventually phase it out. There will also be cases in which the initiator is neither MCP nor an extension but just VS Code itself

Copy link
Member

@jrieken jrieken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's first agree on this being needed

@jrieken
Copy link
Member

jrieken commented Jun 2, 2025

So, I suggest to pass undefined or '' as extension id and handle this case gracefully in the chat extension, like not doing any of the extension specific checks we have there and later revisit if we can remove all of this

@connor4312
Copy link
Member Author

Currently in main I just use the copilot extension ID to make it work, which is a hack. If we don't need this I'm good with making the extension ID optional instead.

There will also be cases in which the initiator is neither MCP nor an extension but just VS Code itself

Agree, my thought was, if we keep the initiator then we can add an additional type to the union when this case arises.

are you planning on extending that to MCP servers?

Yes, MCP supports "sampling" which is basically making an LM request. At the moment, until we get the registry fully in, we don't know the 'identity' of MCP servers and only have their user-defined label. But once we get the registry in we'll know (for registry-based servers) their name in a deterministic way.

@jrieken
Copy link
Member

jrieken commented Jun 3, 2025

At the moment, until we get the registry fully in, we don't know the 'identity' of MCP servers and only have their user-defined label.

Tho, stdio servers, that I have locally and that I configure via mcp.json, always have a user-defined label, right?

@connor4312
Copy link
Member Author

Tho, stdio servers, that I have locally and that I configure via mcp.json, always have a user-defined label, right?

Yes

@connor4312
Copy link
Member Author

Also, I think we may need/want this because the extension's onDidReceiveLanguageModelResponse is what's used to drive the usage graph, which we will want for MCP too

@jrieken
Copy link
Member

jrieken commented Jun 3, 2025

Also, I think we may need/want this because the extension's onDidReceiveLanguageModelResponse is what's used to drive the usage graph,

That's only needed because the chat extension itself doesn't use the API to make LM requests. For all other cases we know the initiator of LM requests (based on the scoped API) and this event is something we won't finalize.

@jrieken
Copy link
Member

jrieken commented Jun 3, 2025

CAPI does utilize this value for some switching and usage reporting. Does it make sense to have MCP servers send a distinct header? Is the Copilot Chat extension the initiator in this case?

Then let's plan a future with CAPI where

  • this value can be an extension id (as today and guaranteed to be correct)
  • this value can be a MCP id (in case of MCP servers from the registery)
  • this value can be a random, untrusted string (local MCP server)
  • this value can be an identifier for core itself (for vscode native LM usage outside of any extension)

@lramos15 can you check with CAPI folks on this please

@connor4312 connor4312 modified the milestones: May 2025, June 2025 Jun 6, 2025
@connor4312
Copy link
Member Author

I made some API updates to reflect that, lmk what you think (or feel free to just push changes)

@connor4312 connor4312 requested a review from jrieken June 6, 2025 21:59
Copy link
Member

@jrieken jrieken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we aim to finalize this API soon, I have done some API nit.

(Tho, I am unsure if the initiator aspect is something we would finalise or not?)

reason: string;
}

export type LanguageModelRequestInitiator = ExtensionLanguageModelRequestInitiator | McpServerLanguageModelRequestInitiator | InternalLanguageModelRequestInitiator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the API we don't do kind-discriminated or-types but plain simple classes, like InlineValue. I think we have two options:

  • have different initiators be represented as classes, or
  • have one fits-all initiator which has the kind-attribute and something generic like identifier

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one case I was looking at was quickpick items vs separators that have a kind like we have here. The only reason I didn't do classes is that there's no reason for an extension to instantiate these types, but that could also be done with a protected or private constructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes that quick pick. tho, it is same, same but different. there is no or-type. just a single type that's different depending on the value of kind. That's something we can have too (basically my 2nd suggestion)


provideLanguageModelResponse(messages: Array<LanguageModelChatMessage | LanguageModelChatMessage2>, options: LanguageModelChatRequestOptions, extensionId: string, progress: Progress<ChatResponseFragment2>, token: CancellationToken): Thenable<any>;
provideLanguageModelResponse(messages: Array<LanguageModelChatMessage | LanguageModelChatMessage2>, options: LanguageModelChatRequestOptions, initiator: LanguageModelRequestInitiator, progress: Progress<ChatResponseFragment2>, token: CancellationToken): Thenable<any>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely out of scope for this PR but the initiator argument should probably be merged with the option argument and then be a new, fresh type for the provider-side. So that LanguageModelChatRequestOptions is reused (in a cyclic API) and so that things are contained nicely

/**
* Represents a large language model that accepts ChatML messages and produces a streaming response
*/
export interface LanguageModelChatProvider {

// TODO@API remove or keep proposed?
onDidReceiveLanguageModelResponse2?: Event<{ readonly extensionId: string; readonly participant?: string; readonly tokenCount?: number }>;
onDidReceiveLanguageModelResponse2?: Event<{ readonly initiator: LanguageModelRequestInitiator; readonly participant?: string; readonly tokenCount?: number }>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it follows the beaten path but this needs revisiting. Core should just know when a LM has been used and AFAIK this is only needed because our extension makes requests without going through the API. Maybe @sandy081 can clarify if/how this is still needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants