-
Notifications
You must be signed in to change notification settings - Fork 4.9k
Proposal: Envoy Support for Model Context Protocol (MCP) #39174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Seems this proposal contains lots of features. What will be the initial target or is there a roadmap? Now, I am instersted in the 1 and 5. I think 5 should be a tranditial HTTP filter which will respond the tools/resources/prompts list request and then convert the related json rpc request based on the content to tranditional HTTP call. |
+1. Interested to support this effort. Is 2 related to |
Thanks @wbpcode. Right, 5 with the transcoder support to REST/gRPC backends could be a great use case, and it will combine with 2 with the parser of MCP protocol. One traditional filter is reasonable if we don't consider the batch of JSON-RPC, which was added as a MCP RFC ~3 weeks ago. I don't see much usage as of now with JSON-RPC batch, but as an incremental effort, we still want to keep the door open for this 1:N and N:1 fan out support in Envoy. There are several options though, like a terminal filter can achieve this. There are some other use cases that we need to consider like SSE with the JSON-RPC batch, which adds more complexity.
I believe we have the idea, and want to discuss more usage pattern in Envoy with the community. @kyessenov mentioned a very useful case in the maintainer channel, which I think is a great case study for how MCP is supported in Envoy. Let's say we want the LLM to manage each Envoy instance’s counter status and configuration dump status, adjust log levels, and read aggregated metrics and logs. The goal is to use these capabilities to help debug issues and gain operational insights. How would we design this service considering the scalability? We have two fundamental approaches: (1) add native MCP support directly to Envoy's admin APIs, or (2) use an MCP adapter/transcoder to connect to Envoy's existing REST admin APIs. One architecture that leverages Envoy's strengths while providing elegant scalability for large fleets can be the following: graph TD
LLM[LLM Client] -- MCP --> LB[Envoy MCP API Gateway]
LB -- MCP /cluster-a/* --> MCPS1[MCP Envoy with transcoder - cluster-a]
LB -- MCP /cluster-b/* --> MCPS2[MCP Envoy with transcoder - cluster-b]
LB -- MCP /global/* --> MCPS3[MCP Envoy with transcoder - global-cluster]
MCPS1 -- REST API --> ENV_A1[Envoy A1 Admin API]
MCPS1 -- REST API --> ENV_A2[Envoy A2 Admin API]
MCPS2 -- REST API --> ENV_B1[Envoy B1 Admin API]
MCPS2 -- REST API --> ENV_B2[Envoy B2 Admin API]
MCPS3 -- REST API --> ENV_ALL[Global Envoy Metric Service APIs]
MCPS3 -- Query/Store --> DB
What I find elegant about this example is that it's "Envoy all the way down" - using Envoy's own capabilities to solve the MCP integration challenge. The front-end Gateway routes MCP requests based on JSON-RPC methods / resouces uri / tool group usage, the second layer contains the specialized transcoders that serve different clusters with the map between MCP schema and REST APIs, and we maintain a clean separation between individual and aggregated metrics. This includes 1, 2, 5, and potentially 8 and SSE stream support. |
I think Envoy's debug use case would be a good one to solve. |
yea, this can be a good case study, which includes the case to support pure MCP server backends and the REST backends. To support the pure MCP backend, there could be some complexity around the We also want to gather feedback for which use cases are the most popular and close to the real world scenario. |
i am interested in this too, wish for a roadmap and contribute to it!since i am also doing job about mcp recently~ |
👀 Interested too! |
This is great! Higress has already developed an MCP Server Wasm plugin based on go 1.24, which is functionally consistent with the basic direction described here. This is the documentation for the plugin: https://higress.cn/en/ai/mcp-server We have also developed a tool that can automatically convert openapi to the configuration of this Wasm plugin: https://github.com/higress-group/openapi-to-mcpserver Based on this mechanism, we have built the Higress MCP marketplace: https://mcp.higress.ai/ However, the Wasm ABI used by the plugin and the implementation of proxy-wasm-cpp-host are slightly different from those currently used by the Envoy community, so it cannot be used directly in the official Envoy distribution yet. If the community is interested in this solution, we can do some work next to make this plugin usable in the official Envoy distribution as well. We have aligned the changes on the proxy-wasm-cpp-host side in this PR: proxy-wasm/proxy-wasm-cpp-host#433 |
Great! and i have a small question, is it possible for wasm go plugin to start a sse server if this plugin is used to convert mcp to rest? |
@StarryVae This plugin only implements mcp to rest based on the Streamable HTTP protocol. If you need to be compatible with the first version of the MCP protocol, which is based solely on POST + SSE, you will need to address the state synchronization issue between the two requests (currently, this is achieved in Higress through another filter that connects to redis). However, I personally believe that the POST + SSE MCP protocol will gradually be replaced by the Streamable HTTP protocol. Here is our analysis and comparison: https://higress.ai/en/blog/mcp-protocol-why-is-streamable-http-the-best-choice |
Oh, i see, your plugin only implements mcp to rest based on the Streamable HTTP protocol (POST), a GET request to open sse is not supported yet? @johnlanni |
@StarryVae Yes, we only implemented the stateless part of the Streamable HTTP protocol, because most mcp to rest scenarios are stateless |
@johnlanni, thank you for sharing! The official Envoy repository doesn’t maintain the Wasm plugin—it only supports built-in C++ filters. A good way to contribute is through the I think the Streamable HTTP protocol is still in its early stages and comes with several challenges, such as stateful management, Auth arch, multi-phase initialization, and the community is influencing the protocol. That said, one of the core components is a JSON-RPC parser. Most MCP servers currently support only single-call JSON-RPC, but there’s a new RFC requiring batch support: modelcontextprotocol/modelcontextprotocol#228 A good starting point would be building a parser that maps each JSON-RPC request 1:1 to an OpenAPI call. At the same time, I’d like to keep the door open for batch support — I'm considering a terminal filter that performs fan-out and reuses the existing router. |
@botengyao Thank you, we will contribute the wasm code of higress mcp server to |
Uh oh!
There was an error while loading. Please reload this page.
Objective
We’d like to initiate an issue and discussion around supporting the Model Context Protocol (MCP) in Envoy as a gateway.
What is MCP?
MCP is an open, stateless/stateful protocol that allows GenAI applications to retrieve and exchange context (e.g. source code, files, documents) with LLMs, using JSON-RPC semantics. A significant MCP streamable HTTP update last month, introduces OAuth 2.1-based authorization, streamable HTTP transport, JSON-RPC batching, and tool annotations. This is a major update that can make MCP work as a remote server.
Details
MCP can use transports such as stdio or Streamable HTTP. With the streamable HTTP update, the bidirectional JSON-RPC messages can be exchanged over HTTP POST and GET. And SSE is wrapped into the streamable HTTP. The following diagrams show the HTTP transport and capacity negotiation in MCP:
Transport
Capacity Negotiation
This proposal explores how Envoy can serve as a gateway between MCP clients and servers — helping route, process, and secure MCP messages in a scalable and extensible way.
Design Proposal
With MCP gaining traction as a standard way for AI tools to interact with contextual data, we believe Envoy can play an important role in enabling infrastructure-level routing, load balancing, and observability for these interactions.
This issue proposes a set of functions that enables Envoy to act as a gateway between MCP clients and servers, covering the following use cases, in order of their complexity and implementation:
Proposed Functionality
MCP session aware load balancing based on the MCP endpoint (HTTP request URI).
Parsing of MCP protocol to make Envoy aware of MCP request properties such as method/id, call arguments, or return values.
Authentication of MCP requests using O-Auth2, JWT or API keys based on MCP request properties.
Authorization of MCP requests and messages using RBAC (for example authorizing specific MCP methods based on the caller identity). This authorization will apply to both client and server requests.
Transcoding JSON-RPC messages to existing API surfaces, for example gRPC and OpenAPI.
Rate limiting of MCP requests.
Customizable business logic for MCP messages, similar to HTTP filters, including remote callouts for MCP messages.
Load balancing and fanning-out of individual MCP messages (i.e. based on method) from a single HTTP stream.
Gateway initialized SSE stream support with session resumption and JSON-RPC batch support.
Note
MCP is closely related to the A2A protocol that is proposed for agent to agent communications. Both protocols use JSON-RPC and streaming semantics and stateful sessions. While this proposal is covering MCP, it does not preclude extending the same functions to A2A protocol. While some of the functions are agnostic of the underlying protocol, business logic specific to A2A can be implemented in its own extension, sharing common implementation, such as JSON-RPC parser and framing, with MCP.
Acknowledgments
This proposal was framed collaboratively with @htuch, @yanavlasov, and @botengyao. This issue is intended to surface the proposal for public discussion, gather feedback, and coordinate OSS collaboration.
We welcome thoughts, feedback, and ideas from the community as we continue to iterate on this direction.
The text was updated successfully, but these errors were encountered: