SEP-2663: Tasks Extension by LucaButBoring · Pull Request #2663 · modelcontextprotocol/modelcontextprotocol

LucaButBoring · 2026-04-29T01:07:48Z

This SEP defines an extension that allows a server respond to a tools/call request with an asynchronous task handle instead of a final result, allowing the client to retrieve the eventual result by polling. The extension introduces three methods: tasks/get, tasks/update, and tasks/cancel; a polymorphic-result discriminator (resultType: "task"); and a Task shape that carries a task status, in-progress server-to-client requests, and a final result or error. Task creation is server-directed: the client signals support by including the extension in its per-request capabilities, and the server decides on a per-request basis whether to materialize a task.

Tasks will become a foundational building block of MCP and are expected to be supported in future protocol versions. The experimental tasks feature in the 2025-11-25 specification served as a stopgap until the protocol's extension mechanism was available. Now that extensions have been formalized, moving tasks to an official extension gives the feature time to incubate and evolve based on additional real-world implementation feedback, without being constrained by the core specification's release cadence. Once the extension has stabilized and achieved broad adoption, it is intended to be promoted into the core protocol.

This proposal removes the version of tasks specified in the 2025-11-25 release. It is shaped by implementation feedback since that release and by several changes to the base protocol expected to arrive in the 2026-06-30 specification:

Motivation and Context

The experimental tasks feature served as an alternate execution mode for tool calls, elicitation, and sampling, allowing receivers to return a poll handle instead of blocking until a final result was ready. Implementation experience surfaced several challenges:

The handshake is fragile. Tasks today expose method-level capabilities (tasks.requests.tools.call declares that tools/call MAY be task-augmented) alongside a tool-level execution.taskSupport field that declares whether a particular tool will accept the augmentation. Clients express their own support for tasks by passing a task parameter on their requests, but MUST NOT include it if the method/tool does not support tasks. A client that wants to opt into tasks must therefore prime its state with a tools/list call before issuing any task-augmented request, and cannot blindly attach a task parameter to every request to handle tools isomorphically. This is confusing, implicit, and easy to get wrong.
tasks/result is a blocking trap. In the current flow, a client that observes input_required is required to call tasks/result prematurely so that the server has an SSE stream on which to side-channel elicitation or sampling requests. tasks/result then blocks until the entire operation completes. This forces long-lived persistent connections that many clients and servers do not want to implement, and it conflicts with SEP-2260, which disallows unsolicited server-to-client requests outright. Under SEP-2260, the SSE semantics that justified the blocking behavior no longer apply.
tasks/list scoping cannot be defined. To avoid clients cancelling or retrieving results for tasks they shouldn't have access to, all tasks should be bound to some sort of "authorization context," the implementation of which is left to individual servers according to their existing bespoke permission models. However, in many cases, it is not possible to perform this binding, in which case the task ID becomes the only line of defense against contamination. In this scenario, it is unsafe for a server to support tasks/list at all. While it was possible for tasks to instead be bound to a session, SEP-2567 removes sessions from the protocol. There is no other natural scope a server can define unilaterally — task IDs can be unguessable handles that a server can recognize one at a time, but servers cannot reliably correlate two unrelated handles to the same caller without additional state.

Beyond implementation challenges, tasks face another structural issue: Client-hosted tasks are no longer expressible. SEP-1686 permitted clients to host tasks for elicitation and sampling, in part to avoid coupling tasks to tool calls. SEP-2260 makes any unsolicited server-to-client request invalid; every server-to-client polling request under client-hosted tasks would be unsolicited by definition.

This proposal intends to solve the above issues by redesigning certain aspects of the feature and moving tasks out to an official extension. Redefining tasks as an official extension gives the feature more time to incubate and evolve independently of the core specification, promoting adoption. As part of the redesign, this proposal consolidates the polling lifecycle into tasks/get and a new tasks/update to remove the blocking tasks/result method. The redesign allows servers to return tasks unsolicited (in response to ordinary, non-task-flagged requests) to eliminate the per-request opt-in and the tools/list warmup, relying instead on the extension capability as the single handshake point. Finally, this proposal removes client-hosted elicitation and sampling tasks in compliance with SEP-2260.

How Has This Been Tested?

Conformance test suite: modelcontextprotocol/conformance#262

Breaking Changes

Described in proposal.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed

Additional context

Supersedes #2557.

AI Use Disclosure: The extension SEP document in this PR was initially drafted using claude.ai with the previous iteration as a reference. I rewrote/rephrased many sections myself and verified its correctness, using claude.ai as a reviewer to iteratively scrub out issues.

LucaButBoring · 2026-04-29T01:26:03Z

Moving discussion from #2557 over here @CaitieM20 @markdroth @Randgalt @kurtisvg @localden @pja-ant @dsp-ant @maxisbey @maciej-kisiel @ylxlpl

(Tagged everyone who commented on #2557)

localden · 2026-04-29T01:26:49Z

Thanks for putting this together, @LucaButBoring - I'll post the comment that I was typing earlier in #2557 and let you validate how much of this is still relevant.

Some notes beyond the other bits I called out in the review. There's a few places where I think the SEP is a little underspecified:

CreateTaskResult and GetTaskResult both carry resultType: "task" but have different shapes. In schema.ts, CreateTaskResult is { task: Task } (nested), while GetTaskResult is Result & DetailedTask (flat). So a client switching on resultType === "task" then has to also check whether it got result.task.taskId or result.taskId. Is the nesting on CreateTaskResult intentional, or a holdover from before GetTaskResult was flattened?
The inputRequests key contract stops at "SHOULD dedupe." The tasks spec says clients should dedupe by key, and the inputResponses JSDoc says keys match inputRequests keys. But it doesn't say whether a key is unique for the task's lifetime or can be reused after the server consumes the response.
Retry of tasks/get with inputResponses. What happens if a client sends tasks/get { inputResponses: { k: ... } }, network blips, and client retries? If the response was a sampling result the server feeds into a downstream API call, that call just ran twice. IMO the smallest fix is to say the server MUST treat inputResponses keyed on a request it has already consumed as a no-op. That makes the key the idempotency token, and it's almost what the key-matching contract says already.
"The same requests will be included" is ambiguous for partial responses. The Input Requests section says if the client polls again before providing all responses, the same requests reappear. Does "the same requests" mean the full original set (client must re-send what it already provided) or only the still-unfulfilled remainder? I'd read it as the remainder, but the text doesn't say so, and it interacts with the idempotency point above.
The cancel behavior has two stories. The spec says servers MAY ignore cancellation but MUST support tasks/cancel, which I read as: always return a valid CancelTaskResult, possibly with a non-cancelled status. But the "Cancellation Not Supported" example returns a -32603 JSON-RPC error instead. Those are different contracts. Can we formalize that the response carries the task's current status, which may not be cancelled, and drop the -32603 example.
ttl and pollInterval are now in different units. The schema still documents pollInterval in milliseconds while the SEP moves ttl to seconds. So { ttl: 60, pollInterval: 5000 } is 60 seconds next to 5000 milliseconds. @pja-ant raised this before and I don't see it landed yet. Both fields should match.
The Failed example might be the wrong status under the new rule. The Task Flow Change section says failed is for JSON-RPC errors and application faults go to completed with isError: true. The Failed example shows error: { code: -32603, message: "API rate limit exceeded" }. A downstream API rate-limiting the tool is an application fault (exactly the case the new rule routes to completed). If -32603 here means the MCP server itself fell over, the message should say that; otherwise the example is the case the rule says not to use failed for.
Is taskId alone always sufficient for tasks/get? requestState lets a server externalize lookup state to the client (a backend job ID, a serialized continuation) so it doesn't have to keep a mapping table - that makes sense. But in a fully stateless deployment a server could push that to the limit and put the entire task record in requestState, keeping nothing locally. At that point tasks/get { taskId } without requestState has nothing to look up, which runs into the "MUST NOT return CreateTaskResult until tasks/get would find it" guarantee. Should we be explicit about the taskId always being sufficient as a standalone index of a task?

A couple of schema regressions I noticed too:

CallToolRequestParams, CreateMessageRequestParams, and the ElicitRequest*Params types no longer extend anything after TaskAugmentedRequestParams was removed, so they've lost the RequestParams base and _meta? with it.
ServerRequest still includes GetTaskRequest and CancelTaskRequest even though client-hosted tasks are removed.

LucaButBoring · 2026-04-29T17:16:59Z

@localden Thanks for the feedback, going through this:

CreateTaskResult and GetTaskResult both carry resultType: "task" but have different shapes. In schema.ts, CreateTaskResult is { task: Task } (nested), while GetTaskResult is Result & DetailedTask (flat). So a client switching on resultType === "task" then has to also check whether it got result.task.taskId or result.taskId. Is the nesting on CreateTaskResult intentional, or a holdover from before GetTaskResult was flattened?

This revision limits resultType: "task" to CreateTaskResult to avoid any ambiguity, noticed that issue while rewriting this. GetTaskResult was always flat, the distinction was that we made CreateTaskResult nested at the last minute in 2025-11-25 to allow switching on it. That nesting is a holdover from before we had resultType, so we can actually flatten CreateTaskResult, too.

edit: updated

The inputRequests key contract stops at "SHOULD dedupe." The tasks spec says clients should dedupe by key, and the inputResponses JSDoc says keys match inputRequests keys. But it doesn't say whether a key is unique for the task's lifetime or can be reused after the server consumes the response.

This revision does require keys to be unique over the lifetime of a task, and not reused between distinct requests.

Retry of tasks/get with inputResponses. What happens if a client sends tasks/get { inputResponses: { k: ... } }, network blips, and client retries? If the response was a sampling result the server feeds into a downstream API call, that call just ran twice. IMO the smallest fix is to say the server MUST treat inputResponses keyed on a request it has already consumed as a no-op. That makes the key the idempotency token, and it's almost what the key-matching contract says already.

Yup, that's how tasks/update works in this revision.

"The same requests will be included" is ambiguous for partial responses. The Input Requests section says if the client polls again before providing all responses, the same requests reappear. Does "the same requests" mean the full original set (client must re-send what it already provided) or only the still-unfulfilled remainder? I'd read it as the remainder, but the text doesn't say so, and it interacts with the idempotency point above.

I struck out that phrasing in this revision, now it can actually be either, as tasks/update is eventually-consistent - but the new key uniqueness constraint means that this is fine from the client's perspective, now.

The cancel behavior has two stories. The spec says servers MAY ignore cancellation but MUST support tasks/cancel, which I read as: always return a valid CancelTaskResult, possibly with a non-cancelled status. But the "Cancellation Not Supported" example returns a -32603 JSON-RPC error instead. Those are different contracts. Can we formalize that the response carries the task's current status, which may not be cancelled, and drop the -32603 example.

To deal with that, in this revision, tasks/cancel no longer has any result (and is also eventually-consistent, like tasks/update).

ttl and pollInterval are now in different units. The schema still documents pollInterval in milliseconds while the SEP moves ttl to seconds. So { ttl: 60, pollInterval: 5000 } is 60 seconds next to 5000 milliseconds. @pja-ant raised this before and I don't see it landed yet. Both fields should match.

A TTL in integer seconds makes sense, but I'm not sure if a polling interval in integer seconds does - 500ms would be a reasonable polling interval for a relatively quick, but high-variance (1s-20s) task. A duration is probably better-expressed with units included in the value (e.g. "500ms"), but that would be nonstandard for us - I suppose I could name it pollIntervalMilliseconds, but that feels awkward and inconsistent in its own right, since nothing else includes units in the field name so far.

edit: updated to include units in the field names

The Failed example might be the wrong status under the new rule. The Task Flow Change section says failed is for JSON-RPC errors and application faults go to completed with isError: true. The Failed example shows error: { code: -32603, message: "API rate limit exceeded" }. A downstream API rate-limiting the tool is an application fault (exactly the case the new rule routes to completed). If -32603 here means the MCP server itself fell over, the message should say that; otherwise the example is the case the rule says not to use failed for.

Noted, I'll update the phrasing here - it actually doesn't really mean the MCP server fell over either, the literal intent is just that if the inner request returns a JSON-RPC error, that's failed, and in every other case (including a tool call with isError: true), that's completed.

edit: updated

Is taskId alone always sufficient for tasks/get? requestState lets a server externalize lookup state to the client (a backend job ID, a serialized continuation) so it doesn't have to keep a mapping table - that makes sense. But in a fully stateless deployment a server could push that to the limit and put the entire task record in requestState, keeping nothing locally. At that point tasks/get { taskId } without requestState has nothing to look up, which runs into the "MUST NOT return CreateTaskResult until tasks/get would find it" guarantee. Should we be explicit about the taskId always being sufficient as a standalone index of a task?

I don't think there's an inconsistency here? requestState is already on the request shape for tasks/get - the requirement is that the client echoes whatever the server gives it. So, in the case where the full task record is in requestState, the server would return the initial value in CreateTaskResult, the client would pick that up, and then it would echo it in tasks/get, maintaining the full record through that flow.

edit: updated, I misinterpreted this - noted here

A couple of schema regressions I noticed too:

CallToolRequestParams, CreateMessageRequestParams, and the ElicitRequest*Params types no longer extend anything after TaskAugmentedRequestParams was removed, so they've lost the RequestParams base and _meta? with it.

ServerRequest still includes GetTaskRequest and CancelTaskRequest even though client-hosted tasks are removed.

Ah, I missed that on #2557 - I'll make sure this is handled correctly when I write the schema changes here.

He-Pin · 2026-04-29T17:59:00Z

This is great, allows integration of various organizational extensions.

pja-ant · 2026-04-29T18:33:47Z

A TTL in integer seconds makes sense, but I'm not sure if a polling interval in integer seconds does - 500ms would be a reasonable polling interval for a relatively quick, but high-variance (1s-20s) task. A duration is probably better-expressed with units included in the value (e.g. "500ms"), but that would be nonstandard for us - I suppose I could name it pollIntervalMilliseconds, but that feels awkward and inconsistent in its own right, since nothing else includes units in the field name so far.

The option space is:

Have everything as seconds
Allow different units, but don't include it in the name or value
Allow different units, but use a string (e.g. "500ms")
Allow different units, and add it to the name

IMO:

Too limiting - seconds isn't appropriate for everything
Strongly prefer we don't do this. We know what happens: https://en.wikipedia.org/wiki/Mars_Climate_Orbiter
An option, but IMO having to parse is just annoying.
My strong preference. It's simple and avoids any confusion. It's a little more verbose.

I agree that (4) is non-standard, but IMO we just make it the standard starting now and make sure that TTL lists also adopts this standard.

…r errors

LucaButBoring · 2026-05-12T13:37:28Z

We've decided not to add it for now, we're holding off until someone very specifically needs an updating value for their use case, as it was adding a lot of requirements to this spec for not much value in exchange.

For use cases where you need an arbitrary but unchanging value, you can use the task ID field instead, and do something like encode a JWT or similar into it (this is also true of tasks today).

Co-authored-by: Caitie McCaffrey <caitiem20@github.com>

LucaButBoring · 2026-05-13T05:45:38Z

Ported over all corrections after review round on modelcontextprotocol/experimental-ext-tasks#2

CaitieM20

This Looks Good,

A couple things lets mark this as Final -> see comment.
Also I think there are examples in schema/draft/examples we should be deleting as well.

GetTaskPayloadRequest
GetTaskPayloadResult
GetTaskPayloadResultResponse
TaskInputResponseRequest
TaskInputResponseRequestParams
Can you do a quick pass and make sure we've deleted all the examples that are tied to the schema we are removing.

Also I think we are missing a changelog comment

Co-authored-by: Caitie McCaffrey <caitiem20@github.com>

Resolve conflicts keeping tasks out of core schema (moved to extension), incorporate SEP-2260 final status, SEP-2549 TTL additions, and add MRTR changelog entry from main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

LucaButBoring changed the title ~~SEP-XXXX: Tasks Extension~~ SEP-2663: Tasks Extension Apr 29, 2026

SEP-2663: Tasks Extension

7715860

LucaButBoring force-pushed the feat/ext-tasks branch from 02b7564 to 7715860 Compare April 29, 2026 01:14

LucaButBoring requested review from a team as code owners April 29, 2026 01:14

LucaButBoring assigned CaitieM20 Apr 29, 2026

LucaButBoring added this to the 2026-06-30-RC milestone Apr 29, 2026

LucaButBoring added SEP in-review SEP proposal ready for review. extension roadmap/agents Roadmap: Agent Communication (Tasks lifecycle) labels Apr 29, 2026

github-project-automation Bot added this to SEP Review Pipeline Apr 29, 2026

LucaButBoring mentioned this pull request Apr 29, 2026

SEP-2557: Adapt Tasks for Stateless & Sessionless Protocol #2557

Closed

9 tasks

localden moved this to In Review in SEP Review Pipeline Apr 29, 2026

localden moved this from In Review to Review Batch in SEP Review Pipeline Apr 29, 2026

CaitieM20 reviewed Apr 29, 2026

View reviewed changes

Comment thread seps/2663-tasks-extension.md Outdated

CaitieM20 reviewed Apr 29, 2026

View reviewed changes

Comment thread seps/2663-tasks-extension.md Outdated

dsp-ant reviewed Apr 29, 2026

View reviewed changes

Comment thread seps/2663-tasks-extension.md

pja-ant reviewed Apr 29, 2026

View reviewed changes

Comment thread seps/2663-tasks-extension.md

LucaButBoring added 2 commits April 29, 2026 11:15

SEP-2663: Flatten CreateTaskResult

edb77c8

SEP-2663: Put under Agents WG for now

5b125bd

LucaButBoring added 4 commits April 29, 2026 11:42

SEP-2663: Add requestState security guidance

87a3f98

SEP-2663: Rephrase distinction between protocol-level errors and othe…

e224fc0

…r errors

SEP-2663: Reformat document

23d2392

SEP-2663: Append units to duration fields

d3ad509

Merge branch 'main' into feat/ext-tasks

a9ea0a4

mikekistler mentioned this pull request May 12, 2026

SEP-2663: Tasks Extension modelcontextprotocol/csharp-sdk#1573

Open

LucaButBoring and others added 13 commits May 12, 2026 15:24

SEP-2663: Specify error for servers that require tasks

6e4fd57

SEP-2663: Remove redundant 'On success' from cancellation ack

46394d2

Co-authored-by: Caitie McCaffrey <caitiem20@github.com>

SEP-2663: Trim partial-response sentence in tasks/update

7828857

Co-authored-by: Caitie McCaffrey <caitiem20@github.com>

SEP-2663: Rename section to 'Task Update Requests'

c248297

SEP-2663: Link to MRTR spec for inputRequests

782d37a

SEP-2663: Add auth check requirement to security implications

527e5c5

SEP-2663: Rename notifications/tasks/status to notifications/tasks

1d3813a

SEP-2663: Move detailed task types into task status section

25d1c9d

SEP-2663: Add explicit polling response requirements

b15331e

SEP-2663: Disallow progress/logging notifications on tasks

2dba297

SEP-2663: Add spec language for client behavior on inputRequests

1152e58

SEP-2663: Use relative MRTR link instead of absolute URL

7a66448

SEP-2663: Reformat document

f2aca22

localden added accepted SEP accepted by core maintainers, but still requires final wording and reference implementation. and removed accepted-with-changes labels May 13, 2026

SEP-2663: Accepted

470072b

CaitieM20 reviewed May 15, 2026

View reviewed changes

Comment thread seps/2663-tasks-extension.md Outdated

CaitieM20 requested changes May 15, 2026

View reviewed changes

LucaButBoring and others added 3 commits May 15, 2026 12:02

Update seps/2663-tasks-extension.md

5b2c24d

Co-authored-by: Caitie McCaffrey <caitiem20@github.com>

SEP-2663: Remove task examples and reformat

91d72e9

SEP-2663: Add changelog entry

38ece65

CaitieM20 added final SEP finalized. and removed accepted SEP accepted by core maintainers, but still requires final wording and reference implementation. labels May 15, 2026

Merge branch 'main' into feat/ext-tasks

c47bd84

Resolve conflicts keeping tasks out of core schema (moved to extension), incorporate SEP-2260 final status, SEP-2549 TTL additions, and add MRTR changelog entry from main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CaitieM20 approved these changes May 15, 2026

View reviewed changes

CaitieM20 merged commit 3395973 into modelcontextprotocol:main May 15, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SEP-2663: Tasks Extension#2663

SEP-2663: Tasks Extension#2663
CaitieM20 merged 58 commits into
modelcontextprotocol:mainfrom
LucaButBoring:feat/ext-tasks

LucaButBoring commented Apr 29, 2026 •

edited

Loading

Uh oh!

LucaButBoring commented Apr 29, 2026

Uh oh!

localden commented Apr 29, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LucaButBoring commented Apr 29, 2026 •

edited

Loading

Uh oh!

He-Pin commented Apr 29, 2026

Uh oh!

pja-ant commented Apr 29, 2026

Uh oh!

LucaButBoring commented May 12, 2026

Uh oh!

LucaButBoring commented May 13, 2026

Uh oh!

Uh oh!

CaitieM20 left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

Conversation

LucaButBoring commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

LucaButBoring commented Apr 29, 2026

Uh oh!

localden commented Apr 29, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LucaButBoring commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

He-Pin commented Apr 29, 2026

Uh oh!

pja-ant commented Apr 29, 2026

Uh oh!

LucaButBoring commented May 12, 2026

Uh oh!

LucaButBoring commented May 13, 2026

Uh oh!

Uh oh!

CaitieM20 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

11 participants

LucaButBoring commented Apr 29, 2026 •

edited

Loading

LucaButBoring commented Apr 29, 2026 •

edited

Loading

CaitieM20 left a comment •

edited

Loading