Merged
Conversation
Add a new experiment-based setting under ConfigKey.Advanced: - github.copilot.chat.agent.backgroundTodoAgent.enabled (default: false) When enabled, this gate controls whether the main agent's manage_todo_list tool is disabled and a background copilot-fast model maintains the todo list instead.
Add hideTodoPromptInstructions prop to AgentPromptProps and DefaultAgentPromptProps. When true, both todo-tool guidance and markdown-checkbox fallback sections are suppressed silently. Updated prompt files: - agentPrompt.tsx: thread prop to system prompt classes - defaultAgentInstructions.tsx: suppress ternary in AlternateGPTPrompt - gpt5Prompt.tsx, gpt51Prompt.tsx, gpt52Prompt.tsx, gpt53CodexPrompt.tsx: guard fallback blocks with hideTodoPromptInstructions check
New files for the background todo agent: - backgroundTodoDelta.ts: High-watermark cursor that tracks which tool-call rounds have been processed. Produces deltas with only new activity for efficient background passes. - backgroundTodoProcessor.ts: State machine (Idle/InProgress/Failed) with cancellation, concurrent update coalescing, and automatic cursor advancement on success or failure. - backgroundTodoPrompt.tsx: Prompt-tsx element for copilot-fast with prioritized sections (system > user request > current todos > deltas) so prompt-tsx pruning preserves critical context.
In AgentIntent: - Add session-keyed BackgroundTodoProcessor map with cleanup on dispose - Add getOrCreateBackgroundTodoProcessor getter In AgentIntentInvocation: - Add isBackgroundTodoAgentEnabled helper and isTodoToolExplicitlyEnabled (checks request.toolReferences for #todo mention) - Disable manage_todo_list in getAgentTools when experiment is on - Set hideTodoPromptInstructions on AgentPromptProps - Trigger _maybeStartBackgroundTodoPass after main prompt render - _executeBackgroundTodoPass: resolve copilot-fast endpoint, render BackgroundTodoPrompt, send request with single tool schema, parse and invoke manage_todo_list tool call - Add GDPR-classified backgroundTodoAgent telemetry event
- backgroundTodoDelta.spec.ts (7 tests): first invocation, round
tracking, cursor advancement, history turns, markRoundsProcessed, reset
- backgroundTodoProcessor.spec.ts (8 tests): state transitions,
cursor advancement on success/failure, concurrent coalescing,
cancellation, parent token propagation
- backgroundTodoEnablement.spec.ts (9 tests):
- isTodoToolExplicitlyEnabled: empty refs, unrelated tools, #todo ref,
manage_todo_list ref, mixed refs, default tool picker state
- getAgentTools integration: experiment on disables tool, tool picker
default does not override, unrelated toolReferences do not override
…taTracker - Rename getDelta to peekDelta (does not advance cursor) - Keep getDelta as convenience alias - Add IBackgroundTodoDeltaMetadata with newRoundCount, newToolCallCount, isInitialDelta, isRequestOnly - Add tests for metadata fields and peek/commit semantics
- Add shouldRun() method with typed Run/Wait/Skip decisions and reasons - Add IBackgroundTodoPolicyInput for external context - Add executePass() convenience method using built-in execution logic - Move _executeBackgroundTodoPass and telemetry into processor as static _doExecute and _sendTelemetry - Add IBackgroundTodoExecutionContext for service injection - Add 8 policy unit tests in backgroundTodoPolicy.spec.ts
- Replace 6 inline guard checks with processor.shouldRun() call - Use processor.executePass() instead of processor.start() with callback - Remove _executeBackgroundTodoPass and _sendBackgroundTodoTelemetry (moved to BackgroundTodoProcessor in previous commit) - Remove unused imports (IBackgroundTodoDelta, BackgroundTodoPrompt, etc.) - Add BACKGROUND_TODO_AGENT.md architecture documentation
…und todo processor - Add meaningfulToolCallCount and contextToolCallCount to delta metadata using classifyTool() to distinguish meaningful/context/excluded tools - Implement tiered shouldRun() policy: - Initial request-only → create plan if no todos exist - Meaningful activity (≥1 call) → run immediately - Context-only activity → batch by threshold (5 calls) - Below threshold → wait for more activity - Add todoListExists to IBackgroundTodoPolicyInput - Track hasCreatedTodos on processor (set on outcome: 'success') - Move history compression (classifyTool, extractTarget, compressHistory, collectAllRounds, renderGroupedProgress, renderLatestRound) into processor - Update BackgroundTodoPrompt to render from IBackgroundTodoHistory with prioritized sections: latest round (850), assistant context (820), grouped progress (800) - Add progress signal guidance to system prompt - Update policy tests: 18 cases covering all decision paths
- classifyTool: context/meaningful/excluded/unknown/subagent (6 tests) - extractTarget: file paths, terminal, tests, subagent, unparseable (7 tests) - collectAllRounds: history + current ordering, empty (2 tests) - compressHistory: empty, single round, grouping by file, meaningful-first sorting, excluded filtering, truncation, assistant context (8 tests) - renderGroupedProgress: empty, dedup, context count (3 tests) - renderLatestRound: targets, empty response (2 tests)
Previously executePass was invoked for both Run and Wait decisions, relying on the processor to coalesce. That coupling made it harder to reason about when the bg agent fires. Tighten the call site to only fire on Run, and pass an ILogService into the processor so its policy/coalescing decisions show up in logs.
Policy: drop the context-only firing branch so research-only requests (e.g. "read these files and summarise") with many read_file/list_dir calls but no mutating actions never trigger the bg agent. The initial-request case now also waits for activity instead of guessing a plan from the user message alone. History compression: pass toolCallResults into compressHistory and extract subagent (search/explore/execution/run) outputs as ISubagentDigest entries so the bg agent sees what exploration discovered, not just that it happened. Raise the latest-round assistant response cap to 1500 chars (older rounds stay at 400) and additionally surface the longest mid-trajectory assistant message — the typical place the agent states its plan. Logging: add structured debug logs for every policy decision, pass start/completion, coalesced delta, and history-compression summary so background agent behaviour is traceable end-to-end.
Rewrites the background todo system prompt to bias toward complete, forward-looking plans rather than per-file mirroring of recent activity.
- Adds an explicit ABORT CONDITIONS block so research-only prompts ("read the following files", "do NOT write any code"), pure read-only activity, and one-item-per-file temptations short-circuit to silence.
- Adds a PLAN COMPLETENESS section requiring the list to cover the full user request and to be derived from the user request + agent's stated plan, with subagent findings and grouped progress as supporting evidence only.
- Renders the new subagentDigests block at lower priority with a label that explicitly tells the model not to mirror its structure as the todo list, so prompt-tsx prunes it before higher-signal context.
- Default to silence: enumerate the only conditions under which the bg agent should call manage_todo_list (creation, completion, advancing to next, genuinely new work). Forbid re-affirming, re-wording or re-marking 'in-progress'. - Sequential execution: enforce exactly one 'in-progress' item; require the previous 'in-progress' to be marked 'completed' in the same update before promoting the next item. - Status transitions: codify the allowed moves and forbid regression from 'completed'. Reduces churn from speculative or duplicate updates and keeps execution strictly serial.
Previously compressHistory truncated assistant responses to 400/1500 chars and only kept 3 hand-picked snippets (latest + first + longest middle). This dropped useful planning context before prompt-tsx ever saw it. - Remove MAX_RESPONSE_LENGTH, MAX_LATEST_RESPONSE_LENGTH and the truncateResponse helper. - extractAssistantContext now returns every non-empty assistant response in chronological order, untruncated. - BackgroundTodoPrompt renders each snippet as its own UserMessage with priority Math.max(700, 850 - age * 30) so the prompt-tsx renderer prunes the oldest snippets first when the budget is tight. - Update history specs for the new no-truncation / chronological behaviour.
Extract verbose inline JSX prompt text into reusable constants. Add granularity rules to prefer 2-4 high-level phase items over file-level implementation details. Strengthen sequential state and ordering rules so items complete in order with skip-reordering. Add new-task deduplication rules. Gate tool calls on an explicit diff check to avoid redundant no-change updates.
… todos Add executeFinalReview() that fires after the agent loop ends so the last round's completions are not stuck as in-progress. Cache the most recent execution context and guard against duplicate finalize passes. Store the pending work callback alongside the pending delta so coalesced finalize passes retain their isFinalReview closure. Extract tool notes (explanation/description/goal) and surface them in the latest-round detail. Support multi_replace_string_in_file targets by reading paths from replacements[]. Pick the last manage_todo_list call when the model emits multiple in one response.
Cover executeFinalReview no-ops (no context, no todos created). Verify coalesced pending delta runs its own queued work callback. Add extractTarget tests for multi-edit single/few/many file paths. Test that explanation/description notes attach to latestRound summaries.
… referenced Co-authored-by: Copilot <copilot@github.com>
Contributor
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a background todo-maintenance path to the Copilot agent flow so todo lists can be updated by a separate copilot-fast pass instead of relying on the main agent model to call manage_todo_list directly. It touches the agent intent/prompt pipeline, adds a new background processor + prompt, and updates prompt behavior/tests around todo enablement.
Changes:
- Adds a session-scoped background todo processor and delta tracker that decide when to run, render a compressed history prompt, and invoke
manage_todo_list. - Wires the feature into
AgentIntent, including experiment-gated tool availability and prompt instruction suppression. - Adds supporting tests and documentation for enablement, policy, delta tracking, history compression, and processor state handling.
Show a summary per file
| File | Description |
|---|---|
extensions/copilot/src/platform/configuration/common/configurationService.ts |
Adds the experiment-backed config key for the background todo agent. |
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoProcessor.spec.ts |
Adds unit tests for processor state transitions, coalescing, cancellation, and final review behavior. |
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoPolicy.spec.ts |
Adds policy tests for run/wait/skip decisions based on prompt context and tool activity. |
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoHistory.spec.ts |
Adds tests for tool classification, history compression, rendering, and subagent digest extraction. |
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoDelta.spec.ts |
Adds tests for delta tracking, metadata, and peek/commit semantics. |
extensions/copilot/src/extension/prompts/node/agent/openai/gpt5Prompt.tsx |
Suppresses some todo/planning guidance when the background agent is enabled. |
extensions/copilot/src/extension/prompts/node/agent/openai/gpt53CodexPrompt.tsx |
Suppresses some todo/planning guidance for the Codex prompt variant. |
extensions/copilot/src/extension/prompts/node/agent/openai/gpt52Prompt.tsx |
Suppresses some todo/planning guidance for the GPT-5.2 prompt variant. |
extensions/copilot/src/extension/prompts/node/agent/openai/gpt51Prompt.tsx |
Suppresses some todo/planning guidance for the GPT-5.1 prompt variant. |
extensions/copilot/src/extension/prompts/node/agent/defaultAgentInstructions.tsx |
Adds the shared hideTodoPromptInstructions prompt prop and uses it in alternate GPT instructions. |
extensions/copilot/src/extension/prompts/node/agent/backgroundTodoPrompt.tsx |
Introduces the prompt-tsx element used to drive the background todo model. |
extensions/copilot/src/extension/prompts/node/agent/backgroundTodoProcessor.ts |
Implements the background processor, execution flow, history compression, and telemetry. |
extensions/copilot/src/extension/prompts/node/agent/backgroundTodoDelta.ts |
Implements per-session delta/high-watermark tracking for new tool-call rounds. |
extensions/copilot/src/extension/prompts/node/agent/agentPrompt.tsx |
Threads the new hide-todo-instructions prop into prompt selection. |
extensions/copilot/src/extension/prompts/node/agent/BACKGROUND_TODO_AGENT.md |
Documents architecture, enablement logic, request flow, and tests for the feature. |
extensions/copilot/src/extension/intents/node/test/backgroundTodoEnablement.spec.ts |
Adds tests for explicit todo-tool enablement and agent tool gating. |
extensions/copilot/src/extension/intents/node/agentIntent.ts |
Wires background todo processors into agent sessions, tool filtering, prompt props, and final-review execution. |
Copilot's findings
- Files reviewed: 16/16 changed files
- Comments generated: 11
bhavyaus
reviewed
May 3, 2026
bhavyaus
reviewed
May 3, 2026
bhavyaus
reviewed
May 3, 2026
bhavyaus
reviewed
May 3, 2026
bhavyaus
reviewed
May 3, 2026
bhavyaus
reviewed
May 3, 2026
bhavyaus
reviewed
May 3, 2026
bhavyaus
requested changes
May 3, 2026
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
bhavyaus
reviewed
May 4, 2026
bhavyaus
reviewed
May 4, 2026
bhavyaus
reviewed
May 4, 2026
bhavyaus
previously approved these changes
May 4, 2026
Co-authored-by: Copilot <copilot@github.com>
roblourens
previously approved these changes
May 4, 2026
DonJayamanne
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a background todo agent that automatically maintains the chat session's todo list during agent turns, removing the need to prompt the main model to call
manage_todo_listitself.What it does
BackgroundTodoDeltaTracker(peek/commit semantics with high-watermark tracking).BackgroundTodoProcessordecides when to fire using a tiered invocation policy (skip / wait / run) based on meaningful vs. context-only tool calls and an experiment gate.copilot-fastendpoint and invokesmanage_todo_list.Notable details
BackgroundTodoAgentEnabled.backgroundTodoAgenttracks outcome, duration, and token usage.