Skip to content

Add background todo agent#313911

Merged
vritant24 merged 38 commits intomainfrom
dev/vritant24/bg-todo-watcher
May 4, 2026
Merged

Add background todo agent#313911
vritant24 merged 38 commits intomainfrom
dev/vritant24/bg-todo-watcher

Conversation

@vritant24
Copy link
Copy Markdown
Member

Adds a background todo agent that automatically maintains the chat session's todo list during agent turns, removing the need to prompt the main model to call manage_todo_list itself.

What it does

  • Watches agent activity via a BackgroundTodoDeltaTracker (peek/commit semantics with high-watermark tracking).
  • A BackgroundTodoProcessor decides when to fire using a tiered invocation policy (skip / wait / run) based on meaningful vs. context-only tool calls and an experiment gate.
  • When triggered, renders a compressed history (grouped progress + latest-round detail + assistant context + subagent digests) against the copilot-fast endpoint and invokes manage_todo_list.
  • Coalesces concurrent updates so at most one pass runs at a time, and runs a final-review pass after the agent loop ends to capture work completed on the last round.
  • Hides the regular todo prompt instructions when the experiment is enabled.

Notable details

  • New experiment key: BackgroundTodoAgentEnabled.
  • Telemetry event backgroundTodoAgent tracks outcome, duration, and token usage.

vritant24 and others added 21 commits May 1, 2026 13:24
Add a new experiment-based setting under ConfigKey.Advanced:
- github.copilot.chat.agent.backgroundTodoAgent.enabled (default: false)

When enabled, this gate controls whether the main agent's
manage_todo_list tool is disabled and a background copilot-fast
model maintains the todo list instead.
Add hideTodoPromptInstructions prop to AgentPromptProps and
DefaultAgentPromptProps. When true, both todo-tool guidance and
markdown-checkbox fallback sections are suppressed silently.

Updated prompt files:
- agentPrompt.tsx: thread prop to system prompt classes
- defaultAgentInstructions.tsx: suppress ternary in AlternateGPTPrompt
- gpt5Prompt.tsx, gpt51Prompt.tsx, gpt52Prompt.tsx, gpt53CodexPrompt.tsx:
  guard fallback blocks with hideTodoPromptInstructions check
New files for the background todo agent:

- backgroundTodoDelta.ts: High-watermark cursor that tracks which
  tool-call rounds have been processed. Produces deltas with only
  new activity for efficient background passes.

- backgroundTodoProcessor.ts: State machine (Idle/InProgress/Failed)
  with cancellation, concurrent update coalescing, and automatic
  cursor advancement on success or failure.

- backgroundTodoPrompt.tsx: Prompt-tsx element for copilot-fast with
  prioritized sections (system > user request > current todos > deltas)
  so prompt-tsx pruning preserves critical context.
In AgentIntent:
- Add session-keyed BackgroundTodoProcessor map with cleanup on dispose
- Add getOrCreateBackgroundTodoProcessor getter

In AgentIntentInvocation:
- Add isBackgroundTodoAgentEnabled helper and isTodoToolExplicitlyEnabled
  (checks request.toolReferences for #todo mention)
- Disable manage_todo_list in getAgentTools when experiment is on
- Set hideTodoPromptInstructions on AgentPromptProps
- Trigger _maybeStartBackgroundTodoPass after main prompt render
- _executeBackgroundTodoPass: resolve copilot-fast endpoint, render
  BackgroundTodoPrompt, send request with single tool schema, parse
  and invoke manage_todo_list tool call
- Add GDPR-classified backgroundTodoAgent telemetry event
- backgroundTodoDelta.spec.ts (7 tests): first invocation, round
  tracking, cursor advancement, history turns, markRoundsProcessed, reset

- backgroundTodoProcessor.spec.ts (8 tests): state transitions,
  cursor advancement on success/failure, concurrent coalescing,
  cancellation, parent token propagation

- backgroundTodoEnablement.spec.ts (9 tests):
  - isTodoToolExplicitlyEnabled: empty refs, unrelated tools, #todo ref,
    manage_todo_list ref, mixed refs, default tool picker state
  - getAgentTools integration: experiment on disables tool, tool picker
    default does not override, unrelated toolReferences do not override
…taTracker

- Rename getDelta to peekDelta (does not advance cursor)
- Keep getDelta as convenience alias
- Add IBackgroundTodoDeltaMetadata with newRoundCount, newToolCallCount,
  isInitialDelta, isRequestOnly
- Add tests for metadata fields and peek/commit semantics
- Add shouldRun() method with typed Run/Wait/Skip decisions and reasons
- Add IBackgroundTodoPolicyInput for external context
- Add executePass() convenience method using built-in execution logic
- Move _executeBackgroundTodoPass and telemetry into processor as
  static _doExecute and _sendTelemetry
- Add IBackgroundTodoExecutionContext for service injection
- Add 8 policy unit tests in backgroundTodoPolicy.spec.ts
- Replace 6 inline guard checks with processor.shouldRun() call
- Use processor.executePass() instead of processor.start() with callback
- Remove _executeBackgroundTodoPass and _sendBackgroundTodoTelemetry
  (moved to BackgroundTodoProcessor in previous commit)
- Remove unused imports (IBackgroundTodoDelta, BackgroundTodoPrompt, etc.)
- Add BACKGROUND_TODO_AGENT.md architecture documentation
…und todo processor

- Add meaningfulToolCallCount and contextToolCallCount to delta metadata
  using classifyTool() to distinguish meaningful/context/excluded tools
- Implement tiered shouldRun() policy:
  - Initial request-only → create plan if no todos exist
  - Meaningful activity (≥1 call) → run immediately
  - Context-only activity → batch by threshold (5 calls)
  - Below threshold → wait for more activity
- Add todoListExists to IBackgroundTodoPolicyInput
- Track hasCreatedTodos on processor (set on outcome: 'success')
- Move history compression (classifyTool, extractTarget, compressHistory,
  collectAllRounds, renderGroupedProgress, renderLatestRound) into processor
- Update BackgroundTodoPrompt to render from IBackgroundTodoHistory with
  prioritized sections: latest round (850), assistant context (820),
  grouped progress (800)
- Add progress signal guidance to system prompt
- Update policy tests: 18 cases covering all decision paths
- classifyTool: context/meaningful/excluded/unknown/subagent (6 tests)
- extractTarget: file paths, terminal, tests, subagent, unparseable (7 tests)
- collectAllRounds: history + current ordering, empty (2 tests)
- compressHistory: empty, single round, grouping by file, meaningful-first
  sorting, excluded filtering, truncation, assistant context (8 tests)
- renderGroupedProgress: empty, dedup, context count (3 tests)
- renderLatestRound: targets, empty response (2 tests)
Previously executePass was invoked for both Run and Wait decisions, relying on the processor to coalesce. That coupling made it harder to reason about when the bg agent fires. Tighten the call site to only fire on Run, and pass an ILogService into the processor so its policy/coalescing decisions show up in logs.
Policy: drop the context-only firing branch so research-only requests (e.g. "read these files and summarise") with many read_file/list_dir calls but no mutating actions never trigger the bg agent. The initial-request case now also waits for activity instead of guessing a plan from the user message alone.

History compression: pass toolCallResults into compressHistory and extract subagent (search/explore/execution/run) outputs as ISubagentDigest entries so the bg agent sees what exploration discovered, not just that it happened. Raise the latest-round assistant response cap to 1500 chars (older rounds stay at 400) and additionally surface the longest mid-trajectory assistant message — the typical place the agent states its plan.

Logging: add structured debug logs for every policy decision, pass start/completion, coalesced delta, and history-compression summary so background agent behaviour is traceable end-to-end.
Rewrites the background todo system prompt to bias toward complete, forward-looking plans rather than per-file mirroring of recent activity.

- Adds an explicit ABORT CONDITIONS block so research-only prompts ("read the following files", "do NOT write any code"), pure read-only activity, and one-item-per-file temptations short-circuit to silence.

- Adds a PLAN COMPLETENESS section requiring the list to cover the full user request and to be derived from the user request + agent's stated plan, with subagent findings and grouped progress as supporting evidence only.

- Renders the new subagentDigests block at lower priority with a label that explicitly tells the model not to mirror its structure as the todo list, so prompt-tsx prunes it before higher-signal context.
- Default to silence: enumerate the only conditions under which the bg agent should call manage_todo_list (creation, completion, advancing to next, genuinely new work). Forbid re-affirming, re-wording or re-marking 'in-progress'.

- Sequential execution: enforce exactly one 'in-progress' item; require the previous 'in-progress' to be marked 'completed' in the same update before promoting the next item.

- Status transitions: codify the allowed moves and forbid regression from 'completed'.

Reduces churn from speculative or duplicate updates and keeps execution strictly serial.
Previously compressHistory truncated assistant responses to 400/1500 chars and only kept 3 hand-picked snippets (latest + first + longest middle). This dropped useful planning context before prompt-tsx ever saw it.

- Remove MAX_RESPONSE_LENGTH, MAX_LATEST_RESPONSE_LENGTH and the truncateResponse helper.

- extractAssistantContext now returns every non-empty assistant response in chronological order, untruncated.

- BackgroundTodoPrompt renders each snippet as its own UserMessage with priority Math.max(700, 850 - age * 30) so the prompt-tsx renderer prunes the oldest snippets first when the budget is tight.

- Update history specs for the new no-truncation / chronological behaviour.
Extract verbose inline JSX prompt text into reusable constants.
Add granularity rules to prefer 2-4 high-level phase items over
file-level implementation details. Strengthen sequential state and
ordering rules so items complete in order with skip-reordering.
Add new-task deduplication rules. Gate tool calls on an explicit
diff check to avoid redundant no-change updates.
… todos

Add executeFinalReview() that fires after the agent loop ends so the
last round's completions are not stuck as in-progress. Cache the most
recent execution context and guard against duplicate finalize passes.
Store the pending work callback alongside the pending delta so
coalesced finalize passes retain their isFinalReview closure.
Extract tool notes (explanation/description/goal) and surface them
in the latest-round detail. Support multi_replace_string_in_file
targets by reading paths from replacements[]. Pick the last
manage_todo_list call when the model emits multiple in one response.
Cover executeFinalReview no-ops (no context, no todos created).
Verify coalesced pending delta runs its own queued work callback.
Add extractTarget tests for multi-edit single/few/many file paths.
Test that explanation/description notes attach to latestRound summaries.
… referenced

Co-authored-by: Copilot <copilot@github.com>
Copilot AI review requested due to automatic review settings May 2, 2026 23:20
@vritant24 vritant24 self-assigned this May 2, 2026
@vritant24 vritant24 marked this pull request as draft May 2, 2026 23:21
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 2, 2026

Screenshot Changes

Base: 85f836e2 Current: d83916b0

Changed (2)

agentSessionsViewer/WithBadge/Dark
Before After
before after
agentSessionsViewer/WithBadge/Light
Before After
before after

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a background todo-maintenance path to the Copilot agent flow so todo lists can be updated by a separate copilot-fast pass instead of relying on the main agent model to call manage_todo_list directly. It touches the agent intent/prompt pipeline, adds a new background processor + prompt, and updates prompt behavior/tests around todo enablement.

Changes:

  • Adds a session-scoped background todo processor and delta tracker that decide when to run, render a compressed history prompt, and invoke manage_todo_list.
  • Wires the feature into AgentIntent, including experiment-gated tool availability and prompt instruction suppression.
  • Adds supporting tests and documentation for enablement, policy, delta tracking, history compression, and processor state handling.
Show a summary per file
File Description
extensions/copilot/src/platform/configuration/common/configurationService.ts Adds the experiment-backed config key for the background todo agent.
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoProcessor.spec.ts Adds unit tests for processor state transitions, coalescing, cancellation, and final review behavior.
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoPolicy.spec.ts Adds policy tests for run/wait/skip decisions based on prompt context and tool activity.
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoHistory.spec.ts Adds tests for tool classification, history compression, rendering, and subagent digest extraction.
extensions/copilot/src/extension/prompts/node/agent/test/backgroundTodoDelta.spec.ts Adds tests for delta tracking, metadata, and peek/commit semantics.
extensions/copilot/src/extension/prompts/node/agent/openai/gpt5Prompt.tsx Suppresses some todo/planning guidance when the background agent is enabled.
extensions/copilot/src/extension/prompts/node/agent/openai/gpt53CodexPrompt.tsx Suppresses some todo/planning guidance for the Codex prompt variant.
extensions/copilot/src/extension/prompts/node/agent/openai/gpt52Prompt.tsx Suppresses some todo/planning guidance for the GPT-5.2 prompt variant.
extensions/copilot/src/extension/prompts/node/agent/openai/gpt51Prompt.tsx Suppresses some todo/planning guidance for the GPT-5.1 prompt variant.
extensions/copilot/src/extension/prompts/node/agent/defaultAgentInstructions.tsx Adds the shared hideTodoPromptInstructions prompt prop and uses it in alternate GPT instructions.
extensions/copilot/src/extension/prompts/node/agent/backgroundTodoPrompt.tsx Introduces the prompt-tsx element used to drive the background todo model.
extensions/copilot/src/extension/prompts/node/agent/backgroundTodoProcessor.ts Implements the background processor, execution flow, history compression, and telemetry.
extensions/copilot/src/extension/prompts/node/agent/backgroundTodoDelta.ts Implements per-session delta/high-watermark tracking for new tool-call rounds.
extensions/copilot/src/extension/prompts/node/agent/agentPrompt.tsx Threads the new hide-todo-instructions prop into prompt selection.
extensions/copilot/src/extension/prompts/node/agent/BACKGROUND_TODO_AGENT.md Documents architecture, enablement logic, request flow, and tests for the feature.
extensions/copilot/src/extension/intents/node/test/backgroundTodoEnablement.spec.ts Adds tests for explicit todo-tool enablement and agent tool gating.
extensions/copilot/src/extension/intents/node/agentIntent.ts Wires background todo processors into agent sessions, tool filtering, prompt props, and final-review execution.

Copilot's findings

  • Files reviewed: 16/16 changed files
  • Comments generated: 11

Comment thread extensions/copilot/src/extension/prompts/node/agent/openai/gpt51Prompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/openai/gpt52Prompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoDelta.ts Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoProcessor.ts Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoProcessor.ts Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoProcessor.ts Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoPrompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/openai/gpt5Prompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/openai/gpt53CodexPrompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/defaultAgentInstructions.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/agentPrompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoProcessor.ts Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoProcessor.ts Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/backgroundTodoDelta.ts Outdated
@vritant24 vritant24 marked this pull request as ready for review May 4, 2026 03:14
vritant24 and others added 3 commits May 3, 2026 20:14
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Comment thread extensions/copilot/src/extension/prompts/node/agent/openai/gpt51Prompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/openai/gpt53CodexPrompt.tsx Outdated
Comment thread extensions/copilot/src/extension/prompts/node/agent/openai/gpt5Prompt.tsx Outdated
bhavyaus
bhavyaus previously approved these changes May 4, 2026
vritant24 and others added 2 commits May 3, 2026 20:47
roblourens
roblourens previously approved these changes May 4, 2026
@vritant24 vritant24 merged commit db9c113 into main May 4, 2026
26 checks passed
@vritant24 vritant24 deleted the dev/vritant24/bg-todo-watcher branch May 4, 2026 04:43
@vs-code-engineering vs-code-engineering Bot added this to the 1.119.0 milestone May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants