Hosted-mode (premium request): agent loop does not autonomously call write/edit tools — even with permissions auto-allowed


### Summary

The Copilot CLI's hosted-mode (no BYOK; default Copilot Enterprise
premium-request path) authenticates and runs the agent loop
correctly, but **does not autonomously call `write` / `edit` /
`create_file` tools** even when the prompt explicitly asks for code
to be generated and the workspace is empty. The agent makes
exploratory tool calls (`view`, `glob`, `shell`, `grep`) and emits
text responses, then closes the session without writing any files.

This makes the Copilot CLI unsuitable for unattended autonomous
codegen workloads (e.g. CI pipelines, factory tiers) regardless of
which model is selected.

### Environment

- `copilot` CLI: `0.0.353` / `1.0.36-0`
- `github-copilot-sdk`: `0.3.0` (Python wrapper)
- Auth: GitHub fine-grained PAT (`github_pat_*`) with Copilot scope
  in `COPILOT_GITHUB_TOKEN` env var
- Models tried (selected via SDK `model=` parameter):
  `claude-sonnet-4.5`, `gpt-5`

### Reproducer

```python
import asyncio
from copilot import CopilotClient, SubprocessConfig

async def main() -> None:
    client = CopilotClient(
        config=SubprocessConfig(
            github_token="<github_pat_with_copilot_scope>",
            use_logged_in_user=True,
        ),
    )
    session = await client.create_session(
        model="claude-sonnet-4.5",                    # also seen with gpt-5
        working_directory="/tmp/empty-dir",
        github_token="<github_pat_with_copilot_scope>",
        on_permission_request=async_handler_returning_allow,
    )
    # Realistic codegen prompt — full NLSpec/LangSpec inlined,
    # detailed task description, target package layout, acceptance
    # criteria, the works. Same prompt that produces 14 passing
    # files via Anthropic's Claude Agent SDK.
    response = await session.send_and_wait(prompt, timeout=60 * 30)
    print(await session.get_messages())  # zero write tool calls

asyncio.run(main())
```

### Expected behavior

The agent — given an explicit prompt asking for code to be written
to a workspace, with permissions auto-granted — calls `write_file`
/ `edit_file` / `create_file` (or whatever tool name the CLI's tool
catalog uses) to produce the requested artifacts. Same prompt
produces working code via Anthropic's Claude Agent SDK
(`claude_agent_sdk.query`) and OpenAI's Codex SDK
(`codex_app_server.Codex`).

### Actual behavior

* 12 successful LLM round-trips against `claude-sonnet-4.5` over
  ~110 seconds, ~12 premium requests consumed.
* Tool calls observed in the Copilot CLI log: `view` ×4, `glob`
  ×3, `shell` ×2, `grep` ×1. **Zero `write` / `edit` /
  `create_file` calls.**
* Session closes cleanly without errors. Workspace stays empty.
* Same outcome with `gpt-5` instead of Claude Sonnet 4.5
  (different scaffolding test under BYOK→OpenAI direct).
* Same outcome with `qwen3-coder-next:51b` under BYOK→Ollama (where
  the CLI auto-allows but the model still doesn't commit Writes).

The pattern is consistent: the Copilot CLI's agent loop appears
biased against autonomous file writes regardless of the underlying
model. Likely the CLI's system prompt or tool-use posture is tuned
for interactive developer assistance ("answer the question,
suggest, let the human review") rather than autonomous codegen.

### Why this matters

We're standing up a code-agent abstraction layer above Microsoft
Agent Framework where each vendor's CLI / SDK is a peer adapter.
Anthropic's Claude Agent SDK (`claude-sonnet-4.5`) and OpenAI's
Codex SDK (`gpt-5.4` via Foundry) both produce passing test suites
on the same prompt that yields zero files via Copilot CLI. The
Copilot CLI is the outlier; the model is fine.

For organizations with GitHub Copilot Enterprise quota who want to
use Copilot CLI for unattended autonomous codegen (CI runners,
factory tiers, batch jobs), this behavior gap is a hard block —
even though the auth, billing, and wire layers all work.

### Suggested fixes

1. **Add a `permission_mode: "bypassPermissions"`-style option**
   that biases the agent loop toward autonomous file writes when
   the calling environment is non-interactive, mirroring the Claude
   Agent SDK's `permission_mode="bypassPermissions"`. The
   `on_permission_request` callback handles the wire-level
   approval; the agent loop's "should I write or just suggest"
   disposition is separate and currently leans "suggest."
2. **Document the autonomous-codegen posture clearly** so operators
   know the CLI is interactive-first and wouldn't expect it to
   write files autonomously without explicit prompting tweaks.
3. **Surface a system-prompt override / agent-mode selector** so
   the SDK caller can opt the agent loop into
   "implement-and-write" mode (the current behavior would remain
   the default for human-in-the-loop usage).

### Workarounds we don't think work cleanly

- Prompt engineering ("you MUST write the files using `write_file`
  tool, ABSOLUTELY DO NOT just describe what you would do"): the
  agent's posture appears system-prompt-rooted, not user-prompt
  steerable in our experiments.
- Pre-creating empty stub files in the workspace: the agent calls
  `view` to inspect them but still doesn't `edit` to populate.
- Different model selection: tested with `claude-sonnet-4.5`,
  `gpt-5`, and `qwen3-coder-next:51b` (BYOK Ollama). Same outcome
  across all three.

---


---
**Companion issue:** #

**Related (different scope):**
- #1543 — Agent exits plan mode and modifies code without user permission (opposite direction — too eager)
- #3043 — System prompt labels shell commands as tools (different prompt-surface bug)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hosted-mode (premium request): agent loop does not autonomously call write/edit tools — even with permissions auto-allowed #3209

Summary

Environment

Reproducer

Expected behavior

Actual behavior

Why this matters

Suggested fixes

Workarounds we don't think work cleanly

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Hosted-mode (premium request): agent loop does not autonomously call write/edit tools — even with permissions auto-allowed #3209

Description

Summary

Environment

Reproducer

Expected behavior

Actual behavior

Why this matters

Suggested fixes

Workarounds we don't think work cleanly

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions