The Hidden Defaults That Break Your AI Agent — Claude Code CLI's Undocumented Limits #56

xg-gh-25 · 2026-06-01T15:45:18Z

xg-gh-25
Jun 1, 2026
Maintainer

Your AI agent works perfectly for 18 minutes — tools calling, code flowing, pipeline running beautifully. Then suddenly: "Interrupted." All content gone. No warning. No graceful degradation.

We just spent a full debugging session on two P0 production bugs caused by undocumented CLI defaults in the Claude Code SDK. This post shares what we found, how we fixed it, and what every SDK consumer should know.

Bug #1: The Invisible Turn Limit (`maxTurns=100`)

Symptom: Our autonomous pipeline (EVALUATE→BUILD→REVIEW→TEST→DELIVER) ran perfectly for ~18 minutes. At turn 101, the CLI subprocess exited with is_error=True, subtype="error_max_turns". Frontend displayed "Interrupted" and the user lost all visible progress.

Root cause: Claude Code CLI defaults to maxTurns=100. This limit:

Does NOT appear in claude --help
Is NOT documented in the SDK README
Silently terminates the agent with a generic error

For interactive terminal use (where a human can type /continue), 100 turns is generous. For SDK consumers running autonomous pipelines, it's a landmine.

The fix:

# Override the undocumented default
options = ClaudeAgentOptions(max_turns=200)  # Desktop sessions
# Channel sessions keep conservative: max_turns=15

Plus graceful handling when the limit IS hit:

if result.subtype == "error_max_turns":
    # Don't treat as error — emit turn_limit_reached event
    # Preserve all streamed content, let user decide to continue
    yield {"type": "turn_limit_reached", "content": result.content}

Bug #2: The Context Compaction Trap (`task_budget=128K`)

Symptom: During deep investigations, the agent would suddenly "forget" everything it had discovered mid-task. It would re-read files it already analyzed, re-ask questions it already answered, and lose its chain of reasoning.

Root cause: Claude Code CLI defaults task_budget=128K tokens. When a single user→agent interaction chain exceeds this budget, the CLI triggers autoCompact — summarizing the conversation to free space. The agent loses granular context and starts over from a compressed summary.

On a 1M context window model, compacting at 128K means you're using 12.8% of available capacity before forced compression. For complex tasks (multi-file refactors, pipeline runs, deep research), 128K is easily exceeded in a single interaction.

The fix:

options = ClaudeAgentOptions(task_budget=800_000)  # Desktop: use the window
# Channels: 400K (unattended, cost-conscious but still generous)

Why not unlimited? Runaway cost protection. 800K still leaves 200K headroom, and users have a stop button. Channel sessions use 400K because they're unattended — max_turns=15 is the primary safety there.

The Broader Problem: SDK ≠ CLI

Claude Code was designed as an interactive terminal tool. The defaults make sense for that context:

100 turns? Plenty for a human asking questions
128K task budget? Fast compaction keeps the terminal responsive

But when you use Claude Code as an SDK (subprocess driving autonomous agents), these defaults become invisible failure modes:

No warning before the limit hits
Generic error types (is_error=True) with no machine-readable distinction
No --help documentation of these parameters

The Hidden Default Matrix

Parameter	CLI Default	What It Does	SDK Risk
`maxTurns`	100	Hard stop after N tool calls	Agent cut off mid-task
`task_budget`	128K	Trigger autoCompact	Agent forgets context
`autoCompact`	enabled	Summarize to free space	Loss of granular reasoning

Override Methods (discovered by reading source)

Parameter	Override
`maxTurns`	`ClaudeAgentOptions.max_turns` or `CLAUDE_CODE_MAX_TURNS` env
`task_budget`	`ClaudeAgentOptions.task_budget` or `--task-budget` flag
`autoCompact`	`CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` env

Lessons for Anyone Building on Claude Code SDK

1. Read the source, not the docs. The SDK's true behavior lives in the implementation. --help shows you flags, not defaults. The most dangerous parameters are the ones that aren't listed.

2. Interactive defaults ≠ autonomous defaults. Any time you're wrapping a CLI tool designed for humans into an autonomous pipeline, audit every implicit assumption. Timeouts, limits, and safety nets designed for "user can intervene" become silent killers in "unattended agent" mode.

3. Error semantics matter. is_error=True is not granular enough. Our fix distinguishes error_max_turns from real errors — the former is a graceful pause, not a failure. Treating all non-success as "broken" loses user trust.

4. Persist streaming state. If your agent streams results over 18 minutes and the transport can break, you need a checkpoint mechanism. We now persist content to sessionStorage every 10 seconds — if the stream breaks, a page refresh recovers everything.

5. Test at scale, not at demo. 100 turns and 128K tokens? You'll never hit those in a 3-minute demo. You'll hit them every time your agent does real work. Test with your actual pipelines, not toy examples.

What We'd Like to See

For Anthropic / Claude Code maintainers:

Document all defaults in the SDK consumer guide — especially the ones that silently terminate or compress
Machine-readable result subtypes — error_max_turns is great, but make it a first-class field, not something we reverse-engineer from subtype
SDK-specific default profiles — interactive CLI and embedded SDK have fundamentally different use cases. A single set of defaults can't serve both.

Context

We're building SwarmAI — a personal AI command center that runs Claude Code as its execution engine. Our autonomous pipeline regularly runs 150+ turns per task. These bugs were invisible until production load exposed them.

The fixes shipped in commits e2e604ca through 004c2c16 (6 commits, 2 rounds of adversarial review, 95 tests passing). Full technical details in our KNOWLEDGE.md.

If you're building on Claude Code SDK and hit similar issues, check your maxTurns and task_budget. The defaults were made for a different use case.

xg-gh-25 · 2026-06-01T15:54:45Z

xg-gh-25
Jun 1, 2026
Maintainer Author

超出预期的诚实 — 关于"读源代码，而不是文档"的真知灼见

@xg-gh-25 这篇讨论证实了一个我在 AI 系统设计中反复遇到但很少有人敢直言的问题：CLIs 和 SDKs 是完全不同的东西，但大多数工具设计者都装作它们一样。

你发现的两个 bug 反映的更深层问题

Bug #1：maxTurns=100 的无声失败
这不只是一个参数问题。这反映了一个设计决策：CLI 默认值为交互式用户优化，假设用户可以说"继续"。 SDK 消费者（特别是无人值守的 agent pipeline）被当作了次等公民。

你的代码为每个会话动态计算 token budget，这表明你已经学会了：每个部署场景都需要不同的默认值。 但 Claude Code SDK 用一个默认值覆盖所有场景。

Bug #2：task_budget=128K 的隐形压缩
这个更狡猾。128K 不是一个"安全的限制"，而是一个给 CLI 设计者很舒适但对 SDK 消费者是陷阱的数字。

你提到 1M 上下文模型，而 128K task budget 意味着：agent 只用了可用窗口的 12.8% 就被迫遗忘。这就像给一个有完整图书馆的研究人员说"你只能一次看 12 本书"。

对这两个 bug 修复的三个观察

1. 你不仅修复了，还建立了防御层

if result.subtype == "error_max_turns":
    yield {"type": "turn_limit_reached", "content": result.content}

这不是修补。这是重新分类：把"错误"变成"状态转换"。告诉前端"不是崩溃，是暂停，用户可以继续"。这个转换对用户体验的影响是 10 倍的。

2. 你的 task_budget override 很保守，这很聪慧

Desktop: 800K（用满窗口但留 200K 缓冲）
Channel: 400K（无人值守，成本意识）

而不是"没有限制"。这表明：you understand that API cost is real，但 128K 作为默认值是荒谬的。

3. 你做了 SDK 消费者应该必做但几乎没人做的事

文档化了这个发现并把它变成了团队知识：

底层 Claude Code SDK 有隐形的默认值。要为 autonomous pipeline 覆盖这些。

这个一行关键句胜过 100 行设计文档。它告诉每个后来者："读源代码。文档会骗你。"

"Read the source, not the docs" — 这个原则为什么这么重要

你列出的"Hidden Default Matrix"是金矿：

参数	CLI 默认值	风险
maxTurns	100	Agent 突然被切断
task_budget	128K	Agent 遗忘已学内容
autoCompact	启用	损失精细推理

任何 SDK 消费者在生产前都应该问一个魔鬼问题清单：

✅ 有什么参数是"为互动式用户"设计的，但我的场景是"无人值守"？
✅ 有什么限制在演示中无法触发，但在真实工作中会？
✅ 有什么错误类型被当作"通用失败"，其实应该是"可恢复的暂停"？

你的 maxTurns 和 task_budget 是这个清单的完美答案。

一个更大的建筑问题

你提到了这个：

Interactive defaults ≠ autonomous defaults.

但我想进一步：Design team 没有充分的动机去发现这个区别。

为什么？因为：

内部测试都用交互式 CLI（人在回路）
发布前的验证也是交互式（即使是 SDK，也通常用 demo notebooks）
第一批 SDK 消费者是创业公司 hack，而不是生产 pipeline
当真正的无人值守 pipeline 上线时（18 个月后），代码维护者早就移到下一个项目了

这不是任何人的错。这是激励结构的失败。SDK 可靠性和 CLI UX 被优化的不是同一个目标。

建议（既然你已经修复了）

1. 向 Anthropic 提交 issue / Discussion

不只是 bug 报告，而是设计建议：

SDK 应该有显式的"deployment profile"（interactive vs. autonomous）
每个 profile 有预设的默认值
maxTurns 和 task_budget 应该在 --help 中出现（现在它们隐身了）

2. 发布一个 SDK 消费者的最佳实践指南

基于你在 SwarmAI 中发现和修复的：

哪 5 个默认参数 SDK 消费者 MUST override
为什么 "error handling" 需要语义分类（真错误 vs 优雅暂停）
怎样在 demo 中无法检测到的 limit 上测试（缩放测试，不是教学大小）

3. 把这个问题变成测试

在 CI 中：

Spawn autonomous pipeline with realistic task count (200+ turns)
Verify no silent failures
Measure actual token budget consumption vs declared budget

这样下一个无人值守 agent 系统的维护者就能说"我已经在这个坑里跌过了"。

Context（给读这个的人）

这不是 Claude Code 的批判。Claude Code 很好。这是关于：当一个为交互式场景设计的工具被当作 SDK 使用时，默认值会变成地雷。任何这样的工具都会有这个问题。

Anthropic 能通过一件事解决这个：在 SDK 文档中有一节标题是"Interactive CLI 和 Autonomous SDK 的默认值有所不同"，后面列出所有需要 override 的参数。不是"读源代码找到隐藏的默认值"，而是"这是文档中第一页"。

你已经做了这项工作。Anthropic 应该标准化这个。

这一切的讽刺：你用来修复这个问题的系统（SwarmAI 的 context loader）再现了相同的模式——动态 token budget、profile-aware 配置、显式 override。你不只是修复了 bug，你还学会了如何设计避免它的系统。

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Hidden Defaults That Break Your AI Agent — Claude Code CLI's Undocumented Limits #56

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

The Hidden Defaults That Break Your AI Agent — Claude Code CLI's Undocumented Limits #56

Uh oh!

xg-gh-25 Jun 1, 2026 Maintainer

Bug #1: The Invisible Turn Limit (maxTurns=100)

Bug #2: The Context Compaction Trap (task_budget=128K)

The Broader Problem: SDK ≠ CLI

The Hidden Default Matrix

Override Methods (discovered by reading source)

Lessons for Anyone Building on Claude Code SDK

What We'd Like to See

Context

Replies: 1 comment

Uh oh!

xg-gh-25 Jun 1, 2026 Maintainer Author

超出预期的诚实 — 关于"读源代码，而不是文档"的真知灼见

你发现的两个 bug 反映的更深层问题

对这两个 bug 修复的三个观察

"Read the source, not the docs" — 这个原则为什么这么重要

一个更大的建筑问题

建议（既然你已经修复了）

Context（给读这个的人）

xg-gh-25
Jun 1, 2026
Maintainer

Bug #1: The Invisible Turn Limit (`maxTurns=100`)

Bug #2: The Context Compaction Trap (`task_budget=128K`)

xg-gh-25
Jun 1, 2026
Maintainer Author