Bug: Auto-compaction fails to trigger when context window fills up — garbled gzip error responses
The problem
When a Droid CLI session fills its context window, the session hard-fails instead of auto-compacting. The user is left with an unresponsive session and must manually run /compress to recover. This happens consistently every time the context window fills up.
The underlying cause is that the API error response body is gzip-compressed but never decompressed before being parsed. Instead of receiving a readable error message like "prompt is too long: 215043 tokens > 200000 token limit", Droid receives garbled binary data. Because it can't read the error, it classifies it as reason: "unknown" and treats it as an unrecoverable failure rather than a signal to auto-compact.
Manual /compress works immediately after the failure, confirming the compaction system itself is fine.
Related: #161 (auto-compress not triggering with custom models). That issue attributes the problem to missing usage stats. This report identifies a different root cause: gzip-encoded error responses that can't be parsed.
Environment
- Droid CLI: v0.65.0
- OS: macOS 15.4 (darwin 25.3.0)
- Terminal: Ghostty 1.2.3
- Model: Claude Opus routed through a local proxy to use a Claude Pro/Max subscription (BYOK)
How to reproduce
- Use Droid CLI in a long session until the context approaches the model's token limit
- Continue working until the next LLM call would exceed the context window
- Expected: Droid auto-compacts and continues
- Actual: Droid retries 4 times across different providers, all fail with garbled errors, then the session hard-fails
What the logs show
All logs are from ~/.factory/logs/droid-log-single.log.
1. The session was working normally, then hit the context limit
212 successful LLM calls over ~57 minutes. The last successful call shows the context was nearly full:
[Agent] Streaming result | cacheReadInputTokens: 199198, contextCount: 497, outputTokens: 110
The very next call fails.
2. The error response body is gzip-compressed binary, not readable JSON
Every error response has the same structure — a valid JSON envelope with a type field that's readable, but a message field that's garbled binary:
{
"error": {
"message": "\u001f\ufffd\b\u0000\u0000\u0000\u0000\u0000\u0000\u00034\ufffd\n\ufffd0\u0014E\ufffd\ufffd\ufffd\u000e...",
"type": "invalid_request_error"
}
}
The first bytes of the message field are 0x1F 0x8B 0x08 0x00 — the gzip magic header (0x1F 0x8B = gzip identification, 0x08 = deflate compression, 0x00 = no flags). This is gzip-compressed data being treated as a UTF-8 string, which is why many bytes show up as \ufffd (the Unicode replacement character for invalid byte sequences).
Each retry attempt has different compressed bytes (fresh compression each time) but the same approximate length (~160-170 bytes), consistent with a short error message being independently compressed per request. The original error message is unrecoverable from the logs because the invalid bytes were replaced with U+FFFD during UTF-8 encoding.
3. All four retry attempts fail identically across three different upstream providers
Attempt 1: apiProvider: "anthropic" → garbled 400
Attempt 2: apiProvider: "bedrock_anthropic" → garbled 400
Attempt 3: apiProvider: "vertex_anthropic" → garbled 400
Attempt 4: apiProvider: "anthropic" → garbled 400
The same gzip encoding issue occurs across Anthropic direct, AWS Bedrock, and Google Vertex. It's unlikely that all three upstream providers independently gzip-encode their error responses in the same way. The common factor is Factory's gateway layer sitting in front of all three.
4. The error is classified as "unknown", so auto-compaction never fires
After all retries are exhausted, the session fails with:
[Chat route failure] | reason: "unknown", statusCode: 400, severity: "severe"
[metrics_log_agent_progress_count] | outcome: "error"
The reason: "unknown" classification is the key. Whatever logic determines whether an error is a recoverable context-length error (and should trigger auto-compaction) cannot match against garbled binary. So the error falls through to the "unknown" path, and the session dies.
5. Manual compaction works immediately after
Running /compress manually right after the failure succeeds without issue — the summarizer processes all 431 messages and creates a new session. The session continues normally from there. The compaction machinery is fine; it just never gets triggered automatically.
Raw log excerpts
Last successful LLM call before failure
[2026-03-01T18:50:20.362Z] INFO: [Agent] Streaming result | Context: {
"count": 1,
"cacheReadInputTokens": 199198,
"contextCount": 497,
"outputTokens": 110,
"hasReasoningContent": false,
"isByok": true
}
First error (attempt 1 — anthropic provider)
[2026-03-01T18:50:21.817Z] WARN: [useLLMStreaming] LLM error | Context: {
"error": {
"name": "Error",
"message": "400 {\"error\":{\"message\":\"\\u001f\ufffd\\b\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u00034\ufffd\ufffd\\n\ufffd0\\u0014E\ufffd\ufffd\ufffd\\u000e\ufffd\\nJ\\u0006\ufffd:\ufffdW\ufffd\\u0010l\\b\ufffd&/\ufffd$j)\ufffd\ufffdr.\ufffd3a\\u0019\ufffdE\ufffd\ufffd\ufffd\\u0018...\\u0000\\u0000\",\"type\":\"invalid_request_error\"}}",
"stack": "Error: 400 {...}\n at generate (../../node_modules/@anthropic-ai/sdk/core/error.mjs:37:24)\n at makeRequest (../../node_modules/@anthropic-ai/sdk/client.mjs:309:30)\n at processTicksAndRejections (native:7:39)"
},
"attempt": 1,
"modelId": "claude-opus-4-6-thinking-32000",
"apiProvider": "anthropic"
}
Retries across all three providers (attempts 2-4)
[2026-03-01T18:50:24.200Z] WARN: [useLLMStreaming] LLM error | attempt: 2, apiProvider: "bedrock_anthropic"
[2026-03-01T18:50:29.758Z] WARN: [useLLMStreaming] LLM error | attempt: 3, apiProvider: "vertex_anthropic"
[2026-03-01T18:50:35.773Z] WARN: [useLLMStreaming] LLM error | attempt: 4, apiProvider: "anthropic"
All three contain the same garbled gzip message field with "type": "invalid_request_error".
Fatal session failure
[2026-03-01T18:50:47.446Z] WARN: [Chat route failure] | Context: {
"reason": "unknown",
"statusCode": 400,
"severity": "severe"
}
[2026-03-01T18:50:47.449Z] INFO: [metrics_log_agent_progress_count] | Context: {
"outcome": "error",
"errorMessage": "400 {\"error\":{\"message\":\"\\u001f\ufffd\\b...garbled...\",\"type\":\"invalid_request_error\"}}"
}
Manual compaction succeeds immediately after
[2026-03-01T18:51:16.881Z] INFO: [Compaction] Start | reason: "manual", source: "tui"
[2026-03-01T18:51:16.881Z] INFO: [Summarizer] Start | messagesToSummarizeCount: 431
[2026-03-01T18:51:16.881Z] INFO: [Summarizer] Using model | modelId: "claude-opus-4-6-thinking-32000"
[2026-03-01T18:52:24.655Z] INFO: [Summarizer] Response OK | summaryLength: 9136
[2026-03-01T18:52:25.207Z] INFO: [Compaction] End | succeeded: true, compactionDurationMs: 68326, numMessagesRemoved: 431
[2026-03-01T18:52:25.207Z] INFO: [Compaction] New session created
Bug: Auto-compaction fails to trigger when context window fills up — garbled gzip error responses
The problem
When a Droid CLI session fills its context window, the session hard-fails instead of auto-compacting. The user is left with an unresponsive session and must manually run
/compressto recover. This happens consistently every time the context window fills up.The underlying cause is that the API error response body is gzip-compressed but never decompressed before being parsed. Instead of receiving a readable error message like
"prompt is too long: 215043 tokens > 200000 token limit", Droid receives garbled binary data. Because it can't read the error, it classifies it asreason: "unknown"and treats it as an unrecoverable failure rather than a signal to auto-compact.Manual
/compressworks immediately after the failure, confirming the compaction system itself is fine.Related: #161 (auto-compress not triggering with custom models). That issue attributes the problem to missing usage stats. This report identifies a different root cause: gzip-encoded error responses that can't be parsed.
Environment
How to reproduce
What the logs show
All logs are from
~/.factory/logs/droid-log-single.log.1. The session was working normally, then hit the context limit
212 successful LLM calls over ~57 minutes. The last successful call shows the context was nearly full:
The very next call fails.
2. The error response body is gzip-compressed binary, not readable JSON
Every error response has the same structure — a valid JSON envelope with a
typefield that's readable, but amessagefield that's garbled binary:{ "error": { "message": "\u001f\ufffd\b\u0000\u0000\u0000\u0000\u0000\u0000\u00034\ufffd\n\ufffd0\u0014E\ufffd\ufffd\ufffd\u000e...", "type": "invalid_request_error" } }The first bytes of the
messagefield are0x1F 0x8B 0x08 0x00— the gzip magic header (0x1F 0x8B= gzip identification,0x08= deflate compression,0x00= no flags). This is gzip-compressed data being treated as a UTF-8 string, which is why many bytes show up as\ufffd(the Unicode replacement character for invalid byte sequences).Each retry attempt has different compressed bytes (fresh compression each time) but the same approximate length (~160-170 bytes), consistent with a short error message being independently compressed per request. The original error message is unrecoverable from the logs because the invalid bytes were replaced with U+FFFD during UTF-8 encoding.
3. All four retry attempts fail identically across three different upstream providers
The same gzip encoding issue occurs across Anthropic direct, AWS Bedrock, and Google Vertex. It's unlikely that all three upstream providers independently gzip-encode their error responses in the same way. The common factor is Factory's gateway layer sitting in front of all three.
4. The error is classified as "unknown", so auto-compaction never fires
After all retries are exhausted, the session fails with:
The
reason: "unknown"classification is the key. Whatever logic determines whether an error is a recoverable context-length error (and should trigger auto-compaction) cannot match against garbled binary. So the error falls through to the "unknown" path, and the session dies.5. Manual compaction works immediately after
Running
/compressmanually right after the failure succeeds without issue — the summarizer processes all 431 messages and creates a new session. The session continues normally from there. The compaction machinery is fine; it just never gets triggered automatically.Raw log excerpts
Last successful LLM call before failure
First error (attempt 1 — anthropic provider)
Retries across all three providers (attempts 2-4)
All three contain the same garbled gzip
messagefield with"type": "invalid_request_error".Fatal session failure
Manual compaction succeeds immediately after