-
Notifications
You must be signed in to change notification settings - Fork 1
Pipeline Design 242
Now I have the full picture. Here's the ADR:
Claude CLI's --output-format json can produce either a JSON array ([{...}]) or a JSON object ({...}). Two functions in sw-loop.sh process this output:
-
_extract_text_from_json()(lines 561-618) — extracts.resulttext from Claude's JSON response. This function was already fixed in prior iterations: it now handles both array and object formats (Case 2, lines 579-606), so the misleading "jq not available" warning (Case 3, line 610) is only reachable when jq is genuinely absent. Tests 17-19 and 23 validate this. -
accumulate_loop_tokens()(lines 511-556) — parses token counts. This function is still broken: line 516 checkshead -c1 | grep -q '\[', so JSON objects (starting with{) fall through to the regex fallback (lines 548-554), silently losing structured token/cost data.
- Bash 3.2 compatibility (no associative arrays, no
readarray) -
jqmay not be available — regex fallback must always exist - Atomic changes only — this is a bugfix, not a refactor
Fix accumulate_loop_tokens() to handle JSON objects the same way _extract_text_from_json() already does — check for both [ and { as the first character, then use the appropriate jq expression (.[-1].usage.* for arrays, .usage.* for objects).
This mirrors the pattern already validated in _extract_text_from_json() and keeps the two JSON-parsing functions symmetric.
Claude CLI (--output-format json)
│
│ JSON array: [{result: "...", usage: {...}}]
│ — OR —
│ JSON object: {result: "...", usage: {...}}
│
├──→ _extract_text_from_json() [ALREADY FIXED]
│ ├─ Case 2a: array + jq → jq '.[-1].result'
│ ├─ Case 2b: object + jq → jq '.result'
│ ├─ Case 3: JSON + no jq → raw copy + warning
│ └─ Case 4: not JSON → raw copy
│
└──→ accumulate_loop_tokens() [NEEDS FIX]
├─ Branch A: array + jq → jq '.[-1].usage.*' ← exists
├─ Branch B: object + jq → jq '.usage.*' ← MISSING
└─ Fallback: regex parsing ← exists
// accumulate_loop_tokens(log_file: string): void
// Precondition: log_file is a path to a file (may not exist)
// Postcondition: LOOP_INPUT_TOKENS, LOOP_OUTPUT_TOKENS, LOOP_COST_MILLICENTS
// are incremented by values parsed from the file
// Error contract: never fails — all parse errors default to 0
// _extract_text_from_json(json_file: string, log_file: string, err_file: string): 0
// Precondition: json_file path provided (file may be empty/missing)
// Postcondition: log_file contains extracted text or placeholder
// Error contract: always returns 0 — never causes pipeline failureaccumulate_loop_tokens(log_file)
→ read first char of file
→ if '{' or '[' AND jq available:
→ if '[': jq '.[-1].usage.input_tokens // 0' (array path)
→ if '{': jq '.usage.input_tokens // 0' (object path) ← NEW
→ accumulate into LOOP_INPUT_TOKENS, LOOP_OUTPUT_TOKENS
→ parse cost_usd or estimate from model rates
→ else:
→ regex fallback (unchanged)
| Component | Error Source | Handling |
|---|---|---|
accumulate_loop_tokens |
jq parse failure | ` |
accumulate_loop_tokens |
missing file | early return 0
|
accumulate_loop_tokens |
non-numeric jq output |
${var:-0} default |
_extract_text_from_json |
jq failure | ` |
-
Unified normalization layer — Pre-process all Claude output into a canonical array format before any parsing.
- Pros: Single parsing path, DRY
- Cons: Adds a new abstraction for a two-function problem; over-engineering for a bugfix; risk of breaking existing array handling
-
Remove the jq path entirely, use only regex — Since regex fallback exists, just always use it.
- Pros: Simpler, no jq dependency
- Cons: Loses structured parsing (cost_usd, cache tokens); regex is fragile and can't extract nested fields; significant accuracy regression
-
Wrap Claude CLI to force array output — Add
| jq '[.]'or similar wrapper.- Pros: Guarantees array format downstream
- Cons: Brittle coupling to Claude CLI internals; breaks if CLI changes; adds a jq dependency to the wrapper itself
-
Files to modify:
-
scripts/sw-loop.sh— Fixaccumulate_loop_tokens()(lines 516-546) to handle{first-char alongside[, using.usage.*instead of.[-1].usage.*for objects -
scripts/sw-loop-test.sh— Add test foraccumulate_loop_tokens()with JSON object input (functional test, not just grep-based)
-
-
Files to create: None
-
Dependencies: None (jq is already optional with regex fallback)
-
Risk areas:
- Arithmetic evaluation in bash: ensure jq output is always numeric before
$(( ))— already handled by${var:-0}pattern - The
head -c1approach assumes no leading whitespace — same assumption as_extract_text_from_json, acceptable since Claude CLI output is deterministic
- Arithmetic evaluation in bash: ensure jq output is always numeric before
-
accumulate_loop_tokens()correctly parsesinput_tokens,output_tokens,cache_read_input_tokens,cache_creation_input_tokens, andtotal_cost_usdfrom a JSON object (not just array) - New functional test: given
{"type":"result","result":"...","usage":{"input_tokens":100,"output_tokens":50},"total_cost_usd":0.003}, token accumulators reflect correct values - Existing Test 23 continues to pass — no "jq not available" warning when jq IS present and output is a JSON object
- All 69 existing sw-loop-test.sh tests continue to pass
- Regex fallback still works when jq is absent (no regression)
- No unrelated changes included in the diff