-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Bug Report
Source: tiny-model-ground-truth parity checker (0/59 passing)
Severity: Critical — blocks ALL programmatic parity checks (50/59)
apr version: 0.2.16 (0f0bcae)
Related: #239 (now fixed — inference works for SmolLM/Qwen2)
Description
After #239 fix, apr run now successfully runs inference on SmolLM-135M and Qwen2-0.5B APR models. However, the --json flag (and --format json) is completely ignored. Output is always human-readable with ANSI color codes, decorative headers, and markdown formatting — never JSON.
This blocks all programmatic parity checking since parity_check.py cannot parse the output.
Evidence
# Both --json and --format json produce identical human-readable output
$ apr run models/smollm-135m-int8.apr -p "The capital of France is" -n 32 --json 2>/dev/null
=== APR Run ===
Source: models/smollm-135m-int8.apr
Output:
\`\`\`
I would like to know if there is a way to get the capital of a country...
Completed in 5.87s (cached)Expected JSON output:
{
"text": "...",
"tokens": [1234, 5678, ...],
"token_count": 32,
"model": "models/smollm-135m-int8.apr"
}Note: --json IS listed in apr run --help and is accepted without error. Debug logs correctly go to stderr. The issue is only with stdout formatting.
What Needs to Happen
When --json or --format json is passed:
- Suppress ANSI color codes and decorative headers on stdout
- Output a JSON object with at minimum:
text,tokens(token IDs),token_count - Keep debug/info logs on stderr only
Impact
- Blocks 50/59 parity checks (all SmolLM + Qwen2 canary, token-parity, quant-drift, roundtrip)
- Any tooling that pipes
apr runoutput cannot parse results
Reproduction
cd tiny-model-ground-truth
make convert
apr run models/smollm-135m-int8.apr -p "test" -n 32 --json 2>/dev/null
# Expected: JSON on stdout
# Actual: Human-readable text with ANSI codes