Skip to content

v2.3.0

Choose a tag to compare

@github-actions github-actions released this 10 May 04:13
· 78 commits to main since this release
1e22fd2

Highlights

Parser hotfix. Strong models (Claude, GPT-4o, etc.) wrap their AgentOutput JSON in ```json ... ``` markdown fences ~30% of the time despite the system-prompt instruction not to. In v2.2.0 the strict parser failed on fenced output, fell back to an empty action[], and RunAgent::execute aborted via the 2-streak stall guard at step 0 — making the executor effectively unusable on most modern Claude/GPT-4o calls.

v2.3.0 makes the parser defensive without weakening the prompt.

What's new

ras-agent — fence-tolerant parser

  • parse_agent_output extracted into its own module (ras_agent::application::parse_output).
  • Two-stage strategy:
    • Fast path: serde_json::from_str on the raw content (zero allocation, unchanged behavior for unfenced output).
    • Slow path: strip a leading ```json, ```JSON, ```Json, or plain ``` opening fence + a trailing ``` closing fence + surrounding whitespace, then retry.
  • If both paths fail, falls through to the existing empty-action fallback. Never panics on malformed model output.

Prompt — unchanged

The "no markdown fences" instruction stays as a hint. The parser is now defensive because strong models ignore the hint anyway; the prompt change wouldn't have fixed it.

Tests

  • 8 new unit tests in parse_output covering:
    • unfenced JSON (existing behavior)
    • ```json … ``` with newline after the open fence
    • ```JSON … ``` and ```Json … ``` (case variants)
    • ``` … ``` with no language tag
    • leading/trailing whitespace around the fence block
    • fenced but invalid JSON → empty-action fallback
    • unfenced and invalid JSON → empty-action fallback
  • New integration test agent_recovers_from_markdown_fenced_response — feeds a fenced response through ScriptedLlm + RunAgent, asserts output.action is non-empty and the navigate call reaches the mock BrowserPort.

Reproduction

Before v2.3.0:

$ RAS_MODEL=anthropic/claude-haiku-4.5 cargo run --example claude_code_oauth_cosmium
ras_agent::application::run_agent: model returned empty action list (streak=1); treating as stalled
ras_agent::application::run_agent: model returned empty action list (streak=2); treating as stalled
ras_agent::application::run_agent: agent stalled: 2 consecutive empty action lists, aborting
[done] (no final result returned)

After v2.3.0: the fenced JSON parses, navigate reaches the browser, the loop progresses past step 0.

Migration

No code changes required — drop-in replacement for 2.2.0. cargo update -p ras-agent --precise 2.3.0 (or any workspace crate; the workspace bumps together).

Compatibility

  • No public API changes.
  • No breaking changes.
  • Workspace MSRV unchanged.

Verification

  • cargo test --workspace --no-fail-fast — all suites pass (8 new unit tests + 1 new integration test green)
  • cargo clippy --workspace --all-targets -- -D clippy::unwrap_used -D clippy::dbg_macro — clean
  • cargo fmt --all -- --check — clean
  • cargo doc --workspace --no-deps — clean

Artifacts

  • Linux x86_64: ras-x86_64-unknown-linux-gnu, ras-daemon-x86_64-unknown-linux-gnu
  • macOS arm64: ras-aarch64-apple-darwin, ras-daemon-aarch64-apple-darwin
  • crates.io: all ras-* workspace crates published at 2.3.0 once publish.yml finishes

Pull requests

  • #22fix(agent): parse markdown-fenced LLM responses (v2.3.0)
  • #23release: v2.3.0 (parser hotfix)

Full changelog: v2.2.0...v2.3.0