rf-mcp v0.30.0 Release Notes

Small LLM Optimization (ADR-006 / ADR-007 / ADR-008 / ADR-009 / ADR-010)

This release focuses on making rf-mcp usable with small and medium-sized LLMs (8K-32K context) that previously failed to operate the tool surface reliably. Five new ADRs introduce a layered optimization strategy.

Intent Action Tool (ADR-007)

New intent_action tool provides a library-agnostic abstraction over execute_step. Instead of requiring the LLM to know that Browser Library uses Click while SeleniumLibrary uses Click Element, a single tool handles intent resolution:

intent_action(intent="click", target="text=Login", session_id="...")

Supported intents: navigate, click, fill, hover, select, assert_visible, extract_text, wait_for. The server resolves the intent to the correct keyword and locator syntax based on the session's active library.

Dynamic Tool Profiles (ADR-006)

Tool profiles dynamically control which MCP tools are visible to the LLM. Smaller models see fewer tools with compact descriptions, reducing token overhead from ~7,000 tokens to ~1,000 tokens for the active profile.

Available profiles: browser_exec, api_exec, discovery, minimal_exec, full. Profiles can be activated via manage_session(action="set_tool_profile", tool_profile="browser_exec") or the ROBOTMCP_TOOL_PROFILE environment variable.

Response Optimization (ADR-008)

Configurable response verbosity (detail_level parameter on most tools) with three levels: minimal, standard, full. Reduces response token consumption for models with limited context budgets.

Type-Constrained Parameters (ADR-009)

All string-typed action/mode/strategy parameters now use Literal types, producing enum constraints in the JSON Schema. This eliminates value hallucination (e.g., an LLM inventing action="setup" instead of action="init").

Affected parameters across 9 tools:

manage_session action (20 valid values including aliases)
intent_action intent (8 values)
find_keywords strategy (4 values)
execute_step mode (2 values)
execute_flow structure (3 values)
recommend_libraries mode (5 values)
analyze_scenario context (6 values)
manage_library_plugins action (3 values)
manage_attach action (11 values)
detail_level on all tools (3 values)

All values accept case-insensitive input with whitespace trimming.

Parameter Coercion and Guided Recovery (ADR-010)

Server-side resilience for common small LLM mistakes:

Array coercion: libraries: "[\"Browser\", \"BuiltIn\"]" (string) is automatically parsed to ["Browser", "BuiltIn"] (array). Also handles comma-separated strings and single values.
Deprecated keyword guidance: GET, POST, PUT, DELETE from RequestsLibrary automatically map to their On Session equivalents with a hint in the response.
Session ID hints: Init responses include explicit guidance to reuse the session ID.
Catalog strategy guidance: find_keywords(strategy="catalog") error message now explains that an active session is required.

Concurrency Fix (v0.29.1 → v0.29.2)

Race condition in keyword execution: Removed _suppress_stdout() from the keyword execution path. The os.dup2(2, 1) redirect is process-global and caused a race where concurrent asyncio.to_thread() keyword executions could redirect MCP responses to stderr. console='none' in RobotSettings is sufficient to suppress RF output during runner.run().
BuiltIn keyword availability: Added safety checks before every keyword execution to verify BuiltIn is in the RF namespace's keyword store, re-importing if missing.

OpenCode E2E Testing with Small LLMs

New E2E test infrastructure for validating rf-mcp with small LLMs via OpenRouter:

tests/e2e/test_intent_action_models.py: pytest-based tests verifying intent_action tool discovery and usage across multiple models. Gated by RUN_INTENT_E2E=true.
tests/e2e/run_realistic_e2e.py: Standalone script running realistic multi-step prompts (TodoMVC, REST API, Demoshop) with tool call efficiency metrics.
CI integration: New opencode-e2e job in the weekly E2E workflow runs Qwen3 Coder and GLM-4.5 AIR via OpenRouter.
Model override: OPENCODE_MODELS env var allows overriding the default model list.

Tested models: GLM-4.7, GLM-4.5 AIR, gpt-oss-20b, Qwen3 Coder, Llama-4 Scout, GLM-4.7 Flash.

Navigate Intent Fallback

When intent_action(intent="navigate", target="https://...") fails because no browser or page is open, the server now detects the error and automatically executes the appropriate recovery sequence before retrying:

Browser Library: New Browser + New Page (no browser) or just New Page (page closed)
SeleniumLibrary: Open Browser about:blank chrome

The response includes fallback_applied: true and fallback_steps count so the LLM knows recovery happened. Small LLMs no longer need to handle "no browser open" errors themselves, saving 2-4 tool calls.

Strict Mode Hint Improvement

When a Browser Library keyword fails with "resolved to N elements" (Playwright strict mode), the error hint now:

Shows the actual element count in the message
Suggests >> nth=0 with a note that nth is zero-based
Includes >> nth=1 and >> visible=true alternatives
Uses the actual failing keyword name in examples

Bug Fixes

Pattern store cleanup: cleanup_old_entries(max_age_days=0) now correctly removes all entries (changed > to >= comparison).
Windows CI: Fixed benchmark threshold comparisons and fd-redirect tests for Windows compatibility.
Build test suite newline escaping: build_test_suite now escapes literal \n, \r, \t in keyword arguments so they don't break the generated .robot file's line structure. Previously, an argument like "123 Flow Street\nSan Francisco" would produce a line break in the output instead of an escaped \\n.

Test Suite

Suite	v0.29.0	v0.30.0
Unit	2,286	3,138
Integration	397	510
Benchmarks	140	256

New coverage areas: intent resolution (579 tests), tool profile services (379), response optimization (517), ADR-009 type aliases (663), ADR-010 coercion (840), ADR integration (366), ADR benchmarks (895), intent action E2E (7 models), navigate fallback (47 tests), strict mode hints (6 tests), argument escaping (9 tests).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.30.0

Choose a tag to compare

Sorry, something went wrong.