Skip to content

fix(grader): support 'arguments' key for tool calls in task_workflow#306

Merged
olearycrew merged 1 commit intopinchbench:mainfrom
mgoulart:fix/grader-workflow-arguments-key
Apr 14, 2026
Merged

fix(grader): support 'arguments' key for tool calls in task_workflow#306
olearycrew merged 1 commit intopinchbench:mainfrom
mgoulart:fix/grader-workflow-arguments-key

Conversation

@mgoulart
Copy link
Copy Markdown
Contributor

Problem

The grader in task_workflow.md checks whether the agent read config.json by inspecting tool call parameters:

params = item.get("params", {})

This only works for agents that serialize tool parameters under the "params" key (Cursor, Windsurf). OpenClaw and Claude Code use "arguments" instead — matching the OpenAI tool call spec. As a result, read_config always scores 0 for those agents even when they correctly read the file.

Fix

Fall back to "arguments" when "params" is absent:

# Support both "params" (Cursor/Windsurf) and "arguments" (OpenClaw/Claude Code)
params = item.get("params", item.get("arguments", {}))

Verified

Confirmed by running task_10_workflow (pre-rename) against kimi-k2p5 on Fireworks. The agent correctly read config.json — visible in the transcript — but scored 0 on read_config before this fix and 1.0 after.

🤖 Generated with Claude Code

OpenClaw and Claude Code serialize tool call parameters under the
'arguments' key, while Cursor and Windsurf use 'params'. The grader
only checked 'params', so read_config always scored 0 for OpenClaw/
Claude Code agents even when they correctly read config.json.

Fix: fall back to 'arguments' when 'params' is absent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Apr 14, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

The fix correctly handles both "params" and "arguments" key formats for tool call parameters, matching the OpenAI spec used by Claude Code/OpenClaw while preserving compatibility with Cursor/Windsurf. The fallback chain item.get("params", item.get("arguments", {})) is safe and idiomatic.

Files Reviewed (1 file)
  • tasks/task_workflow.md

Reviewed by claude-4.6-sonnet-20260217 · 58,447 tokens

@olearycrew
Copy link
Copy Markdown
Member

@mgoulart thanks for this fix!

@olearycrew olearycrew merged commit 816b771 into pinchbench:main Apr 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants