Conversation
SDK (task.py): - Add ProgressInfo dataclass and progress extraction from status endpoint - Update await_task_completion with stderr output, JSONL logging, and progress callback (triggers on any count change, not just completed) - Add session URL display and ETA estimates MCP server (server.py): - Add everyrow_agent_submit, everyrow_rank_submit (non-blocking submit) - Add everyrow_progress (12s server-side blocking, chaining instructions) - Add everyrow_results (fetch data, save CSV, cleanup session) - Update auth from whoami to get_billing (v0.2.0 API change) - Keep existing blocking tools for backward compatibility Plugin + hooks: - Add PostToolUse hooks for submit/progress/results tracking - Add Stop hook guard to prevent agent stopping during active tasks - Add SessionEnd cleanup hook - Add status line script for persistent progress bar Includes 33 MCP tests, 5 hook test suites, e2e verification script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add structured_output=False to all MCP tool decorators to suppress FastMCP's structuredContent generation (Claude Code displays JSON blob otherwise) - Server writes /tmp/everyrow-task.json directly instead of hooks parsing tool_response (avoids fragile double-escaped JSON) - Simplify hook scripts to no-ops (submit/progress) since server handles state - Add dev-claude.sh for local development with plugin + local engine - Fix verify_transcript.sh to use substring matching for tool names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
||
| @mcp.tool(name="everyrow_progress", structured_output=False) | ||
| async def everyrow_progress(params: ProgressInput) -> list[TextContent]: | ||
| """Check progress of a running everyrow task. Blocks ~12s before returning. |
| _active_tasks: dict[str, dict[str, Any]] = {} | ||
|
|
||
| PROGRESS_POLL_DELAY = 12 # seconds to block in everyrow_progress before returning | ||
| TASK_STATE_FILE = "/tmp/everyrow-task.json" |
There was a problem hiding this comment.
This seems like a pretty bad idea, and blocks anyone from using more than one Claude to do anything with everyrow. As far as I can see, we always know the task context we're interested in when handling this file, so could we call this /tmp/everyrow-task-{task-id}.json instead?
|
|
||
| # Track active tasks for the submit/poll pattern. | ||
| # Maps task_id -> {session, client, total, session_url, started_at} | ||
| _active_tasks: dict[str, dict[str, Any]] = {} |
There was a problem hiding this comment.
do we need to make the server stateful here? seems like it shouldn't be needed?
| _active_tasks[task_id] = { | ||
| "session": session, | ||
| "session_ctx": session_ctx, | ||
| "client": client, | ||
| "total": total, | ||
| "session_url": session_url, | ||
| "started_at": time.monotonic(), | ||
| "input_csv": params.input_csv, | ||
| "prefix": "ranked", | ||
| } |
There was a problem hiding this comment.
I'm not convinced any of this is needed. maybe an @cache function somewhere for task id-> session, but everything else seems like unnecessary state. especially in the context of potentially moving this MCP to the web
src/everyrow/task.py
Outdated
|
|
||
|
|
||
| def _default_progress_output(progress: ProgressInfo, total: int, elapsed: float) -> None: | ||
| """Print a progress line to stderr.""" |
There was a problem hiding this comment.
why stderr? is this to avoid stdio output stream getting filled with stuff not related to MCP messages?
seems like bad practice to dump stuff to stderr
ARCHITECTURE.md
Outdated
| - `everyrow_merge` — Join two CSVs by intelligent entity matching | ||
| - `everyrow_agent` — Run web research agents on each row | ||
|
|
||
| Submit/poll tools (for long-running operations): |
There was a problem hiding this comment.
why not re-use the _async naming pattern of the sdk? submit doesn't seem very descriptive
ARCHITECTURE.md
Outdated
| - `everyrow_progress` — Poll task status (blocks ~12s server-side, returns progress text) | ||
| - `everyrow_results` — Retrieve completed results, save to CSV | ||
|
|
||
| All tools use `@mcp.tool(structured_output=False)` to suppress FastMCP's `structuredContent` field. Without this, Claude Code displays raw JSON blobs instead of clean text (see [claude-code#9962](https://github.com/anthropics/claude-code/issues/9962)). |
There was a problem hiding this comment.
does it matter much what claude code sees? I'd probably optimise MCP user experience for claude desktop/GUI clients. It's possible that they also display it weirdly...
ARCHITECTURE.md
Outdated
| Long-running operations (agent_map, rank) use a submit/poll pattern because: | ||
| - Operations take 1–10+ minutes | ||
| - LLMs cannot tell time and will hallucinate if asked to wait ([arXiv:2601.13206](https://arxiv.org/abs/2601.13206)) | ||
| - Client-side timeouts (60s in Codex CLI) kill blocking calls |
There was a problem hiding this comment.
if we used a streaming api, it could start by returning the task id, then if the timeout kills it the client can fallback to fetching/polling with the additional endpoint? that way we smoothly transition from short blockable tasks, to those that just miss out on the timeout, without killing the task if the timeout is hit
ARCHITECTURE.md
Outdated
| Stop: | ||
| Runs `everyrow-stop-guard.sh`, reads `/tmp/everyrow-task.json`. If a task is running, outputs `{"decision": "block", "reason": "..."}` which prevents Claude from ending its turn. The reason text instructs Claude to call `everyrow_progress` to check status. |
There was a problem hiding this comment.
wait so we continually burn through CC tokens until the task finishes, trapping CC in a loop of needing to call everyrow_progress?
ARCHITECTURE.md
Outdated
|
|
||
| Server-Sent Events (SSE): Would replace polling with push notifications. Adds complexity (persistent connections, reconnection logic) for marginal gain. The 12s polling cadence already provides smooth UX. MCP's stdio transport doesn't support SSE natively. | ||
|
|
||
| Hook-based state tracking: The original design had PostToolUse hooks on `_submit` and `_progress` tools parse `tool_response` JSON and write the task state file. This was fragile because plugin MCP tool responses are double-escaped JSON strings (`{"result": "<escaped>"}`) that required careful parsing. Moving state writes into the MCP server itself (`_write_task_state()`) was simpler and more reliable. |
There was a problem hiding this comment.
I still don't follow why we need state at all
End users don't need to override the default API URL
It's subject to change. I don't want the docs to fall out of date.
These async variants have been around for a while, and we haven't felt the need to document them. I'm not convinced it's worth it now.
|
I think I've addressed all the feedback. This PR is rather unwieldy, so I'm planning to merge it shortly and fix any remaining issues in follow-up PRs. Some CI failures are expected since the MCP package depends on the published SDK package (see #146). I will verify that the errors are due to that and not something else. |
See for example: