Skip to content

Implement MCP and SDK streaming messages#114

Merged
rgambee merged 68 commits intomainfrom
streaming-sdk
Feb 12, 2026
Merged

Implement MCP and SDK streaming messages#114
rgambee merged 68 commits intomainfrom
streaming-sdk

Conversation

@dschwarz26
Copy link
Contributor

See for example:

Screenshot 2026-02-08 at 12 32 56 PM

dschwarz26 and others added 5 commits February 8, 2026 12:29
SDK (task.py):
- Add ProgressInfo dataclass and progress extraction from status endpoint
- Update await_task_completion with stderr output, JSONL logging, and
  progress callback (triggers on any count change, not just completed)
- Add session URL display and ETA estimates

MCP server (server.py):
- Add everyrow_agent_submit, everyrow_rank_submit (non-blocking submit)
- Add everyrow_progress (12s server-side blocking, chaining instructions)
- Add everyrow_results (fetch data, save CSV, cleanup session)
- Update auth from whoami to get_billing (v0.2.0 API change)
- Keep existing blocking tools for backward compatibility

Plugin + hooks:
- Add PostToolUse hooks for submit/progress/results tracking
- Add Stop hook guard to prevent agent stopping during active tasks
- Add SessionEnd cleanup hook
- Add status line script for persistent progress bar

Includes 33 MCP tests, 5 hook test suites, e2e verification script.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add structured_output=False to all MCP tool decorators to suppress
  FastMCP's structuredContent generation (Claude Code displays JSON
  blob otherwise)
- Server writes /tmp/everyrow-task.json directly instead of hooks
  parsing tool_response (avoids fragile double-escaped JSON)
- Simplify hook scripts to no-ops (submit/progress) since server
  handles state
- Add dev-claude.sh for local development with plugin + local engine
- Fix verify_transcript.sh to use substring matching for tool names

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@mcp.tool(name="everyrow_progress", structured_output=False)
async def everyrow_progress(params: ProgressInput) -> list[TextContent]:
"""Check progress of a running everyrow task. Blocks ~12s before returning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why block?

_active_tasks: dict[str, dict[str, Any]] = {}

PROGRESS_POLL_DELAY = 12 # seconds to block in everyrow_progress before returning
TASK_STATE_FILE = "/tmp/everyrow-task.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a pretty bad idea, and blocks anyone from using more than one Claude to do anything with everyrow. As far as I can see, we always know the task context we're interested in when handling this file, so could we call this /tmp/everyrow-task-{task-id}.json instead?


# Track active tasks for the submit/poll pattern.
# Maps task_id -> {session, client, total, session_url, started_at}
_active_tasks: dict[str, dict[str, Any]] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to make the server stateful here? seems like it shouldn't be needed?

Comment on lines +724 to +733
_active_tasks[task_id] = {
"session": session,
"session_ctx": session_ctx,
"client": client,
"total": total,
"session_url": session_url,
"started_at": time.monotonic(),
"input_csv": params.input_csv,
"prefix": "ranked",
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced any of this is needed. maybe an @cache function somewhere for task id-> session, but everything else seems like unnecessary state. especially in the context of potentially moving this MCP to the web



def _default_progress_output(progress: ProgressInfo, total: int, elapsed: float) -> None:
"""Print a progress line to stderr."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why stderr? is this to avoid stdio output stream getting filled with stuff not related to MCP messages?
seems like bad practice to dump stuff to stderr

ARCHITECTURE.md Outdated
- `everyrow_merge` — Join two CSVs by intelligent entity matching
- `everyrow_agent` — Run web research agents on each row

Submit/poll tools (for long-running operations):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not re-use the _async naming pattern of the sdk? submit doesn't seem very descriptive

ARCHITECTURE.md Outdated
- `everyrow_progress` — Poll task status (blocks ~12s server-side, returns progress text)
- `everyrow_results` — Retrieve completed results, save to CSV

All tools use `@mcp.tool(structured_output=False)` to suppress FastMCP's `structuredContent` field. Without this, Claude Code displays raw JSON blobs instead of clean text (see [claude-code#9962](https://github.com/anthropics/claude-code/issues/9962)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it matter much what claude code sees? I'd probably optimise MCP user experience for claude desktop/GUI clients. It's possible that they also display it weirdly...

ARCHITECTURE.md Outdated
Long-running operations (agent_map, rank) use a submit/poll pattern because:
- Operations take 1–10+ minutes
- LLMs cannot tell time and will hallucinate if asked to wait ([arXiv:2601.13206](https://arxiv.org/abs/2601.13206))
- Client-side timeouts (60s in Codex CLI) kill blocking calls
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we used a streaming api, it could start by returning the task id, then if the timeout kills it the client can fallback to fetching/polling with the additional endpoint? that way we smoothly transition from short blockable tasks, to those that just miss out on the timeout, without killing the task if the timeout is hit

ARCHITECTURE.md Outdated
Comment on lines +116 to +117
Stop:
Runs `everyrow-stop-guard.sh`, reads `/tmp/everyrow-task.json`. If a task is running, outputs `{"decision": "block", "reason": "..."}` which prevents Claude from ending its turn. The reason text instructs Claude to call `everyrow_progress` to check status.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait so we continually burn through CC tokens until the task finishes, trapping CC in a loop of needing to call everyrow_progress?

ARCHITECTURE.md Outdated

Server-Sent Events (SSE): Would replace polling with push notifications. Adds complexity (persistent connections, reconnection logic) for marginal gain. The 12s polling cadence already provides smooth UX. MCP's stdio transport doesn't support SSE natively.

Hook-based state tracking: The original design had PostToolUse hooks on `_submit` and `_progress` tools parse `tool_response` JSON and write the task state file. This was fragile because plugin MCP tool responses are double-escaped JSON strings (`{"result": "<escaped>"}`) that required careful parsing. Moving state writes into the MCP server itself (`_write_task_state()`) was simpler and more reliable.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't follow why we need state at all

@rgambee
Copy link
Contributor

rgambee commented Feb 12, 2026

I think I've addressed all the feedback. This PR is rather unwieldy, so I'm planning to merge it shortly and fix any remaining issues in follow-up PRs.

Some CI failures are expected since the MCP package depends on the published SDK package (see #146). I will verify that the errors are due to that and not something else.

@rgambee rgambee merged commit daef48d into main Feb 12, 2026
9 checks passed
@rgambee rgambee deleted the streaming-sdk branch February 12, 2026 23:05
@rgambee rgambee mentioned this pull request Feb 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants