Replace curl-download with paginated inline results + widget JSON endpoint#232
Replace curl-download with paginated inline results + widget JSON endpoint#232
Conversation
In HTTP mode, everyrow_results now returns a small schema preview (5 rows) plus a curl command to download the full CSV, instead of paginating up to 200 rows of JSON into the context window. This saves tokens and lets Claude use code execution to analyse data with pd.read_csv(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…N endpoint - Default page_size=50 rows returned inline (was curl command) - Widget fetches full dataset via results_url (?format=json endpoint) - Remove curl_mode, _build_curl_result_response, curl_preview_rows setting - Remove Redis-unavailable fallback (Redis always available) - Fix docker-compose.local.yaml to use variable substitution for MCP_SERVER_URL - Tool docstring instructs agent to ask user how many rows to ingest Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@claude code review |
This comment was marked as outdated.
This comment was marked as outdated.
The progress tool's completion message now tells the agent to ask the user how many rows they'd like to review before fetching results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| session_url, | ||
| mcp_server_url=mcp_server_url, | ||
| ) |
There was a problem hiding this comment.
Bug: The function try_store_result incorrectly returns None when its type hint specifies list[TextContent], and the caller does not handle the None case.
Severity: HIGH
Suggested Fix
Update try_store_result to handle the expired token case without returning None. For example, it could raise an appropriate HTTP exception or return an empty list. Alternatively, change the function's return type hint to list[TextContent] | None and ensure the calling function everyrow_results_http correctly handles the None case.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: everyrow-mcp/src/everyrow_mcp/tools.py#L925-L932
Potential issue: The function `try_store_result` is type-hinted to return
`list[TextContent]`, but it can return `None` if a poll token has expired (typically
after 24 hours). The calling function, `everyrow_results_http`, does not handle the
`None` case and returns it directly. This mismatch between the declared return type and
the actual return value will lead to a runtime error when a client attempts to retrieve
results for a task whose poll token has expired, breaking the results retrieval flow for
older tasks.
Did we get this right? 👍 / 👎 to inform future reviews.
… endpoint - try_store_result: raise RuntimeError instead of returning None when poll token expires (matches tightened return type) - JSON download endpoint: use pandas + _sanitize_records instead of csv.DictReader to preserve numeric types and handle NaN/Inf Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add size guard (413) to ?format=json to prevent memory exhaustion - Add X-Content-Type-Options: nosniff header to JSON responses - Widget retry button now mints fresh download token instead of reusing consumed one - Strengthen progress completion message to prevent premature everyrow_results calls - Tell agent it can only see N of total rows (full dataset visible in widget) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. Atomic pop_download_token via Lua script (TOCTOU race fix) 2. Re-enable network policy in values.yaml 3. Lower JSON conversion threshold to 10MB (memory amplification) 4. Validate format query param — reject unknown values with 400 5. Use dtype=str in pd.read_csv for JSON conversion path 6. Widget uses fresh token on every full-results fetch (no stale URL) 7. Remove pre-minted results_url — widget mints tokens on demand Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All four instruction locations now consistently require asking the user how many rows to load — no more "if small, use default" loopholes: - System instructions (app.py) - Progress completion message (tool_helpers.py) - Tool docstring (tools.py) - Parameter description (models.py) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Skip widget JSON for offset > 0 — the widget already has all data from the first page, so subsequent pages only need the text summary - Revert Lua script for pop_download_token back to simple TTL + GETDEL; the TOCTOU race has no exploitable consequence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The model was telling users "load into our conversation" which is confusing. Updated all 5 instruction locations to say "load into my context so I can read them" and clarify the user always has full access via the widget. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@claude code review |
|
Claude finished @RafaelPo's task in 1m 51s —— View job Code Review
Clean PR — the architecture shift from curl-download to paginated inline results + widget JSON endpoint is well-motivated and the test coverage is solid. A few items worth addressing: 1. Unhandled exception from
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Case-insensitive search across all columns with 150ms debounce. Input expands on focus and clears when new data loads. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dtype=str forced all values to strings in the JSON response, breaking numeric sorting/filtering in the widget. _sanitize_records already handles NaN/Inf safely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Search input now sits in the session-link row, right-aligned - Widget hides itself entirely when it receives non-widget data (e.g. text summary from offset > 0 page requests) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Body starts with display:none and is only shown when ontoolresult receives recognized widget data (fetch_full_results, preview, or array). Eliminates the brief flash on subsequent page requests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ontoolresult fires per TextContent item. The second item (summary text) was hiding the widget that the first item (JSON) just showed. Now non-JSON events are simply ignored instead of hiding the body. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Setting display="" removes the inline style, falling back to the CSS rule (display:none). Must explicitly set display:block. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
display:none hides content but the iframe still takes up space. Using height:0 + overflow:hidden makes the iframe auto-collapse to zero. Restored to height:auto with padding when widget has data. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| max_json_size = settings.max_upload_size_bytes // 5 | ||
| if len(csv_text) > max_json_size: |
There was a problem hiding this comment.
Bug: The code compares the character count of csv_text using len() against max_json_size, a limit in bytes. This can underestimate the true memory size for multi-byte UTF-8 characters.
Severity: MEDIUM
Suggested Fix
The check should compare the byte length of the string, not the character count. Encode the string to bytes before checking its length: if len(csv_text.encode('utf-8')) > max_json_size:.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: everyrow-mcp/src/everyrow_mcp/routes.py#L308-L309
Potential issue: A check intended to prevent memory exhaustion compares the character
count of `csv_text` with a byte-based limit, `max_json_size`. Since `len()` on a string
returns the number of characters, not bytes, this check is inaccurate for data
containing multi-byte UTF-8 characters (e.g., accented or CJK characters). A string that
passes the character count check could be significantly larger in bytes, potentially
bypassing the intended memory limit and leading to excessive memory usage or
Out-of-Memory errors during the subsequent JSON conversion.
Summary
?format=jsonto download endpoint so the widget can fetch the full dataset independently viaresults_urldocker-compose.local.yamlto use variable substitution forMCP_SERVER_URL(was hardcoded, breaking tunnel setups)curl_mode,_build_curl_result_response,curl_preview_rowssetting, and Redis-unavailable fallbackTest plan
🤖 Generated with Claude Code