feat(ai-proxy): abort upstream read on client disconnect during streaming#13254
Conversation
…ming When a downstream client disconnects mid-stream, the proxy was continuing to read all remaining chunks from the LLM, performing SSE parsing, token counting, and protocol conversion unnecessarily. Fix by passing wait=true to lua_response_filter in the streaming path. ngx.flush(true) returns an error when the client connection is gone, at which point we close the upstream httpc connection and return early. Changes: - plugin.lua: add optional wait param to lua_response_filter; return (ok, err) so callers can detect client disconnection - ai-providers/base.lua: use wait=true in parse_streaming_response output loop; on flush failure close upstream and return immediately
There was a problem hiding this comment.
Pull request overview
This PR improves APISIX’s AI proxy streaming behavior by detecting downstream client disconnects and promptly aborting the upstream streaming read, avoiding wasted CPU and LLM quota.
Changes:
- Extend
apisix.plugin.lua_response_filterwith an optionalwaitparameter to allow synchronous flushing and surface disconnect errors via(ok, err)returns. - Update the AI streaming loop to flush synchronously, detect disconnects, and close the upstream HTTP client immediately.
- Add an integration test that simulates a client disconnect and asserts upstream streaming stops early.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
apisix/plugin.lua |
Adds wait option and return values to enable disconnect detection during streaming flush. |
apisix/plugins/ai-providers/base.lua |
Uses wait=true during streaming output and closes upstream on flush failure. |
t/plugin/ai-proxy-client-disconnect.t |
New integration test validating upstream abort behavior after downstream disconnect. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ct handler - update docstring to accurately reflect that lua_response_filter always returns (ok, err) regardless of wait parameter - avoid passing nil to ngx_flush: explicitly call ngx_flush(true) when wait==true, ngx_flush() otherwise - extract abort_on_disconnect local helper in parse_streaming_response to deduplicate the log+close+mark-done pattern Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sconnect test Lua-style comments (--) placed outside content_by_lua_block are parsed as nginx directives, causing nginx to fail to start with 'unknown directive "--"'. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Lua table serialization via cjson does not guarantee field order, so the response may have output_tokens before input_tokens. Use a lookahead regex to match both fields regardless of order. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Also fixed a pre-existing test failure in |
The curl to port 9100 was firing before the etcd stream-route change propagated to the stream workers, causing 'matched route: null' and skipping DNS resolution. Add a 1s sleep after the admin PUT to let the config sync complete before making the test request. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Fixed the |
…and improve test robustness
# Conflicts: # apisix/plugins/ai-providers/base.lua
Summary
When a downstream client disconnects mid-stream (browser tab closed, Ctrl+C, request cancelled), the proxy continues reading all remaining chunks from the LLM and performing SSE parsing, token counting, and protocol conversion — burning CPU and LLM API quota for no benefit.
Root Cause
lua_response_filterusedngx.flush()(async, no wait), which never surfaces client disconnection errors. There was no mechanism to detect a dead downstream in the streaming loop.Fix
Add an optional
waitparameter tolua_response_filter. Whenwait=true, it usesngx.flush(true)(synchronous flush) which returns an error if the client connection is gone. The streaming path inparse_streaming_responsenow passeswait=trueand on flush failure immediately closes the upstream connection and exits the read loop.The
waitparameter defaults tofalseso all existing callers are unaffected.Changes
apisix/plugin.lua: add optionalwaitparam tolua_response_filter; return(ok, err)for disconnect detectionapisix/plugins/ai-providers/base.lua: usewait=trueinparse_streaming_responseoutput loop; on flush failure close upstream and returnt/plugin/ai-proxy-client-disconnect.t: integration test verifying upstream is aborted after client disconnectTest
The new test sets up a slow SSE mock (30ms/chunk, up to 2000 chunks) and a client that reads 3 chunks then closes. It verifies via shared dict that the upstream served well under 50 chunks total (stopped shortly after the disconnect).