feat: migrate to Praxis filter-based proxy architecture#27
feat: migrate to Praxis filter-based proxy architecture#27franciscojavierarceo wants to merge 1 commit into
Conversation
bc59f20 to
acba3a5
Compare
Replace the Axum HTTP server with Praxis as the core proxy runtime. All request handling logic is now implemented as composable Praxis filters (responses_proxy, ogx_state, agentic_loop, tool_dispatch), wired together via YAML configuration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
acba3a5 to
a4f8dc1
Compare
| @@ -0,0 +1,22 @@ | |||
| admin: | |||
There was a problem hiding this comment.
need documentation on what each fields means in this yaml file and what are they used for.
some fields are confusing to guess what it meant to do. like store_base_url like related to storage but it's not database url?!
Overall the Praxis library is relatively new not sure if it is suitable to rely on it. the maintenance is costly.
it is difficult to review this PR and judge as would require the maintainer on agentic-ap to be familiar with Praxis.
meanwhile natively writing our own http request gateway would allow us flexibility especially in SSE stream and tool calls.
in terms of testing the agentic-api repo functionality as a whole system now it's relying entirely on Praxis is maintained and tested. What If we encounter bugs from Praxis we would need to wait for bug fixes there.
I thought based on the last community meeting we would use OGX for CRUD features as it is a well-maintained project?!
|
Thanks for putting this together. I like the direction of making gateway concerns more composable, and praxis does look like a strong proxy framework. the filter model is useful for auth, rate limit, tenant routing, quota, policy, request validation, header injection, deployment guardrails etc. But i don’t think praxis should be the core boundary for agentic-api, my main concern is that this moves too much of the actual agentic runtime into proxy filters. things like:
These, I feel, are not really generic proxy concerns. they are the main state machine of agentic-api. Splitting this into filters like state_hydration, agentic_loop, tool_dispatch, responses_proxy may look composable, but in practice they are tightly coupled by shared state, response semantics, and stream ordering. i worry we end up encoding the main transaction as middleware, which is harder to reason about and test. Praxis can still be useful, in an architecture like: client / codex cli / agent harness So praxis as an outer gateway in front of agentic-api, not where we decompose the agentic loop itself. And for OGX. I think OGX can be a backend/service provider for built-in tools like file_search, vector stores, files, or other stateful services. but agentic-api should still decide when/how those tools participate in the responses loop. So my recommendation is: don’t make praxis the core architecture boundary for agentic-api. if we support praxis, I’d rather make it an optional outer gateway integration, or a very thin adapter that delegates into an explicit agentic-api orchestration core. |
Captures the three-layer crate design (core library, axum server, thin gateway adapters) and key architectural decisions: - Agentic loop as explicit state machine, not proxy filters - No Python (OGX) in core request paths - Praxis routes to agentic-api as a backend service - Standalone mode is first-class - Gateway adapters are thin (one filter per gateway) Shapes PR vllm-project#24 as the foundation for agentic-core/agentic-server. Supersedes PR vllm-project#27's filter-based decomposition approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Sébastien Han <seb@redhat.com>
- Add note about superseding ADR-01 language decision (D3) - Remove axum from Layer 3 diagram (it belongs in Layer 2) - Soften PR vllm-project#27 language to "if accepted" - Clarify PR vllm-project#24 relationship as forward-looking Signed-off-by: Sébastien Han <seb@redhat.com>
All great points, i'm proposing a design that should align all parties so let's discuss over this ADR #28 :) |
- Use correct Praxis terms: HttpFilter, filter chain, branch chains (not "pipeline nodes", "DAG", or "re-entrance") - Each agentic-core function is wrapped in an HttpFilter, composed into a filter chain with branch support for tool-call looping - Standalone mode uses execute() with plain Rust control flow - PR vllm-project#27 aligns in direction but should delegate to agentic-core functions rather than implementing logic directly in filters Signed-off-by: Sébastien Han <seb@redhat.com>
Summary
Replaces the hand-rolled Axum proxy with Praxis, a composable filter-based reverse proxy framework built on Pingora. Each gateway concern — proxying, auth, state hydration, tool dispatch, agentic looping — is an independent filter wired together via YAML configuration.
Why Praxis
HttpFilterwith hooks for request/response. Filters don't know about each other.What changed
Filters introduced:
responses_proxy— setsctx.upstreamto vLLM's/v1/responsesendpoint and injects auth credentials. Praxis/Pingora handles the actual proxying and streaming natively.state_hydration— stub filter for conversation-state hydration. Inspects request body forprevious_response_idand will call the state store to hydrate conversation history.agentic_loop— stub filter for agentic re-inference. Inspects response body forfunction_calloutput items and will re-enter the inference loop.tool_dispatch— stub filter for tool execution. Inspects response body for tool calls and will dispatch them.Removed:
src/app.rs,src/proxy.rs,src/server.rs— replaced by filters + Praxis server runtimebenches/proxy_bench.rs— benchmark harness for the old Axum proxy (will be re-added)Dependencies:
praxis,praxis-proxy-core,praxis-proxy-filter,praxis-test-utils) via git at rev2f7ea31/v1are normalized to avoid/v1/v1/responsesdouble-prefixDocs:
README.mdwith architecture diagram, filter table, and run instructionsdocs/index.mdwith architecture overview and Praxis contextdocs/architecture/index.mdwith Mermaid diagram, filter pipeline reference, streaming details, and component descriptionsHealth endpoint:
admin: { address: "127.0.0.1:9901" }in config)Filter pipeline
Test plan
cargo buildsucceedscargo clippy --all-targets -- -D warningscleancargo fmt -- --checkcleanpre-commit run --all-filescleantest_non_stream_passthrough— JSON request/response round-triptest_stream_passthrough— SSE streaming passthroughtest_auth_injection— API key injected from configtest_client_auth_precedence— client-supplied auth preservedtest_vllm_http_error_passthrough— upstream 429 forwardedtest_mid_stream_failure_closes_cleanly— partial stream handledtest_connect_error_maps_to_502— unreachable vLLM returns 502