Skip to content

feat: wire plumbing — real query data, ProxyError, streaming body limit#59

Merged
yairfalse merged 2 commits into
mainfrom
feat/wire-plumbing
Mar 21, 2026
Merged

feat: wire plumbing — real query data, ProxyError, streaming body limit#59
yairfalse merged 2 commits into
mainfrom
feat/wire-plumbing

Conversation

@yairfalse
Copy link
Copy Markdown
Collaborator

Summary

  • Real query data: rauta status, rauta routes list, rauta diagnose now return live gateway state instead of empty placeholders
  • ProxyError enum: Eliminates e.starts_with("TIMEOUT:") string matching — errors are now typed with proper HTTP status codes
  • Streaming body limit: http_body_util::Limited enforces 10MB cap during reading, not after full collection (prevents OOM)

What changed

Router — new list_routes() method returns Vec<RouteSnapshot> with backend health scores and draining status

CircuitBreakerManager — new snapshot_all() and open_count() for iterating all breakers via ArcSwap

RateLimiter — new snapshot_all(), exhausted_count(), TokenBucket::capacity(), TokenBucket::refill_rate()

LocalGatewayQuery — fully wired:

  • snapshot() reports real open_circuits and exhausted_rate_limiters
  • list_routes() / list_circuit_breakers() / list_rate_limiters() return real data with filtering
  • diagnose() feeds real route/CB/RL data to the diagnostics engine

ProxyError — replaces Result<..., String> in forward_to_backend and handle_request:

  • ProxyError::Timeout → 504
  • ProxyError::BackendError → 502
  • ProxyError::BodyTooLarge → 413
  • From<String> impl for backward compat

Body streaminghttp_body_util::Limited::new(body, MAX_BODY_SIZE) wraps the body stream, aborting read at 10MB instead of buffering everything first

Test plan

  • 220 tests pass across 5 crates
  • clippy clean, fmt clean
  • Manual: curl localhost:9091/api/v1/status shows real open_circuits count
  • Manual: curl localhost:9091/api/v1/routes shows configured routes

🤖 Generated with Claude Code

Wire LocalGatewayQuery to return real data:
- Router.list_routes() exposes all routes as RouteSnapshot with backend
  health scores and draining status
- CircuitBreakerManager.snapshot_all() and open_count() iterate the
  ArcSwap breaker map
- RateLimiter.snapshot_all() and exhausted_count() iterate the ArcSwap
  bucket map, expose TokenBucket.capacity() and refill_rate()
- LocalGatewayQuery.snapshot() now reports real open_circuits and
  exhausted_rate_limiters counts
- list_routes/list_circuit_breakers/list_rate_limiters support filtering
  and return sorted results
- diagnose() passes real route/CB/RL data to the diagnostics engine

Wire ProxyError through the proxy (eliminates string matching):
- forward_to_backend returns Result<..., ProxyError> instead of String
- Timeout errors use ProxyError::Timeout (not "TIMEOUT:" prefix)
- Body limit errors use ProxyError::BodyTooLarge
- record_request_metrics uses e.status_code()/e.status_str() instead
  of e.starts_with("TIMEOUT:")
- error_to_response takes ProxyError, maps via status_code()
- From<String> impl provides backward compat for legacy error sites

Streaming body limit (prevents OOM):
- Use http_body_util::Limited to enforce 10MB cap during streaming
- Body reading stops at the limit instead of collecting everything first
- Returns 413 (ProxyError::BodyTooLarge) without buffering oversized data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR wires the control-plane “query” surfaces (status/routes/diagnose) to real in-process gateway state, replaces stringly-typed proxy errors with a structured ProxyError, and enforces request body size limits while streaming to prevent buffering large payloads.

Changes:

  • Add snapshot/list APIs on Router, CircuitBreakerManager, and RateLimiter to expose live route/CB/RL state to admin/CLI/MCP queries.
  • Introduce ProxyError with HTTP status mapping, and propagate it through request handling/forwarding/metrics.
  • Enforce a 10MB request body cap during streaming reads using http_body_util::Limited.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
control/src/proxy/router.rs Adds list_routes() to expose configured routes as agent-api snapshots including backend draining/health hints.
control/src/proxy/request_handler.rs Switches handler pipeline to ProxyError, updates metrics/error mapping, and keeps retry logic working under typed errors.
control/src/proxy/rate_limiter.rs Exposes token bucket capacity/refill rate and adds snapshot_all() + exhausted_count() for admin queries.
control/src/proxy/forwarder.rs Uses ProxyError and enforces streaming body size limit with http_body_util::Limited.
control/src/proxy/circuit_breaker.rs Adds snapshot_all() + open_count() for admin status/diagnostics.
control/src/error.rs Introduces ProxyError enum with status mapping and legacy From<String> conversion.
control/src/admin/local_query.rs Replaces placeholder query responses with live router/CB/RL data and adds filtering/sorting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread control/src/proxy/forwarder.rs Outdated
Comment on lines +105 to +106
let err_str = e.to_string();
if err_str.contains("length limit exceeded") {
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Body-size limit detection is currently based on e.to_string().contains("length limit exceeded"), which is brittle and may break if the error message changes. Prefer matching on the concrete error type/variant returned by http_body_util::Limited (or using an API like is_limit_exceeded if available) rather than string inspection.

Suggested change
let err_str = e.to_string();
if err_str.contains("length limit exceeded") {
// Detect body-size limit violations using the concrete error type
// produced by `http_body_util::Limited`, rather than inspecting
// the error message string.
if e.is::<http_body_util::LengthLimitError>() {

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 589fa15 — now uses e.is::<http_body_util::LengthLimitError>() for type-safe detection instead of string matching.

Comment thread control/src/proxy/forwarder.rs Outdated
"Request body exceeded size limit"
);
return ProxyError::BodyTooLarge {
size: MAX_BODY_SIZE + 1,
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProxyError::BodyTooLarge is constructed with size: MAX_BODY_SIZE + 1, which is not the actual received size and can be misleading in logs/API responses. If the limiting layer doesn’t expose the real size, consider reporting size: MAX_BODY_SIZE (or omitting/setting to 0/None) and keeping max as the authoritative limit.

Suggested change
size: MAX_BODY_SIZE + 1,
size: MAX_BODY_SIZE,

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 589fa15 — now reports size: 0 since the actual received size is unknown when streaming was aborted at the limit. The max field still shows the configured cap.

Comment on lines +794 to 798
Err(ref e) => {
let status_code = e.status_code();
let status_str = e.status_str();

HTTP_REQUESTS_TOTAL
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ProxyError now carrying client/proxy-side conditions (e.g., BodyTooLarge = 413), treating every Err(e) uniformly as a backend error can skew metrics/health. Consider branching on the ProxyError variant here so client-side errors don’t get counted the same way as backend failures (and aren’t later attributed to backend health/circuit breaker).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 589fa15BodyTooLarge errors now skip record_backend_response and circuit_breaker.record_failure. Only backend-originated errors (Timeout, BackendError) affect health metrics.

Comment thread control/src/error.rs Outdated
Comment on lines +74 to +77
/// Allow `?` on String errors (legacy compatibility during migration)
impl From<String> for ProxyError {
fn from(s: String) -> Self {
ProxyError::BackendError { message: s }
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From<String> for ProxyError maps all legacy String errors to BackendError (502). This can misclassify internal/proxy errors that still use String with ?, leading to incorrect HTTP status/metrics. Consider removing this blanket conversion once migration is complete, or narrowing it (e.g., only for known backend-connection strings) and updating call sites to construct the correct ProxyError variant explicitly.

Suggested change
/// Allow `?` on String errors (legacy compatibility during migration)
impl From<String> for ProxyError {
fn from(s: String) -> Self {
ProxyError::BackendError { message: s }
/// Allow `?` on String errors (legacy compatibility during migration).
/// Generic `String` errors are treated as internal/filter failures (500),
/// not as backend connection errors (502), to avoid misclassification.
impl From<String> for ProxyError {
fn from(s: String) -> Self {
ProxyError::FilterError { message: s }

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added transitional docs in 589fa15. The From<String> impl is explicitly marked as a migration shim with a note that callers should construct specific variants directly. Will be removed once all error sites are migrated.

Comment on lines 658 to +662
let retry_req = match retry_builder.body(Full::new(Bytes::new())) {
Ok(r) => r,
Err(e) => {
last_result = Err(format!("Failed to build retry request: {}", e));
last_result = Err(ProxyError::BackendError {
message: format!("Failed to build retry request: {}", e),
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When rebuilding the retry request, ensure the retry URI preserves the original request’s query string. Currently the retry path construction earlier in this block uses path (from req.uri().path()), so retries can silently drop ?query=... parameters and change GET/HEAD semantics.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 589fa15 — retry URI now uses path_and_query (from req.uri().path_and_query()) instead of just path. Query parameters like ?foo=bar are preserved across retries.

- Use LengthLimitError type check instead of string matching for body
  limit detection (fixes brittle e.to_string().contains() pattern)
- Report size: 0 in BodyTooLarge since actual size is unknown when
  streaming was aborted at the limit
- Don't count BodyTooLarge (client error) against backend health/circuit
  breaker — only backend-originated errors affect health metrics
- Preserve query string in retry URIs (was silently dropping ?query=...)
- Add transitional docs to From<String> for ProxyError

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yairfalse yairfalse merged commit 56d8ca4 into main Mar 21, 2026
2 checks passed
@yairfalse yairfalse deleted the feat/wire-plumbing branch March 21, 2026 15:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants