You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During rollout testing for #550, I noticed that every rollout I was getting 1 503, caused by a connection timeout between the gateway and the backend. Without request tracing, it's too hard in the current logs to identify which request is actually failing correlated with the LB logs in GKE, but the timeout is recorded 3m13s after the first request is received by vLLM, and my backend timeout is 3m, so it's one of the very early requests.
The test in this case is generating load such that the p99 request latency is about 87s, so it's unlikely the request itself requires 3m to complete. I don't see a preemption record.
Deserves further investigation with request tracing on.
The startup probe delays readiness start until 2025-03-21T13:11:22Z and you can see the first request timestamp happens very close to then (EPP starts sending traffic to the backend fairly quickly due to ready -> endpointslice -> EPP propagation)
During rollout testing for #550, I noticed that every rollout I was getting 1 503, caused by a connection timeout between the gateway and the backend. Without request tracing, it's too hard in the current logs to identify which request is actually failing correlated with the LB logs in GKE, but the timeout is recorded 3m13s after the first request is received by vLLM, and my backend timeout is 3m, so it's one of the very early requests.
The test in this case is generating load such that the p99 request latency is about 87s, so it's unlikely the request itself requires 3m to complete. I don't see a preemption record.
Deserves further investigation with request tracing on.
The startup probe delays readiness start until
2025-03-21T13:11:22Z
and you can see the first request timestamp happens very close to then (EPP starts sending traffic to the backend fairly quickly due to ready -> endpointslice -> EPP propagation)GKE L7LB logs:
The text was updated successfully, but these errors were encountered: