Task Summary
access-control-service/.../AccessControlResource.scala hosts two JAX-RS classes — LiteLLMProxyResource (/chat/*) and LiteLLMModelsResource (/models) — that exist only to forward HTTP requests to LiteLLM with the deployment's master API key. PR #5421 hardened these by adding @RolesAllowed("REGULAR", "ADMIN"), but a JVM service whose only job is to copy a request bytewise to another HTTP service is the wrong architecture:
- doubles request latency (frontend → access-control-service → LiteLLM)
- couples LLM availability to the access-control-service deployment
- forces texera Scala code to maintain a hand-rolled HTTP proxy (headers strip / forward, status passthrough, error wrapping — every line a regression risk)
- forces the LLM API key to live in the same process that handles unrelated routing concerns
Possible replacements:
- Frontend talks to LiteLLM directly through Envoy / API gateway, with auth checked at the gateway and a short-lived per-user LLM token issued by access-control-service
- Move the proxy to a generic reverse-proxy in front of the cluster (NGINX, Envoy) with auth offloaded to JWT validation at the edge
- Use a managed AI gateway product (Vercel AI Gateway, LiteLLM's own UI gateway) instead of running our own JAX-RS class
PR #5421 leaves the existing proxy in place so the hardening can ship without a bigger architectural change; this issue tracks the replacement.
Task Type
Task Summary
access-control-service/.../AccessControlResource.scalahosts two JAX-RS classes —LiteLLMProxyResource(/chat/*) andLiteLLMModelsResource(/models) — that exist only to forward HTTP requests to LiteLLM with the deployment's master API key. PR #5421 hardened these by adding@RolesAllowed("REGULAR", "ADMIN"), but a JVM service whose only job is to copy a request bytewise to another HTTP service is the wrong architecture:Possible replacements:
PR #5421 leaves the existing proxy in place so the hardening can ship without a bigger architectural change; this issue tracks the replacement.
Task Type