Skip to content

Replace the LiteLLM HTTP proxy in access-control-service with a direct frontend → LiteLLM flow #5422

@Yicong-Huang

Description

@Yicong-Huang

Task Summary

access-control-service/.../AccessControlResource.scala hosts two JAX-RS classes — LiteLLMProxyResource (/chat/*) and LiteLLMModelsResource (/models) — that exist only to forward HTTP requests to LiteLLM with the deployment's master API key. PR #5421 hardened these by adding @RolesAllowed("REGULAR", "ADMIN"), but a JVM service whose only job is to copy a request bytewise to another HTTP service is the wrong architecture:

  • doubles request latency (frontend → access-control-service → LiteLLM)
  • couples LLM availability to the access-control-service deployment
  • forces texera Scala code to maintain a hand-rolled HTTP proxy (headers strip / forward, status passthrough, error wrapping — every line a regression risk)
  • forces the LLM API key to live in the same process that handles unrelated routing concerns

Possible replacements:

  • Frontend talks to LiteLLM directly through Envoy / API gateway, with auth checked at the gateway and a short-lived per-user LLM token issued by access-control-service
  • Move the proxy to a generic reverse-proxy in front of the cluster (NGINX, Envoy) with auth offloaded to JWT validation at the edge
  • Use a managed AI gateway product (Vercel AI Gateway, LiteLLM's own UI gateway) instead of running our own JAX-RS class

PR #5421 leaves the existing proxy in place so the hardening can ship without a bigger architectural change; this issue tracks the replacement.

Task Type

  • Refactor / Cleanup

Metadata

Metadata

Labels

No labels
No labels

Type

No fields configured for Task.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions