Feature/openwebui litellm deployment #342

jeremyeder · 2025-11-19T01:35:23Z

this is a prototype UX for using openwebui to interact with the new amber codebase agent added in #337

Add a Kubernetes-native deployment of Open WebUI with LiteLLM proxy for chatting with Claude models. This Phase 1 implementation provides a quick, dev-friendly deployment to Kind cluster with minimal configuration. Components: - Base manifests (namespace, deployments, services, PVC, RBAC) - LiteLLM proxy configured for Claude Sonnet 4.5, 3.7, and Haiku 3.5 - Open WebUI frontend with persistent storage - Phase 1 overlay for Kind deployment with nginx-ingress - Comprehensive documentation (README, Phase 1 guide, Phase 2 plan) - Makefile for deployment automation Architecture: - Namespace: openwebui (isolated from ACP) - Ingress: vteam.local/chat (reuses Kind cluster from e2e) - Auth: Disabled in Phase 1 (dev/testing only) - Storage: 500Mi PVC for chat history - Images: ghcr.io/berriai/litellm, ghcr.io/open-webui/open-webui Phase 2 (planned): - OAuth authentication via oauth2-proxy - Long-running Claude Code service for Amber integration - Production hardening (secrets, RBAC, monitoring) - OpenShift compatibility (Routes, SCC compliance) Deployment: ```bash cd components/open-webui-llm # Edit overlays/phase1-kind/secrets.yaml with API key make phase1-deploy # Access: http://vteam.local:8080/chat (Podman) or /chat (Docker) ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

- Increase memory limit from 512Mi to 2Gi to prevent OOMKilled crashes - Increase CPU limit from 500m to 1000m for better performance - Update health probe paths to LiteLLM-specific endpoints: - /health/liveliness for liveness probe - /health/readiness for readiness probe - Increase resource requests for stability Fixes LiteLLM pod crash loop due to insufficient memory allocation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

github-actions · 2025-11-19T01:38:57Z

Claude Code Review

Summary

This PR introduces a well-structured deployment of Open WebUI + LiteLLM for interacting with Claude models via the Amber codebase agent. The implementation is clean, well-documented, and follows Kubernetes best practices with Kustomize overlays. This is clearly marked as "Phase 1" - a development/prototype deployment with authentication disabled, which is appropriate for initial integration testing.

Overall Assessment: Strong implementation with excellent documentation. However, there are critical security issues that must be addressed before merge, plus several important improvements for production readiness.

Issues by Severity

🚫 Blocker Issues

1. Missing SecurityContext - Pod Security Standards Violation ⚠️

Location: base/litellm/deployment.yaml, base/open-webui/deployment.yaml
Issue: No securityContext defined for containers, violating CLAUDE.md backend development standards
Risk: Pods run as root by default, fail on OpenShift, bypass security policies

Required Fix: Add SecurityContext per CLAUDE.md patterns (components/backend/handlers/sessions.go:470+):

securityContext:
  allowPrivilegeEscalation: false
  runAsNonRoot: true
  capabilities:
    drop:
      - ALL
  # readOnlyRootFilesystem: true  # Only if /tmp writes not needed

Why Blocker: Security vulnerability + won't deploy to OpenShift (SecurityContextConstraints)

2. Hardcoded Secrets in Git Repository 🔐

Location: overlays/phase1-kind/secrets.yaml:10-11,22
Issue: Master keys and placeholder API keys committed to git
Risk: Security anti-pattern, even for dev keys
Required Fix:
- Remove secrets.yaml from git, add to .gitignore
- Create secrets.yaml.example with placeholder values
- Update README to instruct: cp secrets.yaml.example secrets.yaml
Why Blocker: Violates security best practices, sets bad precedent

3. Authentication Disabled in Production-Capable Deployment 🔓

Location: base/open-webui/deployment.yaml:36-37
Issue: WEBUI_AUTH: "false" with no clear runtime override mechanism
Risk: Anyone with network access can use UI and consume API quota
Required Fix:
- Keep auth disabled for Phase 1 BUT add prominent warnings in:
  - Deployment manifests (comments)
  - README (⚠️ WARNING sections)
  - Makefile help text
- OR: Enable basic auth by default with documented override for dev
Why Blocker: Too easy to accidentally deploy to production without auth

🔴 Critical Issues

4. Resource Limits Too High for Default Kind Cluster

Location: base/litellm/deployment.yaml:54-55, base/open-webui/deployment.yaml:44-46
Issue: Total limits = 2 CPU, 3Gi RAM. Default Kind cluster has 4 CPU, 8Gi RAM
Impact: May not deploy on constrained environments, wastes resources
Recommendation:
- Lower limits in base: litellm (500m CPU, 1Gi RAM), openwebui (500m CPU, 512Mi RAM)
- Override in production overlay for higher limits
- Add resource quota monitoring in Makefile

5. Using :main and :main-latest Image Tags 🏷️

Location: base/kustomization.yaml:22-24
Issue: Mutable tags mean non-reproducible deployments
Impact: Breaks in prod if upstream changes, hard to debug version issues

Recommendation: Pin to specific SHA or version tags:

- name: ghcr.io/berriai/litellm
  newTag: v1.49.3  # Or sha256:abc123...
- name: ghcr.io/open-webui/open-webui
  newTag: v0.3.19

6. No Network Policies 🌐

Location: Missing entirely
Issue: Pods can egress to any external service
Risk: Data exfiltration, unnecessary attack surface
Recommendation: Add NetworkPolicy to restrict:
- LiteLLM → only Anthropic API (anthropic.com)
- Open WebUI → only LiteLLM service
- Deny all other egress

7. Probes Reference Wrong Endpoints

Location: base/litellm/deployment.yaml:58,64
Issue: Using /health/liveliness but LiteLLM docs show /health/liveness (typo)
Impact: Probes may fail if endpoint doesn't exist (need to verify upstream)
Action: Test actual LiteLLM endpoints or use /health for both

8. No Horizontal Pod Autoscaling

Location: Missing HPA resources
Issue: Single replica can't handle load spikes
Recommendation: Add HPA for both deployments (min: 1, max: 3, target: 70% CPU)

🟡 Major Issues

9. Missing Test Coverage

Location: No test files in components/open-webui-llm/
Issue: No automated tests for deployment health, connectivity, or functionality
Recommendation: Add tests following e2e pattern:
- Smoke test: deploy → wait for ready → curl health endpoints → undeploy
- Integration test: send test message via API → verify response
- See: e2e/cypress/e2e/vteam.cy.ts for reference

10. PVC Has No Backup Strategy

Location: base/open-webui/pvc.yaml
Issue: Chat history lost on PVC deletion, no backup documented
Recommendation:
- Add VolumeSnapshot CRD usage example in docs
- Document export/import procedure in README
- Add Makefile target: make phase1-backup, make phase1-restore

11. Secrets Reference Missing Secret Objects

Location: base/open-webui/deployment.yaml:31-35
Issue: References openwebui-secrets but it's only in overlay, not base
Impact: Breaks Kustomize build if overlay not applied, confusing error
Recommendation: Move secret template to base/ with optional: true flag (already set, but document)

12. Ingress Path Rewrite May Break Assets

Location: overlays/phase1-kind/ingress.yaml:9
Issue: rewrite-target: /$2 may break asset loading if app expects /chat prefix
Action: Test thoroughly, document known issues
Alternative: Use dedicated subdomain (chat.vteam.local) to avoid rewrites

13. No Health Check in Makefile Test Target

Location: Makefile:58-67
Issue: Test target only checks curl success, doesn't validate response
Recommendation: Parse JSON responses, check for expected fields

14. Hardcoded Master Key Exposed in Test Commands

Location: Makefile:65, README examples
Issue: Master key sk-litellm-dev-master-key shown in plaintext
Recommendation: Read from secret at runtime

🔵 Minor Issues

15. Inconsistent Label Usage

Observation: Some resources use app: litellm, others add app.kubernetes.io/name: litellm
Recommendation: Standardize on both everywhere (Kubernetes recommended labels)

16. Documentation Uses Emojis Inconsistently

Location: README.md, PHASE1.md
Issue: CLAUDE.md states "Only use emojis if the user explicitly requests it"
Recommendation: Remove emojis from docs (✅, ❌, ⚠️) for consistency

17. Namespace Not Configurable via Kustomize

Location: base/namespace.yaml:3, Makefile:5
Issue: Hardcoded to openwebui, limits multi-environment deployments
Recommendation: Use Kustomize namespace transformation in overlays

18. No Pod Disruption Budget

Location: Missing PDB resources
Impact: Rolling updates may cause downtime
Recommendation: Add PDB for production overlay (minAvailable: 1)

19. Missing Prometheus ServiceMonitor

Location: No observability resources
Recommendation: Add ServiceMonitor if Prometheus available, document metrics endpoints

20. Verbose ConfigMap Could Be Externalized

Location: base/litellm/configmap.yaml:10-34
Observation: 25-line YAML in ConfigMap, hard to maintain
Recommendation: Mount from file in overlays for easier editing

Positive Highlights

✨ Excellent Documentation

Comprehensive README with troubleshooting section
Clear phase separation (Phase 1 vs Phase 2 planning)
Step-by-step deployment instructions
Thoughtful Makefile with helpful targets

✨ Proper Kustomize Structure

Clean base/overlay separation
Environment-specific patches (pvc-patch, secrets)
Reusable base resources

✨ Good Operational Tooling

Makefile abstracts complexity well
Health check commands included
Shell access targets for debugging
Comprehensive help text

✨ Well-Organized Component Structure

Logical directory layout
Clear separation of concerns (litellm/, open-webui/)
Phase-based roadmap (PHASE2.md shows forward planning)

✨ Recent Commit Shows Iteration

Fixed real issues (memory limits, health probes)
Demonstrates testing and refinement
Good commit message with rationale

Recommendations

Immediate (Before Merge)

✅ Add SecurityContext to both deployments (see Blocker Outcome: Reduce Refinement Time with agent System #1)
✅ Remove secrets.yaml, create .example file (see Blocker Epic: RAT Architecture & Design #2)
✅ Add prominent auth warning to README and deployment comments (see Blocker Epic: Data Source Integration #3)
✅ Pin image tags to specific versions (see Critical Epic: Jira Integration & Workflow #5)
✅ Verify health probe endpoints (/health/liveness vs /liveliness)
✅ Add basic smoke test (deploy → health check → cleanup)

Short-Term (Next PR)

🔄 Add NetworkPolicies for egress restriction
🔄 Reduce base resource limits, move higher limits to production overlay
🔄 Add HPA resources for scalability
🔄 Document backup/restore procedure for PVC

Long-Term (Phase 2)

🚀 Implement OAuth per PHASE2.md plan
🚀 Add OpenShift Routes alternative to Ingress
🚀 Integrate Amber agent for long-running sessions
🚀 Add comprehensive test suite following e2e patterns

Architecture Fit

Integration with Ambient Code Platform: ✅ Good

Uses same ingress pattern as existing e2e tests
Compatible with Kind cluster setup
Namespace isolation follows platform patterns
Ready for future integration with Amber (Feature/add codebase agent #337)

Deviations from CLAUDE.md:

❌ Missing SecurityContext (Critical - see backend/operator standards)
⚠️ No tests (Expected per "Testing Strategy" section)
⚠️ Secrets in git (Violates "Production Considerations" → Security)

Final Verdict

Do Not Merge until Blocker issues resolved (#1, #2, #3).

After fixes, this is a solid Phase 1 prototype that:

Provides immediate value (chat with Claude via UI)
Sets good foundation for Phase 2
Demonstrates proper Kubernetes patterns
Includes excellent documentation

The core implementation is strong - the issues are mostly about security hardening and production readiness, which is expected for a Phase 1 prototype. Focus on the 3 blockers, then this is good to merge.

Estimated Effort to Fix Blockers: ~2-3 hours

SecurityContext: 30 min
Secrets refactor: 60 min
Auth warnings: 30 min
Image pinning: 15 min
Testing: 30 min

Great work on this integration! The documentation quality and phased approach show excellent planning. 🎯

jeremyeder and others added 2 commits November 18, 2025 20:16

jeremyeder enabled auto-merge (squash) November 19, 2025 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/openwebui litellm deployment #342

Feature/openwebui litellm deployment #342

Uh oh!

jeremyeder commented Nov 19, 2025

Uh oh!

github-actions bot commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Feature/openwebui litellm deployment #342

Are you sure you want to change the base?

Feature/openwebui litellm deployment #342

Uh oh!

Conversation

jeremyeder commented Nov 19, 2025

Uh oh!

github-actions bot commented Nov 19, 2025

Claude Code Review

Summary

Issues by Severity

🚫 Blocker Issues

🔴 Critical Issues

🟡 Major Issues

🔵 Minor Issues

Positive Highlights

Recommendations

Immediate (Before Merge)

Short-Term (Next PR)

Long-Term (Phase 2)

Architecture Fit

Final Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant