-
Notifications
You must be signed in to change notification settings - Fork 31
Feature/openwebui litellm deployment #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/openwebui litellm deployment #342
Conversation
Add a Kubernetes-native deployment of Open WebUI with LiteLLM proxy for chatting with Claude models. This Phase 1 implementation provides a quick, dev-friendly deployment to Kind cluster with minimal configuration. Components: - Base manifests (namespace, deployments, services, PVC, RBAC) - LiteLLM proxy configured for Claude Sonnet 4.5, 3.7, and Haiku 3.5 - Open WebUI frontend with persistent storage - Phase 1 overlay for Kind deployment with nginx-ingress - Comprehensive documentation (README, Phase 1 guide, Phase 2 plan) - Makefile for deployment automation Architecture: - Namespace: openwebui (isolated from ACP) - Ingress: vteam.local/chat (reuses Kind cluster from e2e) - Auth: Disabled in Phase 1 (dev/testing only) - Storage: 500Mi PVC for chat history - Images: ghcr.io/berriai/litellm, ghcr.io/open-webui/open-webui Phase 2 (planned): - OAuth authentication via oauth2-proxy - Long-running Claude Code service for Amber integration - Production hardening (secrets, RBAC, monitoring) - OpenShift compatibility (Routes, SCC compliance) Deployment: ```bash cd components/open-webui-llm # Edit overlays/phase1-kind/secrets.yaml with API key make phase1-deploy # Access: http://vteam.local:8080/chat (Podman) or /chat (Docker) ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Increase memory limit from 512Mi to 2Gi to prevent OOMKilled crashes - Increase CPU limit from 500m to 1000m for better performance - Update health probe paths to LiteLLM-specific endpoints: - /health/liveliness for liveness probe - /health/readiness for readiness probe - Increase resource requests for stability Fixes LiteLLM pod crash loop due to insufficient memory allocation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Claude Code ReviewSummaryThis PR introduces a well-structured deployment of Open WebUI + LiteLLM for interacting with Claude models via the Amber codebase agent. The implementation is clean, well-documented, and follows Kubernetes best practices with Kustomize overlays. This is clearly marked as "Phase 1" - a development/prototype deployment with authentication disabled, which is appropriate for initial integration testing. Overall Assessment: Strong implementation with excellent documentation. However, there are critical security issues that must be addressed before merge, plus several important improvements for production readiness. Issues by Severity🚫 Blocker Issues1. Missing SecurityContext - Pod Security Standards Violation
2. Hardcoded Secrets in Git Repository 🔐
3. Authentication Disabled in Production-Capable Deployment 🔓
🔴 Critical Issues4. Resource Limits Too High for Default Kind Cluster
5. Using
6. No Network Policies 🌐
7. Probes Reference Wrong Endpoints
8. No Horizontal Pod Autoscaling
🟡 Major Issues9. Missing Test Coverage
10. PVC Has No Backup Strategy
11. Secrets Reference Missing Secret Objects
12. Ingress Path Rewrite May Break Assets
13. No Health Check in Makefile Test Target
14. Hardcoded Master Key Exposed in Test Commands
🔵 Minor Issues15. Inconsistent Label Usage
16. Documentation Uses Emojis Inconsistently
17. Namespace Not Configurable via Kustomize
18. No Pod Disruption Budget
19. Missing Prometheus ServiceMonitor
20. Verbose ConfigMap Could Be Externalized
Positive Highlights✨ Excellent Documentation
✨ Proper Kustomize Structure
✨ Good Operational Tooling
✨ Well-Organized Component Structure
✨ Recent Commit Shows Iteration
RecommendationsImmediate (Before Merge)
Short-Term (Next PR)
Long-Term (Phase 2)
Architecture FitIntegration with Ambient Code Platform: ✅ Good
Deviations from CLAUDE.md:
Final VerdictDo Not Merge until Blocker issues resolved (#1, #2, #3). After fixes, this is a solid Phase 1 prototype that:
The core implementation is strong - the issues are mostly about security hardening and production readiness, which is expected for a Phase 1 prototype. Focus on the 3 blockers, then this is good to merge. Estimated Effort to Fix Blockers: ~2-3 hours
Great work on this integration! The documentation quality and phased approach show excellent planning. 🎯 |
this is a prototype UX for using openwebui to interact with the new amber codebase agent added in #337